Please find the results of the first phase here:
Our goal is to find a trade-off between compression ratio and timeliness. Information which is older than 1 second is considered outdated and must be ignored. So the first requirement is to transmit messages at latest 1s after its reception. However, we first start with finding the algorithm(s) which suit(s) our data best.
Therefore, use this trace of binary sensor data and find the algorithm with the maximum compression ratio. The data is formatted as follows:
<esc> "2" : 6 byte MLAT timestamp, 1 byte signal level, 7 byte Mode-S short frame <esc> "3" : 6 byte MLAT timestamp, 1 byte signal level, 14 byte Mode-S long frame
<esc><esc>: true 0x1a (i.e. 0x1a's within packets are escaped) <esc> is 0x1a, and "1", "2" and "3" are 0x31, 0x32 and 0x33
The original format description can be found here. The upper 18 bit of the MLAT timestamp are the seconds of the day, the lower 30 bits are the nanoseconds of the second of the day. The Mode-S frames are encoded according to ICAO Annex 10 Volume IV. Find the Mode S message encoding in this file.
The maximum compression ratio of about 2.2 has been achieved with LZMA. An additional 10% improvement can be achieved with some optimizations: reducing the entropy by XORing the CRC and removing ESC chars. However, the high compression ratio is paid for with a high computation time of up to 4 minutes. Find the presentation with the details below:
Group 3 provided a clean dataset without the escape characters (so each message directly starts with 2 -> short message (15 Bytes) or 3 -> long message (22 Bytes)), they removed unnecessary messages, and they XORed the CRC so that the entropy is lower. They also provided the chunks needed for the next task. Find both, the complete clean and optimized dataset as well as the chunks here.
In the third task, we investigated the effect of the datasize on the compression ratio and compression time. Since we aim at compressing very small chunks of data, the investiated chunk sizes are 1, 2, 4, ..., 1024 Radarcape messages. Three compression algorithms were tested: LZMA, deflate (gzip, zlib), and burrows wheeler. Interestingly, while in the previous task LZMA was the outstanding winner with the best compression ratio for the complete dataset, the results of this experiment show that for these very small datasets, all algorithms and compression levels perform more or less equally good.
In addition to the compression ratio for different chunksizes, we profiled the execution time of the different steps of compression algorithms. The results clearly show, that matching the longest symbols is the most expensive task for dictionary-based compression schemes.
Find the presentation with the details below:
Two-week task: March 7 until March 18.
Use Latex and this template (DiscoReport.zip).