The project has the following part:
1. (10 points) Investigation of the compression ratio of the LZW algorithm for different types of files.
The compression ratio is defined as the output (zipped file) size divided by the input file (original) file size. The goal of any compression algorithm is to achieve high compression ratio. There are different implementations of the LZW algorithm. The compression ratio of the LZW algorithm may have to do with the implementation and the input data file.
1.1 Investigate how different implementations of LZW algorithm may affect the compression ratio.
Select a few different implementations of LZW (e.g., Winzip, 7-zip, gzip), and some different input file types. Zip the files using different programs and compare the compression ratios. Draw a conclusion how the implementations of LZW algorithm may affect the compression ratio.
1.2 Investigate how different data file types may affect the compression ratio.
Fix one implementation of LZW algorithm. Try it on different input data files with different types, like text files, binary files (executable programs), Word documents, different types of image files, audio files, video files. Report the compression ratios and draw some conclusions. What are the extreme compression ratios (extremely high compression ratios and extremely low compression ratios)? You can contrive some files by your own, e.g., you may author some unusual text files, some Word files, some image for video files just to get some usually high or low compression ratio.
2. (10 points) Hand execute the LZW compression algorithm.
Assume an alphabet of two letters a and b. Given the following input string
and the initial dictionary
You should output the compressed encoding. Also give the dictionary as a table with “word” and “code” as columns.