Compressing Data
Storing large amounts of data can take up a large amount of disk space. Data compression encodes information in a way that reduces its overall size. There are two general types of compression. Lossless compression preserves the full fidelity of the original data set. Lossy compression can provide better performance and a higher compression ratio, but it may not preserve all of the original information. It is often used in image, video, and audio compression where an exact data match is not required.
The Windows 8 Runtime exposes the Compressor and Decompressor classes for compression. The Compression project provides an active example of compressing and decompressing a data stream. The project contains a text file that is almost 100 kilobytes in size and loads that text and displays it with a dialog showing the total bytes. You can then click a button to compress the text and click another button to decompress it back.
The compression task performs several steps. A local file is opened for output to store the result of the compressed text. There are various ways to encode text, so it first uses the Encoding class to convert the text to a UTF8 encoded byte array:
var storage = await ApplicationData.Current.LocalFolder .CreateFileAsync("compressed.zip", CreationCollisionOption.ReplaceExisting); var bytes = Encoding.UTF8.GetBytes(_text);
You learned earlier in this chapter how to locate the folder for a specific user and application. You can examine the folder for the sample application to view the compressed file after you click the button to compress the text. The file is saved with a zip extension to illustrate that it was compressed, but it doesn’t contain a true archive, so you will be unable to decompress the file from Windows Explorer.
The next lines of code open the file for writing, create an instance of the Compressor, and write the bytes. The code then completes the compression operation and flushes all associated streams:
using (var stream = await storage.OpenStreamForWriteAsync()) { var compressor = new Compressor(stream.AsOutputStream()); await compressor.WriteAsync(bytes.AsBuffer()); await compressor.FinishAsync(); }
When the compression operation is complete, the bytes are read back from disk to show the compressed size. You’ll find the default algorithm cuts the text file down to almost half of its original size. The decompression operation uses the Decompressor class to perform the reverse operation and retrieve the decompressed bytes in a buffer (it then saves these to disk so you can examine the result).
var decompressor = new Decompressor(stream.AsInputStream()); var bytes = new Byte[100000]; var buffer = bytes.AsBuffer(); var buf = await decompressor.ReadAsync(buffer, 999999, InputStreamOptions.None);
When you create the classes for compression, you can pass a parameter to determine the compression algorithm that is used. Table 6.4 lists the possible values.
Table 6.4. Compression Algorithms
CompressAlgorithm Member |
Description |
InvalidAlgorithm |
Invalid algorithm. Used to generate exceptions for testing. |
NullAlgorithm |
No compression is applied, and the buffer is simply passed through. Used primarily for testing. |
Mszip |
Uses the MSZIP algorithm. |
Xpress |
Uses the XPRESS algorithm. |
XpressHuff |
Uses the XPRESS algorithm with Huffman encoding. |
Lzms |
Uses the LZMS algorithm. |
The Windows Runtime makes compression simple and straightforward. Use compression when you have large amounts of data to store and are concerned about the amount of disk space your application requires. Remember that compression will slow down the save operation, so be sure to experiment to find the algorithm that provides the best compression ratio and performance for the type of data you are storing. Remember that you must pass the same algorithm to the decompression routine that you used to compress the data.