Sunday, January 13, 2013

.net framework 4.5 & Compression - ZipPackage and Open Package Specification API

In this post we are going to discuss how we can use the Open Package Specification API in .net framework for compression and decompression. Although we can not use the API to decompress the standard Zip files, but the compressed files using the API can be read and decompressed using the API. So if the feature under development needs decompression of self-compressed data files, the API can be used in a desktop application. It must be remembered that newer file formats including docx, pptx and nupkg are based on the Open Package Convention. Yes, it also includes Nuget packages. This is what msdn has to say about the System.IO.Packaging namespace.

"Provides classes that support storage of multiple data objects in a single container. ["msdn] The types for the API resides in System.IO.Packaging namespace in WindowsBase.dll assembly.


WindowsBase.dll is available as a default assembly in a WPF Application project, still we can add it manually for other project types.


The System.IO.Packaging namespace was added in .NET Framework 3.0. For dealing with compression and Decompression, we can use ZipPackage and ZipPackagePart types. ZipPackage inherits from the Package.


Similarly ZipPackagePart is the specialization of PackagePart type available in the same namespace.



Compression
Let's see how can we write a very simple code to compress packages using the packaging API. Below is a method that takes a folder name as input. It compresses the immediate files in a folder to the compressed file name provided as the second parameter to the method.


The above code gets the list of file names in a folder. It then creates a package and adds the files to the package as package parts. Please note that the sample code would just work for single level of files in a folder. We can use the similar code recursively to compress the folder hierarchy.

Here MimeMapping is from System.Web assembly and the same namespace. This is a new type introduced with .net framework 4.5. This type is not available by default with default Console application template and we need to add a separate reference of System.Web assembly in order to use this type. This type can be used to map document extensions to mime type.

Based on the Open package specification, the API creates a file [Content_Type].xml in the compressed file. The contents of the file is the details of extensions and their mime types. For my case, the contents were as follows:


This is the standard format for the file.

Decompression
Decompressing a compressed packages as also as easy. The caveat is that the standard zip files can not be directly decompressed using the API. This is because of the requirement of [Content_Type].xml file in the package to be decompressed. Actually Package.GetParts() wouldn't return anything if the file is absent.


The above code checks if the destination folder exists, it creates it if non-existent. It then reads the immediate contents of the package i.e. package parts and streams them individually to the specified folder.HttpUtility is also from System.Web. We have used UrlDcode from the type to remove any "%20" added to the file name. The method decodes them to spaces.

Download

No comments: