Compression with DeflateStream

From IronPython Cookbook

This example provides two functions. 'compress' compresses a Python string into a byte array, and 'decompress' turns a compressed byte array back into a Python string. (Based on this blog entry.)

from System.IO import BinaryReader, StreamReader, MemoryStream
from System.IO.Compression import CompressionMode, DeflateStream
from System.Text import Encoding


def decompress(data):
    gz = DeflateStream(MemoryStream(data), CompressionMode.Decompress)
    str = StreamReader(gz).ReadToEnd()
    gz.Close()
    return str

def compress(data):
    ms = MemoryStream()
    gz = DeflateStream(ms, CompressionMode.Compress, True)
    bytes = Encoding.UTF8.GetBytes(data)
    gz.Write(bytes, 0, bytes.Length)
    gz.Close()
    data = ms.ToArray()
    ms.Close()
    return data

sourceText = """
A really long string that we would really, 
really like to compress.
""" * 50

compressed = compress(sourceText)

print 'Original Length:', len(sourceText)
print 'Compressed Length:', len(compressed)

decompressed = decompress(compressed)

print 'Decompression Worked?', decompressed == sourceText


This uses the following classes from the System.IO and System.IO.Compression namespaces:

compress returns a Byte array rather than a string (it isn't generally safe to represent binary data with a string on .NET as strings are Unicode). To write it to a file, and then read it back in again and decompress,, you can use code like the following:

from System import Array, Byte
from System.IO import FileStream, FileMode

compressedBytes = compress(sourceText)
outStream = FileStream('test.dat', FileMode.Create)
outStream.Write(compressedBytes, 0, compressedBytes.Length)
outStream.Close()

inStream = FileStream('test.dat', FileMode.Open)
compressedBytes = Array.CreateInstance(Byte, inStream.Length)
inStream.Read(compressedBytes, 0, inStream.Length)
inStream.Close()

decompressedString = decompress(compressedBytes)

This uses the following:

There is a shorter way of reading byte arrays from files:

>>> from System.IO import File
>>> bytes = File.ReadAllBytes(filePath)

Correspondingly, writing can be done with:

>>> from System.IO import File
>>> File.WriteAllBytes(filePath, byteArray)


Back to Contents.

TOOLBOX
LANGUAGES