Skip to content

Latest commit

 

History

History
31 lines (22 loc) · 992 Bytes

README.rst

File metadata and controls

31 lines (22 loc) · 992 Bytes

huffman

Generate Huffman codebooks! Huffman codes are the optimal way to compress individual symbols into a binary sequence that can be unambiguously decoded without inter-symbol separators (it is "prefix-free").

Provided an iterable of 2-tuples in (symbol, weight) format, generate a Huffman codebook, returned as a dictionary in {symbol: code, ...} format.

>>> huffman.codebook([('A', 2), ('B', 4), ('C', 1), ('D', 1)])
{'A': '10', 'B': '0', 'C': '110', 'D': '111'}

If you have an iterable of symbols, the collections.Counter is a handy way to tally them up.

>>> huffman.codebook(collections.Counter('man the stand banana man').items())
{' ': '111',
 'a': '10',
 'b': '0100',
 'd': '0110',
 'e': '11010',
 'h': '0101',
 'm': '1100',
 'n': '00',
 's': '11011',
 't': '0111'}