Title: Functional Programming Lecture 15 - Case Study: Huffman Codes
1Functional ProgrammingLecture 15 - Case
StudyHuffman Codes
2The Problem
- Design a coding/decoding scheme and implement in
Haskell. - This requires
- - an algorithm to encode a message,
- - an algorithm to decode a message,
- - an implementation.
3Fixed and Variable Length Codes
- A fixed length code assigns the same number of
bits to each code word. - E.g. ASCII letter -gt 7 bits (up to 128 code
words) - So to encode the string at we need 14 bits.
- A variable length code assigns a different number
of bits to each code word, depending on the
frequency of the code word. Frequent words are
assigned short codes infrequent words are
assigned long codes. - E.g.
- a
at encoded by 011 -
0 for go left - b t
1 for go right - tree to encode and decode
4Coding
0 1 a 0
1 b t
a is encoded by 1 bit, 0 b is encoded by 2
bits, 10 t is encoded by 2 bits, 11 An important
property of a Huffman code is that the codes are
prefix codes no code of a letter (code word) is
the prefix of the code of another letter (code
word). E.g. 0 is not a prefix of 10 or 11
10 is not a prefix of 0 or 11 11 is not a
prefix of 0 or 10 So, aa is encoded by 00.
ba is encoded by 100.
5Decoding
0 1 a 0
1 b t
The encoded message 1001111011is decoded
as 10 - b 0 - a 11 - t 11 - t 0 - a 11 -
t In view of the frequency of t, this is
probably not a good code. t should be encoded by
1 bit! ps. Morse code is a type of Huffman
code.
6A Haskell Implementation
- Types
- -- codes --
- data Bit L R deriving (Eq, Show)
- type Hcode Bit
- -- Huffman coding tree --
- -- characters at leaf nodes, plus frequencies --
- -- frequencies as well at internal nodes --
- data Tree Leaf Char Int Node Int Tree Tree
- Assume that codes are kept in table (rather than
read off a tree). - -- table of codes --
- type Table (Char, Hcode)
7Encoding
- -- encode a message according to code table --
- -- encode each character and concatenate --
- codeMessage Table -gt Char -gt Hcode
- codeMessage tbl concat . map (lookupTable tbl)
- -- lookup the code for a character in code table
-- - lookupTable Table -gt Char -gt Hcode
- lookupTable c error lookupTable
- lookupTable ((ch,code)tbl) c
- ch c code
- otherwise lookupTable tbl c
8Decoding
- -- decode a message according to code tree --
- -- if at a leaf node, then character is decoded,
-- - -- start again at root
-- - -- if at an internal node, then follow sub-tree
-- - -- according to next code bit
-- - decode Tree -gt Hcode -gt Char
- decode tr decodetree tr
- where
- decodetree (Node f t1 t2) (Lrest)
- decodetree t1 rest
- decodetree (Node f t1 t2) (Rrest)
- decodetree t2 rest
- decodetree (Leaf ch f) rest
- ch(decodetree tr rest)
9Example
- codetree Node 3 (Leaf a 0)
- (Node 3 (Leaf b 1) (Leaf t
2)) - -- assume a is most frequent, denoted by
smallest -- - -- number --
- message R,L,L,R,R,R,R,L,R,R
- decode codetree message
- gt
- decodetree Node 3 t1 (Node 3 ..)
- R L,L,R,R,R,R,L,R,R
- gt decodetree (Node 3 (Leaf b 1) (Leaf t 2))
- L L,R,R,R,R,L,R,R
- gt decodetree ( Leaf b 1) LR,R,R,R,L,R,R
- gt b decodetree Node 3 (Leaf a 0) (Node 3
..) - L
R,R,R,R,L,R,R - gt b decodetree (Leaf a 0)) R,R,R,R,L,R,R
- gt b a decodetree Node 3 (Leaf a 0)
(Node 3 ..) -
R,R,R,R,L,R,R
10- We still have to make
- the code tree
- the code table
- (Next lecture!)