Huffman Codes - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Huffman Codes

Description:

Huffman Codes Drozdek Chapter 11 * * Huffman_Tree.cpp Start by sorting the list. Display the sorted list. void Huffman_Tree::Make_Decode_Tree(void) { node_list.sort ... – PowerPoint PPT presentation

Number of Views:91
Avg rating:3.0/5.0
Slides: 45
Provided by: Rollins4
Learn more at: https://www.cse.usf.edu
Category:
Tags: codes | huffman | prefix | test

less

Transcript and Presenter's Notes

Title: Huffman Codes


1
Huffman Codes
  • Drozdek Chapter 11

2
Objectives
  • You will be able to
  • Construct an optimal variable bit length code for
    an alphabet with known probability for each
    letter occuring in a message.
  • Huffman Code
  • Construct a tree for decoding messages encoded in
    a Huffman code.
  • Construct a tree for encoding messages encoded in
    a Huffman code.

3
Huffman Codes
  • Common character codes such as ASCII and EBCDIC
    use same size data structure for all characters.
  • Eight bits per character.
  • Contrast Morse code
  • Uses variable-length sequences.
  • Variable length codes can produce shorter
    messages than fixed length codes
  • on average when applied to many messages with
    given character probabilities.

4
Variable-Length Codes
  • Each character in such a code
  • has a weight (probability) and a length
  • The expected message length per character is the
    sum of the products of the code lengths and the
    probabilties for all the characters
  • (0.22) (0.14) (0.14) (0.153) (0.451)
    2.1

5
Immediate Decodability
  • When no sequence of bits that represents a
    character is a prefix of a longer sequence for
    another character
  • Can be decoded without waiting for remaining
    bits.
  • Note how previous scheme is not immediately
    decodable.
  • And this one is

6
Immediate Decodability
  • Codes that are immediatly decodable are called
    prefix codes.
  • No valid code symbol is a prefix of another valid
    code symbol.
  • Perhaps better called prefix free codes.

7
Optimal Codes
  • We seek codes that are
  • Immediately decodable.
  • Average message length for a large number of
    messages is minimal.
  • For a set of n characters C1 .. Cn with
    weights w1 .. wn
  • We need an algorithm which generates variable
    length bit strings representing the characters.

8
Huffman Codes
  • An optimal code scheme developed by David A.
    Huffman while a PhD student at MIT.
  • A Method for the Construction of
    Minimum-Redundancy Codes
  • Proceedings of the I.R.E., Sept. 1952
  • http//en.wikipedia.org/wiki/David_A._Huffman
  • http//www.huffmancoding.com/david-huffman/scienti
    fic-american

9
Huffman's Algorithm
  • How to determine an optimal code for a set of N
    characters given their relative frequencies (or
    weights).

10
Huffman's Algorithm
  • Initialize a list of one-node binary trees
  • One node for each character containing the
    character and its weight.
  • While there is more than one tree in the list
  • Find two trees in the list having minimal
    weights.
  • Remove those trees from the list and make them
    the left and right subtrees of a new node having
    the sum of their weights as its weight.
  • Label the arc to the left subtree with 0.
  • Label the arc to the right subtree with 1.
  • Add the new tree to the list.

11
Huffman's Algorithm
  • The code for character Ci is the bit string along
    the path from the root to Ci in the final binary
    tree.

12
Example
  • Given characters
  • and probabilities
  • The end result is

Character Huffman
  Code
A 011
B 000
C 001
D 010
E 1
Note arbitrary choice for sibling of D.
13
Alternate Result
Average message length is the same.
14
Huffman Decoding Algorithm
  • Given a message as a string of 0's and 1's
  • Initialize pointer p to the root of Huffman tree.
  • While end of message string not reached
  • Let x be the next bit of the message string.
  • If x is 0
  • move p to the left child
  • else
  • move p to the right child
  • If p points to a leaf
  • Display the character at that leaf.
  • Reset p to the root of the Huffman tree.

15
Huffman Decoding Algorithm
  • For message string 0001011010
  • Using Huffman Tree and decoding algorithm

Click for answer
16
Implementing a Huffman Code Program
  • Lets implement a program to build a Huffman code
    tree.
  • Encode and decode text messages using the
    resulting Huffman code.
  • Limit input to letters and spaces.
  • Convert to letters to lower case.

17
Implementing a Huffman Code Program
  • In order to create a Huffman code for English
    text, we need weighting factors for the letters.
  • Frequency tables are readily available.
  • To simplify testing and debugging, start with the
    a small example
  • Just the letters A, B, C, D, and E

18
Getting Started
  • Create a new empty C project in Visual Studio,
    Huffman_Code
  • or a directory in Unix.
  • Add a C code file main.cpp

19
main.cpp
  • include ltiostreamgt
  • using namespace std
  • int main(void)
  • cout ltlt "This is the Huffman Code program" ltlt
    endl
  • cin.get()
  • cin.get()
  • return 0
  • Build and test

20
Program Running
21
Class char_freq
  • We need a class to hold the elements of a Huffman
    tree.
  • Data
  • Character
  • Frequency (Probability of occurance)
  • Pointers
  • Left child
  • Right child
  • Add class Char_Freq

22
Char_Freq.h
  • pragma once
  • include ltiostreamgt
  • using stdostream
  • class Char_Freq
  • private
  • char ch
  • double freq
  • Char_Freq left
  • Char_Freq right
  • public
  • Char_Freq(void)
  • Char_Freq(char c, double f)
  • Char_Freq(char c, double f, Char_Freq Left,
    Char_Freq Right)
  • char Ch() const return ch
  • double Freq() const return freq
  • bool operatorlt(const Char_Freq rhs) const
  • friend ostream operatorltlt (ostream os,
    const Char_Freq cf)

23
Char_Freq.cpp
  • include "Char_Freq.h"
  • Char_FreqChar_Freq(void)
  • Char_FreqChar_Freq(char c, double f)
  • ch(c), freq(f), left(0), right(0)
  • Char_FreqChar_Freq(char c, double f, Char_Freq
    Left, Char_Freq Right)
  • ch(c), freq(f), left(Left), right(Right)
  • bool Char_Freqoperatorlt(const Char_Freq rhs)
    const
  • return this-gtfreq lt rhs.freq
  • ostream operatorltlt (ostream os, const
    Char_Freq cf)

24
The Huffman Tree
  • Add class Huffman_Tree
  • Will hold code to build and access the Huffman
    code for a specific set of characters and
    frequencies.

25
Starting the Huffman Tree
  • We will build multiple trees of Char_Freq
    elements.
  • Keep the roots in a list.
  • Use Standard Template Library list class.
  • Initially one tree per character to be coded.
  • Each tree consists of root only.
  • Method Add() will be used to add char-freq pairs
    to the list

26
Huffman_Tree.h
  • pragma once
  • include ltlistgt
  • include "Char_Freq.h"
  • class Huffman_Tree
  • public
  • Huffman_Tree(void)
  • Huffman_Tree(void)
  • // Add a single node tree to the list.
  • void Add(char c, double frequency)
  • void Display_List(void)
  • private
  • stdlistltChar_Freqgt node_list

27
Huffman_Tree.cpp
  • include ltiostreamgt
  • include ltstringgt
  • include "Huffman_Tree.h"
  • using namespace std
  • Huffman_TreeHuffman_Tree(void)
  • void Huffman_TreeAdd(char c, double frequency)
  • Char_Freq cf(c, frequency)
  • node_list.push_back(cf)

28
Huffman_Tree.cpp
  • void Huffman_TreeDisplay_List(void)
  • cout ltlt "Character frequency list" ltlt endl
  • listltChar_Freqgtiterator itr
  • for (itrnode_list.begin()
    itr!node_list.end() itr)
  • cout ltlt itr ltlt endl

29
main.cpp
  • include ltiostreamgt
  • include ltstringgt
  • include "Huffman_Tree.h"
  • using namespace std
  • Huffman_Tree huffman_tree
  • int main(void)
  • cout ltlt "This is the Huffman code
    program.\n\n"
  • huffman_tree.Add('a', 0.2 )
  • huffman_tree.Add('b', 0.1 )
  • huffman_tree.Add('c', 0.1 )
  • huffman_tree.Add('d', 0.15)
  • huffman_tree.Add('e', 0.45)
  • huffman_tree.Display_List()

30
Program in Action
31
Implementing Huffmans Algorithm
  • Huffmans algorithm requires us to identify two
    trees with minimal total frequency.
  • To do this we can sort the list.
  • The lt operator for the char_freq class compares
    the frequency values.
  • So the sort method of the list template class
    will sort the trees into increasing order by
    frequency.

32
Implementing Huffmans Algorithm
  • Add function Make_Decode_Tree to class
    Huffman_Tree.
  • Repeatedly
  • Sort the list of trees by frequency
  • Remove the first two trees
  • Create a new node with these trees as subtrees.
  • Frequency is sum of their frequencies
  • Add the new node to the list.
  • Continue until there is only one node on the list.

33
Huffman_Tree.h
  • Add new public method
  • void Make_Decode_Tree(void)

34
Huffman_Tree.cpp
  • Start by sorting the list.
  • Display the sorted list.
  • void Huffman_TreeMake_Decode_Tree(void)
  • node_list.sort()
  • cout ltlt "\nSorted list\n"
  • Display_List()

35
main.cpp
  • Add call to make_decode_tree.
  • int main(void)
  • cout ltlt "This is the Huffman code
    program.\n"
  • huffman_tree.Add('a', 0.2 )
  • huffman_tree.Add('b', 0.1 )
  • huffman_tree.Add('c', 0.1 )
  • huffman_tree.Add('d', 0.15)
  • huffman_tree.Add('e', 0.45)
  • huffman_tree.Display_List()
  • huffman_tree.Make_Decode_Tree()
  • cin.get()
  • cin.get()
  • return 0

36
Program in Action
37
Huffman_Tree.cpp
  • Add to function Make_Decode_Tree()
  • while (node_list.size() gt 1)
  • Char_Freq cf1 new Char_Freq(node_list.front
    ())
  • node_list.pop_front()
  • Char_Freq cf2 new Char_Freq(node_list.fron
    t())
  • node_list.pop_front()
  • Char_Freq cf3(0, cf1-gtFreq()cf2-gtFreq(),
    cf1, cf2)
  • node_list.push_back(cf3)
  • node_list.sort()

This is the essence of Huffmans algorithm!
38
Huffman_Tree.h
  • Add a new private member variable to class
    Huffman_Tree to hold the root of the tree.
  • private
  • stdlistltChar_Freqgt node_list
  • Char_Freq decode_tree_root

39
Huffman_Tree.cpp
  • In order to check our results we need to be able
    to display the tree.
  • Also show the code as a list.
  • Add public functions to Huffman_Tree.h
  • void Display_Decode_Tree(Char_Freq cf, int
    indent) const
  • void Display_Code(Char_Freq cf, stdstring
    prefix) const
  • Add at top of Huffman_Tree.cpp
  • include ltiomanipgt

40
Display_Decode_Tree()
  • void Huffman_TreeDisplay_Decode_Tree(Char_Freq
    cf,
  • int
    indent) const
  • if (cf-gtleft ! 0)
  • Display_Decode_Tree(cf-gtleft, indent
    8)
  • cout ltlt setw(indent) ltlt " " ltlt cf ltlt endl
  • if (cf-gtright ! 0)
  • Display_Decode_Tree(cf-gtright, indent
    8)
  • Note access of private members of cf.
  • Make class Huffman_Tree a friend of class
    Char_Freq.

41
Char_Freq.h
  • Add at the end of Char_Freq.h
  • bool operatorlt(const Char_Freq rhs) const
  • friend ostream operatorltlt (ostream os,
    const Char_Freq cf)
  • friend class Huffman_Tree

42
char_freq.cpp
  • Update ltlt to handle merged nodes
  • ch will be 0
  • ostream operatorltlt (ostream os, const
    Char_Freq cf)
  • if (cf.ch gt 0)
  • os ltlt cf.ch ltlt " " ltlt cf.freq
  • else
  • os ltlt '' ltlt " " ltlt cf.freq
  • return os

43
Huffman_Tree.cpp
  • Add at the end of function Make_Decode_Tree()
  • decode_tree_root node_list.front()
  • cout ltlt endl ltlt "The Huffman Tree" ltlt endl
  • Display_Decode_Tree(decode_tree_root, 0)

44
Program in Action
Write a Comment
User Comments (0)
About PowerShow.com