An intuitive introduction to information theory - PowerPoint PPT Presentation

About This Presentation
Title:

An intuitive introduction to information theory

Description:

St. Thomas Monastry, Brno. 4. Genetics. Gregor Mendel. 1822 1884. 1866 Mendel s laws. Foundation of Genetics. Ca. 1900: Biology becomes a ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 27
Provided by: sch144
Category:

less

Transcript and Presenter's Notes

Title: An intuitive introduction to information theory


1
An intuitive introduction to information theory
  • Ivo Grosse
  • Leibniz Institute of Plant Genetics and Crop
    Plant Research Gatersleben
  • Bioinformatics Centre Gatersleben-Halle

2
Outline
  • Why information theory?
  • An intuitive introduction

3
History of biology
St. Thomas Monastry, Brno
4
Genetics
  • Gregor Mendel
  • 1822 1884
  • 1866 Mendels laws
  • Foundation of Genetics
  • Ca. 1900
  • Biology becomes a
  • quantitative science

5
50 years later 1953
James Watson Francis Crick
6
50 years later 1953
7
(No Transcript)
8
DNA
  • Watson Crick
  • 1953
  • Double helix structure
  • of DNA
  • 1953
  • Biology becomes a
  • molecular science

9
1953 2003 50 years of revolutionary
discoveries
10
1989
11
1989
  • Goals
  • Identify all of the ca. 30.000 genes
  • Identify all of the ca. 3.000.000.000 base pairs
  • Store all information in databases
  • Develop new software for data analysis

12
2003 Human Genome Project officially finished
2003 Biology becomes an information science
13
2003 2053 biology information science
14
2003 2053 biology information science
Systems Biology
15
What is information?
  • Many intuitive definitions
  • Most of them wrong
  • One clean definition since 1948
  • Requires 3 steps
  • Entropy
  • Conditional entropy
  • Mutual information

16
Before starting with entropy
  • Who is the father of information
  • theory?
  • Who is this?
  • Claude Shannon
  • 1916 2001
  • A Mathematical Theory of
  • Communication. Bell System
  • Technical Journal, 27, 379423
  • 623656, 1948

17
Before starting with entropy
  • Who is the grandfather of
  • information theory?
  • Simon bar Kochba
  • Ca. 100 135
  • Jewish guerilla fighter against
  • Roman Empire (132 135)

18
Entropy
  • Given a text composed from an alphabet of 32
    letters (each letter equally probable)
  • Person A chooses a letter X (randomly)
  • Person B wants to know this letter
  • B may ask only binary questions
  • Question how many binary questions must B ask in
    order to learn which letter X was chosen by A
  • Answer entropy H(X)
  • Here H(X) 5 bit

19
Conditional entropy (1)
  • The sky is blu_
  • How many binary questions?
  • 5?
  • No!
  • Why?
  • Whats wrong?
  • The context tells us something about the
    missing letter X

20
Conditional entropy (2)
  • Given a text composed from an alphabet of 32
    letters (each letter equally probable)
  • Person A chooses a letter X (randomly)
  • Person B wants to know this letter
  • B may ask only binary questions
  • A may tell B the letter Y preceding X
  • E.g.
  • L_
  • Q_
  • Question how many binary questions must B ask in
    order to learn which letter X was chosen by A
  • Answer conditional entropy H(XY)

21
Conditional entropy (3)
  • H(XY) lt H(X)
  • Clear!
  • In worst case namely if B ignores all
    information in Y about X B needs H(X) binary
    questions
  • Under no circumstances should B need more than
    H(X) binary questions
  • Knowledge of Y cannot increase the number of
    binary questions
  • Knowledge can never harm! (mathematical
    statement, perhaps not true in real life ?)

22
Mutual information (1)
  • Compare two situations
  • I learn X without knowing Y
  • II learn X with knowing Y
  • How many binary questions in case of I? ? H(X)
  • How many binary questions in case of II? ?
    H(XY)
  • Question How many binary questions could B save
    in case of II?
  • Question How many binary questions could B save
    by knowing Y?
  • Answer I(XY) H(X) H(XY)
  • I(XY) information in Y about X

23
Mutual information (2)
  • H(XY) lt H(X) ? I(XY) gt 0
  • In worst case namely if B ignores all
    information in Y about X or if there is no
    information in Y about X then I(XY) 0
  • Information in Y about X can never be negative
  • Knowledge can never harm! (mathematical
    statement, perhaps not true in real life ?)

24
Mutual information (3)
  • Example 1 random sequence composed of A, C, G, T
    (equally probable)
  • I(XY) ?
  • H(X) 2 bit
  • H(XY) 2 bit
  • I(XY) H(Y) H(XY) 0 bit
  • Example 2 deterministic sequence ACGT ACGT
    ACGT ACGT
  • I(XY) ?
  • H(X) 2 bit
  • H(XY) 0 bit
  • I(XY) H(Y) H(XY) 2 bit

25
Mutual information (4)
  • I(XY) I(YX)
  • Always! For any X and any Y!
  • Information in Y about X information in X about
    Y
  • Examples
  • How much information is there in the amino acid
    sequence about the secondary structure? How much
    information is there in the secondary structure
    about the amino acid sequence?
  • How much information is there in the expression
    profile about the function of the gene? How much
    information is there in the function of the gene
    about the expression profile?
  • Mutual information

26
Summary
  • Entropy
  • Conditional entropy
  • Mutual information
  • There is no such thing as information content
  • Information not defined for a single variable
  • 2 random variables needed to talk about
    information
  • Information in Y about X
  • I(XY) I(YX) ? info in Y about X info in X
    about Y
  • I(XY) gt 0 ? information never negative
  • ? knowledge cannot harm
  • I(XY) 0 if and only if X and Y statistically
    independent
  • I(XY) gt 0 otherwise
Write a Comment
User Comments (0)
About PowerShow.com