Strings and Languages Operations - PowerPoint PPT Presentation

1 / 271
About This Presentation
Title:

Strings and Languages Operations

Description:

Strings and Languages Operations Concatenation Exponentiation Kleene Star Regular Expressions – PowerPoint PPT presentation

Number of Views:266
Avg rating:3.0/5.0
Slides: 272
Provided by: Patchrawat4
Category:

less

Transcript and Presenter's Notes

Title: Strings and Languages Operations


1
Strings and Languages Operations
  • Concatenation
  • Exponentiation
  • Kleene Star
  • Regular Expressions

2
Strings and Language Operations
  • Concatenation
  • Exponentiation
  • Kleene star
  • Pages 27-30 of the text
  • Regular expressions
  • Pages 71-75 of the text

3
String Concatenation
  • If x and y are strings over alphabet S, the
    concatenation of x and y is the string xy formed
    by writing the symbols of x and the symbols of y
    consecutively.
  • Suppose x abb and y ba
  • xy abbba
  • yx baabb

4
Properties of String Concatenation
  • Suppose x, y, and z are strings.
  • Concatenation is not commutative.
  • xy is not guaranteed to be equal to yx
  • Concatenation is associative
  • (xy)z x(yz) xyz
  • The empty string is the identity for
    concatenation
  • x/\ /\x x

5
Language Concatenation
  • Suppose L1 and L2 are languages (sets of
    strings).
  • The concatenation of L1 and L2, denoted L1L2,is
    defined as
  • L1L2 xy x ? L1 and y ? L2
  • Example,
  • Let L1 ab, bba and L2 aa, b, ba
  • What is L1L2?
  • Solution
  • Let x1 ab, x2 bba, y1 aa, y2 b, y3 ba
  • L1L2 x1y1, x1y2, x1y3, x2y1, x2y2, x2y3
    abaa, abb, abba, bbaaa, bbab, bbaba

6
Language Concatenation is not commutative
  • Let L1 aa, bb, ba and L2 /\, aba
  • Let x1 aa, x2 bb, x3ba, y1 /\, y2 aba
  • L1L2 x1y1, x1y2, x2y1, x2y2, x3y1, x3y2
    aa, aaaba, bb, bbaba, ba, baaba
  • L2L1 y1x1, y1x2, y1x3, y2x1, y2x2, y2x3
    aa, bb, ba, abaaa, ababb, ababa
  • L2L2 y1y1, y1y2, y2y1, y2y2 /\,
    aba, aba, abaaba /\, aba, abaaba
    (dropped extra aba)

7
Associativity of Language Concatenation
  • (L1L2)L3 L1(L2L3) L1L2L3
  • Example
  • Let L1a,b, L2c,d, and L3e,f
  • L1L2L3(a,bc,d)e,f ac, ad,
    bc, bde,f ace,acf,ade,aef,bce,bc
    f,bde,bdf
  • L1L2L3a,b(c,de,f) a,bce,
    df, ce, df ace,acf,ade,aef,bce,bc
    f,bde,bdf

8
Special Cases
  • What language is the identity for language
    concatenation?
  • The set containing only the empty string /\ /\
  • Example
  • aab,ba,abc/\ /\aab,ba,abc
    aab,ba,abc
  • What about ?
  • For any language L, L L
  • Thus for concatenation is like 0 for
    multiplication
  • Example
  • aab,ba,abc aab,ba,abc
  • The intuitive reason is that we must choose a
    string from both sets that are being
    concatenated, but there is nothing to choose from
    .

9
Exponentiation
  • We use exponentiation to indicate the number of
    items being concatenated
  • Symbols
  • Strings
  • Set of symbols (S for example)
  • Set of strings (languages)
  • a3 aaa
  • x3 xxx
  • S3 SSS x ? S x3
  • L3 LLL

10
Examples of Exponentiation
  • Let xabb, Sa,b, Lab,b
  • a4 aaaa
  • x3 (abb)(abb)(abb) abbabbabb
  • S3 SSS a,ba,ba,b aaa,aab,aba,abb,b
    aa,bab,bba,bbb
  • L3 LLL ab,bab,bab,b
    ababab,ababb,abbab,abbb,
    babab,babb,bbab,bbb

11
Results of Exponentiation
  • Exponentiation of a symbol or a string results in
    a string.
  • Exponentiation of a set of symbols or a set of
    strings results in a set of strings
  • a symbol ? a string
  • a string ? a string
  • a set of symbols ? a set of strings
  • a set of strings ? a set of strings

12
Special Cases of Exponentiation
  • a0 /\
  • x0 /\
  • S0 /\
  • L0 /\ for any language L
  • aa,bb0 /\
  • a, aa, aaa, aaaa, 0 /\
  • /\ 0 /\
  • ?0 0 /\

13
Kleene Star
  • Kleene is a unary operation on languages.
  • Kleene is not an operation on strings
  • However, see the pages on regular expressions.
  • L represents any finite number of concatenations
    of L.
  • L Ukgt0 Lk L0 U L1 U L2 U
  • For any L, /\ is always an element of L
  • because L0 /\
  • Thus, for any L, L ! ?

14
Example of Kleene Star
  • Let Laa
  • L0 /\
  • L1Laa
  • L2 aaaa
  • L3
  • L L0 ? L1 ? L2 ? L3
  • /\, aa, aaaa, aaaaaa,
  • set of all strings that can be obtained by
    concatenating 0 or more copies of aa

15
Example of Kleene Star
  • Let Laa, b
  • L0 /\
  • L1Laa,b
  • L2 LL aaaa, aab, baa, bb
  • L3
  • L L0 ? L1 ? L2 ? L3
  • set of all strings that can be obtained by
    concatenating 0 or more copies of aa and b

16
Regular Languages
  • Regular languages are languages that can be
    obtained from the very simple languages over S,
    using only
  • Union
  • Concatenation
  • Kleene Star
  • See lecture 14 and pages 71-75 of the text

17
Examples of Regular Languages
  • aab (i.e. aab )
  • aa,b (i.e. aa ? b )
  • a,b language of strings that can be
    obtained by concatenating any number of as and
    bs
  • bba,b language of strings that begin with
    bb (followed by any number of as and bs)
  • abb,/\ language of strings that begin
    with any number of as and end with an optional
    bb.
  • a?b language of strings that consist of
    only as or only bs and /\.

18
Regular Expressions
  • We can simplify the formula for regular languages
    slightly by
  • leaving out the set brackets and
  • replacing ? with
  • The results are called regular expressions.

19
Examples of Regular Expressions
Set notation Regular Expressions
aab aab
aa,b aa?b aab
a,b (a?b) (ab)
bba,b bb(a?b) bb(ab)
abb,/\ a(bb?/\) a(bb/\)
a?b ab
20
String or Language?
  • Consider the regular expression a(bb/\)
  • a(bb/\) is a string over alphabet a, b, , ,
    /\, (, ), ?
  • a(bb/\) represents a language over alphabet a,
    b
  • It represents the language of strings over a,b
    that begin with any number of as and end with an
    optional bb.
  • Some regular expressions look just like strings
    over alphabet a,b
  • Regular expression aaba represents the language
    aaba
  • Regular expression /\ represents the language
    /\
  • It should be clear from the context whether a
    sequence of symbols is a regular expression or
    just a string.

21
Module 1 Course Overview
  • Course CSE 460
  • Instructor Dr. Eric Torng
  • TA To be determined

22
What is this course?
  • Philosophy of computing course
  • We take a step back to think about computing in
    broader terms
  • Science of computing course
  • We study fundamental ideas/results that shape the
    field of computer science
  • Applied computing course
  • We learn study a broad range of material with
    relevance to computing today

23
Philosophy
  • Phil. of life
  • What is the purpose of life?
  • What are we capable of accomplishing in life?
  • Are there limits to what we can do in life?
  • Why do we drive on parkways and park on
    driveways?
  • Phil. of computing
  • What is the purpose of programming?
  • What can we achieve through programming?
  • Are there limits to what we can do with programs?
  • Why dont debuggers actually debug programs?

24
Science
  • Physics
  • Study of fundamental physical laws and phenomenon
    like gravity and electricity
  • Engineering
  • Governed by physical laws
  • Our material
  • Study of fundamental computational laws and
    phenomenon like undecidability and universal
    computers
  • Programming
  • Governed by computational laws

25
Applied computing
  • Applications are not immediately obvious
  • In some cases, seeing the applicability of this
    material requires advanced abstraction skills
  • Every year, there are people who leave this
    course unable to see the applicability of the
    material
  • Others require more material in order to
    completely understand their application
  • for example, to understand how regular
    expressions and context-free grammars are applied
    to the design of compilers, you need to take a
    compilers course

26
Some applications
  • Important programming languages
  • regular expressions (perl)
  • finite state automata (used in hardware design)
  • context-free grammars
  • Proofs of program correctness
  • Subroutines
  • Using them to prove problems are unsolvable
  • String searching/Pattern matching
  • Algorithm design concepts such as recursion

27
Fundamental Theme
  • What are the capabilities and limitations of
    computers and computer programs?
  • What can we do with computers/programs?
  • Are there things we cannot do with
    computers/programs?

28
Module 2 Fundamental Concepts
  • Problems
  • Programs
  • Programming languages

29
Problems
  • We view solving problems as the main application
    for computer programs

30
Definition
  • A problem is a mapping or function between a set
    of inputs and a set of outputs
  • Example Problem Sorting

(4,2,3,1)
(1,2,3,4)
(3,1,2,4)
(1,5,7)
(7,5,1)
(1,2,3)
(1,2,3)
31
How to specify a problem
  • Input
  • Describe what an input instance looks like
  • Output
  • Describe what task should be performed on the
    input
  • In particular, describe what output should be
    produced

32
Example Problem Specifications
  • Sorting problem
  • Input
  • Integers n1, n2, ..., nk
  • Output
  • n1, n2, ..., nk in nondecreasing order
  • Find element problem
  • Input
  • Integers n1, n2, , nk
  • Search key S
  • Output
  • yes if S is in n1, n2, , nk, no otherwise

33
Programs
  • Programs solve problems

34
Purpose
  • Why do we write programs?
  • One answer
  • To solve problems
  • What does it mean to solve a problem?
  • Informal answer For every legal input, a correct
    output is produced.
  • Formal answer To be given later

35
Programming Language
  • Definition
  • A programming language defines what constitutes a
    legal program
  • Example a pseudocode program may not be a legal
    C program which may not be a legal C program
  • A programming language is typically referred to
    as a computational model in a course like this.

36
C
  • Our programming language will be C with minor
    modifications
  • Main procedure will use input parameters in a
    fashion similar to other procedures
  • no argc/argv
  • Output will be returned
  • type specified by main function type

37
Maximum Element Problem
  • Input
  • integer n gt 1
  • List of n integers
  • Output
  • The largest of the n integers

38
C Program which solves the Maximum Element
Problem
  • int main(int A, int n)
  • int i, max
  • if (n lt 1)
  • return (Illegal Input)
  • max A0
  • for (i 1 i lt n i)
  • if (Ai gt max)
  • max Ai
  • return (max)

39
Fundamental Theme
  • Exploring capabilities and limitations of C
    programs

40
Restating the Fundamental Theme
  • We will study the capabilities and limits of C
    programs
  • Specifically, we will try and identify
  • What problems can be solved by C programs
  • What problems cannot be solved by C programs

41
Question
  • Is C general enough?
  • Or is it possible that there exists some problem
    P such that
  • P can be solved by some program P in some other
    reasonable programming language
  • but P cannot be solved by any C program?

42
Churchs Thesis (modified)
  • We have no proof of an answer, but it is commonly
    accepted that the answer is no.
  • Churchs Thesis (three identical statements)
  • C is a general model of computation
  • Any algorithm can be expressed as a C program
  • If some algorithm cannot be expressed by a C
    program, it cannot be expressed in any reasonable
    programming language

43
Summary
  • Problems
  • When we talk about what programs can or cannot
    DO, we mean what PROBLEMS can or cannot be
    solved

44
Module 3 Classifying Problems
  • One of the main themes of this course will be to
    classify problems in various ways
  • By solvability
  • Solvable, half-solvable, unsolvable
  • We will focus our study on decision problems
  • function (one correct answer for every input)
  • finite range (yes or no is the correct output)

45
Classification Process
  • Take some set of problems and partition it into
    two or more subsets of problems where membership
    in a subset is based on some shared problem
    characteristic

46
Classify by Solvability
  • Criteria used is whether or not the problem is
    solvable
  • that is, does there exist a C program which
    solves the problem?

47
Function Problems
  • We will focus on problems where the mapping from
    input to output is a function

48
General (Relation) Problem
  • the mapping is a relation
  • that is, more than one output is possible for a
    given input

49
Criteria for Function Problems
  • mapping is a function
  • unique output for each input

50
Example Non-Function Problem
  • Divisor Problem
  • Input Positive integer n
  • Output A positive integral divisor of n

9
51
Example Function Problems
  • Sorting
  • Multiplication Problem
  • Input 2 integers x and y
  • Output xy

2,5
52
Another Example
  • Maximum divisor problem
  • Input Positive integer n
  • Output size of maximum divisor of n smaller than
    n

9
53
Decision Problems
  • We will focus on function problems where the
    correct answer is always yes or no

54
Criteria for Decision Problems
  • Output is yes or no
  • range Yes, No
  • Note, problem must be a function problem
  • only one of Yes/No is correct

55
Example
  • Decision sorting
  • Input list of integers
  • Yes/No question Is the list in nondecreasing
    order?

56
Another Example
  • Decision multiplication
  • Input Three integers x, y, z
  • Yes/No question Is xy z?

57
A Third Example
  • Decision Divisor Problem
  • Input Two integers x and y
  • Yes/No question Is y a divisor of x?

58
Focus on Decision Problems
  • When studying solvability, we are going to focus
    specifically on decision problems
  • There is no loss of generality, but we will not
    explore that here

59
Finite Domain Problems
  • These problems have only a finite number of inputs

60
Lack of Generality
  • All finite domain problems can be solved using
    table lookup idea

61
Table Lookup Program
  • int main(string x)
  • switch x
  • case Bill return(3)
  • case Judy return(25)
  • case Tom return(30)
  • default cerr ltlt Illegal input\n

62
Key Concepts
  • Classification Theme
  • Decision Problems
  • Important subset of problems
  • We can focus our attention on decision problems
    without loss of generality
  • Same is not true for finite domain problems
  • Table lookup

63
Module 4 Formal Definition of Solvability
  • Analysis of decision problems
  • Two types of inputsyes inputs and no inputs
  • Language recognition problem
  • Analysis of programs which solve decision
    problems
  • Four types of inputs yes, no, crash, loop inputs
  • Solving and not solving decision problems
  • Classifying Decision Problems
  • Formal definition of solvable and unsolvable
    decision problems

64
Analyzing Decision Problems
  • Can be defined by two sets

65
Decision Problems and Sets
  • Decision problems consist of 3 sets
  • The set of legal input instances (or universe of
    input instances)
  • The set of yes input instances
  • The set of no input instances

66
Redundancy
  • Only two of these sets are needed the third is
    redundant
  • Given
  • The set of legal input instances (or universe of
    input instances)
  • This is given by the description of a typical
    input instance
  • The set of yes input instances
  • This is given by the yes/no question
  • We can compute
  • The set of no input instances

67
Typical Input Universes
  • S The set of all finite length strings over
    finite alphabet S
  • Examples
  • a /\, a, aa, aaa, aaaa, aaaaa,
  • a,b /\, a, b, aa, ab, ba, bb, aaa, aab, aba,
    abb,
  • 0,1 /\, 0, 1, 00, 01, 10, 11, 000, 001, 010,
    011,
  • The set of all integers
  • If the input universe is understood, a decision
    problem can be specified by just giving the set
    of yes input instances

68
Language Recognition Problem
  • Input Universe
  • S for some finite alphabet S
  • Yes input instances
  • Some set L subset of S
  • No input instances
  • S - L
  • When S is understood, a language recognition
    problem can be specified by just stating what L
    is.

69
Language Recognition Problem
  • Traditional Formulation
  • Input
  • A string x over some finite alphabet S
  • Task
  • Is x in some language L subset of S?
  • 3 set formulation
  • Input Universe
  • S for a finite alphabet S
  • Yes input instances
  • Some set L subset of S
  • No input instances
  • S - L
  • When S is understood, a language recognition
    problem can be specified by just stating what L
    is.

70
Equivalence of Decision Problems and Languages
  • All decision problems can be formulated as
    language recognition problems
  • Simply develop an encoding scheme for
    representing all inputs of the decision problem
    as strings over some fixed alphabet S
  • The corresponding language is just the set of
    strings encoding yes input instances
  • In what follows, we will often use decision
    problems and languages interchangeably

71
Visualization
72
Analyzing Programs which Solve Decision Problems
  • Four possible outcomes

73
Program Declaration
  • Suppose a program P is designed to solve some
    decision problem P. What does Ps declaration
    look like?
  • What should P return on a yes input instance?
  • What should P return on a no input instance?

74
Program Declaration II
  • Suppose a program P is designed to solve a
    language recognition problem P. What does Ps
    declaration look like?
  • bool main(string x)
  • We will assume that the string declaration is
    correctly defined for the input alphabet S
  • If S a,b, then string will define variables
    consisting of only as and bs
  • If S a, b, , z, A, , Z, then string will
    define variables consisting of any string of
    alphabet characters

75
Programs and Inputs
  • Notation
  • P denotes a program
  • x denotes an input for program P
  • 4 possible outcomes of running P on x
  • P halts and says yes P accepts input x
  • P halts and says no P rejects input x
  • P halts without saying yes or no P crashes on
    input x
  • We typically ignore this case as it can be
    combined with rejects
  • P never halts P infinite loops on input x

76
Programs and the Set of Legal Inputs
  • Based on the 4 possible outcomes of running P on
    x, P partitions the set of legal inputs into 4
    groups
  • Y(P) The set of inputs P accepts
  • When the problem is a language recognition
    problem, Y(P) is often represented as L(P)
  • N(P) The set of inputs P rejects
  • C(P) The set of inputs P crashes on
  • I(P) The set of inputs P infinite loops on
  • Because L(P) is often used in place of Y(P) as
    described above, we use notation I(P) to
    represent this set

77
Illustration
All Inputs
I(P)
78
Analyzing Programs and Decision Problems
  • Distinguish the two carefully

79
Program solving a decision problem
  • Formal Definition
  • A program P solves decision problem P if and only
    if
  • The set of legal inputs for P is identical to the
    set of input instances of P
  • Y(P) is the same as the set of yes input
    instances for P
  • N(P) is the same as the set of no input instances
    for P
  • Otherwise, program P does not solve problem P
  • Note C(P) and I(P) must be empty in order for P
    to solve problem P

80
Solvable Problem
  • A decision problem P is solvable if and only if
    there exists some C program P which solves P
  • When the decision problem is a language
    recognition problem for language L, we often say
    that L is solvable or L is decidable
  • A decision problem P is unsolvable if and only if
    all C programs P do not solve P
  • Similar comment as above

81
Illustration of Solvability
Inputs of Program P
Y(P)
N(P)
82
Program half-solving a problem
  • Formal Definition
  • A program P half-solves problem P if and only if
  • The set of legal inputs for P is identical to the
    set of input instances of P
  • Y(P) is the same as the set of yes input
    instances for P
  • N(P) union C(P) union I(P) is the same as the set
    of no input instances for P
  • Otherwise, program P does not half-solve problem
    P
  • Note C(P) and I(P) need not be empty

83
Half-solvable Problem
  • A decision problem P is half-solvable if and only
    if there exists some C program P which
    half-solves P
  • When the decision problem is a language
    recognition problem for language L, we often say
    that L is half-solvable
  • A decision problem P is not half-solvable if and
    only if all C programs P do not half-solve P

84
Illustration of Half-Solvability
Inputs of Program P
Y(P)
N(P)
85
Hierarchy of Decision Problems
All decision problems
The set of half-solvable decision problems is a
proper subset of the set of all decision
problems The set of solvable decision problems is
a proper subset of the set of half-solvable
decision problems.
86
Why study half-solvable problems?
  • A correct program must halt on all inputs
  • Why then do we define and study half-solvable
    problems?
  • One Answer the set of half-solvable problems is
    the natural class of problems associated with
    general computational models like C
  • Every program half-solves some decision problem
  • Some programs do not solve any decision problem
  • In particular, programs which do not halt do not
    solve their corresponding decision problems

87
Key Concepts
  • Four possible outcomes of running a program on an
    input
  • The four subsets every program divides its set of
    legal inputs into
  • Formal definition of
  • a program solving (half-solving) a decision
    problem
  • a problem being solvable (half-solvable)
  • Be precise with the above two statements!

88
Module 5
  • Topics
  • Proof of the existence of unsolvable problems
  • Proof Technique
  • There are more problems/languages than there are
    programs/algorithms
  • Countable and uncountable infinities

89
Overview
  • We will show that there are more problems than
    programs
  • Actually more problems than programs in any
    computational model (programming language)
  • Implication
  • Some problems are not solvable

90
Preliminaries
  • Define set of problems
  • Observation about programs

91
Define set of problems
  • We will restrict the set of problems to be the
    set of language recognition problems over the
    alphabet a.
  • That is
  • Universe a
  • Yes Inputs Some language L subset of a
  • No Inputs a - L

92
Set of Problems
  • The number of distinct problems is given by the
    number of languages L subset of a
  • 2a is our shorthand for this set of subset
    languages
  • Examples of languages L subset of a
  • 0 elements
  • 1 element /\, a, aa, aaa, aaaa,
  • 2 elements /\, a, /\, aa, a, aa,
  • Infinite of elements an n is even, an n
    is prime, an n is a perfect square

93
Infinity and a
  • All strings in a have finite length
  • The number of strings in a is infinite
  • The number of languages L in 2a is infinite
  • The number of strings in a language L in 2a
    may be finite or infinite

94
Define set of programs
  • The set of programs we will consider are the set
    of legal C programs as defined in earlier
    lectures
  • Key Observation
  • Each C program can be thought of as a finite
    length string over alphabet SP
  • SP a, , z, A, , Z, 0, , 9, white space,
    punctuation

95
Example
  • int main(int A, int n) 26 characters
    including newline
  • int i, max 13
    characters including initial tab

  • 1 character newline
  • if (n lt 1) 12
    characters
  • return (Illegal Input) 28 characters
    including 2 tabs
  • max A0 13
    characters
  • for (i 1 i lt n i) 25
    characters
  • if (Ai gt max) 18
    characters
  • max Ai 15
    characters
  • return (max) 15
    characters
  • 2
    characters including newline

96
Number of programs
  • The set of legal C programs is clearly infinite
  • It is also no more than SP
  • SP a, , z, A, , Z, 0, , 9, white space,
    punctuation

97
Goal
  • Show that the number of languages L in 2a is
    greater than the number of strings in SP
  • SP a, , z, A, , Z, 0, , 9, white space,
    punctuation
  • Problem
  • Both are infinite

98
How do we compare the relative sizes of infinite
sets?
  • Bijection (yes)
  • Proper subset (no)

99
Bijections
  • Two sets have EQUAL size if there exists a
    bijection between them
  • bijection is a 1-1 and onto function between two
    sets
  • Examples
  • Set 1, 2, 3 and Set A, B, C
  • Positive even numbers and positive integers

100
Bijection Example
  • Positive Integers Positive Even Integers
  • 1 2
  • 2 4
  • 3 6
  • ... ...
  • i 2i
  • ...

101
Proper subset
  • Finite sets
  • S1 proper subset of S2 implies S2 is strictly
    bigger than S1
  • Example
  • women proper subset of people
  • number of women less than number of people
  • Infinite sets
  • Counterexample
  • even numbers and integers

102
Two sizes of infinity
  • Countable
  • Uncountable

103
Countably infinite set S
  • Definition 1
  • S is equal in size (bijection) to N
  • N is the set of natural numbers 1, 2, 3,
  • Definition 2 (Key property)
  • There exists a way to list all the elements of
    set S (enumerate S) such that the following is
    true
  • Every element appears at a finite position in the
    infinite list

104
Uncountable infinity
  • Any set which is not countably infinite
  • Examples
  • Set of real numbers
  • 2a, the set of all languages L which are a
    subset of a
  • Further gradations within this set, but we ignore
    them

105
Proof
106
(1) The set of all legal C programs is
countably infinite
  • Every C program is a finite string
  • Thus, the set of all legal C programs is a
    language LC
  • This language LC is a subset of SP

107
For any alphabet S, ? is countably infinite
  • Enumeration ordering
  • All length 0 strings
  • S0 1 string l
  • All length 1 strings
  • S strings
  • All length 2 strings
  • S2 strings
  • Thus, SP is countably infinite

108
Example with alphabet a,b
  • Length 0 strings
  • 0 and l
  • Length 1 strings
  • 1 and a, 2 and b
  • Length 2 strings
  • 3 and aa, 4 and ab, 5 and ba, 6 and bb, ...
  • Question
  • write a program that takes a number as input and
    computes the corresponding string as output

109
(2) The set of languages in 2a is uncountably
infinite
  • Diagonalization proof technique
  • Algorithmic proof
  • Typically presented as a proof by contradiction

110
Algorithm Overview
  • To prove this set is uncountably infinite, we
    construct an algorithm D that behaves as follows
  • Input
  • A countably infinite list of languages L subset
    of a
  • Output
  • A language D(L) which is a subset of a that
    is not on list L

111
Visualizing D
List L L0 L1 L2 L3 ...
Language D(L) not in list L
112
Why existence of D implies result
  • If the number of languages in 2a is countably
    infinite, there exists a list L s.t.
  • L is complete
  • it contains every language in 2a
  • L is countably infinite
  • The existence of algorithm D implies that no list
    of languages in 2a is both complete and
    countably infinite
  • Specifically, the existence of D shows that any
    countably infinite list of languages is not
    complete

113
Visualizing One Possible L
l
a
aa
aaa
aaaa
...
  • Rows is countably infinite
  • Given
  • Cols is countably infinite
  • a is countably infinite

L0
L1
L2
L3
L4
...
  • Consider each string to be a feature
  • A set contains or does not contain each string

114
Constructing D(L )
  • We construct D(L) by using a unique feature
    (string) to differentiate D(L) from Li
  • Typically use ith string for language Li
  • Thus the name diagonalization

D(L)
l
a
aa
aaa
aaaa
...
OUT
L0
IN
IN
IN
IN
IN
L1
OUT
IN
IN
IN
OUT
IN
L2
OUT
OUT
OUT
OUT
OUT
IN
L3
IN
IN
OUT
OUT
OUT
IN
L4
IN
IN
OUT
OUT
OUT
OUT
...
115
Questions
l
a
aa
aaa
aaaa
...
L0
IN
IN
IN
IN
IN
L1
OUT
IN
IN
IN
OUT
L2
OUT
OUT
OUT
OUT
OUT
L3
IN
IN
OUT
OUT
OUT
L4
IN
IN
OUT
OUT
OUT
...
  • Do we need to use the diagonal?
  • Every other column and every row?
  • Every other row and every column?
  • What properties are needed to construct D(L)?

116
Visualization
All problems
The set of solvable problems is a proper subset
of the set of all problems.
117
Summary
  • Equal size infinite sets bijections
  • Countable and uncountable infinities
  • More languages than algorithms
  • Number of algorithms countably infinite
  • Number of languages uncountably infinite
  • Diagonalization technique
  • Construct D(L) using infinite set of features
  • The set of solvable problems is a proper subset
    of the set of all problems

118
Module 6
  • Topics
  • Program behavior problems
  • Input of problem is a program/algorithm
  • Definition of type program
  • Program correctness
  • Testing versus Proving

119
Number Theory Problems
  • These are problems where we investigate
    properties of numbers
  • Primality
  • Input Positive integer n
  • Yes/No Question Is n a prime number?
  • Divisor
  • Input Integers m,n
  • Yes/No question Is m a divisor of n?

120
Graph Theory Problems
  • These are problems where we investigate
    properties of graphs
  • Connected
  • Input Graph G
  • Yes/No Question Is G a connected graph?
  • Subgraph
  • Input Graphs G1 and G2
  • Yes/No question Is G1 a subgraph of G2?

121
Program Behavior Problems
  • These are problems where we investigate
    properties of programs and how they behave
  • Give an example problem with one input program P
  • Give an example problem with two input programs
    P1 and P2

122
Program Representation
  • Program variables
  • Abstractly, we define the type program
  • graph G, program P
  • More concretely, we define type program to be a
    string over the program alphabet SP a, , z,
    A, , Z, 0, , 9, punctuation, white space
  • Note, many strings over SP are not legal programs
  • We consider them to be programs that always crash
  • Possible declaration of main procedure
  • bool main(program P)

123
Program correctness
  • How do we determine whether or not a program P we
    have written is correct?
  • What are some weaknesses of this approach?
  • What might be a better approach?

124
Testing versus Analyzing
Test Inputs x1 x2 x3 ...
Outputs P(x1) P(x2) P(x3) ...
Analysis of Program P
Program P
125
2 Program Behavior Problems
  • Correctness
  • Input
  • Program P
  • Yes/No Question
  • Does P correctly solve the primality problem?
  • Functional Equivalence
  • Input
  • Programs P1, P2
  • Yes/No Question
  • Is program P1 functionally equivalent to program
    P2

126
Module 7
  • Halting Problem
  • Fundamental program behavior problem
  • A specific unsolvable problem
  • Diagonalization technique revisited
  • Proof more complex

127
Definition
  • Input
  • Program P
  • Assume the input to program P is a single
    unsigned int
  • This assumption is not necessary, but it
    simplifies the following unsolvability proof
  • To see the full generality of the halting
    problem, remove this assumption
  • Nonnegative integer x, an input for program P
  • Yes/No Question
  • Does P halt when run on x?
  • Notation
  • Use H as shorthand for halting problem when space
    is a constraint

128
Example Input
  • Program with one input of type unsigned int
  • bool main(unsigned int Q)
  • int i2
  • if ((Q 0) (Q 1)) return false
  • while (iltQ)
  • if (Qi 0) return (false)
  • i
  • return (true)
  • Input x
  • 4

129
Three key definitions
130
Definition of list L
  • SP is countably infinite where SP
    characters, digits, white space, punctuation
  • Type program will be type string with SP as the
    alphabet
  • Define L to be the strings in SP listed in
    enumeration order
  • length 0 strings first
  • length 1 strings next
  • Every program is a string in SP
  • For simplicity, consider only programs that have
  • one input
  • the type of this input is an unsigned int
  • Consider strings in SP that are not legal
    programs to be programs that always crash (and
    thus halt on all inputs)

131
Definition of PH
  • If H is solvable, some program must solve H
  • Let PH be a procedure which solves H
  • We declare it as a procedure because we will use
    PH as a subroutine
  • Declaration of PH
  • bool PH(program P, unsigned int x)
  • In general, the type of x should be the type of
    the input to P
  • Comments
  • We do not know how PH works
  • However, if H is solvable, we can build programs
    which call PH as a subroutine

132
Definition of program D
  • bool main(unsigned int y) / main for program D
    /
  • program P generate(y)
  • if (PH(P,y)) while (1gt0) else return (yes)
  • / generate the yth string in SP in enumeration
    order /
  • program generate(unsigned int y)
  • / code for program of slide 21 from module 5
    did this for a,b /
  • bool PH(program P, unsigned int x)
  • / how PH solves H is unknown /

133
Generating Py from y
  • We wont go into this in detail here
  • This was the basis of the question at the bottom
    of slide 21 of lecture 5 (alphabet for that
    problem was a,b instead of SP).
  • This is the main place where our assumption about
    the input type for program P is important
  • for other input types, how to do this would vary
  • Specification
  • 0 maps to program l
  • 1 maps to program a
  • 2 maps to program b
  • 3 maps to program c
  • 26 maps to program z
  • 27 maps to program A

134
Proof that H is not solvable
135
Argument Overview
H is solvable
D is NOT on list L
136
Proving D is not on list L
  • Use list L to specify a program behavior B that
    is distinct from all real program behaviors (for
    programs with one input of type unsigned int)
  • Diagonalization argument similar to the one for
    proving the number of languages over a is
    uncountably infinite
  • No program P exists that exhibits program
    behavior B
  • Argue that D exhibits program behavior B
  • Thus D cannot exist and thus is not on list L

137
Non-existent program behavior B
138
Visualizing List L
0
1
2
3
4
...
  • Rows is countably infinite
  • Sp is countably infinite
  • Cols is countably infinite
  • Set of nonnegative integers is countably infinite

P0
P1
P2
P3
P4
...
  • Consider each number to be a feature
  • A program halts or doesnt halt on each integer
  • We have a fixed L this time

139
Diagonalization to specify B
  • We specify a non-existent program behavior B by
    using a unique feature
  • (number) to differentiate B from Pi

0
1
2
3
4
...
B
P0
NH
H
H
H
H
H
P1
NH
H
H
H
NH
H
P2
NH
NH
NH
NH
NH
H
P3
H
H
NH
NH
NH
H
P4
NH
H
H
H
H
H
...
140
Arguing D exhibits program behavior B
141
Code for D
  • bool main(unsigned int y) / main for program D
    /
  • program P generate(y)
  • if (PH(P,y)) while (1gt0) else return (yes)
  • / generate the yth string in SP in enumeration
    order /
  • program generate(unsigned int y)
  • / code for extra credit program of slide 21
    from lecture 5 did this for a,b /
  • bool PH(program P, unsigned int x)
  • / how PH solves H is unknown /

142
Visualization of D in action on input y
  • Program D with input y
  • (type for y unsigned int)
  • Given input y, generate the program (string) Py
  • Run PH on Py and y
  • Guaranteed to halt since PH solves H
  • IF (PH(Py,y)) while (1gt0) else return (yes)

0
1
2
...
D
...
y
P0
H
H
H
P1
H
H
NH
P2
NH
NH
NH
...
Py
H
NH
...
143
Alternate Proof
144
Alternate Proof Overview
  • For every program Py, there is a number y that we
    associate with it
  • The number we use to distinguish program Py from
    D is this number y
  • Using this idea, we can arrive at a contradiction
    without explicitly using the table L
  • The diagonalization is hidden

145
H is not solvable, proof II
  • Assume H is solvable
  • Let PH be the program which solves H
  • Use PH to construct program D which cannot exist
  • Contradiction
  • This means program PH cannot exist.
  • This implies H is not solvable
  • D is the same as before

146
Arguing D cannot exist
  • If D is a program, it must have an associated
    number y
  • What does D do on this number y?
  • 2 cases
  • D halts on y
  • This means PH(D,y) NO
  • Definition of D
  • This means D does not halt on y
  • PH solves H
  • Contradiction
  • This case is not possible

147
Continued
  • D does not halt on this number y
  • This means PH(D,y) YES
  • Definition of D
  • This means D halts on y
  • PH solves H
  • Contradiction
  • This case is not possible
  • Both cases are not possible, but one must be for
    D to exist
  • Thus D cannot exist

148
Implications
  • The Halting Problem is one of the simplest
    problems we can formulate about program behavior
  • We can use the fact that it is unsolvable to show
    that other problems about program behavior are
    also unsolvable
  • This has important implications restricting what
    we can do in the field of software engineering
  • In particular, perfect debuggers/testers do not
    exist
  • We are forced to test programs for correctness
    even though this approach has many flaws

149
Summary
  • Halting Problem definition
  • Basic problem about program behavior
  • Halting Problem is unsolvable
  • We have identified a specific unsolvable problem
  • Diagonalization technique
  • Proof more complicated because we actually need
    to construct D, not just give a specification B

150
Module 8
  • Closure Properties
  • Definition
  • Language class definition
  • set of languages
  • Closure properties and first-order logic
    statements
  • For all, there exists

151
Closure Properties
  • A set is closed under an operation if applying
    the operation to elements of the set produces
    another element of the set
  • Example/Counterexample
  • set of integers and addition
  • set of integers and division

152
Integers and Addition
7
Integers
153
Integers and Division
.4
2
5
Integers
154
Language Classes
  • We will be interested in closure properties of
    language classes
  • A language class is a set of languages
  • Thus, the elements of a language class (set of
    languages) are languages which are sets
    themselves
  • Crucial Observation
  • When we say that a language class is closed under
    some set operation, we apply the set operation to
    the languages (elements of the language classes)
    rather than the language classes themselves

155
Example Language Classes
  • In all these examples, we do not explicitly state
    what the underlying alphabet S is
  • Finite languages
  • Languages with a finite number of strings
  • CARD-3
  • Languages with at most 3 strings

156
Finite Sets and Set Union
0,1,00,11
Finite Sets
157
CARD-3 and Set Union
0,1,00,11
CARD-3
CARD-3 sets with at most 3 elements
158
Finite Sets and Set Complement
/\,00,10,11,000,...
0,1,01
Finite Sets
159
Infinite Number of Facts
  • A closure property often represents an infinite
    number of facts
  • Example The set of finite languages is closed
    under the set union operation
  • union is a finite language
  • union l is a finite language
  • union 0 is a finite language
  • ...
  • l union is a finite language
  • ...

160
First-order logic and closure properties
  • A way to formally write (not prove) a closure
    property
  • For all L1, ...,Lk in LC, op (L1, ... Lk) in LC
  • Only one expression is needed because of the for
    all quantifier
  • Number of languages k is determined by arity of
    the operation op

161
Example F-O logic statements
  • For all L1,L2 in FINITE, L1 union L2 in FINITE
  • For all L1,L2 in CARD-3, L1 union L2 in CARD-3
  • For all L in FINITE, Lc in FINITE
  • For all L in CARD-3, Lc in CARD-3

162
Stating a closure property is false
  • What is true if a set is not closed under some
    k-ary operator?
  • There exist k elements of that set which, when
    combined together under the given operator,
    produce an element not in the set
  • There exists L1, ...,Lk in LC, op (L1, , Lk) not
    in LC
  • Example
  • Finite sets and set complement

163
Complementing a F-O logic statement
  • Complement For all L1,L2 in CARD-3, L1 union L2
    in CARD-3
  • not (For all L1,L2 in CARD-3, L1 union L2 in
    CARD-3)
  • There exists L1,L2 in CARD-3, not (L1 union L2 in
    CARD-3)
  • There exists L1,L2 in CARD-3, L1 union L2 not in
    CARD-3

164
Proving/Disproving
  • Which is easier and why?
  • Proving a closure property is true
  • Proving a closure property is false

165
Module 9
  • Recursive and r.e. language classes
  • representing solvable and half-solvable problems
  • Proofs of closure properties
  • for the set of recursive (solvable) languages
  • for the set of r.e. (half-solvable) languages
  • Generic element/template proof technique
  • Relationship between RE and REC
  • pseudoclosure property

166
RE and REC language classes
  • REC
  • A solvable language is commonly referred to as a
    recursive language for historical reasons
  • REC is defined to be the set of solvable or
    recursive languages
  • RE
  • A half-solvable language is commonly referred to
    as a recursively enumerable or r.e. language
  • RE is defined to be the set of r.e. or
    half-solvable languages

167
Why study closure properties of RE and REC?
  • It tests how well we really understand the
    concepts we encounter
  • language classes, REC, solvability,
    half-solvability
  • It highlights the concept of subroutines and how
    we can build on previous algorithms to construct
    new algorithms
  • we dont have to build our algorithms from
    scratch every time

168
Example Application
  • Setting
  • I have two programs which can solve the language
    recognition problems for L1 and L2
  • I want a program which solves the language
    recognition problem for L1 intersect L2
  • Question
  • Do I need to develop a new program from scratch
    or can I use the existing programs to help?
  • Does this depend on which languages L1 and L2 I
    am working with?

169
Closure Properties of REC
  • We now prove REC is closed under two set
    operations
  • Set Complement
  • Set Intersection
  • In these proofs, we try to highlight intuition
    and common sense

170
Set Complement Example
  • Even the set of even length strings over 0,1
  • Complement of Even?
  • Odd the set of odd length strings over 0,1
  • Is Odd recursive (solvable)?
  • How is the program P that solves Odd related to
    the program P that solves Even?

171
Set Complement Lemma
  • If L is a solvable language, then L complement is
    a solvable language
  • Proof
  • Let L be an arbitrary solvable language
  • First line comes from For all L in REC
  • Let P be the C program which solves L
  • P exists by definition of REC

172
proof continued
  • Modify P to form P as follows
  • Identical except at very end
  • Complement answer
  • Yes -gt No
  • No -gt Yes
  • Program P solves L complement
  • Halts on all inputs
  • Answers correctly
  • Thus L complement is solvable
  • Definition of solvable

173
P Illustration
YES
P
Input x
No
174
Code for P
  • bool main(string y)
  • if (P (y)) return no else return yes
  • bool P (string y)
  • / details deleted key fact is P is guaranteed
    to halt on all inputs /

175
Set Intersection Example
  • Even the set of even length strings over 0,1
  • Mod-5 the set of strings of length a multiple of
    5 over 0,1
  • What is Even intersection Mod-5?
  • Mod-10 the set of strings of length a multiple
    of 10 over 0,1
  • How is the program P3 (Mod-10) related to
    programs P1 (Even) and P2 (Mod-5)

176
Set Intersection Lemma
  • If L1 and L2 are solvable languages, then L1
    intersection L2 is a solvable language
  • Proof
  • Let L1 and L2 be arbitrary solvable languages
  • Let P1 and P2 be programs which solve L1 and L2,
    respectively

177
proof continued
  • Construct program P3 from P1 and P2 as follows
  • P3 runs both P1 and P2 on the input string
  • If both say yes, P3 says yes
  • Otherwise, P3 says no
  • P3 solves L1 intersection L2
  • Halts on all inputs
  • Answers correctly
  • L1 intersection L2 is a solvable language

178
P3 Illustration
Yes/No
P1
Yes/No
P2
179
Code for P3
  • bool main(string y)
  • if (P1(y) P2(y)) return yes
  • else return no
  • bool P1(string y) / details deleted key fact
    is P1 always halts. /
  • bool P2(string y) / details deleted key fact is
    P2 always halts. /

180
Other Closure Properties
  • Unary Operations
  • Language Reversal
  • Kleene Star
  • Binary Operations
  • Set Union
  • Set Difference
  • Symmetric Difference
  • Concatena
Write a Comment
User Comments (0)
About PowerShow.com