Parameterized Pattern Matching - PowerPoint PPT Presentation

About This Presentation
Title:

Parameterized Pattern Matching

Description:

... Parameterized Matching Algorithm: Run KMP with the following modifications: Construct table: A ... TIME: KMP is linear time, but we have a new Compare subroutine. – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 19
Provided by: har1151
Category:

less

Transcript and Presenter's Notes

Title: Parameterized Pattern Matching


1
Parameterized Pattern Matching
  • Amihood Amir
  • Martin Farach
  • V. Muthukrishnan

2
Parameterized Matching
  • Input two strings s and t st, over
    alphabets ?s and ?t.
  • s parameterize matches t if bijection
    ?s ?t , such that (s) t.

Example
a
a
b
b
b
(a)x
x
x
y
y
y
(b)y
3
Parameterized Matching
  • Input Two strings T, P Tn, Pm.
  • Output All text locations i,
  • such that (P)Ti Tim-1.

4
Parameterized Matching History
  • Introduced by Brenda Baker Baker93.
  • Others AFM94, Bak95, Bak97.
  • Two Dimensions AACLP03.
  • Used in scaled matching ABL99.
  • Periodicity of parameterized matching
    ApostolicoGiancarlo.
  • Approximate parameterized matching HLS04.

5
Alternate Definition
  • Notice
  • Alphabet bijection between S and T means
  • Si Ti for all I
  • Where Si Ti if
  • Si ? Sk, k1,,i-1
    and
  • Ti ? Tk, k1,,i-1
  • or for all k1,,i-1
  • SiSk iff TiTk

6
Parameterized Matching Algorithm
  • Run KMP with the following modifications
  • Construct table A1,,Am where
  • largest k, 1klti, s.t. PiPk
  • Ai
  • i , if no such k exists

7
  • 2. Replace equality checks as follows
  • Instead of PiTj? do
  • Compare (Pi,Tj)
  • If Aii and Tj?Tk, kj-i1,,j
  • then return equal
  • If Ai?i and TjTj-iAi
  • then return equal
  • return not equal
  • End

8
  • Instead of PiPj? do
  • Compare (Pi,Pj)
  • If (Aii or i-Aij) and Pj?Pk,
    k1,,j
  • then return equal
  • If i-Ailtj and PjPj-iAi
  • then return equal
  • return not equal
  • End

9
Correctness
  • Automaton construction guarantees that failure
    arrow points to
  • largest prefix that parameter matches the
    suffix.

10
TIME
  • KMP is linear time, but we have a new Compare
    subroutine.
  • Take text size to be 2m, and Compare takes time
    O(log s), where smin(S,m).
  • This is the time to search if Tj or Pj
    appears in a balanced tree.

11
TIME
  • Automaton Construction O(m log s) .
  • Text Scanning O(n log s) .
  • Can we do better?

12
Alphabet S1,,n
  • Can be done in linear time.
  • How?
  • Construct array
  • 1 list of indices of symbol 1
  • 2 list of indices of symbol 2
  • .
  • .
  • m list of indices of symbol m.

13
To check if Tj?Tk, kj-i1,,j
  • Assume the symbol in Tj is a.
  • Check if
  • previous index to j in as list lt j-i1

14
LOWER BOUNDS
  • What about general alphabets?
  • Element distincness Problem (EDP)
  • Input Array A1,,An of natural numbers.
  • Decide If all elements of A are distinct (i.e.
    no i?j where Aiaj)

15
TIME FOR EDP
  • In comparison model
  • General alphabets ?(n log n)
  • Alphabet S1,,n linear time.
  • (construct array of indices)

16
Linear Reduction
  • Claim EDP is linearly reducible to Parameterized
    Matching.
  • Proof Let A1,,An be an array of numbers.
  • In linear time, check if A1 is unique.
  • If so, construct SA2,A3,,An,A1

17
Linear Reduction (cont.)
  • A S iff all elements of A are distinct.
  • trivial
  • By induction on the prefixes of A.
  • A1 is unique we checked.
  • Assume A1,,Ak are distinct.
  • In particular, Ak is unique.

18
Linear Reduction (cont.)
  • But Ak was parameter-matched to Sk, so Sk
    only appears once in S.
  • But SkAk1.
  • This means that Ak1 is unique.
Write a Comment
User Comments (0)
About PowerShow.com