The reuse of grammars with embedded semantic actions - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

The reuse of grammars with embedded semantic actions

Description:

We need grammar composition now to put it back together ... Grammar composition to factor large grammars; uses delegation to get rule ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 23
Provided by: terenc2
Category:

less

Transcript and Presenter's Notes

Title: The reuse of grammars with embedded semantic actions


1
The reuse of grammars with embedded semantic
actions
  • Terence Parr
  • University of San Francisco

2
The goal and problem
  • We want to reuse grammars (and fragments)
  • All apps recognize same language
  • Bug fixes propagate automatically
  • Development / testing easier with reuse
  • Sometimes we need embedded, unrestricted semantic
    actions
  • Uses program comprehension, configuration file
    readers,
  • But, actions within grammar locks into a specific
    app

3
Common rewriting approach
  • Simply decouple syntax and semantics
  • Can treat grammars like libraries
  • Term rewriters use this approach
  • ASFSDF, Stratego (also AST rewrites)
  • TXL has concrete syntax transformation rules
  • These systems have rewrites in nice concrete
    declarative form
  • Can generate API for accessing implicitly-created
    tree, visitors

4
ANTLR rewriter strategy
  • Have grammar build IR tree
  • Isolate semantics in tree grammars e.g., walk
    trees to annotate, build symbol tables, or make
    use-def chains,
  • Also can trigger events like XML SAX
  • For rewriting, ANTLR more raw than purely
    declarative approach must describe AST, tree
    grammar

5
Decoupled approach issues
  • New app must use verbatim syntax and AST
    structure (if walking ASTs) or
  • Grammar metalanguages must support
  • includes, inheritance, or modules
  • ANTLR v2 supported grammar inheritance
  • Subgrammar overrides syntax or AST rules
  • Such metalanguage reuse mechanisms work best when
    decoupled I.e., no actions in grammar

6
What about non-rewrite apps?
  • Cant decouple when you need an internal data
    structure, not text output
  • Cannot escape need to execute actions in general
    purpose language
  • External visitors or API calls are cumbersome,
    lack grammatical context
  • Cost of proximity (actions in grammar) is
    entangling syntax, semantics

7
Reuse in presence of actions
  • Existing reuse mechanisms dont deal well with
    tweaks to actions from super or new actions
  • Currently, coders dup then modify existing
    grammar to add actions, change rules,
  • Ok, except bug fixes arent propagated decidely
    lowbrow and undignified

8
Aside ANTLR LL() parsers
  • LL() recursive descent generator unified
    syntax lexer, parser, tree parser
  • decl not LL(k) fixed k but LL()
  • DFA spins ahead to or
  • No strict ordering with LR(k) decl is not LR(k)
  • reduce-reduce conflict between modifier rules
  • For non-LL(), ANTLR accepts PEGs

// simplified Java declaration rule e.g., //
"public static int i" // "public static int f()
..." decl variable_modifier variable
function_modifier function
9
Current ANTLR grammar reuse
  • Cobble together grammar from others
  • Embed complete grammar in another
  • E.g., Java within HTML or SQL within C
  • Derive variant
  • E.g., GCC vs C, vendor specific SQL
  • Traverse trees from existing grammar
  • E.g., program comprehension tools like lint
  • Copy, modify semantics of existing grammar
  • Change symbol table actions, tweak pretty printer
  • Lots of cut-and-paste going on, few opportunities
    for verbatim grammar reuse

10
Grammar inheritance imperfect
  • Single grammar inheritance (v2 did include)
  • Works in some cases, but
  • subgrammar can require changes to super
  • fine grained control forces small unnatural rules
  • copy-n-paste preferable to altering working super
  • lots of subs hard to imagine overall language

11
Grammar inheritance imperfect (Contd)
  • Inheritance is blunt instrument for altering
    actions strewn through super
  • At least with ANTLR, had to override entire rule
    to change a single action
  • Could identify by name but would require labeling
    all in case needed
  • Not sufficient might need to tweak inside of
    action not replace
  • Is it a lost cause to reuse grammars with actions?

12
Prototype-based grammar reuse
  • Solution consider a tool not metalanguage
    feature
  • Recall cut-n-paste has great flexibility only
    problem is lack of change propagation (lose
    single change point)
  • If we propagate changes from an original
    prototype grammar to our modified version, that
    smacks of revision control
  • ANTLR philosophy is to formalize what programmers
    do naturally
  • So, formalize grammar reuse via prototype grammar
    mechanism
  • Track changes between prototype and derived, not
    history of changes to single file

13
Grammar prototyping tools
  • Begin project gderive Java.g MyJava.g
  • Later, pick up changes gsync MyJava.g
  • Could either update rules, actions or both
  • Works well
  • add actions to standard action-free grammar
  • tweak actions to get slightly different app
  • have multiple versions but diff actions (ANTLR
    has 5 identical tree grammars but diff actions)
  • Yeah, but how is this different than diff3?

14
Why not diff3?
  • Must compare structure not text lines
  • Text tools like diff3 cannot separate grammars
    from actions etc(ID names.add(ID.text)) vs
    ID
  • Cant see rule renaming as a rename since diff3
    cant identify refactoring patterns

15
Change patterns seen in antlr.g over 14 months
  • new rules
  • add new rule and refer to that
  • extract rules and change references
  • modify rules
  • add/remove branch
  • add/change actions
  • change closure notations ()?, (), ()
  • rename rules
  • rule labels
  • add new label and the references to the label
    variables
  • rename label and update the references
  • delete label and delete the references
  • meta-language changes
  • add token declarations, change options

16
What does gsync look like?
  • Computer-aided cant always do a safe automatic
    merge
  • Visual diff will show different perspectives of
    prototype and derived grammar with tabbed views,
    one for each pattern it can detect
  • gsync could use grammar refactoring patch tools
    to do the actual merge (e.g., work of Laemmel).

17
ANTLR grammar composition
  • Prototype grammar mechanism breaks down when
    composing grammars
  • Might need to break up big grammar
  • An Oracle 10g grammar yielded 129k lines of Java
    code! (v2s include mech.)
  • Cant auto-split due to actions
  • Let coder break up into logical chunks
  • We need grammar composition now to put it back
    together
  • Allows better organization, size control, and
    opportunities for reuse

18
Composition mechanism
  • Root grammar imports dependent grammars
  • Import operates like inheritance
  • rule overriding
  • polymorphic rule evocation
  • Duplicates resolved in favor of rule imported
    first
  • Imported grammars can import others
  • Impl delegation model if R imports A, B
  • ANTLR generates classes R, R_A, R_B
  • R has 2 delegate pointers to R_A, R_B
  • R_A, R_B have back delegator pointers to R
  • Every rule can get to every other rule

19
Composition example
parser grammar JavaDecl type 'int' decl
type ID '' type ID init '' init
'' INT
// Root grammar parser grammar Java import
JavaDecl prog decl type 'int' 'float'
overrides
calls Java.type
class Java extends Parser JavaDecl d void
prog() decl() void type() void
decl() d.decl() void init() d.init()
class JavaDecl extends Parser public Java
parent void decl() void init()
20
Interesting effects
  • Overriding alters lookahead
  • rule type overridden, alters lookahead of rule
    decl
  • FIRST3(decl) was (int, ID, ,) but is
    now (int,float, ID, ,)
  • Polymorphism through delegate pointer
  • Ref to rule r in delegate sees R.r if overridden
    in R.
  • Here, JavaDecl.decl invokes Java.type via
    parent.type()
  • Broke up Java.g from 22k line file to 6 smaller

21
Prototypes and composition
  • Even with composition, need prototype grammars to
    alter pre-existing
  • Easy to combined mechanisms
  • Delegates become prototypes
  • Composer imports refined delegates
  • Sync by syncing derived delegates from respective
    prototypes then run ANTLR on the root grammar

22
Summary
  • Cant always decouple the semantics from syntax
  • Nonrewriter apps need actions to build data
    structures
  • Currently coders cut, paste, and modify
  • Prototype grammars are like live cut-and-paste
    changes to prototype merged into derived via
    structured tree difference tool akin to RCS
  • Tool applicable to any metalanguage tool
  • Grammar composition to factor large grammars
    uses delegation to get rule polymorphism and
    inheritance
  • Combination allows programmers to easily reuse
    existing grammars or grammar fragments even in
    presence of semantic actions
Write a Comment
User Comments (0)
About PowerShow.com