An Overview of the Trimaran Compiler Infrastructure - PowerPoint PPT Presentation

1 / 57
About This Presentation
Title:

An Overview of the Trimaran Compiler Infrastructure

Description:

The textual language is called REBEL ... Phases of Elcor may communicate using Rebel ... is provided for generating Rebel from the internal representation ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 58
Provided by: benjamin123
Category:

less

Transcript and Presenter's Notes

Title: An Overview of the Trimaran Compiler Infrastructure


1
An Overview of the Trimaran Compiler
Infrastructure
2
What Is Trimaran ?
  • A parametric compilation and performance
    monitoring system
  • A full-blown C compiler for the HPL-PD
    instruction set architecture (ISA)
  • A cycle-by-cycle parametric machine simulator
    cache simulator
  • A suite of optimization and analysis tools
  • Uses HPL-PD a parameterized very long instruction
    word (VLIW) ISA
  • Supports predication, control and data
    speculation and compiler controlled management of
    the memory hierarchy
  • Compiles for target architectures specified by a
    machine description language
  • Can compile optimized code for a variety of VLIW
    and Superscalar architectures

3
Trimarans Goal
  • To provide a vehicle for implementation and
    experimentation for state of the art research in
    compiler techniques for instruction-level
    parallel architectures.
  • Currently, the infrastructure is oriented towards
    Explicitly Parallel Instruction Computing (EPIC)
    architectures.
  • But can also support compiler research for
    Superscalar architectures.
  • Primarily for back-end compiler research
  • instruction scheduling, register allocation, and
    machine dependent optimizations.

4
Compiling a Program

Source Program (C, C, Java, etc)
  • Compiles programs for only one architecture
  • All optimizations are tuned for the given target
    machine

5
A Retargetable Compiler and Simulator

Front-End
Source Program (C, C, Java, etc)
High-level Optimizations
Low-level Optimizations
  • MDES influences optimizations and code generation
  • Executing the binary performs cycle-by-cycle
    simulation based on MDES

Code Generation
6
Terms and Definitions
  • ILP (Instruction-Level Parallelism)
  • more than one operation issued per clock cycle
    within a single CPU
  • EPIC (Explicitly Parallel Instruction Computing)
  • ILP under compiler control
  • A single instruction may contain many operations
  • Compiler determines operation dependences and
    specifies which operations may execute
    concurrently

7
Infrastructure Components
  • A machine description language, HMDES, for
    describing ILP architectures.
  • A parameterized ILP Architecture called HPL-PD
  • Current instantiation in the infrastructure is as
    a EPIC architecture
  • A compiler front-end for C, performing parsing,
    type checking, and a large suite of high-level
    (i.e. machine independent) optimizations.
  • This is the IMPACT module (IMPACT group,
    University of Illinois)

8
Infrastructure Components
  • A compiler back-end, parameterized by a machine
    description, performing instruction scheduling,
    register allocation, and machine-dependent
    optimizations
  • Each stage of the back-end may be replaced or
    modified by a compiler researcher
  • Primarily implemented as part of the ELCOR effort
    by the CAR Group at HP Labs
  • Augmented with a scalar register allocator from
    CREST

9
Infrastructure Components ..contd
  • An extensible IR (intermediate program
    representation)
  • Has both an internal and textual representation,
    with conversion routines between the two. The
    textual language is called REBEL
  • Supports modern compiler techniques by
    representing control flow, data and control
    dependence, and many other attributes
  • Easy to use in its internal representation (clear
    C object hierarchy) and textual representation
    (human-readable)
  • A cycle-level simulator of the HPL-PD,
    configurable by a MDES and provides run-time
    information on execution time, branch
    frequencies, and resource utilization
  • This information can be used for profile-driven
    optimizations, as well as to provide validation
    of new optimizations

10
Infrastructure Support ..contd
  • An Integrated graphical user interface (GUI) for
    configuring and running the Trimaran system.

11

Trimaran System Organization


IMPACT
C program
KR/ANSI-C Parsing Renaming Flattening Control-F
low Profiling C Source File Splitting Function
Inlining
Classical Optimizations Code Layout Superblock
Formation Hyperblock Formation ILP Transformations
MachineDescription
Elcor/CAR
DependenceGraph Construction
Modulo Scheduling
Acyclic Scheduling
ExecutionStatistics
ToIR
Simulator
Post-pass Scheduling
Region-basedRegister Allocation
. . .
ReaCT-ILP
12
An Overview of the IMPACT Module and Its
Optimization Suite
13
KR/ANSI-C Parser
  • Built upon EDG C parser
  • Solid but persnickety about C language spec
  • May need to modify benchmark source to match spec
  • Utilizes native compilers header files (in most
    cases), and libraries
  • We may only distribute binaries and source diffs
  • Unmodified source available via free educational
    license from EDG (see web site for source diffs
    and instructions)
  • Modified to generate our source-level
    intermediate rep.
  • Compile all the available source together
  • Dont link in libraries if have source for
    libraries!
  • Profiler and source analysis tools need everything

14
IMPACT steps
  • Flattening
  • Transforms complex expressions into simple ones
    adding temporary variables if required
  • Profiling
  • Simple control-arc weighing based on one profile
    run
  • File splitting
  • Generates one file per function
  • Function inlining
  • Inlines functions to limit code growth but
    accelerate most called functions
  • Classical optimizations
  • Performs certain optimizations (Red Dragon book)

15
IMPACT steps (contd)
  • Code layout optimizations
  • Makes most branches fall through, etc.
  • Superblock/Hyperblock formation
  • Can generate superblocks/hyperblocks
  • ILP optimizations
  • Expose more ILP through unrolling of loops, etc.

16
An Overview of theELCOR module

17
Elcor Functional Overview
  • Elcor is a collection of compiler components and
    scripts that analyze and transform Rebel
  • Analysis modules
  • Control dependence
  • Data flow
  • Transformation and optimization modules
  • Scheduling modules
  • Acyclic schedulers
  • Loop schedulers
  • Rotating register allocator
  • Static register allocator()
  • by ReaCT-ILP
  • Elementary data structures
  • Container classes
  • Data structures for compiler algorithms
  • Intermediate Representation data structures
  • I/O modules
  • Rebel reader/writer
  • Lcode reader/writer
  • Mdes interface()

18
Control Flow Analysis
  • Dominator analysis
  • Control Dependence Analysis
  • Loop detection
  • Induction variable detection

19
Control Flow Transformations
  • Loop region construction
  • Constructs a cyclic region which can be modulo
    scheduled
  • Single back edge
  • Structural region formation
  • Identifies acyclic subgraphs of CFG that are
    single entry and multiple exit.
  • Branch normalization/denormalization
  • Constructs a memory layout independent form of
    CFG that can be transformed easily.

20
Control Flow Transformations
  • Tail duplication
  • Useful for constructing single entry multiple
    exit regions

A
A
B
C
B
C
D
21
Control Flow Transformations
  • If-conversion of single entry multiple exit basic
    block regions

Supports if-conversion with or without fully
resolved predicates (FRPs)
22
Data-flow analysis
  • Live variable analysis
  • Live variable information is annotated on the IR
  • Reaching definitions analysis
  • A data structure for def-use chains is annotated
    on the IR
  • Available expression analysis
  • Queries for expression availability is provided
    at any point on the control-flow graph
  • These analysis can be performed on any region

23
Data-flow analysis architecture
  • Uses a CFG consisting of basic-blocks and
    hyperblocks.
  • Such a cut has to exist in the region hierarchy
  • Region based analysis has three steps
  • Transfer functions are constructed for each
    entry-exit pair on a CFG node
  • Transfer functions are constructed using local
    predicate relationships.
  • Global iterative solver is conventional
  • Solves data-flow equations at basic/hyperblock
    entry exit points.
  • Local analysis is used to determine data-flow
    equation solutions at points within a block using
    global solver results

24
Optimizations
  • Predicate speculation
  • Dead code elimination
  • Global copy propagation (forward)
  • Local copy propagation (forward and backward)
  • Global common sub-expression elimination
  • Loop-invariant code removal
  • Global register renaming

25
The Elcor Intermediate Representation

26
Factors Motivating the Design
  • Global scheduling is key to exploiting ILP
  • We are moving towards bigger and complex regions
  • Frequency-based regions have more complex
    structure than traditional structure-based
    regions
  • Even a trace is multiple-entry multiple-exit
    region
  • Many of the ILP enhancing techniques, e.g.,
    height reduction, rely on estimates of height and
    resource usage
  • Such estimates may be helpful even in earlier
    phases
  • Analysis like memory disambiguation are expensive
  • Need to represent and maintain their results
    accurately

27
Factors Motivating the Design
  • Flexibility in phase ordering
  • Because we don't fully understand the right phase
    order
  • Flexibility and ability to grow
  • In many cases, we don't fully understand the
    requirements
  • IR highly optimized for a specific purpose may
    not be the right one
  • Put general mechanism to support various policies
  • Well defined interfaces to modules and
    encapsulation
  • Uniformity
  • Easy to build software, modify and grow

28
IR Features
  • Registers carry values, edges represent
    dependences
  • A uniform, edge-based representation of control
    flow and data dependences
  • Supports threading of data dependences
  • dependence flow graphs
  • Hierarchical non-overlapping region structure (a
    tree)
  • Multi-state IR
  • Provides mechanism for representing
  • Traditional control flowgraph
  • Control dependences
  • Data dependences for both registers and memory in
    various forms
  • Various forms of register usage single
    assignment, multiple assignments
  • Expanded virtual registers (EVRs)
  • Predicated execution
  • Data section
  • Global symbols, arrays, etc.

29
Internal vs. Textual Representation
  • Each component of the graph data structure is a
    C object
  • All modules of the Elcor use this IR
  • Optimization are simply IR-to-IR transformations
  • There is an ASCII intermediate representation,
    called Rebel
  • Phases of Elcor may communicate using Rebel
  • A reader procedure is provided that reads Rebel
    and constructs the corresponding internal program
    representation
  • A writer procedure is provided for generating
    Rebel from the internal representation

30
Example of textual IR
  • See http//www.trimaran.org/docs/elcor_ir_manual.p
    df

31
Program Representation
  • A program unit is represented by a graph of
    operations connected by edges
  • Control flow is represented explicitly and at the
    operation level
  • A region structure over the operation graph (a
    tree)
  • The root of the tree is the program unit, e.g. a
    procedure
  • The leaf nodes of the tree are operations
  • Operation graph elements
  • Op(eration) class
  • Operand class
  • Edge class

32
Navigating ELCOR code
33
Projects for this class
  • You are going to be modifying parts of ELCOR
  • ELCOR code is written in C with its own
    template library
  • Very large code base
  • Many tools already available in ELCOR you must
    find them to try avoiding re-inventing the wheel
    every time

34
Some useful directories
  • Graph directory
  • Contains C representation of Op, Region,
    Operand, etc.
  • Control directory
  • Contains tools helpful in identifying control
    structures (such as loops)
  • Analysis directory
  • Analysis such as dominator analysis, liveness
    analysis, etc.
  • Opti directory
  • Contains code for certain optimizations
  • Main directory
  • Starting point for ELCOR
  • Tools directory
  • Structures such as maps, vectors, etc.

35
Some tips
  • Use ctags (exhuberant-ctags) or etags
  • Look at Main/process_function.cpp
  • Do NOT assume Trimaran to be bug-free
  • Look at how supplied tools are used
  • There are iterators that are very useful (look in
    Graph and Control)
  • Spend time exploring the code before starting to
    code. You may find that things are already there

36
TheHPL-PD Simulator and Performance Monitoring
Environment
37
User View of the Simulator
  • To the user, the simulator is simply another
    phase of the compilation/execution process.
  • Transparent to the user, Makefiles guide the
  • Configuration of the simulator using MDES
  • Generation of executable code from the Rebel
    output of the back end.
  • Creation of interface for foreign calls
  • to C routines provided by the user or as part of
    a standard library.
  • A GUI is provided to extract and analyze the
    execution results of the simulator.

ExecutionStatistics
C program
Front End
Back End
Simulator
38
Execution Results
  • During execution, the simulator produces raw
    data, namely a trace specifying
  • Control flow execution
  • gives the order of control-block execution
  • Memory addresses referenced
  • Guarded predicate values
  • whether an operation within a HPL-PD instruction
    was disabled by predication.
  • A trace-driven profiler tool is run after
    execution.
  • Reads the trace, and Rebel file(s), and extracts
    the desired information.
  • Emits a detailed statistics / profile information
    file.

39
Statistics
  • List of items generated by the Trace-Driven
    Profiler
  • IPC (number of HPL-PD operations / clock cycle).
  • Memory address usage frequencies.
  • Control block visit frequencies.
  • Resource utilization.
  • Register Usage frequencies.
  • Functional Unit utilization.
  • Memory(Stack / Heap) utilization.
  • Effectiveness of guarded predicates.
  • Register allocation overhead.

40
Viewing execution statistics using the GUI
41
Viewing execution statistics using the GUI
42
The Trimaran GUI
  • The Trimaran system is configured and run via a
    Graphical User Interface
  • choose program to compile
  • configure target machine
  • configure compilation stages
  • view graphical program representations at various
    stages of compilation
  • view execution statistics (graphs, pie charts,
    etc.)
  • view extensive on-line help and documentation.
  • If desired, the system can also be run from the
    command line and be invoked from shell scripts.

43
The control panel
  • The GUI is operated from this main control panel.

ViewingProgramIntermediate Representation
GUI settingsand defaults
Compiler andSimulator Parameters
Compileroptions
Organize Collections of Programs,
Machines,Parameter Sets, etc.
Viewing Execution Statistics
Target MachineConfiguration
44
The Compiler Panel
  • The compiler panel allows you to choose a
  • benchmark program to compile
  • you can add your own as well.
  • target machine configuration
  • parameter set (for the compiler and simulator)
  • project file
  • It also allows you to easily configure the stages
    of the compiler and start the compilation
    process.

45
Choosing a benchmark and machine
Choosing a benchmark
Choosing a machine
46
Configuring the compiler
Front endfeatures
Back endfeatures
Simulatoron/off
47
On-line Documentation
  • On-line documentation is available for each
    component of Trimaran
  • this is the on-line help for the compiler panel.

48
The Machine Panel
  • The machine panel is used create new target
    machines and modify existing ones.
  • Here, one selects an existing machine to copy or
    modify.

49
Machine Descriptions
  • To edit a machine description, the GUI opens an
    editor window.
  • Trimaran includes a very powerful machine
    description facility.
  • It is the subject of an entire section of this
    tutorial.
  • The GUI interface simplifies the process of
    machine description.

50
The Parameters Panel
  • The parameters panel allows you to modify a large
    number of parameters used by the front end, the
    back end, and the simulator.
  • Generally, the default settings will be used.
  • Once a new parameter set has been configured, the
    set can be named and saved for subsequent use.

51
Modifying Parameters
  • Upon clicking open, the parameters are
    displayed.
  • Here, the compiler front end parameters are
    displayed, along with their current values.
  • Clicking a ? button opens a help window for
    that parameter.
  • Parameters can also be modified by editing text
    files, if desired.

52
Parameters for the Back End
  • The compiler back end has the largest number of
    parameters.
  • The parameters are organized into groups
    according to their use.
  • Analysis
  • Optimizations
  • Register Allocation
  • Etc.

Help window
53
The Statistics Panel
  • The statistics panel allows you to choose what
    statistics are displayed for the programs in
    ones project file.
  • Function level execution profile
  • Region level profile
  • Instruction usage
  • Etc.

54
Viewing Statistics
  • For each program in your project file, a separate
    graph is displayed.
  • Here, pie charts show the dynamic instruction
    distribution.

55
The View IR Panel
  • The IR viewer provides five kinds of views of a
    program.

The program regions (hyperblocks, loops, etc.)
Dependence Graph
Control Flow Graph (CFG)
ILP Instruction Schedule
Profile Information
56
Control Flow View
  • Here is a portion of the control flow graph for a
    program.
  • The user can specify a portion of the program to
    display.
  • The viewer has zoom in, zoom out, scroll, etc.

57
Summary
  • Trimaran is open-source compilation/simulation
    environment to study VLIW compilation
  • Intel IA-64 (Itanium and McKinley)
  • Trimaran has a parameterized structure
  • Relies on an HPLPD architecture description file
  • Trimaran supports various research and
    educational efforts
  • www.trimaran.org for more information
Write a Comment
User Comments (0)
About PowerShow.com