Languages and the Machine - PowerPoint PPT Presentation

About This Presentation
Title:

Languages and the Machine

Description:

Steps involved in compiling this statement into assembly code: ... Code Generation : Determine the proper assembly code to perform the action. ld [Bar], %r0, %r1 ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 30
Provided by: mathUaa
Category:

less

Transcript and Presenter's Notes

Title: Languages and the Machine


1
Languages and the Machine
  • Chapter 5
  • CS221

2
Topics
  • The Compilation Process
  • The Assembly Process
  • Linking and Loading
  • Macros
  • We will skip
  • Case Study Extensions to the Instruction Set
    The Intel MMX and Motorola AltiVec SIMD
    Instructions

3
Compilation Process
  • Assembly to Machine code fairly straightforward,
    but compilation is not
  • Translate a program written in a high level
    language into a functionally equivalent program
    in assembly language
  • Consider a simple high-level language assignment
    statement
  • Foo Bar Zot 15
  • Steps involved in compiling this statement into
    assembly code
  • Lexical analysis separate into tokens, Foo, ,
    , etc.
  • Syntactic Analysis / Parsing Determine that we
    are performing an assignment, VAR EXPRESSION
  • Semantic Analysis Determine that Foo, Bar, Zot
    are names, 4 is an integer
  • Code Generation Determine the proper assembly
    code to perform the action
  • ld Bar, r0, r1
  • ld Zot, r0, r2
  • addcc r1, r2, r1
  • addcc r1, 15, r2
  • st r2, r0, Foo

4
Compiler Issues
  • Each compiler specific to a particular ISA
  • E.g., an int on one machine may be 32 bits, on
    another may be 64 bits
  • Cause of error in networking library ported to
    Alpha
  • Int issue not a problem in Java JVM specifies 32
    bits
  • E.g., in previous example, if the ISA allowed
    operands of addcc to be memory addresses, we
    could have done
  • addcc Bar, Zot, r1
  • addcc r1, 15, Foo
  • Hopefully the compiler generates efficient code
    but optimization is a tough issue!
  • Cross compiler one that generates code for a
    different ISA (example, CodeWarrior)

5
Mapping Variables to Memory
  • Global variables
  • Accessible from anywhere in the program, given a
    fixed address
  • E.g., global variable X at memory address 400
  • Local variables
  • Also called automatic variables
  • Defined inside a function or method, e.g.
  • void foo()
  • int a,b
  • These variables created when foo is invoked,
    destroyed when foo exits
  • These variables are created by pushing them on
    the stack when the function is invoked, and are
    popped off when the function exits

6
Local Variables and the Stack
  • Recall that the stack typically grows downward in
    memory
  • Here we start with 1234 stored on the top of the
    stack

Mem
Mem
0 4 8
0 4 8
FFFF
1234
1234
Push FFFF
SP 8
SP 4
7
Local Variables and the Stack
  • In our case, local variables are pushed on the
    stack upon entering the function
  • void foo() int a
  • Copy SP into Frame Pointer FP (also called the
    Base Pointer, or BP)

Mem before Foo
Mem in Foo
0 4 8
0 4 8
Var a
1234
1234
SP 8
SP 4
FP 8
8
Accessing Stack Variables
  • These variables are referenced as offsets from
    the frame pointer, called based addressing
  • To access a fp 4

Mem in Foo
0 4 8
Var a
Why not use sp ? Consider pushing lots of
stuff on the stack Or data structures
1234
SP 4
FP 8
9
C to ASM Example on x86
pushl ebp movl esp, ebp subl 8, esp movl
3, -4(ebp) movl 4, -8(ebp) movl
-4(ebp),eax imul1 -8(ebp),eax movl eax,
c .comm c,4,4
  • include ltstdio.hgt
  • int c
  • int main()
  • int a,b
  • a3
  • b4
  • cab

10
Arrays in Memory
  • Arrays may be allocated on the stack or allocated
    off the heap, a pool of memory where portions may
    be dynamically allocated. Access elements of an
    array a bit different than regular variables.
  • int A10 Array of 10 integers

Mem allocated for A
0 4 8 40
A (Base) 4
A0 A1 A9
ElementAddr A (IndexSize) e.g. A2 is at 4
(24) 12
11
If-Statements
  • Conditional statements map to a comparison and a
    branch instruction
  • C
  • if (xy) statement1 else statement2
  • Assembly (assume X in r1, Y in r2)
  • subcc r1, r2 ! Zero flag set if res0
  • bne Statement2 ! Branch if zero flag is not
    set
  • ! Statement1 code
  • ba StatementNext ! Branch always
  • Statement2 ! Statement2 code
  • StatementNext

12
Loops
  • While, Do-While, For loops implemented using the
    same conditional check and branch as the if-then
    statement
  • The branch returns back to previous code instead
    of jumping forward over code

13
Production Level Assemblers
  • Allow programmer to specify location of data and
    code
  • Provide mnemonics for all instructions and
    addressing modes
  • Permit the use of symbolic labels to represent
    addresses and constants
  • Provide a means to specify the starting address
    of the program
  • Include a way to share variables between
    different assembled programs
  • Support macros

14
Assembly Example
15
Assembled Code
16
Two Pass Assemblers
  • Most assemblers are two-pass
  • First pass
  • Determine addresses of all data and instructions
  • Perform any assembly-time arithmetic
  • Put definitions and constants into the symbol
    table
  • Second pass
  • Generate machine code
  • Insert actual addresses and values of symbols
    which are known from the symbol table
  • Two passes useful for forward references, i.e.
    referencing later on in the program

17
Forward Reference
18
Symbol Table
  • Generated during the first pass
  • Maps identifiers to values, table filled in as
    values are encountered and the program is parsed
    from top to bottom
  • .org 2048 Says assemble code starting at 2048
  • const .equ value Defines const equal to value

19
(No Transcript)
20
Assembled Program
21
Final Tasks of the Assembler
  • Linking and Loading
  • We need the following additional info
  • Module name and size
  • Address start symbol
  • Information about global and external symbols
  • Information about any library routines
  • Values of constants
  • Relocation information

22
Location of Programs in Memory
  • We have been using .org to specify a fixed start
    location
  • Typically we will want programs capable of
    running in arbitrary locations
  • If we are concatenating together different
    modules, the addresses for identifiers in the
    different modules must be relocated
  • Linker software that combines separately
    assembled modules
  • Loader software that loads another program into
    memory and may modify addresses if the program is
    loaded in a location different from the origin
  • Must also set appropriate registers, e.g. SP

23
Linking .global and .extern
  • A .global is used in the module that a symbols is
    defined and .extern is used in every other module
    that refers to it

24
Linking and Loading
  • Symbol tables for previous example
  • Symbols whose address might change market
    relocatable (not all addresses! Some may be fixed)

25
DLLs
  • Windows uses Dynamic Link Libraries, or DLLs
  • Linking a common routine in many programs results
    in duplicate code from that common routine in
    each program
  • In a DLL, commonly used routines (e.g. memory
    management, graphics) present in only one place,
    the DLL
  • Smaller program sizes, each program does not need
    to have its own copy
  • All programs share the exact same code while
    executing
  • Dont need recompiling or relinking
  • Disadvantages
  • Deletion of a shared DLL by mistake can cause
    problems
  • Versions must be the same
  • DLL code file can live in many places in Windows
  • DLL Hell

26
Macros
  • An assembly macro looks kind of like defining a
    subroutine
  • For example, there say that there is no PUSH
    instruction to push data on the stack. We can
    make a macro for push

27
Macro Expansion
  • Given the previous macro, we could now write the
    following code
  • push r15 ! Push r15 on the stack
  • push r20 ! Push r20 on the stack
  • Upon assembly, these macros are expanded to
    generate the following actual code
  • addcc r14, -4, r14
  • st r15, r14
  • addcc r14, -4, r14
  • st r20, r14

28
Macros vs. Subroutines
  • Later we will see how to write actual subroutines
    we can call
  • Only one copy of the shared code in a subroutine
  • Tradeoffs
  • Subroutines
  • Takes up less memory since only one copy of the
    code
  • But slower than macros subroutines have overhead
    of invoking and returning
  • Macros
  • Take up more space than subroutine call due to
    macro expansion for each occurrence of the macro
  • Faster than subroutines no overhead to
    invoke/return

29
Skipping for now
  • Discussion on Pentium MMX
  • We may return to this later if time permits
Write a Comment
User Comments (0)
About PowerShow.com