An Introduction toThe Mozart Abstract Machine - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

An Introduction toThe Mozart Abstract Machine

Description:

P' is a procedure printing hello world' P's closure contains a reference to the ... For the sake of efficiency, records refer also hash tables that map feature ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 57
Provided by: kost2
Category:

less

Transcript and Presenter's Notes

Title: An Introduction toThe Mozart Abstract Machine


1
An Introduction toTheMozart Abstract Machine
  • Per Brand and Konstantin Popov

2
The Mozart System - Overview
  • Mozart Compiler
  • compiles Oz into an intermediate language
  • written in Oz
  • Mozart Virtual Machine
  • executes intermediate code
  • written in C
  • Tcl/Tk interpreter (GUI)
  • Emacs-based OPI (Emacs Lisp modus)

3
Virtual Machines - why?
  • Portability
  • the same intermediate code runs everywhere
  • of course, one has to have VM on target platform
  • Easier to implement!
  • The so-called semantic gap between source
    language and machine language is filled by the
    intermediate language
  • both Mozart Compiler and Mozart VM taken together
    are simpler than a potential Oz to machine code
    compiler!!

4
Virtual Machines around...
  • Historically Lisp, Smalltalk
  • Low-level, stack-based Forth
  • Logic programming Prolog, etc.
  • Functional programming ML, Haskell, Erlang
  • Modern imperative Java

5
The Mozart VM - the Idea
  • VM is a loop fetching and executing instructions.
  • Instructions creating data structures,
    conditionals, procedure calls, thread creation
    etc.
  • Values are stored in the Store.
  • like in the language itself
  • VM has a program pointer and registers.
  • registers refer values in the Store

6
Property of Mozart Virtual Machine
  • Register-based virtual machine
  • Temporaries and parameters are found in registers
    (so-called X registers)
  • Java is stack-based
  • Register-based vs stack-based
  • closer to machine architecture - less work for
    the JIT
  • X registers are either machine registers or at
    least in cache
  • instructions are longer in register-based than
    stack-based machine
  • Multi-paradigm virtual machine

7
Terminology
  • X-registers
  • a set of registers common to the whole virtual
    machine
  • Y-registers
  • corresponds to stack variables (local variables)
    in conventional programming variables
  • relative the current frame
  • G-registers
  • closure references
  • relative current procedure

8
The Mozart VM - the Idea (I)
Code Area
Emulator
emulator() ... while (1) op
fetch(PC)) switch (op) case call(X)
inc(PC) continue

PC
Inst(x(0)) inst(g(1)) inst ...
Registers
Store
Atom a
9
Hello World example (1)
  • declare P in
  • proc P
  • System.show
  • 'hello world'
  • end
  • P
  • P is a procedure printing
    hello world
  • Ps closure contains a reference to the
    System module!

10
Hello World example (simplified)
  • lbl(7) definition(x(0) 21 g(122))
  • move(g(0) x(0))
  • inlineDot(x(0) show x(1))
  • putConstant(hello world x(0))
  • call(x(1))
  • return
  • endDefinition(7)
  • lbl(21) unify(x(0) g(1024))
  • call(g(1024))
  • declare P in
  • proc P
  • System.show
  • 'hello world'
  • end
  • P

compiles
11
Hello World example (2)
  • definition(x(0) 21 g(122))
  • move(g(0) x(0))
  • inlineDot(x(0) show x(1))
  • putConstant(hello world x(0))
  • call(x(1))
  • return
  • endDefinition(7)
  • lbl(21) unify(x(0) g(1024))
  • call(g(1024))
  • Creates a procedure as a first-class value.
  • x(0) is the register that will refer the
    procedure
  • 21 is the label after the definition
  • g(122) is the register refering the System
    module

12
Hello World example (3)
  • definition(x(0) 21 g(122))
  • move(g(0) x(0))
  • inlineDot(x(0) show x(1))
  • putConstant(hello world x(0))
  • call(x(1))
  • return
  • endDefinition(7)
  • lbl(21) unify(x(0) g(0))
  • call(g(0))
  • Moves the content of g(0)
  • into x(0)
  • g(0) contains a reference to the System
    module
  • g(0) is local to the procedure and initialized
    by definition (discussed later!)

13
Hello World example (4)
  • definition(x(0) 21 g(122))
  • move(g(0) x(0))
  • inlineDot(x(0) show x(1))
  • putConstant(hello world x(0))
  • call(x(1))
  • return
  • endDefinition(7)
  • lbl(21) unify(x(0) g(0))
  • call(g(0))
  • Retrieves the show procedure out of the
    System module
  • x(0) is initialised above
  • x(1) becomes a reference to the Show procedure

14
Hello World example (5)
  • definition(x(0) 21 g(122))
  • move(g(0) x(0))
  • inlineDot(x(0) show x(1))
  • putConstant(hello world x(0))
  • call(x(1))
  • return
  • endDefinition(7)
  • lbl(21) unify(x(0) g(0))
  • call(g(0))

Creates an atom hello world in the Store and
puts a reference to it into x(0)
15
Hello World example (6)
  • definition(x(0) 21 g(122))
  • move(g(0) x(0))
  • inlineDot(x(0) show x(1))
  • putConstant(hello world x(0))
  • call(x(1))
  • return
  • endDefinition(7)
  • lbl(21) unify(x(0) g(0))
  • call(g(0))

Now, x(1) refers the Show procedure and x(0)
refers hello world. Show accesses the
argument as x(0)!
16
Hello World example (7)
  • definition(x(0) 21 g(122))
  • move(g(0) x(0))
  • inlineDot(x(0) show x(1))
  • putConstant(hello world x(0))
  • call(x(1))
  • return
  • endDefinition(7)
  • lbl(21) unify(x(0) g(0))
  • call(g(0))
  • Returns control to the place
  • just after call(g(0))
  • endDefinition is not used for execution per
    se

17
Hello World example (8)
  • definition(x(0) 21 g(122))
  • move(g(0) x(0))
  • inlineDot(x(0) show x(1))
  • putConstant(hello world x(0))
  • call(x(1))
  • return
  • endDefinition(7)
  • lbl(21) unify(x(0) g(0))
  • call(g(0))
  • continue...

Execution proceeds futher...
18
Oz Data Types in VM
  • A (partial) value in the VM is a graph such that
  • nodes of primitive values (atoms, integers etc.)
    have no outgoing arcs
  • nodes of compound values (e.g. records) do have
    outgoing arcs we call them references
  • variable nodes can be bound, after which they
    become transparent references

19
Primitive Values
  • Atoms - objects with strings inside
  • Integers - objects with integers inside
  • Boolean - objects with 0-1 values inside
  • Conveniently, boxes are real C objects with
    operations relevant to their types.

20
Records
  • A record in VM is an object that refers
  • an atom (name) which is the records label
  • a sorted list of feature names
  • record subtrees (stored in an array)
  • For the sake of efficiency, records refer also
    hash tables that map feature names to arrays
    indexes

21
Records (II)
R label(f1 a f2 1)
Hash Table
R
label
a
1
f1 f2
22
Records (III)
R label(f1 a f2 R)
Hash Table
R
label
a
f1 f2
23
Cells
  • A Cell is just a box with a reference to a value.

C Cell.new unit
C
unit
24
Abstractions
  • (Remember) Oz Procedures can refer values in
    their lexical scope - they are closures
  • environment of a procedure is an array of
    references (g-registers)

Code Area
declare EnvVar Proc in proc Proc X EnvVar
X end
inst() inst() return ...
Proc
PC
EnvVar
25
Representation of Types of Nodes
  • Types of nodes in the store are represented by
    references which are typed. We call them tagged
    references (3 or 4 bits).

Emulator
registers
int
1
list
list
int
2
list
26
Representation of Types of Nodes-2
  • Sometimes there needs to be a combination of
    tagged reference or pointer and
  • Tagged object (each object knows its size)

Emulator
registers
obj
ext
27
Variables
  • A variable is an object such that
  • Unbound variable has no reference in it. Thus, it
    looks like a primitive value
  • Unbound variable is recognised by the VM as such
  • Bound variable object refers another value
  • Bound variable object is transparent for
    operations on values, I.e. becomes a reference
  • The VM can step through adjacent references. This
    is called dereferencing

28
Variables
  • X Y Z in XY, XZ, X1

29
Compiling Data Structures
  • Values from a program text need to be constructed
    in the Store.
  • Primitive values are constructed with putInt,
    putConstant, etc.

XatomEx
putConstant(atomEx x(2))
30
Compiling Records
  • Records are constructed in the top-down way,
    similar to the Prologs WAM one

getRectord(rec f1 f2 x(2)) unifyVariable(x(1)) u
nifyNumber(2) getRecord(tup 1 x(1)) unifyLiteral(a
)
R rec(f1tup(a) f22)
31
Compiling Records (2)
R rec(f1tup(a) f22)
Creates a record node with subtrees which
are unbound variables
putRectord(rec f1 f2 x(2)) unifyVariable(x(1)) u
nifyNumber(2) getRecord(tup 1 x(1)) unifyLiteral(a
)
32
Compiling Records (3)
R rec(f1tup(a) f22)
putRectord(rec f1 f2 x(2)) unifyVariable(x(1)) u
nifyNumber(2) getRecord(tup 1 x(1)) unifyLiteral(a
)
Unifies the first subtree (under f1) with a
new variable in x(1)
33
Compiling Records (4)
R rec(f1tup(a) f22)
putRectord(rec f1 f2 x(2)) unifyVariable(x(1)) u
nifyNumber(2) getRecord(tup 1 x(1)) unifyLiteral(a
)
Unifies the second subtree (under f2) with
integer 2
34
Compiling Records (5)
R rec(f1tup(a) f22)
putRectord(rec f1 f2 x(2)) unifyVariable(x(1)) u
nifyNumber(2) getRecord(tup 1 x(1)) unifyLiteral(a
)
Unifies x(1) with a new tuple
35
Compiling Records (6)
R rec(f1tup(a) f22)
putRectord(rec f1 f2 x(2)) unifyVariable(x(1)) u
nifyNumber(2) putRecord(t 1 x(1)) unifyLiteral(a)
Unifies the first (and sole) subtree of tup()
with a
36
Compiling Abstractions
  • One specifies registers that are to be saved in
    the closure

lbl(7) definition(x(0) 21 g(122)) ...
return endDefinition(7) lbl(21) ...
declare EnvVar Proc in proc Proc X EnvVar
X end
37
Conditionals
  • Check condition(s) and proceed in one of two
    branches.
  • there is branch ltlabelgt instruction
  • very similar to C compiled for any RISC
    architecture!

38
Conditionals (II)
x(1) contains X x(2) contains
Show testNumber(x(1) 1 22) putConstant(x(0)
ok) call(x(2)) ok (x(0)) passed branch
31 skip else clause lbl(22)
putConstant(x(0) no) call(x(2)) no
(x(0)) passed lbl(31) ...
declare X in if X 1 then Show ok else Show
no end
39
Procedure Application
  • Arguments are passed in X registers
  • there is one single set of X registers
  • Return point is saved in a task on the task stack
  • task stack is called so because it serves yet
    other purposes (e.g. exception handling)
  • Procedure finishes with return
  • pops the task from the stack

40
Procedure Application (II)
inst() ... return ...
PC
call() inst() ...
call() inst() ...
Task Stack
41
Local Variables (II)
PC
move y(0) x(1) ... return ...
Y registers
call() ... move y(1) x(1) ...
Task Stack
42
Local Variables
  • Local variables are kept in Y registers
  • associated with tasks only the topmost set is
    accessible for manipulation
  • explicitly allocated and deallocated through
    allocate ltNgt and deallocate instructions
  • Y registers are accessible through move
    ltreggtltreggt instructions

43
Tail-call optimization
  • Task frame is only needed when a procedure
    contains either
  • 2 or more call instructions
  • 1 call instruction but other instructions follow
  • Otherwise no frame allocated
  • Example Partition (tail-recursive)

44
Accessing Closure Variables
  • A pointer to the abstractions environment array
    is known to the VM as G registers
  • set by call ltreggt instruction
  • saved in stack when a nested procedure is called
  • accessible by move ltreggtltreggt instructions
  • Thus, a task in the task stack is a triple
    ltPCret,Yregs,Gregsgt

45
Memory Management (overview)
  • Values in the store can become garbage
  • e.g. when an array of Y registers is deallocated
  • Garbage collection reclaims garbage it traces
    all alive data and frees unused space
  • Mozart (so far) exploits a stop-and-copy
    collector alive data is copied into a new area,
    and the old are is freed (including garbage!)
  • Nodes reachable through registers and stack in
    the VM are considered alive (i.e. could be
    accessed in the future)

46
Threads
  • Threads are created by means of the
    thread ltlblgt instruction

thread E end ...
thread(33)
CE code for E lbl(33)
  • E is executed concurrently in a new thread

47
Threads (II)
  • A thread consists essentially out of the task
    stack
  • A thread can be runnable, running, blocked or
    terminated.
  • Blocked can not advance because of lack of
    information in the Store
  • VM contains a scheduler and a pool of runnable
    threads
  • Question how to manage blocked threads?

48
Threads (III)
  • Blocked threads are associated with variable(s)
    those missing bindings caused threads to block

Thread
nil
stack
suspension
declare X in if X 1 then end
X
Store
49
Threads (IV)
  • Suspensions are created by the e.g. test
    instructions used for compiling conditionals
  • There can be many threads blocked on the same
    variable!
  • Binding a variable involves scheduling all
    blocked threads for execution
  • suspensions are deallocated
  • threads are entered into the threads pool

50
Advanced Issues - Data Types
  • Ports are objects with the send method that
    just adds elements to the ports stream
  • no additional synchronisation primitive(s) are
    needed
  • Objects have highly-optimised built-in
    implementation
  • dedicated (C) abstraction
  • specialised VM instructions for method
    application, etc.

51
Advanced Issues - Exceptions
  • Remember that handling an exception means that
  • remaining computation in the try clause is
    discarded
  • a specified action is executed instead
  • Note that the remaining computation is
    represented by a certain number of topmost tasks
    in the threads stack

52
Advanced Issues - Exceptions (II)
  • Exception handling mechanism pushes a dedicated
    task into the task stack - an exception handler
  • it delimits computation in try clause
  • when no exception occurs, then the task is
    silently discarded when reached
  • when an exception does occur, all tasks up to but
    the exception handler are discarded and the
    handlers action is executed

53
Advanced Issues - Optimization
  • Run-time code optimization
  • Obj m(X Y Z)
  • Compiled as move(? x(0)) move(? x(1)) move(?
    x(2)) sendMsg(m ObjectReg 3)
  • During runtime changed to direct access to method
    in method table (from Smalltalk)

54
Advanced Issues - Optimization -2
  • Threaded code
  • GNU compiler (not standard C)
  • Instruction is actually address of case statement
    for that specific instruction
  • 30 speedup
  • increases code size

55
Advanced Issues - Optimization -3
  • Emulation is interpretation at very low level
  • It is believed (based on experience with other
    VMs) that native code compilation would increase
    performance by 2-3 times on a RISC
  • JIT or native code compilation of a stack-based
    VM (e.g. Java) gives more
  • Stack-based virtual machines are slower
  • Part of improvement is stack to register
    transformations
  • Dynamic typed or even statically typed with data
    flow language systems are slower than static
    typed languages

56
Topics NOT Mentioned At All
  • Further nearly conservative (in spirit of)
    extensions of the VM allow for
  • constraint solving facilities, including the Oz
    search combinator
  • distributed programming extension
  • covered in next lecture
Write a Comment
User Comments (0)
About PowerShow.com