Trustless Grid Computing in ConCert (Progress Report) - PowerPoint PPT Presentation

About This Presentation

Title:

Trustless Grid Computing in ConCert (Progress Report)

Description:

Truth (local) typing judgement: Valid (Mobile) Bindings. True (Local) Bindings ... Validity (mobile) typing judgement: Mobile = does not use local resources. ... – PowerPoint PPT presentation

Number of Views:23

Avg rating:3.0/5.0

Slides: 56

Provided by: Robert71

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Trustless Grid Computing in ConCert (Progress Report)

1
Trustless Grid Computing in ConCert(Progress
Report)

Robert Harper
Carnegie Mellon University

2
Acknowledgements

Co-PIsKarl Crary, Frank Pfenning, Peter Lee.
SupportNSF ITR program.
Students (who do the real work)Chang, Delap,
Dreyer, Kliger , Magill, Moody, Murphy, Petersen,
Sarkar, Vanderwaart, Watkins.
Thanks to FGC Organizers for the invitation!

3
Grid Computing

The network is a computer.
Exploit idle resources on the network.
Many ad hoc grids.
SETI_at_HOME
FOLDING_at_HOME
But what is a general grid model?
Trust model, programming model, participation
model?

4
Application Model

What is the (a?) grid computer?
Parallelism?
Dependencies?
Sharing resources?
Failures?
Centralized vs. distributed.
Bottlenecks (e.g., SETI traffic at UCB).
Reliability, robustness.

5
Application Model

Most grid apps are massively parallel.
Depth 1, no dependencies.
Ray tracing, GIMPS, SETI.
Is a grid useful for depth gt 1?
Game-tree search.
Theorem proving.
Is parallelism the only benefit?
What about data locality?

6
Host Model

Active intervention required.
Must download code, apply upgrades.
Must decide on which grids to participate.
Motivation to participate?
At scale, largely altruism, coolness.
Ad hoc grids on an intranet.
Economic models? (Cf Lillibridge, et al.)

7
Trust Relationships

Hosts trust applications.
Denial of service attacks.
Privacy/secrecy attacks.
Accidental misbehavior (e.g., SETI).
Applications trust hosts.
Spoofed answers.
Collusion among participants.
Can we minimize these?

8
The ConCert Approach

One computer, many keyboards.
Decentralized scheduling.
Emphasis on code mobility.
Policy-based participation.
Declarative statement of participation criteria.
Applications must prove compliance.
Dependency-based scheduling.
Arbitrary depth.
And/or dependencies.
Inspired by CILK/NOW.

9
The ConCert Network
Client
Hosts
10
Host Setup
Peer-to-Peer Discovery Protocol
Locator
Scheduler
Distributed Scheduler
Worker
Loader/Verifier/Runner
11
Scheduler

Maintain ready and waiting queues.
Ready queue available for stealing.
Wait queue awaiting satisfying assignment.
Work-stealing model.
Who has work to do?
Grab work, compute result, deliver to owner.
Dependencies.
Supports depth gt 1 parallelism.
Dont care and dont know parallelism.

12
Scheduler

The unit of work on the grid is a cord.

13
Scheduler

Cord structure
Code cached using MD5 fingerprints.
Certificate of compliance (more later).
Dependencies positive boolean formula.
Assumptions
Idempotent can always be re-run.
Non-blocking runs to completion (but may create
more cords, often as continuations).
Communication only via dependencies. Satisfying
assignment passed on activation.

14
Worker

Steal work from (self or) neighbor.
Obtain cord from host.
Typically arguments dependencies.
Code shipped at most once.
Verify certificate of compliance.
Load and execute as a DLL.
Currently combined with verification.
Should verify at most once (cache result).
Deliver result to owner.

15
Control

Client.
Submit a job to the grid.
One per keyboard.
Monitor.
Web server interface.
Displays cord status.
Change policy.

16
Moving Cords Around
A client submits work, broken into cords, to the
local conductor.
17
Moving Cords Around
Idle peers steal cords to work on. Cords have
destinations for their answers, shown by color
here.
18
Moving Cords Around
Some cords spawn new cords. They might depend on
other cords before they can run. The destination
of F and G is the green node, since they will be
used to fill Hs dependencies.
19
Moving Cords Around
When a cord finishes, the result is sent to its
destination. The client interprets and displays
the results. Simultaneously, unfinished cords
continue to be stolen...
20
Moving Cords Around
When the green node has answers for F and G, H is
then ready to be stolen.
21
Popcorn/Grid Model

my_cord string witness ! string.
Marshals argument and result itself.
Witness is the satisfying assignment for its
dependencies.
Typical structure
Input entry point arguments.
Dispatch on entry point.
Cords as distributed continuations.
Perform some work, spawn new cords.
Supports various higher-level parallelism models.

22
ML/Grid Model

One program for client and its cords.
Compiler separates client from cords.
Compiler handles marshalling.
Run-time checks enforce distinctions (more
later).
Cord cannot perform I/O.
Client cannot submit itself as a cord.
Compiles to TAL/Grid.

23
ML/Grid Model

Primitives
spawn (unit ! ?) ! ? task
sync ? task ! ?
relax ? task list ! ? ? task list
Must be provided as primitives.
Requires access to representations.
Further higher-level libraries.
E.g., parallelism models.

24
Examples

GML ray-tracer (ICFP01 Contest).
Depth 1.
Written in Popcorn/Grid, compiles to TALx86/Grid.
Chess player.
Depth gt 1, and-or dependencies.
Written in Popcorn/Grid, compiles to TALx86/Grid.
Theorem prover for MLL.
Depth gt 1, and-or dependencies.
Written in SML, runs on simulator.
Being ported to ML/Grid.

25
Some Problems

Failures.
Fail-stop model is easily supported.
Demonic failures require result certification.
Abandoning cords.
Or-dependencies are satisfied by first cord to
deliver answer.
Parent must be prepared to receive result long
after it is no longer needed.
Sharing results.
Grid-wide cache of answers?

26
Result Certification

Main idea make host prove validity of answer.
Avoid need for application to trust hosts.
Some applications admit native certification.
For theorem prover the proof.
For factoring, the facts.
Are there general result certification methods?
Work-stealing model precludes random allocation /
redundancy methods (SETI, Bayanihan).
Centralized methods are not robust or scalable.

27
Result Certification

A crazy idea use the PCP theorem.
Use interactive dialog to spot-check a proof.
Host proves that it ran given code on given data.
Execution trace is a proof that it did.
But traces can be huge!
Engage in a dialog with O(1) rounds to check
proof with high probability.
Avoids need to transmit trace itself.
But the representation is enormous!

28
Two Foundational Questions

What is a type system for a GPL?
Enforce mobility constraints.
Clean type system to support development,
compilation, certification.
What policies can we support?
How to state policies?
How to prove compliance?
How to support multiple policies?

29
A Type System for GPL

Main idea modalities for mobility.
Cf. related ideas by Cardelli, Gordon, et al.
Cf. recent work by Walker.
Here Curry-Howard applied to modal logic.
Necessity ( A) a computation of A anywhere.
Classifies mobile code of type A.
Enforces marshalling and access restrictions.
Possibility ( A) a computation of A somewhere.
Classifies remote code of type A.
Ensures that access is limited to remote values.

30
Necessity for Mobility

Truth (local) typing judgement

True (Local) Bindings
Valid (Mobile) Bindings
31
Necessity for Mobility

Validity (mobile) typing judgement
Mobile does not use local resources.

32
Necessity for Mobility

Box marshal value and bindings.
Values of boxed type are mobile.

33
Necessity for Mobility

Unboxing unbox and run mobile code.
Implicit un-marshalling

34
Necessity for Mobility

Marshalling cast into network form.
Base types, structured types fairly typical.
Function types certified binary.
Code mobility is a form of semantic linking.
Import object from the network.
Un-marshall, verify, load, execute.
(More later.)

35
Possibility for Locality

Possible (somewhere) typing judgement
What is here is somewhere

36
Possibility for Locality

Create a local reference to something somewhere

37
Possibility for Locality

Move to remote entity
May be useful for managing data locality.
Return call has type (A! B).
Cf upcalls.

38
Modalities for Mobility

These rules are for S4 modal logic.
Accessibility is reflexive and transitive.
Is this the right notion of accessibility?
Symmetry S5. You can go home again.
Judgmental form requires three contexts.
Explicit-world form uses a record of contexts.
Other varieties of modal logic are also under
consideration.

39
Policies and Certification

Current certification methods are uniform.
9 sec. policy 8 problems safety is assured.
Eg, PCC for Java
Eg, TAL for Popcorn.
Safety means memory and type safety.
Baseline requirement.
But not adequate for all applications.
Recall policies should be per-host.

40
Foundational Certification

Non-uniform setup 8 probs 9 type system
Shift the type system for object code out of the
TCB (untrusted, problem-specific).
Must provide a proof that type system is safe.
Compare Appel, et al.
Their goal minimize TCB.
Our goal support multiple safety policies.
Could be consolidated, but its a lot of work.

41
Foundational Safety

Host specifies target architecture.
Fully realistic, e.g., IA-32 OS RTS.
No unsafe transitions.
Safety policy target does not get stuck.
Any type system must come with a proof of
progress relative to the target machine.
Experience shows that progress proofs are readily
mechanizable.

42
Foundational Certification (I)
43
Foundational Certification (I)

Object code is essentially a DLL.
Type system is specified in LF.
Using typical LF representations.
Safety proof well-typed ) safe.
Represented as an LF term.
Obtained with Twelf proof search engine.
Derivation type annotations for code.
Makes mechanical checking feasible.

44
Foundational Certification (I)

May cache type system and safety proof.
Reduces certificate size.
Many cords for one type system is typical.
May use oracle strings for derivation.
Relies on details of operational behavior of
host-side checker.
Therefore not completely declarative.
But significantly reduces certificate size.

45
Foundational Certification (II)
46
Foundational Certification (II)

Object code is a DLL as before.
Type checker is a program.
Currently, a Twelf logic program.
Could be ML code.
Safety proof shows partial correctness of the
checker.
Checking succeeds ) safety.
Annotations support mechanical checking.
Time limit precludes looping.
Can refuse if limit is too large.

47
Examples

TALT
Essentially TALx86 with a safety proof.
Proof is mechanically derived and checked.
Structured as a safety proof for an abstract
machine plus a simulation lemma for target.
TALT Resource Bounds
Goal ensure that object code yields processor at
set intervals.
Precludes denial of CPU service.

48
Resource Bound Certification

Type system enforces upper bound on yield
interval.
Specified as a parameter of the type system.
Basic method
Conservative instruction counting (join points).
Yield processor at start of every basic block.
Prove that block can complete before next yield
(else split block).

49
Resource Bound Certification

Smarter techniques are under development.
Better analysis of code behavior across calls.
Fewer yields overall.
Run-time checks reduce overhead.
Use static analysis to insert minor yields that
check true interval.
Minor yields re-calibrate, possibly incurring a
major yield (system call).

50
A Meta-Grid?

ConCert Conductor represents one model of grid
computing.
Compute-intensive, distributed scheduling.
Not much reason to believe this is canonical.
Can we support a variety of models inside of a
single meta-grid?
Applications choose grid model.
Hosts are indifferent to programming model.

51
A Meta-Grid?

The ur-grid
A TCP port.
Foundational code certification.
A grid framework
Scheduler, recovery model, host policy.
Runs application cords.

52
A Meta-Grid?

Key capability safe dynamic loading and linking.
Current ConCert framework must be certified
against host safety policy.
It must be able to load application policies and
application code.
Requires a fairly sophisticated theory of sage
linking.

53
Semantic Linking

Marshalling is meta-programming.
Create values of a grid type system.
Cast grid values as local values.
Certification is how we marshal code.
Functions are marshalled as closures plus proof
of compliance with host type system.
Ensures that cast will succeed, safely.
The ur-grid is just an unmarshaller.
Grid frameworks are meta-programs.

54
Summary