Title: PASIS: Perpetually Available and Secure Information Systems
1PASIS Perpetually Available and Secure
Information Systems
http//www.ices.cmu.edu/pasis/ Greg Ganger,
Pradeep Khosla Mehmet Bakkaloglu, Michael
Bigrigg, Garth Goodson, Semih Oguz, Vijay
Pandurangan, Craig Soules, John Strunk, Ken Tew,
Cory Williams, Ted Wong, Jay Wylie Carnegie
Mellon University
2PASIS Objective
- Create information storage systems that are
- Perpetually Available
- Information should always be available even when
some system components are down or unavailable - Perpetually Secure
- Information integrity and confidentiality should
always be enforced even when some system
components are compromised - Graceful in degradation
- Information access functionality and performance
should degrade gracefully as system components
fail - Assumptions Some components will fail, some
components will be compromised, some components
will be inconsistent, BUT. - surviving components allow the information
storage system to survive
3Survivable Storage Systems
- Surviving server-side intrusions
- decentralization data distribution schemes
- provides for availability and security of storage
- Surviving client-side intrusions
- server-side data versioning and request auditing
- enables intrusion diagnosis and recovery
- Tradeoff management balances availability,
security, and performance - maximize performance given other two
4Step 1 Decentralized storage systems
5Step 2 Data distribution schemes
- Scheme Algorithm ltParametersgt
- E.g., 3-fold replication replication ltn 3gt
- 1000s of possible choices
- Many different algorithms
- Cryptographic
- Threshold (n shares, any m to reconstruct)
- Hybrids and combinations
- Many reasonable parameters
6PASIS Agent Architecture
System Characteristics
User Preferences
Tradeoff Management
Client Applications
PASIS Storage Nodes
Multi-read/write Communication
Encode Decode
7Features of PASIS Architecture
- Security
- confidentiality no single storage node can
expose data - integrity no single storage node can modify data
- Availability
- any M-of-N storage nodes can collectively provide
data - Flexibility
- range of options in space of trade-offs among
availability, security, and performance
8Engineering survivable systems
- Performance and manageability need to approach
that of conventional systems - to ensure significant acceptance
- Approach exploit threshold scheme flexibility
- achieve maximum performance given desired levels
of availability and security - requires quantification of the corresponding
trade-offs - Approach exploit ability to use any M shares
- send requests to more than M and use quickest
responses - send requests to closest servers first
9Trade-off management challenges
- Reasoning about security and availability
- specifically, need to translate settings into
configuration rules and limitations - e.g., M gt 0.7N, (N-M) gt 2, M shares cannot be on
same OS - Finding best performing configuration
- within the limitations imposed by first step and
given the expected workload and system components - configuration includes choices of data
distribution scheme, values for M and N and P,
degree of over-requesting, server selection
algorithm, etc - 2-step approach predict performance of any
possible configuration and then search for
optimal choice
10Trade-off space
Scheme Selection Surface
11Quantifying the axes
- Performance (MB/s)
- based on simple performance model
- computed with standard performance eval.
techniques - Availability (nines)
- standard fault tolerance math with independent
failures - relative values are useful even if not
independent - Security (Effort to defeat)
- estimate effort involved with possible attack
paths - overall effort is minimum of possible efforts
12Generation of scheme selection surface
- Quantify performance, security, and availability
of each algorithmparameters - Select best performing scheme for each region
13Trade-off space
Scheme Selection Surface
14Selection surface sensitivity
- Models are insensitive to small perturbations of
configuration parameters - Scheme selection surface is different for truly
different configurations
15Extreme read workload
50 Read Workload
99 Read Workload
16Security Model Sensitivity
ECircumventCrypto EBreakIn
ECircumventCrypto 2.5?EBreakIn
17Self-Securing Storage Nodes
- Goal protect data from authorized but malicious
users - both client-side intruders and insider attacks
- How assume all clients are compromised
- keep all versions of all data
- audit all requests
- Benefits
- fast and complete recovery by preventing data
destruction and undetectable modifications - enhanced detection and diagnosis of intrusions by
providing tamper-proof audit logs
18Where were at
- PASIS architecture and first prototype complete
- Re-implementation of agent in progress
- more efficient, portable, flexible
- and more data distribution schemes and storage
protocols - based on lessons from initial prototype
- Re-implemented multi-versioning storage node
- working on internal space and time optimizations
- investigating how it can be used for intrusion
diagnosis - Trade-off quantification in progress
- measurements and modeling continue
19Technology Transfer
- Transfer path via CMU Consortia (e.g., PDL)
- 15-20 storage and networking companies
- EMC, HP, IBM, Intel, 3Com, Veritas, Sun, Seagate,
Lucent, Snap, LSI Logic, Hitachi, Panasas,
Network Appliances, Platys Communications - 20 embedded system infrastructure companies
- Raytheon, Boeing, United Technologies, Hughes,
Bosch, ATT, Adtranz, Emerson Electric, Ford, HP,
Intel, Motorola, NIIIP Consortium - Joint Battlespace Infosphere (JBI)
- working with AFRL researchers to understand how
PASIS technologies might fit into JBI
infrastructures
20PASIS Summary
- Decentralization data distribution schemes
- provides for availability and security of storage
- Tradeoff management balances availability,
security, and performance - and it is good engineering practice!
- Data versioning to survive malicious users
- enables intrusion diagnosis and recovery