Title: OASIS PI Meeting
1OASIS PI Meeting HACQIT
To Critical Users
VPN
Primary Nodes
Monitor Adapter
Backup Nodes
Sensors
Decoys/Fishbowls
Controls
- James E. Just
- James C. Reynolds
- Karl Levitt, Jeff Rowe
- 25 July 2001
The UC Davis Computer Security Laboratory
2Outline
- Team
- HACQIT Goals Approach
- System Model Threat
- Architecture
- Application Attack Coverage
- Status
- Plans
- Validation Integration
- Questions
3HACQIT Team
- Teknowledge Corporation architecture, design,
monitor/adapter development, integration - J. Just
- J. Reynolds
- L. Clough
- R. Maglich, E. Lawson
- UC Davis attack modeling, forensics, sensing
and response options - K. Levitt
- J. Rowe
- N. Carlson, M. Dawson
4Project Goals
- Prototype HACQIT controlled cluster will deliver
- 4 hours of intrusion tolerance under active Red
Team attacks (including unknown attacks) against - Hosts providing critical COTS/GOTS applications
services to - Critical users at 75 capability level or better
- Ability to add a new user while under attack
- Focus
- COTS HW SW based for near term utility
- Architecture based intrusion tolerance framework
for longer term extensibility - Deliver intrusion tolerance for broad class of
COTS/GOTS applications without equipment
footprint required for Byzantine fault tolerance
5Phased Approach
- Phase 1 Leverage Desiderata and other
components - Build demo prototype and explore space
- Analyze more formal model
- Refine architecture and implementation plan
- Phase 2 Implement new process pair architecture
- Initial critical applications with diversity,
then non-diverse - Implement initial and advanced Forensics and
Response module to identify and block unknown
attacks (classes) - Enable full and incremental restore and resync
- Reconstitution module for compromised servers
- Robust policy generation, dissemination, and
protection - Continue more formal model analysis
- Test on open Internet with Red Team
- Identify user candidates.
- Optional Phase 3 Refine implementation. Work
with selected user to meet requirements.
6System Model Assumptions
GoalEnable critical users to continue critical
work while under attack with lt25 degradation in
performance
- Key Assumptions
- LAN is reliable, cannot be flooded, and is only
means of communication between machines/users - HACQIT cluster hardware and software are pristine
at startup and patched against known
vulnerabilities - Critical users are trusted
- Unknown vulnerabilities exist
- Users/attackers interact with services via HW/SW
7Threat Model
- Attacker can be anyone other than critical users
or HACQIT system administrators, who are trusted - No physical access to HACQIT cluster HW/SW, e.g.,
no life cycle attacks or trojan hardware - Any form of attack against users, servers or
cluster that is carried out via the network,
except DoS attacks directed against network
bandwidth or critical users - Attacks entail communication via LAN either with
- Software already resident on cluster or
- New software downloaded by the attacker
- Attackers Goal
- Degrade critical operations by more than 25
during 4 hours of attack
8Measures of Merit
- Concept compare useful work completed by
critical users (wrt critical services) while
under attack to that completed while not under
attack - Possible measurements
- Keystroke comparison
- Percentage of usable product if quantifiable
- Subjective estimate of relative value of products
- Time required to produce equivalent product under
normal conditions - Issues
- What if all critical users are knocked out but
server is OK? - What about errors or qualitative differences?
- What about loss of confidentiality?
9Observations/Analysis
- If attackers can be prevented from downloading,
storing, and/or executing foreign software,
their attacks must focus on software resident on
cluster - Attackers can be forced to attack through the
critical applications rather than the OS - Analysis
- Any DoS attack against all critical users wins
- HACQIT must survive continuing unknown attacks.
- Failover not enough without large numbers of
backups
10Implications for/from Design Effort
- Implement model-derived characteristics
- Some key design goals and requirements
- Handle repeated unknown attacks how identify
and how stop - Deal with attacks at common effects level
- Channel attacks in desired directions
- Maintain consistent replicas
- Reconstitute and reuse compromised servers
- Protect (somewhat) weak clients
- Maintain separation boundaries
- Minimize false alarms and responses to them
- Defend in depth to cope with unknowns
- Policy-driven normal, pro-active and reactive
postures - Control and sensor system must be protected
- Diversity helps with maintaining current service
but makes restoration and identifying unknown
attacks more difficult
11HACQIT Reference Architecture
User
b
LAN
o
WAN
FW
Remote User J
Server r
User K
User N
q
All critical user interaction with HACQIT
protected critical applications is via VPN or
IPSec
Key
HACQIT Protected Node
Monitor Adapter
Primary Servers
Server
Critical Service
Backup Servers
Out-of-Band Comms Between HACQIT M/As
Fishbowls Decoys
Out-of-Band
12Phase 2 Results to Date
- Implementation Teknowledge, Jim Reynolds
- Testing
- Implementation UC Davis, Jeff Rowe
- Demo last night of Host Monitor,
Mediator-Adaptor-Controller response, Policy
Editor
13Current HACQIT Design/Implementation
14Components
- Firewall, primary, backup, controller hosts
- Extra backup to allow dynamic restoration
- Sandbox for confirming unknown attacks
- Enclave / personal firewalls for weak client
protection - Gateway (N-minute circular buffer, content
filters) - Host monitor (host application heartbeat, file
integrity, policy implementation, mediators,
sensor control, ) - Sensors (Tripwire, Snort, wrappers, QoS, )
- Application wrappers
- Mediator-Adapter-Controller
- Situation awareness
- Application output comparison
- Controls (Process pair, failover, checkpoint and
restoration, forensics, reconstitution, response,
fishbowl/decoy,) - Policy editor, policy server, policy checker
15HACQIT Host Monitor
16Mediator/Adaptor/Controller (MAC)
17Policy Editor (Rules)
18Policy Editor (Users)
19Policy Editor (Applications)
20Policy Editor (Servers)
21Policy Editor (Implementations)
22Policy Editor (Wrapper Settings)
23Policy Editor (Response)
24Testing
- Test each prototype version
- Open Internet
- Project web site and current implementation at
www.hacqit.net - Red Team for selected prototypes
- Latest code available shortly
- Contact reynolds_at_teknowledge.com
- 703-352-9300, x203
25Prevention Forensics
- Protecting against future attack instances
requires determination of the cause. - Forensics need only proceed till a response
blocking future attacks is determined, not to the
ultimate root cause. - The HACQIT Forensic Agent will automatically
detect the suspicious events in the application
and network traffic logs. - Reports from the Forensic Agent are used by the
Response Agent to block future instances.
26Forensic Methods
- Simple rules
- The last log entry in a crashed application is
the suspect transaction - Recent anomalous application transactions are
suspect (DEMIDS) - Buffer replay
- An offline mirror of the application is
maintained - Suspect transactions recorded by the application
are replayed against the mirror to find the
one(s) that reproduce the behavior. - Replay with subtle modifications zeros in on the
true cause. (e.g. reordering of parameters,
randomized time intervals, subtraction of
obviously benign transactions) - System Interaction rules
- Specify system dependencies between audit logs,
processes and the file system
27Response Methods
- Based upon an attack model (JIGSAW)
- Focuses on the capabilities obtain by an attacker
- Automatic chaining of attacks into complex
scenarios by matching pre and post-conditions of
single point attacks - Attacks are mapped to a model of response
- Response modeled through hierarchy of effects
- Coarse responses contain the cumulative effects
of finer grained, precise responses that are
their subset. - The response that blocks the capabilities
obtained from the attack model with the minimum
effect is selected.
28Preventing Future Attacks
Internet
Protected Clients
Monitor Adapter
Response Agent
Firewall
Blocking Response
Gateway
Attack Signature
QoS Subsystem
Diagnosis Request
- Forensics Agent
- Correlation
- Replay
Critical Servers
Application Monitor
Event Query
Suspect Transactions
29Phase 2 Plans
- Non-diverse application failover with process
pairs time delays to prevent common mode
failure - Restore/resync and reconstitution modules
- Forensics improvements
- N-minute buffer of inputs to cluster and/or
application - Network sensor, host and application data
- Identify unknown attack sequences after first
attack - Responses
- Weak client protection changes
- Integration with Cyber Panel bi-directional
reports and requests/commands - Decoys and fishbowls -- ??
30Thoughts on Integration
- Distinguish among development/design time (e.g.,
source code, compilers) mechanisms, run time
mechanisms (e.g., COTS applications), other - Run time characterization examples
- Classes of applications and attacks covered
- Sequential unknown attacks against unknown
vulnerabilities on both servers and clients - Insider attacks (e.g., users, system
administrators) - Denial of service attacks on bandwidth or users
- Lifecycle software or hardware attacks
- Attacks that compromise server but not service
- Configuration errors, protection sw bugs false
alarms - Workload for administration and adding new apps
- Confidentiality, integrity, availability impacts
- Comparisons of system projects components
- Integrations with FTN projects for DoS
31Possible Interactions with Cyber Panel
- Outgoing reports consist of aggregates sets of
atomic reports - System Operation / Health
- Operations normal, Responding to attack,
Prevented attack attempt, Compromise detected
which did not result in a fault, System suffered
fault of origin unknown/attack - Activity Detection
- Process violated behavior policy with info on
attempts to do bad things such as modifying O/S
library or registry or executing programs - Response Actions Taken
- Blocking IP addresses, failing over to backup,
performing system integrity checks, restoring
critical files, killing bad processes, etc - Aggregate Report Example Process Attempt to
Modify System DLL - H2 System responding to attack
- A1 Process violated behavior policy
- A2 Unsuccessful attempt to
modify Windows library - R3 Unauthorized process killed
- H3 Prevented attack attempt
- H1 System operations normal
- Incoming Indications Warnings (e.g. Code Red)
- Policy recommendations include INFOCON changes,
system priorities
32Questions?
33Backup
34Monitor/Adapter Functionality
Continue Critical Services
Conflict Detection/Adjud.
State Estimates (Current History)
Responses (Current History)
Policies/Specs
Stop Current Attacker
Should these be implemented as agents and
blackboards?
Mediators
Control Signals (Current History)
To Actuators
Sensor Readings (Current History)
From Sensors
35HACQIT Reference Architecture
HACQIT Protected Enclave
User
User 1
Other Enclaves
q
User 2
User J
User 3
User J
User
i
User N
LAN
WAN
F W
Server p
Server q
Server r
User M
User M
User K
User M
User P
User K
User P
o
User K
Out-of-Band Comms Between HACQIT M/As
HACQIT Protected Node
Key
Monitor Adapter
All critical user interaction with HACQIT
protected critical applications is via VPN or
IPSec
Server
Critical Service
Out-of-Band
36Goals of M/A Control Modules
- Assess/monitor system state (network, host, apps)
- Continue critical service
- Migrate critical applications and state
- Administer system (e.g., add or remove critical
user or critical service) - Gather more information (e.g., refocus sensors,
turn on more intrusive sensing) - Stop current attacker
- Note Control over enclave firewall and critical
user protection features needed - Reconstitute compromised server, checkpoint and
restore primary, and resync as process pair - Stop future similar attacks (forensics and
response) - Forensics to identify unknown attack from
captured data and block it or prevent its future
success - Use decoys or fishbowls
37Some Application Types (Not Diversity)
Application Type (Example) Description Failover Requirements Fairlove Mechanism
File server (for MS Word Powerpoint) Central file server for critical data Human response time (HRT), file move or replicate, redirect users File replication
Simple web server, DNS Client server, no state HRT, Start up appl. redirect users User redirection
Email, web application with database Client server, state captured as appl. or checkpoint file HRT, Replicate state, start up appl, redirect users File replication
Mathematical simulation model or multiple appl. with same functionality Batch or client server with complex or differ. internal state HRT, capture restore state (convert?), start up appl., redirect users Mirroring, process pairs
Air defense Real time Can data be lost? May not be feasible
Radar return processing Real time, low latency Can data be lost? May not be feasible
38Illustrative Attack Types Characteristics
39Phase 1 Experiments Demonstrations
- Demonstrations (based on Desiderata
infrastructure) - Console based migration (Apache startup and
shutdown) - Migration behavior under graceful shutdown v.
crash - Demonstrations (initial M/A other sensors)
- Migration based on integrity violation (Tripwire)
on primary with heightened monitoring auditing - Migration based on rogue process detection with
process kill and heightened monitoring auditing - Experiments (enhanced M/A with wrappers for
protection sensing) - Multiple applications (Apache JAMES)
- Cross platform migration
- Simple state capture, save, and restore
- Attacker address identification and blocking
40Final Phase 1 Software Implementation
41Final Phase 1 Experimentation Setup
42Insights from Phase 1
- Wrappers can provide significant protection
- Cannot prove intrusion tolerance against all
attacks for our architecture - Application diversity can allow significant
intrusion tolerance with fast fail over if
applicable - Architecture significantly protects critical
services against network based attacks but not - Flooding attacks (network or critical users)
- Insider/lifecycle attacks
- Requirements sharpened
- Desiderata (and likely many QoS control apps)
provides some protection against single attacks
43The Attack Model
Concept
44What is Jigsaw?
- Requires/provides modeling language specifying
the scenario attack model - Modular approach for specifying attacks
- Basic unit Concept
- Pre- and post-conditions
- Capability-linked
- Concept udpStorm is
- requires
- ServiceActive sa1,sa2
- ForgedPacketSend fps
- with
- sa1.service echo
- sa2.service echo
- fps.srchost sa1.host
- fps.srcport sa1.port
- end
- provides
- NetworkDoSHost nd1, nd2
- with
- nd1.host sa1.host
- nd2.host sa2.host end
- end.