Title: Software Assurance Metrics and Tool Evaluation
1Software Assurance Metrics and Tool Evaluation
- Paul E. Black
- National Institute of Standards and Technology
- http//www.nist.gov/
- paul.black_at_nist.gov
2What is NIST?
- National Institute of Standards and Technology
- A non-regulatory agency in Dept. of Commerce
- 3,000 employees adjuncts
- Gaithersburg, Maryland and Boulder, Colorado
- Primarily research, not funding
- Over a century of experience in standards and
measurement from dental ceramics to
microspheres, from quantum computers to building
codes
3What is Software Assurance?
- the planned and systematic set of activities
that ensures that software processes and products
conform to requirements, standards, and
procedures. - from NASA Software Assurance Guidebook and
Standard - to help achieve
- Trustworthiness - No vulnerabilities exist,
either of malicious or unintentional origin - Predictable Execution - Justifiable confidence
that the software functions as intended
4Getting Software Assurance
Good Requirements
Good Operations
Good Testing
Good Coding
Good Design
5What is an Assurance Case?
- A documented body of evidence that provides a
convincing and valid argument that a specified
set of claims about a systems properties are
justified in a given environment. - after Howell Ankrum, MITRE, 2004
6 in other words?
Claims, subclaims
Arguments
Evidence
7Evidence Comes From All Phases
- All tools should produce explicit assurance
evidence - Design tools
- Compliers
- Test managers
- Source code analyzers
- Etc.
- What form should the evidence take?
- OMG a common framework for analysis and
exchange - Grand Verification Challenge tool bus
- Software certificates, ala, proof carrying code
- Software security facts label
8So What About Maintenance??
- A change should trigger reevaluation of the
assurance case - A typical change should entail regular
(re)assurance work unit test, subsystem
regression test, etc. - A significant change may imply changes to the
assurance case model
9Why Does NIST Care?
- NIST and DHS signed an agreement in 2004 which
kicked off the Software Assurance Metrics And
Tool Evaluation (SAMATE) project - http//samate.nist.gov/
- NIST will take the lead to
- Examine software development and testing methods
and tools to target bugs, backdoors, bad code,
etc. - Develop requirements to take to DHS for RD
funding. - Create studies and experiments to measure the
effectiveness of tools and techniques. - Develop SwA tool specifications and tests.
- NIST already accredits labs (NVLAP NIAP) and
produces security standards (FISMA DES AES)
10Details of SwA Tool Evaluations
- Develop clear (testable) requirements
- Focus group develops tool function specification
- Spec posted to web for public comment
- Comments incorporated
- Testable requirements developed
- Develop a measurement methodology
- Write test procedures
- Develop reference datasets or implementations
- Write scripts and auxiliary programs
- Document interpretation criteria
- Come up with test cases
11SAMATE Reference Dataset
- Much more than a group of (vetted) test sets
code scanner benchmark
SRD
pen test minimum
12SRD Home Page
13(No Transcript)
14(No Transcript)
15(No Transcript)
16But, are the Tools Effective?
- Do they really find vulnerabilities? In other
words, how much assurance does running a tool or
using a technique provide?
17Studies and Experiments to Measure Tool
Effectiveness
- Do tools find real vulnerabilities?
- Is a program secure (enough)?
- How secure does tool X make
- a program?
- How much more secure does
- technique X make a program
- after doing Y and Z ?
- Dollar for dollar, can I get more reliability
from methodology P or S ?
18Contact for Participation
- Paul E. Black
- SAMATE Project Leader
- Information Technology Laboratory (ITL)
- U.S. National Institute of Standards and
Technology (NIST) - paul.black_at_nist.gov
- http//samate.nist.gov/
19Possible Study Do Tools Catch Real
Vulnerabilities?
- Step 1 Choose programs which are widely used,
have source available, have long histories. - Step 2 Retrospective
- 2a Run tools on older versions of the programs.
- 2b Compare alarms to reported vulnerabilities
and patches. - Step 3 Prospective
- 3a Run tools on current versions of the programs
- 3b Wait (6 months to 1 year or more)
- 3d Compare alarms to reported vulnerabilities
and patches.
20Possible Study Transformational Sensitivity
- Choose a program measure
- Transform a program into a semantically
equivalent (loop unrolling, in-lining, break into
procedures, etc.) - Measure the transformed program
- Repeat steps 2 and 3
- If the measurement is consistent, it measures the
algorithm. If it differs, it measures the program.
21Dawson Englers Qualm
vulnerabilities
time fixing weaknesses reported by tools
22Possible Study Englers Qualm
- Step 1 Choose programs which are widely used,
have long histories, and adopted tools. - Use number of vulnerabilities reported as a
surrogate for (in)security. - DHS funded Coverity to check many programs.
- Step 2 Count vulnerabilities before and after.
- Step 3 Compare for statistic significance.
- Confounding factors
- Change in size or popularity of a package
- When was tool feedback incorporated?
- Reported vulnerabilities may come from the tool.