Title: Information Discovery, Use and Sharing
1Information Discovery, Use and Sharing
- Mary Brady
- Program Manager
- Information Technology Laboratory
- mbrady_at_nist.gov
- (301) 975-4094
2Information Discovery, Use and Sharing
- Address National Priority areas by applying ITLs
core competencies in measurement and testing,
mathematical and statistical analyses, modeling
and simulation, and standards development and
deployment to enable - Scientific / Knowledge discovery
- Develop methods to characterize, analyze, and
extract knowledge from information - Widespread, seamless, and secure information
exchange, and - Develop methods for acquiring, defining,
securing, accessing, integrating and preserving
information (data, metadata) - Usability of vast stores of information
- Improved methods for turning data, metadata into
information search retrieval - Improved accessibility and usability
Turn data into information
3Barriers to Innovation
- Massive amounts of data must be gathered,
integrated and analyzed - Human annotation and metadata methods do not
scale - Robust computational methods are needed to
extract relevant features from huge data
repositories - Methods are needed for finding, fusing and
visualizing highly complex data to make
experimental inferences - Highly sophisticated computer modeling is needed
to enable adequate comparisons and drive
potential hypotheses
4Information Discovery, Use, and Sharing
- Medical Imaging (Change analysis in Lung Cancer)
- Develop measurement infrastructure to improve
methods of measuring change in lung cancer tumors
- Computational Biology
- Computational science capability to support high
throughput measurements Focus on techniques to
convert physical cell measurements into
quantitative descriptors that can be analyzed
statistically. - Core Infrastructure
- Apply automated test methods, coupled with
combinatorial testing approaches, to XML-based
languages used in manufacturing and electronics
industry supply chains investigating potential
measurement science role in web-based exchanges
(reg/rep, ontologies, soas, web services, XML,
etc). - Shape Searching
- 3D shape searching techniques for protein
structures within the HIV AIDS Database and for
3D CAD parts in a supply chain. - Multimodal
- Develop and apply metrics and testing to advance
the state-of-the-art in human language
processing, including text, video, and speech
5Medical Imaging Change Analysis in Lung Cancer
- Detecting and analyzing changes in lung tumors is
useful in disease diagnosis and therapy
evaluation for lung cancer patients. - High growth rate gt indicative of disease onset
- Exponential growth rate gt cancer
- Change of tumor size over time gt response,
stable, progressive disease - BUT, change analysis is difficult due to the
lack of quantitative measurements.
- Sources of Measurement Error
- CT Scanner - not calibrated, changes in
parameters effect tumor image - Lesion - Size, texture (density variation),
margins, complexity - Patient - cardiac motion, respiration,
involuntary movements, and overall health - NIST Role
- Address Scanner and patient issues through the
development of phantoms and standard protocols
with traceability to NIST - Develop methods to effectively characterize
tumors with known uncertainties - Conduct round-robin evaluations of reconstruction
and analysis algorithms - Collaborators
- NCI, FDA, RSNA, CRPF, GE, Siemens, Phillips,
Cornell, Stanford
6Computational Biology Single Cell Analyses
- Intracellular molecular reactions and
interactions that control the response and fate
of cells and organisms cannot be unambiguously
compared and combined due to a lack of standards
and validated protocols. - NIST Role
- Provide the measurement tools and standards that
enable quantifiable and reproducible measurements
of cells and their interactions through - Standard data/metadata format for image capture,
storage, retrieval, analysis. - Software to enable high throughput cell image
analysis and interoperability. - Standards and validation required to ensure
reproducible image analysis. - Multi-site experiments to test software and
validation protocols. - Technical Approach
- Create and evaluate an integrated data
collection, organization and analysis
infrastructure for cellular imaging. - Experimentalists and computational scientists -
focus on the physical standards and protocols for
data collection, image processing and analysis,
storage of data and metadata, and evaluation of
results.
7Core Infrastructure
8Conformance Test Development
Spec Processing
W3C -Specification
Semi-formal Assertions
Test Assertions
Test Development
Compute Intensive
Labor Intensive
Improved W3C, Vendor QA
- Comprehensive, coherent tests
- Common test description and reporting
- Problems identified early in life cycle
- Requirement for Recommendation
- Better specifications, better products
NIST Tests
9XML Technologies
- XML is critical enabling technology
- Family of standards for describing and
manipulating information - Describes data in an intelligent format that can
be exchanged over incompatible applications,
systems, and output media - To achieve the promise of XML - need to ensure
the usability, correctness and quality of XML
technology standards and their implementation - NIST Role
- Provide technical leadership and conformance
guidance - Partner with industry to develop a set of
conformance test suites - Provide technical direction regarding use of XML
technologies to others (e.g., NIST researchers,
govt agencies)
10XML Testing
- Drivers/Motivation
- XML software not working correctly inaccurate
data, no interoperability - Need for conformance tests
- early in the WG process
- adaptable to spec changes
- use to improve spec
- Accelerate the development of specifications and
implementations
- Expected Outcomes
- Lead efforts for all W3C efforts
- Test suites used by many, results in bug fixes to
all XML implementations and specs - Tests are used as part of the process 2
implementations must pass tests to exit CR - Test methodology influences others open source
Linux, OASIS, SourceForge, Apache - Over 30,000 tests developed
- Collaborators/Customers
- IT Industry
- IBM, Sun, Microsoft, Netscape, X-Hive, Arbortext,
Oracle, Lucent Technologies, DataDirect, ATT,
Software AG, BumbleBee - W3C, OASIS, University of Edinburgh
- Vertical Industries Health Care, automotive,
aerospace, construction, financial, travel, etc - Consumers of XML technologies
- Staffing (FTEs)
- (FTEs, includes students, GRs)
- 1 XML Core
- 1 XML Schema
- 1 XML Query
- .5 DOM, XSLT, XSL-FO (maintenance)
- Exit Strategy
- Successfully applied to applications
- Technology transfer enriches communities
11ebXML Conformance Testing
- Expected Outcomes
- Uniform method for testing ebXML standards
- Test Framework and Suites used to verify
conformance and interoperability - Test Framework and Suites used by
- Regional IT consortia OASIS (US), EAN-USS and
ETSI (Eur), KorBIT and ECOM (Asia) - Test companies, ebXML vendors and users
- Demoed at Reliable Infrastructures for XML, ETSI
Plugtest, OAG meeting
- Drivers/Motivation
- Interoperability is key to Internet B2B, ebXML
provides the foundation - Need to consolidate disparate testing efforts for
ebXML family of standards - A globally recognized, open testing architecture
and test suite promotes global ebXML
interoperability through a standard set of
testing tools - reduces development efforts and duplicative test
suites and testing
- Collaborators/Customers
- OASIS, Sun Microsystems, Drake Certivo, Fujitsu
America, Cyclone Commerce, Adobe, KorBit - Vertical industries
- Automotive, Aerospace, Healthcare, Construction
- OAG, HIMSS/IHE
- ebXML vendors and testers, small/medium
businesses using ebXML
- Staffing (FTEs)
- (FTEs GRs)
- 2 (1 FTE, 1 GR)
- Exit Strategy
- Successfully applied to applications
- Technology transfer enriches communities - spurs
global usage and contribution to a common set of
ebXML test suites
12ebXML Conformance Testing
OASIS Technical Committees (TC)
Test Framework Components
Regional Consortia, Vendors, Users
- Led OASIS Registry TC
- Developed initial Spec and tests
- Lead Testing effort in OASIS Implementation,
Interop, Conformance (IIC) TC - Developed Test Framework Software and
Specification - Developed and published Messaging Test Suite
- Revising Test Framework Software and
Specification/Implementation, extend use to other
ebXML specs - Collaborate with other ebXML TCs (message,
registry, business process, collaboration
Protocol Profile to use Test Framework to produce
test suites - Work with Registry TC, developing test
requirements and core test suite
13Health Information Technology
- Expected Outcomes
- Lead and participate in standards and
conformance efforts - partners with industry standards groups
- chair conformance efforts
- consultant to federal agencies with healthcare
missions - Develop conformance tests, tools and prototypes
- based on industry priorities
- prototypes fill in industry gaps
- Conformance definitions included in standards
- Drivers/Motivation
- Fragmentation in our health care system
- Demand for online access to medical info and
services - Improve patient care, reduce medical errors and
costs (98,000 deaths/yr) - build a standardized platform all can
communicate electronically - (HHS Secretary Thompson, 2003)
- Vision of an interconnected electronic health
infrastructure
- Staffing
- (FTEs, includes students, GRs)
- 5.5 Mess. Conformance, EHR, Infrast. Integration
- 2 IEEE Medical Device Comm
- 3 Standards Landscape
- .25 Telemedicine
- Exit Strategy
- Tests, tools and approaches successfully applied
- Technology transfer enriches community
- continued use of products
- ability to build upon products
- rely on NIST for consultation
- Collaborators/Customers
- American Telemedicine Association (ATA)
- ANSI HC Informatics Standards Board (ANSI HISB)
- ASTM Committee E31 on HC Informatics
- Consolidated Health Informatics (CHI)
- Health Level Seven (HL7)
- IEEE Medical Device Communications Group
- Integrating the HC Environment (IHE)
- PITAC - Presidents Information Technology
Advisory Committee
14Infrastructure Integration
- Putting standards together for effective
information exchange - seamless information flow from application to
application across and between HC enterprises - comprehensive view of patient information results
in quality care decisions - Define and build standard-based profiles to
address specific problems - NIST Activities
- work with HIMSS IHE to define, test, and
implement integration profiles - develop measurements and tools to validate that
approaches are technically and economically
feasible - Co-authored IHE Enterprise Cross Document Sharing
standard (XDS) - Develop XDS reference implementation and test
tool - NIST components featured at HIMSS Showcases
Integrated HC environment HC standards IT
standards glue to bridge gaps
15Cross-Enterprise Document Sharing
Enterprise
Enterprise
Imaging Center
Physician
Repository
Repository
Cross-Enterprise Document Registry (XDS)
Enterprise
Hospital A
Repository
Enterprise
Repository
Emergency Room
Hospital B
Patient
Admin
16Standards Landscape
- Plethora of standards even within a discipline
(e.g., clinical standards) - Whos doing what? How are standards being used?
Deployed? - monitoring the relevant standards is arduous and
labor-intensive effort - Repository of information on HC standards,
organizations, and initiatives. - knowledge of whos doing what
- minimize overlap and duplication of standards
development - facilitate collaborative standards work
- foster use and adherence to standards
- NIST Activities
- develop an interactive web prototype to
demonstrate the concept - collaborate with partners such as ANSI HISB,
eGovs CHI, and AHRQ - develop a tool set that can be adopted and
extended to enable widespread implementation - initiated pilot focused on HISB information --
adapted for use in NHIN
173D Shape Searching
18Research Challenge
- Need shape descriptor that is
- Discrimiminating
- Quick to compute
- Concise to store
- Pose-independent
- Efficient to match
Rank List
3D CAD database
Shape descriptor
Nearest Neighbor
19Shape Searching structural bioinformatics
20MultiModal Round-robins
21TREC
- Workshop series for large-scale evaluation of
(text) retrieval technology - realistic test collections
- uniform, appropriate scoring procedures
- a forum for the exchange of research ideas
- Started in 1992 as an evaluation vehicle for
DARPA Tipster project - http//trec.nist.gov
22TREC Tracks
Blog Spam
Personal documents
Legal Genome
Retrieval in a domain
Novelty QA
Answers, not docs
Enterprise Terabyte Web VLC
Web searching, size
Video Speech OCR
Beyond text
X?X,Y,Z Chinese Spanish
Beyond just English
Interactive, HARD
Human-in-the-loop
Filtering Routing
Streamed text
Ad Hoc, Robust
Static text
23NIST ASR Benchmark Test History May. 06
100
Meeting Speech
Meeting - SDM
Meeting - MDM
Meeting - IHM
Air Travel Planning Kiosk Speech
10
4
Range of Human Error In Transcription
2
1
1988 1989 1990 1991 1992 1993
1994 1995 1996 1997 1998
1999 2000 2001 2002 2003 2004
2005 2006 2007 2008 2009
2010 2011
24Questions?