Title: Harmonizing Semantics in EGovernment
1Harmonizing Semantics in E-Government
- Presentation to the Ontolog-Forum
(http//ontolog.cim3.net) - Brand L. Niemann
- U.S. Environmental Protection Agency Enterprise
Architecture Team - CIO Councils Architecture and Infrastructure
Committee (AIC) - Co-Chair, Semantic Interoperability Community of
Practice (SICoP) - CIO Councils Best Practices Committee (Knowledge
Management Working Group) - April 22, 2004
2A Little History
- Led a Team That Won the Special Award for
Innovation with XML and VoiceXML Web Services
from Mark Forman and the Quad Council at FOSE,
March 2002. - Led the CIO Council XML Web Services Working
Group from August 2002-October 2003 - TopQuadrant led the Semantic Technologies for
eGovernment Pilot. - TopQuadrant helped organize the very successful
Semantic Technologies for eGovernment Conference
at the White House Conference Center, September
8, 2003. - The TopQuadrant Pilot and the CIO Councils
Knowledge Management Working Group (Best
Practices Committee) Helped Start the new
Semantic Interoperability Community of Practice
(SICoP). - The XML Web Services Working Group Pilots
Supported the Development of the - Federal Enterprise Architectures (FEA) Data and
Information Reference Model and Its Data
Management Strategy and the - Government Enterprise Architecture Framework
(GEAF) of the CIO Councils Architecture and
Infrastructure Committee (AIC) Governance
Subcommittee.
3Organizational Relationships
Industry Advisory Council (IAC)
U.S. CIO Council
OMB - FEAPMO
Enterprise Architecture Special Interest Group
Architecture Infrastructure Committee
IT Workforce Connections
Best Practices Committee
WGs and CoPs
Subcommittees Governance Components Emerging
Technologies
Semantic Interoperability Community of Practice
Chief Architects Forum
4Some Upcoming Events
- Collaboration Expedition Workshop 31, April
28th, National Science Foundation, Ballston,
Virginia - Joint Workshop with SICoP on Multiple Taxonomies
- See http//ua-exp.gov
- Collaboration Expedition Workshop 32, May 11th,
National Science Foundation, Ballston, Virginia - Workshop on Emerging Technology Innovations in
Software Component Development, Reuse, and
Management Applications to Government
Enterprise Architecture (e.g. the new Chief
Architects Forum CoP) - See http//ua-exp.gov
- SICoP Monthly Meeting 2, May 19th, MITRE,
Mclean, Virginia - Progress Reports on White Paper Modules (3),
Collaboration Tools, Discussion of Common Upper
Ontologies, etc. - See http//web-services.gov and http//km.gov
- Fourth Quarterly Emerging Technology Components
Conference, June 3rd, MITRE, McLean, Virginia - Populating the Service Grid with Service
Components - See http//Componettechnology.org
5An Upcoming Event
- Joint Workshop with SICoP on Multiple Taxonomies,
April 28th - Welcome
- Organizer Michel Biezunski, Coolheads Consulting
- The Semantic Web-What Is This Really About?
- Renee Lewis, Pensare Group
- Increased Knowledge Sharing and Mission Success
Implementing Taxonomies for NASA - Jayne Dutra, Jet Propulsion Laboratory
- Master and Relational Taxonomies
- Kevin Hannon, Independent Consultant
- Clustering of Search Results With and Without
Taxonomies - Raul Valdez-Perez, Vivisimo, Inc.
6An Upcoming Event
- Joint Workshop with SICoP on Multiple Taxonomies,
April 28th (continued) - Semantics, Ontologies, and the Semantic Web
- Leo Obrst, The MITRE Corporation
- How to Create Many Taxonomies That Integrate Into
a Single Enterprise-Wide Taxonomy - Denise Bedford, The World Bank
- Ontology Overview
- Adam Pease, Independent Consultant
- Issues in Negotiating Multiple Semantic Models
- LeeEllen Friedland, The MITRE Corporation
- Accessibility, Usability, and Preservation of
Government Information - Eliot Christian, USGS and Chair, Categorization
of Government Information Working Group of the
Interagency Committee on Government Information - Open Dialogue
- Steven Newcomb, Coolheads Consulting
7A Past Event
- SICoP Monthly Meeting 1, April 14th, Army CIOs
Office, Crystal City, Virginia - Part 1 Community Business
- Old Business
- Minutes and Charter
- Emerging Products
- White Paper On Implementing the Semantic Web
- Module 1 Harnessing the Power of Information
Semantics (Jie-Hong Morrison, State Department) - Module 2 Exploring the Business Value of
Semantic Interoperability (Irene Polikoff,
TopQuadrant) - Module 3 Roadmap for Operationalizing the
Semantic Web (Michael Daconta, Smart Data
Associates) (Slides 13-14) - Army Knowledge Management Conference, August
31-September 2nd, Semantic Web Track (need
speakers).
Posted at http//web-services.gov, Past Meetings
and Presentations, April 14th
8A Past Event
- SICoP Monthly Meeting 1, April 14th, Army CIOs
Office, Crystal City, Virginia (continued) - Part 2 Building Shared Understanding
- Ontologies for Semantically Interoperable
Systems, Leo Obrst, The MITRE Corporation
(deferred to the next meeting) (Slides 9-12) - A Data and Information reference Model (DRM)
Registry and Repository Pilot, Brand Niemann, US
EPA (deferred to the next meeting) - Common Upper Ontology for Cross-Domain Semantic
Interoperability, Jim Schoening, The U.S. Army
Communications Electonics Command - Part 3 Launching/Building the Supported Community
of Practice - Proposed CoP Development Process, Rick Morris, US
Army CIO Office - Facilitated Discussion, Rick Morris and Brand
Niemann, Co-Chairs
9Tightness of Coupling Semantic Explicitness
Explicit, Loose
Far
Performance k / Integration_Flexibility
Modal Policies
Internet
Semantic Mappings
Semantic Brokers
OWL-S
Agent Programming
RDF/S, OWL
Peer-to-peer
Semantics Explicitness
Web Services UDDI, WSDL
Web Services SOAP
Community
Applets
XML, XML Schema
Data
Application
N-Tier Architecture EAI
Workflow Ontologies
Same Intranet
Conceptual Models
Enterprise
Middleware Web
Data Marts
Same Wide Area Network Client-Server
Data Warehouses
Same Local Area Network
Federated DBs
Distributed Systems OOP
Systems of Systems
Same OS
Same DBMS
Same Address Space
Same CPU
Linking
From Synchronous Interaction to Asynchronous
Communication
Same Programming Language
Compiling
Same Process Space
1 System Small Set of Developers
Local
Looseness of Coupling
Implicit, TIGHT
10Dimensions of Interoperability Integration
Our interest lies here
Community
Enterprise
6 Levels of Interoperability
System
Semantic
Application
Syntactic
Component
Structural
Object
Data
3 Kinds of Integration
0
100
Interoperability Scale
11Ontology Spectrum One View
strong semantics
weak semantics
12Ontology Spectrum One View
strong semantics
Modal Logic
First Order Logic
Logical Theory
Is Disjoint Subclass of with transitivity property
Description Logic
DAMLOIL, OWL
UML
Conceptual Model
Is Subclass of
RDF/S
XTM
Extended ER
Thesaurus
Has Narrower Meaning Than
ER
DB Schemas, XML Schema
Taxonomy
Is Sub-Classification of
Relational Model, XML
weak semantics
13The Smart Data Enterprise
Figure 2. Developer's Perspective on Data To the
application developer, the data evolution
timeline is viewed through the correlation of
programming paradigms with the relation of data
and code. From Designing the Smart-Data
Enterprise, Get prepared for the 10 ways that
semantic computing will impact enterprise IT, by
Michael C. Daconta, Posted November 28, 2003,
Enterprise Architect Magazine.
14The Smart Data Enterprise
Figure 3. The Smart Data Continuum Data has
progressed through four stages of increasing
intelligence. (Reprinted with permission from The
Semantic Web A Guide to the Future of XML, Web
Services, and Knowledge Management John Wiley
Sons, 2003. From Designing the Smart-Data
Enterprise, Get prepared for the 10 ways that
semantic computing will impact enterprise IT, by
Michael C. Daconta, Posted November 28, 2003,
Enterprise Architect Magazine.
15Abstract
- The history and broader context of this work.
- See Section 1.
- The eGov Act of 2002 has two sections (207 212)
which require more structure and interoperability
for government data and information and work has
begun in several committees and communities of
practice. - See Section 2 (just a few highlights).
- The new Semantic Web standards and technologies
provide a way to accomplish the purposes of the
eGov Act of 2002 and the FEA Data and Information
Reference Model Data Management Strategy. - See Section 3 (will skip over for this group).
- The work on repurposing the Statistical Abstract
of the United States, 2003, into a DRM Registry
and Repository illustrates how a number of
objectives can be accomplished at the same time,
including the highest priority of the CIO
Councils Architecture and Infrastructure
Committee, namely intergovernmental exchange of
data and information. - See Section 4 (just a few highlights).
- The additional pilots underway are outlined.
- See Section 5.
16Overview
- 1. Introduction (slides 17-19).
- 2. eGovernment Drivers The eGov Act of 2002 and
the FEA Data and Information Reference Model
(DRM) (slides 20-32). - 3. Semantic Technologies for eGovernment (slides
33-49). - 4. Repurposing the Statistical Abstract of the
United States, 2003, Into a DRM Registry and
Repository (slides 50-72). - 5. Additional Pilots (slides 73-74).
171. Introduction
- Repurposing of large documents with mixed content
(text, tables, graphics, etc.) into XML content
collections began with The Statistical Abstract
of the United States (1999 Edition) as part of
the FedStats.Net project to build a distributed
network of statistical data and information using
new XML standards and technology. - The Statistical Abstract of the United States was
considered to be one of the best examples of
"manual aggregation of government information"
(from some 200 programs across about 70 agencies)
that would benefit from a distributed XML-based
content network that would leave the content in
the hands of its originators and create a more
"living document". - This work was recognized by OMB Associate
Director for Information Technology and
E-Government, Mark Forman, and the Quad Council
with a Special Award for Innovation in the 2002
CIO Showcase of Excellence for the use of XML in
a distributed content network (renamed FedGov)
and use of VoiceXML in providing universal access
to emergency response information.
181. Introduction
- More recently, the eGov Act of 2002's provisions
for an Intergovernmental Committee on Government
Information (ICGI) and Data Integration Pilots,
the Federal Enterprise Architecture's Data and
Information Reference Model (DRM) and its Data
Management Strategy and the focus in the CIO
Council's Architecture and Infrastructure
Committee on Intergovernmental Data Exchange,
have all be tied together in a new pilot that
simultaneously accomplishes multiple objectives
(see next slide). - This Smart Data Enterprise approach came from
the Semantic Technologies for eGov Conference,
September 8, 2003, at the White House Conference
Center (in which the EPA CIO and her staff
participated), and continues in the new CIO
Councils Semantic Interoperability (Web
Services) Community of Practice (SICoP) (see
subsequent slides).
191. Introduction
- (1) Repurposes government data and information
into structured documents using new XML-based
standards and technologies that facilitate reuse
and exchange. - (2) Repurpose the data and information so that it
can be readily decomposed into XML fragments (for
text and tables) and RDF metadata (for graphics)
that can be stored and referenced in a database
and can be in turn repurposed into new documents
that provide additional user-defined views of the
data and information. - (3) Organize and categorize the repurposed data
and information using taxonomies and even
ontologies in semantic registries and
repositories. - (4) Use "XML data islands", and RDF and OWL to
add metadata, interoperability and semantic
meaning to data and information to be reused and
exchanged. - (5) Standardize the data element and XML tag
names in a DRM registry and repository. - (6) Share these results with others that are
working on Semantic Web and Technology
Applications for eGovernment.
202. eGovernment Drivers
- The eGov Act of 2002
- SEC. 207. ACCESSIBILITY, USABILITY, AND
PRESERVATION OF GOVERNMENT INFORMATION. - (a) PURPOSE.The purpose of this section is to
improve the methods by which Government
information, including information on the
Internet, is organized, preserved, and made
accessible to the public. - (b) DEFINITIONS.In this section, the term
- (1) Committee means the Interagency Committee
on Government Information established under
subsection (c) and - (2) directory means a taxonomy of subjects
linked to websites that - (A) organizes Government information on the
Internet according to subject matter and - (B) may be created with the participation of
human editors.
212. eGovernment Drivers
- The eGov Act of 2002 (continued)
- SEC. 207. ACCESSIBILITY, USABILITY, AND
PRESERVATION OF GOVERNMENT INFORMATION. - (d) CATEGORIZING OF INFORMATION.
- (1) COMMITTEE FUNCTIONS.Not later than 2 years
after the date of enactment of this Act, the
Committee shall submit recommendations to the
Director on - (A) the adoption of standards, which are open to
the maximum extent feasible, to enable the
organization and categorization of Government
information - (i) in a way that is searchable electronically,
including by searchable identifiers and - (ii) in ways that are interoperable across
agencies - (B) the definition of categories of Government
information which should be classified under the
standards and - (C) determining priorities and developing
schedules for the initial implementation of the
standards by agencies.
Note Received the 2002 CIO Council Showcase of
Excellence Special Innovation Award for XML Web
Services (VoiceXML and the FedGov Content
Network) in March 2002.
222. eGovernment Drivers
- The eGov Act of 2002 (continued)
- SEC. 212. INTEGRATED REPORTING STUDY AND PILOT
PROJECTS. - (a) PURPOSES.The purposes of this section are
to - (1) enhance the interoperability of Federal
information systems - (2) assist the public, including the regulated
community, in electronically submitting
information to agencies under Federal
requirements, by reducing the burden of duplicate
collection and ensuring the accuracy of submitted
information and - (3) enable any person to integrate and obtain
similar information held by 1 or more agencies
under 1 or more Federal requirements without
violating the privacy rights of an individual.
232. eGovernment Drivers
- The eGov Act of 2002 (continued)
- SEC. 212. INTEGRATED REPORTING STUDY AND PILOT
PROJECTS. - (d) PILOT PROJECTS TO ENCOURAGE INTEGRATED
COLLECTION AND MANAGEMENT OF DATA AND
INTEROPERABILITY OF FEDERAL INFORMATION SYSTEMS. - (1) IN GENERAL.In order to provide input to the
study under subsection (c), the Director shall
designate, in consultation with agencies, a
series of no more than 5 pilot projects that
integrate data elements. The Director shall
consult with agencies, the regulated community,
public interest organizations, and the public on
the implementation of the pilot projects. - (2) GOALS OF PILOT PROJECTS.
- (A) IN GENERAL.Each goal described under
subparagraph - (B) shall be addressed by at least 1 pilot
project each. - (B) GOALS.The goals under this paragraph are to
- (i) reduce information collection burdens by
eliminating duplicative data elements within 2 or
more reporting requirements - (ii) create interoperability between or among
public databases managed by 2 or more agencies
using technologies and techniques that facilitate
public access and - (iii) develop, or enable the development of,
software to reduce errors in electronically
submitted information.
242. eGovernment Drivers
- The Federal Enterprise Architecture (FEA) Data
and Information Reference Model (DRM) - Volume 1 Bob Haycock, OMB Chief Architect, will
soon release with guidance to the agencies. - The E-Government Act 2002, Section 207,
Interagency Committee on Government Information,
will use top two layers of the DRM structure for
categorization of government information (see
next slide). - The E-Government Act 2002, Section 212, calls for
a series of no more than 5 pilot projects that
integrate data elements to encourage integrated
collection and management of data and
interoperability of Federal Information systems. - Data Management Strategy In process and draft
to be released soon. - Have several critiques of the ISO 11179 to
improve the DRM Model including the suggested use
of the Meta Object Facility (MOF) from the Object
Management Group (OMG) by MetaMatrix (see slide
16). - Volumes 2-4 To Be Released by Fall 2004 (see
slides 17-19). - DRM business context, DRM information exchange,
and DRM data elements.
25The Current DRM Model
- A model for discovery of information
- Context and classification.
- To determine available packages and elements.
- A model for exchange of information
- Information packages, built from common data
elements. - Sharing mechanism.
- A model for representation of information
- Data elements defined in standard way.
ISO 11179
26Expanding the DRM Model
DRM Model
MetaMatrix Model
- MetaMatrix vision
- Generic classification to tag metadata with
context - vs. 2-level context.
- Packages built from complex datatypes and
deployable for exchange or data access - vs. exchange-only packaging of ISO 11179 data
elements. - Formal datatype model
- vs. more conceptual ISO 11179 model.
- Formal reference information to add semantic
value to data definitions - vs. nothing.
BUSINESS CONTEXT
CLASSIFICATION
Subject Area
Context
Super Type
Category
BUSINESS DATA FLOW
PACKAGE
Virtual Database
Exchange Package
Info Exch Package
INSTANCE
Virtual
Transform
Physical
TYPE
DATA ELEMENT
Schema/Association
ISO 11179
Complex Datatype
Data Object
Abstract Datatype
Data Property
Simple Datatype
Data Representation
REFERENCE
Glossary
Thesaurus
Bibliography
272. eGovernment Drivers
282. eGovernment Drivers
292. eGovernment Drivers
302. eGovernment Drivers
- The FEA DRM Data Management Strategy, Business
Driver 4 Resolve Data Semantics Issues That
Impede Community of Practice Work, Brand Niemann
and Ken Gill - Introduction to Data Semantics.
- Domain Data Harmonization Strategy.
- Data Harmonization Guiding Principles (10).
- Global Justice Information Sharing Initiative
(Global) Example. - Increased Collaboration by Means of and with
"Smart Data (Dacontas Declaration of Data
Independence). - Recommendations.
Note See http//web-services.gov for details.
312. eGovernment Drivers
- The FEA DRM has been and currently is the object
of a series of pilot projects and collaborative
work within the Communities of Practice - Open GIS Consortium (OGC)
- Information Communities and Semantics WG (ICS
WG) - http//www.opengis.org/groups/?iid50
- Sustainable Intergovernmental Network Exchange
(Global-Justice, Environmental Information-EPA,
and Health IT Sharing (Health) (SINE) - Collaborative Work Environment
- http//sine.cim3.net/
- Intelligence Community Metadata Working Group (IC
MWG) - http//www.xml.saic.com/icml/
- CIO Councils (Best Practices Committee)
Knowledge Management Working Group (KM.GOV) - Semantic (Web Services) Interoperability
Community of Practice (SICoP) - See http//km.gov and http//web-services.gov/
322. eGovernment Drivers
- The FEA DRM has been and currently is the object
of a series of pilot projects and collaborative
work within the Communities of Practice
(continued) - E-Gov SmartServices
- To join the group send an email to
eGov_SmartServices-subscribe_at_yahoogroups.com with
empty Subject and Body. You will then receive an
email with a web link where you can select the
subscription option. - Open International Forum on Business Ontology
- ONTOLOG - collaborative work environment
- http//ontolog.cim3.net/ (April 22nd
presentation) - Semantic Technologies for E-Government, September
8, 2003, White House Conference Center - http//www.topquadrant.com/conferences/tq_proceedi
ngs.htm - 2nd Semantic Technologies for E-Government,
September 8, 2004 (tentative). - University of Maryland MINDLab (Professor Jim
Hendler) and TopQuadrant (Ralph Hodgson) - http//owl.mindswap.org/ and http//www.topquadran
t.com/ - TopMIND Tutorials with Government Data Examples,
March 22-25, 2004 - http//www.topquadrant.com/seminars/topmind.htm
333. Semantic Technologies for eGovernment
- Web-Enabled Government 2004 Conference and
Exhibition, Session 2-4, February 4th, 2004
Understanding Semantic Web Technology by
Professor Jim Hendler and Brand Niemann - (1) Tree of Knowledge Technologies and The
Semantic Technology Layer Cake - (2) Where We Are
- (3) Emerging Vendors Landscape Semantic
Integration - (4) Semantic Technologies and Web Services
- (5) The First Site on the Semantic Web
- (6) Taxonomy
- (7) Topic Maps
- (8) RDF and Ontology Components
- (9) RDF Syntax and Validator
- (10) OWL Syntax and Functionality
- (11) Some Educational Resources
Note Based on TopMIND Tutorials, November 3-4,
and December 3-4, 2003
343. Semantic Technologies for eGovernment
- Jim Hendler is a Professor at the University of
Maryland and the Director of Semantic Web and
Agent Technology at the Maryland Information and
Network Dynamics Laboratory. He holds joint
appointments in the Department of Computer
Science, the Institute for Advanced Computer
Studies and the Institute for Systems Research,
and he is also an affiliate of the Electrical
Engineering Department. He has authored close to
150 technical papers in the areas of artificial
intelligence, robotics, agent-based computing and
high performance processing. - Hendler was the recipient of a 1995 Fulbright
Foundation Fellowship, is a member of the US Air
Force Science Advisory Board, and is a Fellow of
the American Association for Artificial
Intelligence. As Chief Scientist and Program
Manager at DARPA for the DAML program, he has
been one of the major drivers in the creation of
the Semantic Web, and continues to be a prominent
player in the W3Cs Semantic Web Activity.
35(1) Tree of Knowledge Technologies
Content Management Languages
Semantic Technology Languages
Process Knowledge Languages
AI Knowledge Representation
Software Modeling Languages
36(1) The Semantic Technology Layer Cake
Source Dieter Fensel
37(2) Where We Are
We Are Here
38(3) Emerging Vendors Landscape Semantic
Integration
Current Support / Primary Strength
SU
Structured information
Ontoprise
OWL
SU
S
S
Network Inference
enLeague
Unicorn
S
Unstructured information
Expressivity and Semantic Power
Ontology Works
RDF
S
S
Miosoft
S
Celcorp
Modulant
S
Supports both
Contivo
S
S
XML
S
U
Vitria
MetaMatrix
SchemaLogic
IGS
Source Irene Polikoff, TopQuadrant, Positioning
Semantic Technologies The Emerging
Vendor Landscape, September 8, 2003.
Data and Schema
Run
time
Integration and
-
Management
Validation
Engine
Orchestration
Enterprise Support
39(4) Semantic Technologies and Web Services
Semantic Web Services
Enterprise Ontology and Web Services Registry
Dynamic Resources
Semantic Web Services
Web Services
Static Resources
WWW
Semantic Web
Source Derived in part from two separate
presentations at the Web Services One Conference
2002 by Dieter Fensel and Dragan Sretenovic.
Interoperable Syntax
Interoperable Semantics
40(5) The First Site on the Semantic Web
http//owl.mindswap.org
PhotoStuff Image Annotation Tool with OWL
41(6) Taxonomy
Goals for enterprise taxonomies
Regardless of end goals, look to a future where
taxonomies interoperate (domains connect)
Expect new stakeholders to take an interest
but have their own viewpoints Technology
Recommendation RDF(S)
From Tim Berners-Lee, ISWC 2003
42(6) What is a Taxonomy?
- A taxonomy is a model of knowledge organized as a
hierarchical arrangement (tree structure) of
concepts - parent nodes denote more general ideas than their
children.
A
B
43(6) Types of Taxonomy
- A taxonomy can be
- A classification hierarchy, eg Natural Taxonomy
- Unique Beginner (plant) -gt Life-Form (bush) -gt
Generic (rose) -gt Specific (hybrid tea) -gt
Varietal (Peace) - A part hierarchy (Meronomy)
- A category hierarchy
- Taxonomies can intersect intersection means
there are different relationships at work
Reference D.A. Cruise, Lexical Semantics,
Cambridge University Press, 1986
44(6) topSAIL/tdf Taxonomy Development
FrameworkA five-step method for taxonomy
development
1
2
3
4
5
- Focus
- What is the taxonomy for?
- What business challenges will it overcome?
- What results will it achieve?
- How to measure stakeholder benefit?
- Analysis
- What is the context for the taxonomy?
- What are the types sources of knowledge?
- How does knowledge map to processes?
- Design
- What types of taxonomy concepts are needed?
- What to do first?
- What system capabilities are needed?
- What will be the impact?
- Is the taxonomy design correct, complete and
consistent?
- Construct
- Have we enough content mapped?
- How to connect taxonomies to content?
- How to integrate with IT systems?
- Deploy
- How do we ensure there will be feedback for
assessment? - Have we accomplished set objectives?
- What should be done next?
45(7) Topic Maps
The TAO of Topic Maps
- Topic
- The entry in a topic map that refers to a subject
on the real world. - Topic Maps make a Plato-distinction between
Things in the Real World (Subjects) and Things in
the Topic Map world (Topics). - Association
- Linkages between Topics.
- Tosca was written by Puccini.
- Occurrence
- Topics occur in resources.
- Resources indicated e.g., URLs
- Types of Occurrence mention, illustration,
article, etc.
Note See http//www.giuseppeverdi.it/verdi. Also
see http//www.coolheads.com/egov for merging of
topic maps.
46(8) RDF and Ontology Components
Key Ontology Components
RDF Triple Components
The company sells batteries.
depiction
Image
knows
Person birthdate date Gender char
Object
Predicate
published
Subject
Resource
Predicate
Literal
works for
is-A
leads
Resource Description Framework
Leader
Organization
URI
Literal
Source The Semantic Web A Guide to the Future
of XML, Web Services, and Knowledge Management,
Wiley Technology Publishing, June 2003.
Property or Association
47(9) RDF Syntax and Validator
Graph of the Data Model
Syntax
- lt?xml version"1.0"?gt
- ltrdfRDF xmlnsrdf"http//www.w3.org/1999/02/22-r
df-syntax-ns" - xmlns"http//www.example.org"gt
- ltrdfDescription rdfID"Jen"gt
- ltrdftypegt
- ltrdfDescription rdfabout"Person"gt
- lt/rdftypegt
- lthas namegtJen Golbecklt/has namegt
- lthasJobgt
- ltrdfDescription rdfabout"Job1"gt
- ltemployergtGeorge Washington Universitylt/emplo
yergt - ltpositiongtAdjunct Professorlt/positiongt
- lthiredgtJuly 2001lt/hiredgt
- ltsalarygt1lt/salarygt
- lthoursPerWeekgt15lt/hoursPerWeekgt
- lt/rdfDescriptiongt
- lt/hasJobgt
- ..
http//www.w3.org/RDF/Validator
48(10) OWL Syntax and Functionality
- lt?xml version"1.0"?gt
- ltrdfRDF xmlnsrdf"http//www.w3.org/1999/02/22-r
df-syntax-ns" - xmlnsrdfs"http//www.w3.org/2000/01/rdf-schema
" - xmlnsowl"http//www.w3.org/2002/07/owl"
- xmlns"http//www.example.org"
- xmlbase"http//www.example.org/"gt
-
- ltowlClass rdfID"Person"/gt
- ltowlClass rdfID"Employee"gt
- ltrdfssubClassOf rdfresource"Person"/gt
- ltrdfslabelgtOur Cool Employee Classlt/rdfslabelgt
- lt/owlClassgt
- ltowlClass rdfID"Civil_Servant"gt
- ltrdfssubClassOf rdfresource"Employee"/gt
- ltrdfslabelgtOur Cool Civil Servant
Classlt/rdfslabelgt - lt/owlClassgt
- Applications for OWL
- Markup for web pages and other web-based media.
- Raw Data Sharing.
- Web Services.
- Media Markup
- Google and other keyword searches are excellent
because they can work with text. - Not likely to be much improved by semantic web.
- Image searches are much worse than text searches.
- No way to know what is happening in an image,
what in it, what context it was taken, or who is
doing what. - MP3 searches.
- I want that song that was in the Mitsubishi
commercial - Video search.
- Challenges
- Trust Provenance.
- Visualization.
49(11) Some Educational Resources
Dieter Fensel Ontologies A Silver Bullet for
Knowledge Management and Electronic Commerce,
Springer Verlag, 2001
Johan Hjelm, Creating the Semantic Web with
RDF, John Wiley, 2001
John Davies, Dieter Fensel Frank van Harmelen,
Towards the Semantic WEB Ontology Driven
Knowledge Management, John Wiley, 2002
Dieter Fensel, Wolfgang Wahlster, Henry
Lieberman, James Hendler (Eds.) Spinning the
Semantic Web Bringing the World Wide Web to Its
Full Potential, MIT Press, 2002
Michael C. Daconta, Leo J. Obrst, Kevin T. Smith
The Semantic Web A Guide to the Future of XML,
Web Services, and Knowledge Management, John
Wiley, 2003
Vladimir Geroimenko (Editor), Chaomei Chen
(Editor), Visualizing the Semantic Web,
Springer-Verlag, 2003
M. Klein and B. Omelayenko (eds.), Knowledge
Transformation for the Semantic Web, Vol. 95,
Frontiers in Artificial Intelligence and
Applications, IOS Press, 2003
Sheller Powers, Practical RDF, OReilly, 2003
504. Repurposing the Statistical Abstract of the
United States, 2003, Into a DRM Registry and
Repository
- Overview
- Steps in Repurposing the Data Tables
- (1) Table in Adobe Reader 6.0.
- (2) Define Basic XML Tags in XMLSPY 2004.
- (3) Define XML Tags for Data Element Names in
XMLSPY 2004. - (4) Markup the Table in XMLSPY 2004.
- (5) Grid View in XMLSPY 2004.
- (6) XML Table Database in Excel 2002.
- (7) Create the HTML Interface.
- (8) HTML Interface in Browser.
- (9) XML Table Database in Browser.
- Some Features of the DRM Registry and Repository
- Note that it is embedded in the document itself,
not separate!
514. Repurposing the Statistical Abstract of the
United States, 2003, Into a DRM Registry and
Repository
- Overview
- The methodology for repurposing the Statistical
Abstract, 2003, documents (45 PDF files/14.2 MB)
into a structured XML content collection was
presented previously - See Past Meetings and Presentations at
http//web-services.gov, November 18-19, 2003,
Website Content Management for Government
Conference, Invited Presentation on November 19th
on "Repurposing Documents Into Semantic Web
Services and Networks" (EPA Enterprise
Integration Portal/Data Exchange Network Pilot),
Doubletree Hotel, Arlington, VA. Also see
Folio-to-XML Conversion and Webinar. - Current plans call for the completions of the
repurposing of this document and continued work
on state of the environment and national and
community indicator reports.
52Step 1. Table in Adobe Reader 6.0
Text Select Tool Highlight Table, Edit Copy,
Edit Paste to XML SPY 2004
53Step 2. Define Basic XML Tags in XMLSPY 2004
- ltTableTitlegt
- ltTableHeadNotegt
- ltTableBodygt
- ltTableFootnotegt
- ltTableSourcegt
54Step 3. Define XML Tags for Data Element Names in
XMLSPY 2004
- Census Date (Year, Month Day)
- Resident Population (Number)
- Resident Population (Number Per Square Mile of
Land Area) - Resident Population Increase Over Preceding
Census (Number) - Resident Population Increase Over Preceding
Census (Percent) - Area (Square Miles) Total
- Area (Square Miles) Land
- Area (SquareMiles) Water
- CensusDateYearMonthDay
- ResidentPopulationNumber
- ResidentPopulationPerSquareMileofLandArea
- ResidentPopulationIncreaseOverPrecedingCensusNumbe
r ResidentPopulationIncreaseOverPrecedingCensusPer
cent - AreaSquareMilesTotal
- AreaSquareMilesLand
- AreaSquareMilesWater
The heart of the DRM Registry and Repository
for reuse!
55Step 4. Markup the Table in XMLSPY 2004
Text View in XMLSPY 2004
56Step 4. Markup the Table in XMLSPY
2004(continued)
Text View in XMLSPY 2004
57Step 5. Grid View in XMLSPY 2004(like a
spreadsheet!)
Highlight Grid Table, Edit Copy as Structured
Text, and Paste to Excel.
58Step 6. XML Table Database in Excel 2002
Highlight Table, Format Column AutoFit
Selection. Also spreadsheet-like data tables can
be pasted into XMLSPY 2004.
59Step 7. Create the HTML Interface
Navigation Functionality (non-XML)
Note two references to statabs2003no1.xml.
60Step 7. Create the HTML Interface(continued)
Data Element Names
XML Tag Names
Note this makes the XML table database
independent of the HTML presentation.
61Step 8. HTML Interface in Browser
Link to XML File
Navigation Buttons
Can easy browse through long tables.
62Step 9. XML Table Database in Browser
Can expand and collapse using and -.
The heart of the DRM Registry and Repository
for interoperable exchange.
63Some Features of the DRM Registry and Repository
Taxonomy of Federal Statistical Data and
Information!
64Some Features of the DRM Registry and Repository
Detailed of Table of Contents for Entire Document.
65Some Features of the DRM Registry and Repository
Detailed Table of Contents for Each Section.
66Some Features of the DRM Registry and Repository
Graphics can have RDF metadata.
67Some Features of the DRM Registry and Repository
Tables are structured data (copy to Excel) and
available in XML
68Some Features of the DRM Registry and Repository
Table copied to Excel from Browser
69Some Features of the DRM Registry and Repository
Search within just one chapter of the entire
document.
70Some Features of the DRM Registry and Repository
Better search than from conventional Internet
search engines.
71Some Features of the DRM Registry and Repository
Appendix III on Limitations of the Data (Data
Quality) for Major Databases!
72Some Features of the DRM Registry and Repository
Harmonization/Standardization of Data Element and
XML Tag Names
735. Additional Pilots
- Where does the FEA go next?, Bob Haycock, Chief
Architect, OMB, at the Chief Architects Forum,
April 5, 2004 - Complete the DRM.
- Conduct DRM Community of Practice Pilots.
- Continue to develop and implement further DRM
volumes and FEA Data Management Strategy. - Etc.
745. Additional Pilots
- Census Bureau/FedStats (Statistical Abstract of
the US) - Lead original Line of Business (Data and
Statistics) which was exempted so it became a
logical selection for a best practice pilot! - National Indicator System and the Community
Statistical System - GAO, CEQ, Community Indicator Consortium, etc.
- Sustainable Intergovernmental Network Exchange
(SINE) - Global Justice, EPA, Health, etc.
- Intelligence Community Metadata Working Group (IC
MWG) - XML Enablement Strategy and Tool Evaluation.
- Componenttechnology.Org
- Proposals from participants in this Community of
Practice to Populate the Service Grid with
Services Components. - Categorization of Government Information Working
Group of the Interagency Committee on Government
Information - GSA Office of Intergovernmental Solutions (Susan
Turnbull) Outreach to Involve State and Local
Governments. - University of Maryland MINDLab (Professor Jim
Hendler) and TopQuadrant (Ralph Hodgson) - Semantic Markup and Tools for Government Content
(getting content ready for them!).