Title: XML Web Services: Data Standards Branch Training
1XML Web ServicesData Standards Branch Training
- Brand Niemann
- XML Web Services Solutions Architect
- Office of Environmental Information
- US EPA
- August 1, 2002
2Overview
- 1. Bringing XML to EPA Data Standards
- 1.1 Standards (Documents and Pilot
Implementations) - 1.2 Library (Old and New)
- 1.3 Registries and Repositories (EDR, etc.)
- 2. Bringing XML to EPA Networks
- 2.1 Nodes (XML and Content)
- 2.2 Networks (NEIEN and Others)
- 2.3 Registries and Repositories (Nodes and
Networks) - 3. Contact Information
31. Bringing XML to EPA Data Standards
- Standards loosely applied to any agreed-upon
way of doing things, however, there is a big
difference in the way the standard has been
developed and will be maintained, and often a big
difference in who has agreed upon the contents of
the standard. - Accredited standards developed and adopted as
standards through an open consensus process,
under the guidelines of national or international
standards bodies (ISO, IEC, ANSI, etc.). - Industry specifications formalized industry
practices generally developed by a group within
the industry (Web Offset Publications, etc.). - De facto standards usually developed or owned
by a single group or company and gain credibility
as the result of the use of a critical mass of
people (PostScript, Windows, etc.).
41. Bringing XML to EPA Data Standards
- Business Case for XML (GML)
- Mark Forman (E-Gov 2002)
- Mark was asked about the reported redundancy in
state-federal geospatial data activities and he
responded that the states especially have
complained about the costs involved, namely 10B
total (6 Federal and 4 State) and that about
half of that (5B) is wasted due to duplication
of effort! - Kim Nelson (GIS Day, November 8, 2001, and ORD
Science Meeting, May 1-2, 2002 - Everyone needs to think about how to
geographically reference all of the data that we
use and collect, so that we can share each
others' resources. We have 100's of geo-spatial
data products and resources. We need to develop
data collection standards which will enable us to
link and cross-reference these and other newly
acquired resources. - http//www.envindicators.org/indicators/faq.htm
- Solution organize by geography and add GML tags
(see Section 1.3.5).
51. Bringing XML to EPA Data Standards
- Understanding XML Standards, Chapter 19 in XML
and Web Services Unleashed, Sams, February 2002,
814-845 - Definitions
- American Heritage Dictionary
- Noun-Something such as a practice or a product,
that is widely recognized or employed, especially
because of its excellence. - Adjective-Widely recognized or employed as a
model of authority or excellence a standard
reference work. - Etymology (origin and development of a word)
- Middle English, from the old French estandard,
rallying place, probably from Frankish
standhard standan, to stand. - Some issues
- Mashmallow-soft, inaccurate term really mean
initiative, application, or recommendation. - Basically about one thing getting agreement.
61. Bringing XML to EPA Data Standards
- Understanding XML Standards, Chapter 19 in XML
and Web Services Unleashed, Sams, February 2002,
814-845 - Vocabularies
- A set of agreed-upon language constructs that
mean the same things to all parties using them. - In many ways, vocabularies that are defined
within user communities and have a well-defined
mechanism for their maintenance are called
standards! - Somewhat controversial Many consider a standard
to be one that has been in use by a large
population for a given number of years, whereas
others consider a standard to be well-defined
specification that addresses the needs of a wide
user base. - The net result of the widespread use of XML has
resulted in hundreds of industry vocabularies,
specifications, and standards.
71. Bringing XML to EPA Data Standards
- Understanding XML Standards, Chapter 19 in XML
and Web Services Unleashed, Sams, February 2002,
814-845 - The standard stack
- A common metaphor (more visual rather than
logical) used to identify the wide set of
specifications and standards that impact that
particular technology segment and show how they
interrelate (diagrams later). - Standards are judged by the process and
organization that created them. - Governments will always be the best place to
establish a standard that can be enforced by law,
regulation, and established guidelines of conduct.
81. Bringing XML to EPA Data Standards
- Understanding XML Standards, Chapter 19 in XML
and Web Services Unleashed, Sams, February 2002,
814-845 - Open Standards
- Opposite to the word proprietary (closed to
outside development and viewing, closed minded,
not customer centric, and slow to change), which
many consider to be pejorative. - Better out in the open, open process,
softwares that can be replaced, and softwares
that play well with each other.
91. Bringing XML to EPA Data Standards
- Understanding XML Standards, Chapter 19 in XML
and Web Services Unleashed, Sams, February 2002,
814-845 - The World Wide Web Consortium (W3C)
- Preeminent standards-setting body in the XML
world, 35 specifications in just 5 years, and to
say its word is the gold currency of the industry
is an understatement. - Recommendation is the W3C non-politically
charged word for standard. - Three central principles interoperability,
evolution, and decentralization. -
101. Bringing XML to EPA Data Standards
- Understanding XML Standards, Chapter 19 in XML
and Web Services Unleashed, Sams, February 2002,
814-845 - Others
- The Organization for the Advancement of
Structured Information Standards (OASIS) more
like a community than an official standards body
such as W3C at least three implementations of
the specification must be created for approval of
the OASIS membership. - International Organization for Standardization
(ISO) very formal process and over 12,000
standards with 300,000 pages of documentation
has begun to specify XML-based standards that
will surely be used for many years to come. - Industry Consortia Health Level 7 (HL7), Open
GIS Consortium, etc. - Birds of a Feather Vendor Groupings Universal
Description, Discovery, and Integration (UDDI) (a
practical and quickly implemented central
repository for Web Services components and
descriptions). - Individuals and Organizations Microsoft, Sun
Microsystems, etc.
111. Bringing XML to EPA Data Standards
- Understanding XML Standards, Chapter 19 in XML
and Web Services Unleashed, Sams, February 2002,
814-845 - The Standards Stack (like a stack of pancakes)
- The higher in the stack one goes, the more
technology and specifications each layer is
dependent on or references. - Some aspects of XML specifications that exhibit
layering behavior, whereas others can be applied
to multiple layers in the stack. - The uses for XML fall into two different camps
message-oriented protocols (right side-span all)
and document-oriented specifications (left side).
121. Bringing XML to EPA Data StandardsThe XML
Standards Stack
Community Specifications
Business Process Layer
Services Layer
Security Aspect
Query Aspect
Presentation Aspect
Semantics Aspect
Messaging Layer
Transport Layer
XML Base Architecture
131. Bringing XML to EPA Data Standards
- The XML Standards Stack Layers
- XML Base Architecture all specifications use
XML (e.g. XML Schema). - XML Transport Layer Uses HTTP, SMTP, and FTP
for transport from place to place, but also BEEP
(Blocks Extensible Exchange Protocol), etc. - XML Messaging Mayer packaging XML documents for
transmission (analogy to a postal envelope)
(SOAP-Simple Object Access Protocol to become the
W3Cs XML Protocol). - Services Layer functionalities that can be
accessed by machines in a distributed manner
(WSDL-Web Services Description Language) - Process Layer turning functionality into
coordinated action and individual components into
larger applications (various workflow
specifications that even allow human interaction
to occur at various points in the
machine-to-machine dialogue).
141. Bringing XML to EPA Data Standards
- The XML Standards Stack Aspects
- Presentation Aspect how XML should be presented
or modified in presentation for usability (XHTML,
XForms, and SVG-Scalable Vector Graphics). - Security Aspect provided a level of protection
of XML information (encryption, authentication,
authorization and permission, and privacy). - Query Aspect assist in locating XML resources
(tagging with metadata and retrieving). - Semantic Aspect help apply meaning and context
to XML documents (synchronizing XML vocabularies
with other incompatible representations).
151. Bringing XML to EPA Data StandardsXML
Standards Stack Pyramid
Community Vocabularies
Community Vocabularies
Message-Oriented Protocols
Document-Oriented Specifications
XML Base Architecture
161. Bringing XML to EPA Data Standards
- XML Standards Stack Pyramid
- Community Vocabularies Layer
- All the industry specific implementations and
problem-oriented specifications (where the
rubber meets the road). - How a specific user community plans to make use
of XML, the specific of data exchange, and often
some of the first specifications to be developed. - The number of community vocabularies is
proliferating. - Upside-down pyramid (relative numbers in each
layer) - From few (XML Base Architecture) to many
(Community Vocabularies) specifications.
171.1 Standards
- 1.1.1 Documents (e.g. Contact Standard)
- Repurpose - unstructured to structured.
- XML file for machine processing (Stylesheets CSS
and XSLT Schema XSD, Transformations XSLT and
XSL-FO) and storage in a registry and repository. - 1.1.2 Pilot Implementations (e.g. W3C and OASIS
process) - EPA Local Emergency Planning Committee Database.
- EPA Facility Database.
- EPA Employee Directory.
- US Government Blue Pages.
- Commercial Qsent Comprehensive Directory.
- Other FGDC Address Content Standard and OASIS
DSML (Directory Services Markup Language), etc.
181.1.1 Documents
- Structure
- Point of Contact
- Individual (7 data elements)
- Organization (1 data element)
- Affiliation Type (1 data element)
- Address
- Mailing Address (6 data elements)
- Geographic Address (6 data elements)
- Communication
- Telephonic (3 data elements)
- Electronic (2 data elements)
191.1.1 Documents (PDF)
201.1.1 Documents (Excel)
211.1.1 Documents (XML Spy-Text View)
221.1.1 Documents(XML Spy-Enhanced Grid View)
231.1.2 Pilot Implementations
- EPA Local Emergency Planning Committee Database
Data Elements - XML and VoiceXML Web Services
- EPA Facility Database Data Elements
- XML Web Service (Locational Data Improvement
Project) - EPA Employee Directory Data Elements
- In process
- US Government Blue Pages Data Elements
- XML and VoiceXML Pilot Web Services
- Commercial Qsent Comprehensive Directory
DTD/XSD - XML Web Service
241.1.2 Pilot Implementations
- EPA Local Emergency Planning Committee Database
Data Elements - LEPC Name
- Street
- Address2 (e.g. P.O. Box)
- City
- State
- Postal Code
- First Name
- Last Name
- Phone
- Email
- LEPC ID
- Edit Date
- Idmarplot (geo-reference from the LandView
MarPlot Mapping System)
25EPA Local Emergency Planning Committee Database
http//www.epa.gov/ceppo/lepclist.htm
26EPA Local Emergency Planning Committee Database
http//130.11.53.73/lepc/FMPro?-dbLEPC.FP5-forma
t-fmp_xmlzip_lepczip_code22181-find
271.1.2 Pilot Implementations
- EPA Facility Database Data Elements
- EPA ID
- Name
- Street
- City
- State
- ZIP Code
- SIC Code
- State Code
- County Code
- Latitude
- Longitude
- Etc.
28EPA Facility Databasehttp//130.11.44.140591/fmp
ro?-dbepasites.fp5-format-dso_xml-find
291.1.2 Pilot Implementations
- EPA Employee Directory Data Elements
- Name (First, Middle Initial, and Last)
- Mail Code
- Building Site
- Street Address
- City, State, and ZIP Code
- Telephone Number
- Email Address
30EPA Employee Directoryhttp//www.epa.gov/epahome/
locator.htm
311.1.2 Pilot Implementations
- US Government Blue Pages Data Elements
- Functional Listing
- Organization
- Listing
- Area Code
- Phone Number
- Other
32Blue Pages Pilot Project
33Blue Pages Pilot Project
Search Form
XML Output
34Blue Pages Pilot Project
Search Form
Web Output
351.1.2 Pilot Implementations
- Commercial Qsent Comprehensive Directory
Features - Over 145 million residential, business and
government listings (99). (Every record
verified through phone installation and account
activation with credit history.) - 250,000 to 500,000 updates daily.
- Four types of searches
- U.S. Residential search for an individual.
- U.S. Business and Government search for
business and government agency. - Reverse Lookup - search by telephone number.
- U.S. All search all directories at once.
- Geographic searches
- City Surround expand search incrementally from
city center (lowest ZIP) - Neighborhood Search search by neighborhoods
using ZIP4
36Qsents iQ411 Services Interactive (Web)
37Qsents iQ411 Services Integrated (XML)
38Qsents iQ411 Services Integrated (XML)
39Qsents iQ411 Services Integrated (XML)
401.2 Library
- Old
- Tia Greens Notes Database
- I have the key documents in XML publishing system
(see next slides). - New
- Carmen Farrows Notes Database-more than Tias
(had only 25), but mostly SAIC. - Convert more to XML for XML publishing system
(Metadata Networks in section 2).
411.2 Library
- Tools from NextPage http//www.nextpage.com
- Folio Views SGML-like markup (pre-XML) in a
GUI. - CD-ROM distribution.
- Web Server (Markup-to-HTML on the fly).
- LivePublish Basic XML support (uses DTD see
next slide). - Site Administrator.
- Personal Edition (Desktop and CD-ROM).
- Web Server (Markup-to-HTML on the fly).
- NXT 3 Advanced support for XML (LivePublish
plus XSL, SOAP, etc. see later slide). - Content Network Manager.
- Content Network Server.
421.2 Library
- LivePublish Uses of XML
- Serve up native XML.
- Convert XML to HTML using a CSS or XSL at run
time using the Display Filter API. - Convert XML to HTML at build time.
- Uses an XML-based file to define site look and
feel. - The build Makefiles are XML files that define the
structure and contents of the information
collections. - XML-based legacy conversion tools simplify the
conversion of existing content into HTML. - Indexsheets (XIL) define and control the indexing
of content like stylesheets (XSL) define and
control the formatting.
431.2 Library
441.2 Library
451.2 Library
461.2 Library
471.2 Library
481.2 Library
491.2 Library
501.3 Registries and Repositories
- 1.3.1 Environmental Data Registry
- 1.3.2 Chemicals (List of Lists)
- 1.3.3 Integrated Taxonomic Information System
- 1.3.4 GEneral Multilingual Environmental
Thesaurus (GEMET) 2002 - 1.3.5 State of the Environment Report and
Environmental Indicators Initiative - 1.3.6 Re-purpose every EPA Information System
into an XML Document and Web Service
511.3.1 Environmental Data RegistryDocument-oriente
d
521.3.1 Environmental Data RegistryDocument-oriente
d
531.3.1 Environmental Data RegistryData-oriented
541.3.1 Environmental Data RegistryData-oriented
551.3.2 Chemicals (List of Lists)http//130.11.53.7
3/lol/
561.3.2 Chemicals (List of Lists)http//www.filemak
er.com/xml/index.html
571.3.3 Integrated Taxonomic Information
Systemhttp//sis.agr.gc.ca/pls/itisca/taxaget?p_i
fx
581.3.3 Integrated Taxonomic Information
SystemSelect XML and Query for flowers
591.3.4 Generalized Multi-lingual Environmental
Thesaurus (GEMET) 2002http//www.mu.niedersachsen
.de/cds/etc-cds_neu/library/select.html
601.3.4 Generalized Multi-lingual Environmental
Thesaurus (GEMET) 2002
611.3.4 Generalized Multi-lingual Environmental
Thesaurus (GEMET) 2002
- The GEMET DTD allows one to see the Thesaurus in
XML with a Web Browser that supports XML Version
1.0. - The Thesaurus may be viewed in different ways
with different XSL Stylesheets. - See cds-thes-xml.dtd in XML Spy in the next
slide. - This DTD can be converted to an XML Schema in XML
Spy 4.4 in second next slide.
621.3.4 Generalized Multi-lingual Environmental
Thesaurus (GEMET) 2002DTD in XML Spy
631.3.4 Generalized Multi-lingual Environmental
Thesaurus (GEMET) 2002
641.3.5 State of the Environment Report and
Environmental Indicators InitiativeGeography
651.3.5 State of the Environment Report and
Environmental Indicators InitiativeThemes
661.3.5 State of the Environment Report and
Environmental Indicators InitiativeRegions and
States
671.3.6 Re-purpose Every EPA Information System
into an XML Document and Web ServiceIntegrate,
Link, and Chain All Together (Section 2)
683. Contact Information
- Brand Niemann, Ph.D.
- USEPA Headquarters, EPA West, Room 6143D
- Office of Environmental Information, MC 2822T
- 1200 Pennsylvania Avenue, NW, Washington, DC
20460 - 202-566-1657
- niemann.brand_at_epa.gov
- EPA http//161.80.70.167
- Outside EPA http//130.11.44.140