Title: Data Representation System XML
1Data Representation System XML
CBS OPAG-ISS Expert Team on the Assessment of
Data Representation Systems (ET-ADRS) Washington
DC, USA, 23 - 25 April, 2008
- Jan W. Noteboom
- Royal Netherlands Meteorological Institute (KNMI)
- noteboom_at_knmi.nl
- XML Overview
- XML SWOT Analysis
- XML Practical Experiences
- Discussion
3XML Overview Introduction
- XML is
- eXtensible Markup Language
- W3C Recommendation since feb 1998
- subset of Standard Generalized Markup Language
(SGML) - meta-language, used to create markup languages
- designed to represent and exchange data as
structured documents across information systems
particulary via the Internet - human readable (text based, unicode support)
- open standard, licence free
4XML OverviewStructure and Semantics
- Basic components
- elements, attributes, comments, PCDATA,
processing information (e.g. declaration,
namespaces) - Well-formed
- one root element (hierarchical structure)
- no open tags, proper nesting ltparentTaggtltchildTa
g1gt lt/childTag1gtlt/parentTaggt - attributes must be quoted
- element names are case sensitive
- Valid
- document conforms to some semantic rules(e.g.
Document Type Definition DTD or XML Schema XSD)
5XML Overview Structure and Semantics
- Semantic Rules - XML Schema XSD (W3C, 2001)
- uses XML syntax (well-formed)
- structural definitions, type definitions,
defaults - very powerful and flexible
- facilitates creation of own libraries with
exchange data types - Schemas are also useful for
- prior agreements between parties for data
exchange - application development that process data
- Alternatives RELAX NG (OASIS), Schematron
- Supported by DSDL Document Schema Definition
Languages (ISO 19757) - Namespaces (W3C, 2006)
- Identify your vocabularies (usage
xmlnsprefixURI) - qualify element and attribute names to avoid name
collisions - Allows modularization of schemas
- Mix and match elements from multiple schemas in
document instances - Import or include from one XML Schema into
another (re-use)
6XML Example
- lt?xml version"1.0"?gt
- ltsweCompositePhenomenon
- xmlnsswe"http//www.opengis.net/swe/1.0.1"
- xmlnsgml"http//www.opengis.net/gml"
- xmlnsxsi"http//www.w3.org/2001/XMLSchema-
instance" - xmlnsxlink"http//www.w3.org/1999/xlink"
- xsischemaLocation"http//www.opengis.net/s
we/1.0.1 - http//schemas.opengis.net/sweCommon/
1.0.1/swe.xsd" - gmlid"weather1" dimension"6"gt
- ltgmlname codeSpace"urnietfrfc2141"
- gturnogcdefphenomenonSEEGridweathe
r1lt/gmlnamegt - ltswebase xlinkhref"urnogcdefphenomenon
OGCWeather"/gt - ltswecomponent
- xlinkhref"urnogcdefphenomenonOG
CAirTemperature"/gt ltswecomponent - xlinkhref"urnogcdefphenomenonOG
CWindSpeed"/gt ltswecomponent - xlinkhref"urnogcdefphenomenonOGC
WindDirection"/gt ltswecomponent - xlinkhref"http//sweet.jpl.nasa.gov/ontolo
gy/property.owlVisibility"/gt - lt/sweCompositePhenomenongt
7XML OverviewProcessing XML
API implementor
XML Infoset
XML validator
XML parser
XML doc
XML schema
- Programming APIs
- DOM - Document Object Model
- SAX - Simple API for XML
- StAX - Streaming API for XML
- Data binding
- JAXB - Java Architecture for XML Binding
- Hibernate - relational/object/XML mapping tool
8XML OverviewXML extensions
- XPath, XPointer for addressing XML subdocuments
- XLink to create hyperlinks between resources
- XSLT for rearranging restructuring XML docs
- XQuery for querying
- SOAP XML-Protocol for message and object
serialization and remote procedure calls - RDF to describe resource metadata
- XForms for Web forms
- XMI XML Metadata Interchange
- ..etc.
9XML OverviewDialects
- Geography Markup Language (GML, ISO19136)
- Keyhole Markup Language (KML)
- Digital Weather Markup Language (DWML)
- Climate Data Markup Language (CDML)
- Weather Markup Language (WxML)
- Emergency Data Exchange Language (EDXL)
- Water Markup Language (WaterML)
- Chemical Markup Language (CML),
- Electronic Business XML Initiative (ebXML),
- Scalable Vector Graphics (SVG), . . .
10XML SWOT Analysis
11XML SWOT AnalysisCriteria
- Ability/suitability to present WMO dataincludes
also pictorial data, textual information (e.g.
warnings), metadata - Ability/suitability to exchange dataoperational
data between NMHSs and information to users
outside NMHSs - Ability/suitability for store datausage in
storage systems - Compliance with and status of existing standards.
- Available support skills and technology
(tooling) - Other abilitiesability to translate back and
forward to other DRSsability/suitability to
envelope objects or act as a pseudo-carrier
12XML SWOT AnalysisSummary
Criteria Score Remarks
Ability/suitability to present WMO data Includes also pictorial data, textual information, metadata Not suitable for raster data (pictures)
Ability/suitability to exchange data operational data between NMHSsinformation to users outside NMHSs Substantial bandwidth and processing overhead
Ability/suitability to store data usage in storage systems Mapping schemes to RDBMS tables difficult.
Compliance with and status of existing standards Only a Core Metadata XML scheme (WMO)
Available Support skills technology (tooling) High level of support
Other abilities ability to translate back and forward to other DRSs ability to envelope objects or act as a pseudo-carrier No BUFR translation tooling found
13XML SWOT AnalysisPresenting WMO data(weather,
climate, water, atmospheric constituents,
oceanography, aviation e.a)
- Strengths
- structured text format (hierarchical)
- self-documenting
- can represent common data structures records,
lists, trees - human-readable
- very flexible you can define and mix other
languages (GML) - supports modularity (namespaces)
- Weaknesses
- fairly verbose and partially redundant
- binary data
- expressing non-hierarchical relationships is
difficult - Opportunities Threats
- many languages and schemes available that are
useful to describe weather, climate related
aspects - potentially complex parsing (many namespaces)
14XML SWOT AnalysisPresenting WMO data
- Pictorial data
- feature data, vector data (GML, KML)
- presenting raster data
- Remark
- GML (ISO19136) enables OGC services (WFS, WMS
WCS) - Text information (e.g. warnings)
- XML is human readable and multi-lingual (unicode)
- EDXL (Emergency Data Exchange Language) by OASIS
- rearranging restructuring abilities XSL (to
text, XHTML, pdf etc) - Remark
- CAP Common Alert Protocol (Oasis), open, based
on EDXL - Metadata
- ISO 191xx support (ISO 19139 Metadata Schema)
- Remark
- Available WMO Core Metadata Profile (extension to
ISO 19139)
15XML SWOT AnalysisExchanging operational data
(NMHS and centers)
- Strenghts
- strict syntax and parser requirements
- XML schemes, useful for
- validating
- defining exchange formats
- writing applications that process the data
- web enabling
- Weaknesses
- verbose, bandwidth consumption
- processing overhead (complex parsing, validation)
- Opportunities Threats
- data compression (gzip)
- no agreed WMO schemas, except for Core Metadata
- Usage barely, e.g. Cyclone XML
16XML SWOT AnalysisTransmitting information
(Users outside NMHSs and centers)
- Strenghts
- human readable (text)
- language support (unicode)
- web enabling (Webservices, SOA support,
e-Business) - interoperability
- translation to text or HTML or other XML (XSL)
- open standard, licence free
- technology support (tools)
- Weaknesses
- verbose, bandwidth consumption
- transmitting binary data
- Opportunities Threats
- usage for metadata exchange is common practice
(catalogue data) - too many proprietary schemas to transmit data
- Usage growing practice , e.g. Road Weather
Information Network, Canada
17XML SWOT AnalysisUsage in storage systems
- Strenghts
- Native XML DBMSs available
- Many XML enabled relational databases
- technology such as XPath and XQuery, SQL/XML
- Weaknesses
- Normalizing/mapping XML data into RDBMS tables
can be difficult - A native XML DBMS requires more space (less
efficient) - Opportunities Threats
- No much experience with XML for storage in NMHSs
- Usage writing XML queries iso SQL (growing
18XML SWOT AnalysisStandards
- Strenghts
- XML is open standard (W3C)
- ISO 191xx standards are supported by XML schemas
- GML (ISO 19136) XML for geospatial aspects
- OGC OM model supported by GML schemas
- Ability to implement UML conceptual models (ISO
19103) in XML schemes (using XMI) - Weaknesses
- No agreed WMO schemas available (except for Core
Metadata) - Opportunities ( Threats)
- All harmonization initiatives are based on ISO,
W3C and OGC standards (RA VI EU-INSPIRE) - OGC Observation and Measurement model used to
develop WXXM (weather exchange model) for
avaition (Eurocontrol) - HollowWorld applied to develop WMO Core metadata
19XML SWOT AnalysisOther Abilities
- Translating XML from and to other DRSs
- NetCDF ltgt XML tooling
- (to) NetCDF Markup Language (NcML), and NcML-GML
- (tofrom) LeoNetCDF
- BUFR ltgt XML tooling unknown
- HDF5 ltgt XML tooling (to) d5dump
- Combining XML with other data formats
- (e.g. envelope, pseudo carrier)
- XML metadata /header for HDF5, NetCDF or BUFR
datasets - SOAP (XML protocol) for exchanging data
20XML SWOT AnalysisAvailable support(skills
- Numerous XML extensions(XSL, XQuery XForms,
XPath, KML, DWML, SOAP etc.) - XMI to translate data models (UML) into XML
schemes - GML for geospatial aspects (GIS systems)
- XML technology is widespread, easily available
and cheap - Increasing number of individuals with XML skills
21XML SWOT AnalysisSummary
Criteria Score Remarks
Ability/suitability to present WMO data Includes also pictorial data, textual information, metadata Not suitable for raster data (pictures)
Ability/suitability to exchange data operational data between NMHSsinformation to users outside NMHSs Substantial bandwidth and processing overhead
Ability/suitability to store data usage in storage systems Mapping schemes to RDBMS tables difficult.
Compliance with and status of existing standards Only a Core Metadata XML scheme (WMO)
Available Support skills technology (tooling) High level of support
Other abilities ability to translate back and forward to other DRSs ability to envelope objects or act as a pseudo-carrier No BUFR translation tooling found
22XML practical experiences
- Cyclone XML, TIGGE project
- RWIN, Canada
23XML practical experiences
- Cyclone XML
- XML format for cyclone analyses and forecasts
(CXML) - to improve sharing of cyclone information with
other users than NHMSs - alternative for the BUFR/CREX format
- development recently started TIGGE project
- details http//www.bom.gov.au/bmrc/projects/THORP
EX/CXML/index.html - RWIN (Road Weather Information Network)
- XML format for road weather observations (CMML)
- Interchange between Canadian transportation
ministries and contractors (network maintenance) - 200 observing sites every 20 minutes
- Operational
- details http//www.clarusinitiative.org/
24XML practical experiences
- WXXM Weather Exchange model
- for data and objects related to weather for
aviation - Conceptual Model (WXCM) based on OGC Observations
and Measurements model - Following ISO 19100 principles and OGC
recommendations - using GML for compatibility with third-party GML
applications - Under development
- Detailshttp//www.eurocontrol.int/aim/public/stan
- Aeronautical Information Exchange Model (AIXM)
- Is there a need to develop a WMO Markup
language? - How to benefit from the extensive and cheap XML
support? - What synergy XML ltgt HDF5/BUFR/NetCDF is
achievable? - What governance should WMO offer to support XML?