Title: Metadata and Data Discovery
1Metadata and Data Discovery
2 Two Uses
- This topic is still very fluid. The ideas of
registries of data sets and registries of
services are still not well settled in the
community. - An FGDC Node ! SOAP UDDI ! IOOS
Registry - 1 - Full characterization of a data set requires
rich and broadly written metadata (eg. FGDC
CSDGM). - 2 - Access to a finite list of data services
that provide well defined data streams requires
only the characteristics of the data service(s)
with pointers to the full metadata. - Full metadata may or may not be transported with
each data transmission.
3 Metadata (use 1)
- We are obligated to participate in the FGDC
Clearinghouse NSDI system (Executive Order
12906). - We need a thesaurus and keywords for FGDC. There
is a proposal in the DMAC standards process that
references GCMD, IHO, OBIS, and CF. - IOOS PO DIF Concept of Operations document points
toward ISO as does the FGDC and DMAC. ISO is
rich, powerful, and complex. - Options
- Write metadata and host your own FGDC
Clearinghouse node - http//www.fgdc.gov/dataandse
rvices/isite_tutorial - Write metadata and participate in Geospatial
OneStop - http//gos2.geodata.gov/wps/portal/gos - Use MERMaid - http//www.ncddc.noaa.gov/metadatare
source/metadata-tools - This addresses the need for full characterization
of data sets.
4 Registries of Data Services
5 Discovery in a Controlled System
- Data discovery in a system with predefined
content, structure, and quality is simpler than
discovery in a totally open system. - EPA Contract Laboratory Program model.
- EPA places basic requirements on laboratories.
- The quality of the data is expressed in a
standardized way - Data quality is handled just as any other
attribute like a date or units. - The data reporting is standardized.
- Any user of CLP data can depend on the
content/structure. - We still need to establish a regimen of quality
levels and flags for each level (e.g. QARTOD).
There is a submission to the DMAC standards
process on this, but, it is unclear if/when it
might be adopted.
6 Example Obs. Registry Architecture
7 A Simple Example
- The IOOS Observation Registry uses a simple list
of participants. An entry for a participant
looks like
ltregistrantgt ltnamegtname of organization
(CDP)lt/namegt ltemailgtsomebodyobfuscate
emailsome.server.edult/emailgt
ltfile_urlgthttp//www.yourserver.edu/obsreg/obs.xml
lt/file_urlgt ltregional_associationgtAOOSlt/re
gional_associationgt lt/registrantgt
8- An entry in a registry of services might look
something like
ltregistrantgt ltnamegtname of
organizationlt/namegt ltemailgtsomebodyobfusc
ate emailsome.server.edult/emailgt
ltservice_urlgthttp//www.yourserver.edu/cgi-bin/ioo
s/microWFS.cgilt/service_urlgt
ltobservationsgt ltobservation
md_urlhttp//url/to/sal_metadatagtsalinitylt/obse
rvationgt ltobservation md_urlhttp//url/
to/wt_metadatagtwater temperaturelt/observationgt
ltobservation md_urlhttp//url/to/cs_metad
atagtcurrent speedlt/observationgt
ltobservation md_urlhttp//url/to/cd_metadatagtc
urrent directionlt/observationgt
lt/observationsgt ltgmlboundedBygt ltgmlEn
velope srsName"EPSG4326" srsDimension"2"gt ltgm
llowerCornergt-27.928720 -159.833328lt/gmllowerCor
nergt ltgmlupperCornergt60.095001
144.788834lt/gmlupperCornergt lt/gmlEnvelopegt lt/
gmlboundedBygt lt/registrantgt
9 Suggestions
- Suggestion Write FGDC Metadata as required.
- Suggestion Capitalize on that investment to
support decisions about data use. - Suggestion Employ a simple registry of data
services to facilitate data discovery in a system
of Certified Data Providers. - Suggestion Write an FGDC record for each
parameter specific data stream and host it near
the data access point.