Title: Discovering Geospatial Data Resources via Clearinghouse
1Discovering Geospatial Data Resources via
Clearinghouse
- Lessons learned through a passing familiarity
with Z39.50 - Doug Nebert, FGDC
2Clearinghouse expectations
- Standard fields and operators for discovery and
display of data - Distributed data and metadata resources
- Search engine capabilities on full-text and
fields - Delivery of structured documents in consistent
form
3Standard fields and operators
- Ability for information provider to support well
known semantics of fields in search - Support for internationalization
- Allow custom approaches to data management and
schema, but hidden from the end-user
4Distributed data and metadata
- Data and metadata are better managed locally
rather than centrally - Centralized index is a significant investment
with likely synchronization problems - Community list of registered data collections
managed and reflected as Internet resource for
distributed query
5Search engine capabilities
- Descriptive properties need to be made quickly
searchable - Properties are stored as full-text, text, and
numeric - Multiple properties can be combined with and
and or boolean operators - May be local or remote
6Structured results
- For comparison of results from different sources,
it is desirable to present the results in
consistent ways - HTML, SGML, and text formats supported
- Well-known subsets of data elements can be
selected for delivery Brief, Summary, or Full
are examples
7More basics
- Properties are typically aggregated to the
dataset level for discovery and presentation,
for distinct data products - Some properties are pertinent to collections of
data products (series, collections, flightlines) - Specific properties may vary across a collection
(pub date, footprint)
8Data Management may be via
- A database management application separate from
the data resource - An index of structured information not in a
conventional DBMS - Properties of spatial data managed with and
within the geodata management system
9A discovery view
Registry
User
Node
10A discovery view
Registry
?
Query
User
Node
11A discovery view
Registry
?
Query
User
Node
12A discovery view
Registry
?
Query
User
Node
13A discovery view
Registry
?
Response
User
Node
14A discovery view
Registry
Data Connections
User
Node
15NSDI Clearinghouse
- Uses list of servers registry
- Provides redundant WWW-Z39.50 protocol gateways
with server list - Returns Brief, Summary, or Full metadata
reports to client - Query of 50 servers becoming difficult
- Pre-emptive search under consideration
16Canada CEONet
- Uses same Z39.50 protocol and FGDC field
terminology - Requires trader-like WWW-Z39.50 gateway for all
queries - Allows terminal descriptions to map to FGDC
properties in non-standard formats
17CEOS (NASA/ESA/NASDA)
- Catalogue Interoperability Profile (CIP) also
uses Z39.50 and WWW - Requires DCE middleware to support query routing
and ordering - Requires intentional search of collections
information then subordinate search (no product
inventory search)
18Scalability of Services
- Discovery needs to occur as if via DNS
- Multiple registries will exist, multiple
communities may define servicespace - Gateways for WWW-??? protocols can be bottlenecks
- Arbitrated queries via trader should be
permitted, not required
19Catalog Services for OGC
- Should support a single common abstract query
interface to data set properties - Should evaluate interfaces in terms of
interoperability among CORBA, COM, SQL, and WWW
implementations - Should support common properties and community
differentiation and extension
20Interactions
- User/client interacts with a registry/ referral
server and with data servers - Registry/referral server interacts with user and
data servers - Data servers interact with users and forward
information to registry/referral servers
21Technology suggestions
- Z39.50 ISO 23950 Search and Retrieve
- X.500 or LDAP for aggregate properties
- Whois for referral service
- Uniform Resource Characteristics (URC)
- PICS-NG ratings/metadata service
- Extensible Markup Language (XML)
- Resource Description Framework
22Feature-Level Metadata
23Hierarchies of Metadata
- Properties are most often associated with the
data set or minimum collection of features
available for acquisition - How are feature-level properties exposed to
search? - How are multi-level properties presented to the
user?
24One multi-level approach
- An arbitrary set of features is associated with
one or more properties via a unique identifier - Permits inheritance of properties from multiple
sources - Requires a supplements or supercedes
condition where a parent case exists
25National Hydro Dataset
USGS DLG
EPA RF3
General Case
A
B
Basin- Specific
Quad- Specific
Supplemental
Aa
Bb
Feature-list Associations
5
1
2
3
4
4023 Bb 2,3,4,5 4024 Aa 1,2,3
26Feature Addition and Update
USGS DLG
EPA RF3
General Case
A
B
Basin- Specific
Quad- Specific
Supplemental
Aa
Bb
4023 Bb 2,3,4,5 4024 Aa 1,2,3 4025 C
6 4026 D-b 5
5
1
2
3
4
New Feature
6
Update
C
D
27Issues
- Packaging and presentation of linked and
supplemental metadata - Processing of updates to features and propagation
of metadata over data lifecycle - Feasibility in existing data management systems,
software, and schema