Title: Metadata
1Metadata
Understanding the Value and Importance of Proper
Data Documentation Michael Moeller Metadata
Specialist NOAA Coastal Services Center
2What is Metadata?
3What is Metadata?
Simply put, metadata is information about your
data.
4This is the metadata for this.
Author(s) Boullosa, Carmen.
Title(s) They're cows, we're pigs /
by Carmen Boullosa
Place New York Grove Press, 1997.
Physical Descr viii, 180 p 22 cm.
Subject(s) Pirates Caribbean Area Fiction.
Format Fiction
While the card-catalog entry is a form of
metadata, it does not address topics such as
quality, accuracy, or scale. Well-written
geospatial metadata describes these and many more
aspects of the data.
5This is the metadata for this.
Emily and Madison
Whats Missing?
6A Common, Everyday Example
Entity
Attributes
7a small part of
This is the metadata for this.
Identification_Information Citation
Citation_Information Originator NOAA
Coastal Services Center Publication_Date
19971131 Title Hurricane Storm Surge
Geospatial_Data_Presentation_Form Map
Publication_Information
Publication_Place Charleston, SC
Publisher NOAA Coastal Services Center
Larger_Work_Citation Citation_Information
8Metadata as a Component of Data
9A Component of Data
Properly documented data provides vital
information to interested parties.
10A Component of Data
Metadata is that component of data which
describes it.
Environmental Sensitivity Index Data
Metadata
11A Component of Data
Its data about a data set.
12A Component of Data
Metadata describes
CONTENT CONDITION QUALITY
Characteristics of the data
13A Component of Data
Because metadata provides vital information about
a dataset, it should never be viewed or treated
as a separate entity.
14A Component of Data
Take Home Message
Metadata is a critical and integral component
of any complete data set.
15The Value of Metadata
16The Value of Metadata
The Current Concept
- Primary external value
- Discovery
- Assessment
- Access
- Use
17The Value of Metadata
The Current Concept
- Primary internal value
- Inheritance
Properly documenting a data set is the key to
preserving its usefulness through time.
18The Value of Metadata
An Emerging Concept
- A data management tool
- Internal value
- Discovery
- Assessment
- Access
- Use
19The Value of Metadata
Benefits as a data management tool
- Data Currency
- Date of last edit/update
- Age of source files
- Data Utility
- Track source file usage
- Track distribution frequency
20The Value of Metadata
Benefits as a data management tool
- Monitoring Data Development
- Data processing steps
- Status of development
- Estimate Development Costs
- Data processing time and extent
- Source file availability
21The Value of Metadata
Obstacles to metadata production
Examples include
- Metadata standards are too extensive
- and difficult to implement.
- Metadata production requires time
- and other resources.
- Few immediate and tangible benefits,
- and few incentives to produce
- metadata.
22The Value of Metadata
Make metadata part of the process
To realize the full potential of metadata under
this new concept, metadata creation must
become integral to the data development process.
The question is How?
23The Value of Metadata
Sell it!
Approach metadata development from a business
perspective.
- Preserves data investment
- Limits liability
- Helps manage data resources
- Aids in external data acquisition
- Facilitates data access and transfer
- Provides for efficient data distribution
24The Value of Metadata
Build administrative support
Garner administrative support by stressing the
organizational benefits
- Data archive
- Data assessment
- Data management
- Data discovery
- Data transfer
- Data distribution
25The Value of Metadata
Build technical support
Stress the individual benefits of metadata
- Reduces workload over the long term
- Field fewer data inquiries
- Provides a means of documenting
- personal contributions
- Facilitates sharing of reliable information
26The Value of Metadata
Build technical support
Develop strong staff support
- Incorporate metadata expectations into
- job descriptions and performance standards
- Provide staff development opportunities
- The three Ts
- Training
- Tools
- Time
27The Value of Metadata
Build organizational support
Develop templates to facilitate efficient and
consistent metadata creation
- Identify pertinent fields within the
- metadata structure
- Populate fixed fields
- Use standardized language
- Define distribution methods
- Cite standards used
- Build source and contact libraries
28The Value of Metadata
Distribute the effort
- Map metadata fields to the work flow
- Establish and assign responsibilities
- Technicians - lineage
- Analysts process and methodology
- Field Scientists accuracy assessments
- I.T. Managers tools, automated collection
- methods, information management
29The Value of Metadata
Establish standard policies
- Mandate the use of standards and templates.
- Develop boilerplate metadata deliverable language
for data contractors. - Require publication of metadata.
- Create and publish a metadata SOP to document
policies and procedures.
30The FGDC Workbook
31FGDCs Metadata Workbook
Defines the 334 metadata elements.
32What do I use The Workbook for?
- It is the definitive resource for applying the
FGDC Content Standard.
- It provides section and element definitions.
- It describes element domain values, which are
valid values that can be assigned to the data
element.
- However, it does not define
- the production rules.
33Use the Graphical Representation for quick
access.
- It is a quick reference for production rules
and structure.
- You will still need to use the workbook to
find the definition of a particular element and
its domain.
34Playing the Metadata Game
35Organization of the Content Standard
The Content Standard is organized using numbered
chapters called sections. There are 7 main
sections and 3 supporting sections. Each
section is organized into series of elements
that define the information content for metadata
to document a set of digital geospatial data.
36Organization of the Content Standard
The Seven Main Sections
4
5
2
6
7
3
1
Data Quality Information
Spatial Data Organization Information
Spatial Reference Information
Entity and Attribute Information
Distribution Information
Metadata Reference Information
Identification Information
37Organization of the Content Standard
Each section begins with the name and definition
of the section. These are followed by the
component elements of the section. Each
section provides the names and definitions of
its component elements, information about the
types of values that can be provided for the
elements, and information about the elements
that are mandatory or repeatable.
38Interpreting the Graphical Production Rules
- The workbook uses graphics to illustrate the
production rules of the standard. These graphics
include most of the information provided by the
production rules, including -
- How elements are grouped.
- What is mandatory and what is not.
- What elements can repeat and how many times
they can repeat.
39Interpreting the Graphical Production Rules
Sections are depicted by this symbol.
40Interpreting the Graphical Production Rules
A data element is a logically primitive item of
data. Data elements are the things that you fill
in.
The form for the definition of a data element is
Data element name -- definition. Type (choice
of integer, real, text, date, or
time) Domain (describes valid values that can
be assigned)
An example of the definition of a data element is
Abstract -- a brief narrative summary of the data
set. Type text Domain free text
Note Data element definitions are contained in
the text of the Content Standard, not in the
graphical production rules.
41How Elements Are Grouped
Compound elements are composed of other compound
and data elements. The composition is represented
by nested boxes.
Compound Element 1 is composed of Compound
Element 1.1 and Data Element 1.2.
Compound Element 1.1 is composed of Data
Element 1.1.1 and Data Element 1.1.2.
42Whats Mandatory? Whats Not?
Meaning
Mandatory - must be provided.
43Repeating Elements
- If an element can be repeated
- independently from other elements, it
- will be indicated as such below the
- element name.
This group of elements would repeat.
Compound Element 1 Compound Element 1.1
Data Element 1.1.1 Data Element
1.1.2 Data Element 1.2
44Using the Graphics to Make Decisions
Compound Element 1
Compound Element 1.1
- All elements are colored yellow, so all are
mandatory and must be reported.
Data Element 1.1.1
Data Element 1.1.2
Data Element 1.2
45Using the Graphics to Make Decisions
- Compound Element 1 is mandatory.
-
- Compound Element 1.1 is optional.
-
- If yes, Data Elements 1.1.1 and 1.1.2
are mandatory. - If no, do not report Compound Element
1.1, Data Element 1.1.1 or 1.1.2, and skip
to Data Element 1.2. -
-
-
- Data Element 1.2 is mandatory.
46Using the Graphics to Make Decisions
- Compound Element 1 is mandatory.
- Compound Element 1.1 is mandatory.
- Data Element 1.1.1 is mandatory.
-
- Data Element 1.1.2 is mandatory if
- applicable.
-
-
- Data Element 1.2 is optional.
47Using the Graphics to Make Decisions
- Compound Element 1 is mandatory if applicable. If
not applicable to the data set, do not report any
elements. If applicable, it is mandatory and - Compound Element 1.1 is mandatory.
- Data Element 1.1.1 is mandatory if
- applicable. If not applicable, do
- not report it. If applicable, it is
- mandatory.
- Data Element 1.1.2 is mandatory.
-
- Data Element 1.2 is optional.
48(No Transcript)
49Exercise 1 Using the Green Book
50Metadata as a Data Discovery Tool
51Discovering Data Through Metadata
The FGDC metadata clearinghouse is a
decentralized system of Internet servers you can
use to search for available geospatial data.
Servers housing metadata
Client
FGDC Gateway
52Discovering Data Through Metadata
The descriptive information that fuels the FGDC
clearinghouse is metadata, which is collected in
a standard format to facilitate query and
consistent presentation across the multiple
participating sites.
53A Brief Look at the FGDC Clearinghouse
The FGDC has six gateways to its clearinghouse
system, with access to over 250 spatial data
servers.
www.fgdc.gov/clearinghouse/clearinghouse.html
54A Brief Look at the FGDC Clearinghouse
Searches can be performed by using the NSDI
Search Wizard, or by using a map interface with
place names, or by place names alone.
55A Brief Look at the FGDC Clearinghouse
The new NSDI Smart Select Search Wizard bins
servers by the types of metadata they house.
56A Brief Look at the FGDC Clearinghouse
Searches can be performed using a map interface
that allows the user to define an area of
interest.
An area of interest can be defined by dragging an
area of interest box on the map interface.
57A Brief Look at the FGDC Clearinghouse
The selected area defines the bounding
coordinates that will be used in the search.
58A Brief Look at the FGDC Clearinghouse
You can search all the servers listed, or you can
select only those that interest you.
59A Brief Look at the FGDC Clearinghouse
Select individual servers of interest to your
search.
60A Brief Look at the FGDC Clearinghouse
Search criteria can be further refined by time
period of content and keywords.
61A Brief Look at the FGDC Clearinghouse
The status of each selected node is displayed as
the search is conducted.
62A Brief Look at the FGDC Clearinghouse
When the search is complete, the status window
lets you know if you were successful in
discovering metadata that matched your search
criteria.
63A Brief Look at the FGDC Clearinghouse
Select a server to see what metadata is
available.
64A Brief Look at the FGDC Clearinghouse
Metadata discovered by the search is shown by
title.
65A Brief Look at the FGDC Clearinghouse
Metadata record returned in HTML format.
66A Brief Look at the FGDC Clearinghouse
67A Brief Look at the FGDC Clearinghouse
The Coastal Information Directory (CID) at the
NOAA Coastal Services Center is similar to the
FGDC gateway interface, but the CID searches
only those spatial data servers that house
metadata of a coastal nature.
www.csc.noaa.gov/CID/
68A Brief Look at the FGDC Clearinghouse
For more information on the clearinghouse system,
visit the FGDC Web site (www.fgdc.gov). Here
you can find information on how to establish your
own clearinghouse node using free Isite?
software. On-line tutorials provide assistance
for setting up and configuring this software.
69Exercise 1
Searching for Metadata
- Access the FGDC Website at www.fgdc.gov.
- Click on the Clearinghouse link,
- and then click on Search for Geospatial Data.
- Choose a Clearinghouse Gateway.
- Decide how you are going to search.
- Perform a search using the keywords
- from the exercise Searching for Metadata.
70Exercise 2
Reading a Metadata File
71Writing Metadata
72Its not THAT bad!
- First records are the hardest.
- Not all fields may need to be filled in.
- Tools are available.
- Training classes can be taken.
- Can often be produced automatically.
- Can (and should) be reviewed
- for updates.
73Writing Metadata
Before you begin writing, get organized.
74Writing Metadata
Document your data as you go.
75Writing Metadata
Write so others can understand.
76Writing Metadata
Always review your document.
77Items required
FGDC Workbook
Metadata entry tool
Chocolate
Coffee
Sense of Humor!
78Writing Metadata
Keep your readers in mind.
- Write simply but completely.
- Document for a general audience.
- Be consistent in style and terminology.
79Writing Metadata
Keep your readers in mind.
- Clearly state data limitations.
80Writing Metadata
Write a complete title that includes
- What
- Where
- When
- Scale
- Who
81Writing Metadata
The title is critical in helping others find your
data.
Which is better?
82Writing Metadata
Be specific. Quantify when you can.
Vague We checked our work and it looks
complete. Specific We checked our work using 3
separate sets of check
plots reviewed by 2 different people. We
determined our work to be 95 complete based on
these visual inspections.
83Writing Metadata
- Select your key words wisely.
- Use unambiguous words.
- Use descriptive words.
- Fully qualify geographic locations.
84Writing Metadata
Review your final product.
- Have someone else read it.
- If youre the only reviewer, put it away and
read it again later. - Check for clarity and omissions.
85Writing Metadata
When you review your work, ask
- Can a novice understand what you wrote?
- Are your data properly documented for
posterity?
86Writing Metadata
When you review your work, ask
- Does the documentation present all the
information needed to use or reuse the data? - Are any pieces missing?
87Writing Metadata
Write so that others will understand.
88Tool Time
- Metadata Creation
- and Validation
89Tool Time
A sample of some of the available tools for
metadata creation, validation, and publication.
- TKME
- Text editor used for metadata entry.
- NOAA CSC ArcView Metadata Collector
- Extension for ArcView 3.x.
- NOAA CSC MetaScribe
- Allows you to create a template record that can
be used to create large numbers of similar
records.
- CNS and MP
- Chew n spit, checks and corrects structural
errors, - and Metadata Parser, which checks for errors
in - element compliance.
- Commercially available software
90Metadata Entry Tools
TKME - An editor for formal metadata, TKME is
intended to simplify the process of creating
metadata that conform to the standard.
91Metadata Entry Tools
NOAA CSC ArcView Metadata Collector - The
ArcView Metadata Collection Tool was developed
by the National Oceanic and Atmospheric
Administration (NOAA) Coastal Services Center in
ArcView using the Avenue scripting language.
This tool collects and compiles Federal
Geographic Data Committee (FGDC)-compliant
metadata for ARC/INFO coverage's, shapefiles,
grids and supported image formats.
92Metadata Entry Tools
NOAA CSC MetaScribe
93Tool Time
ArcGIS and Metadata
94Tool Time
MP (Metadata Parser) - A compiler to parse formal
metadata, checking the syntax against the FGDC
Content Standard for Digital Geospatial Metadata
and generating output suitable for viewing with a
web browser or text editor.
CNS (Chew n Spit) - A pre-parser for formal
metadata designed to assist metadata managers
convert records that cannot be parsed by mp into
records that can be parsed by mp.
95Tool Time
96Tool Time
TKME, CNS, and MP are available as free downloads
from the United States Geological Survey (USGS)
Website. (geology.usgs.gov/tools/metadata) TKME
will run from a shortcut on the desktop, but both
MP and CNS must be run from a command line in
MS-DOS or UNIX.
97Finally...
Remember, metadata is an integral component of
your data, and can provide many benefits at
various levels within an organization by making
the metadata process more streamlined and
efficient.
98Michael Moeller Mike.Moeller_at_noaa.gov www.csc.noaa
.gov/metadata/
99Writing metadata