Tony Rees - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Tony Rees

Description:

Introduce myself, my agency, our approach to data and metadata ... 2 x 1 degree squares (Maidenhead locators) Equal-area systems. UTM grids ... – PowerPoint PPT presentation

Number of Views:222
Avg rating:3.0/5.0
Slides: 41
Provided by: ree103
Category:
Tags: maidenhead | rees | tony

less

Transcript and Presenter's Notes

Title: Tony Rees


1
C-squares - a new approach to representing,
querying, displaying and exchanging dataset
spatial extents at the metadata level
Tony Rees Divisional Data Centre CSIRO Marine
Research, Australia (Tony.Rees_at_csiro.au)
2
Talk Outline
  • Introduce myself, my agency, our approach to data
    and metadata
  • Review characteristics of metadata, and current
    handling of spatial extents in metadata records
  • Describe limitations of bounding rectangles
    representation for non-rectangular / patchy data
  • The C-squares approach
  • Current c-squares resources / future possibilities

3
Acknowledgements ...
  • CMR staff and colleagues in Australia, Europe and
    USA for helpful discussions
  • WMO and Australian Blue Pages for nomenclature
    for the squares and their subdivisions
  • Miroslaw Ryba (CMR) for programming used in the
    c-squares mapper and search interface
  • David Hastings / NOAA GLOBE Task Team and CSIRO
    Atmospheric Research for images used as base maps
  • Doug Nebert / FGDC for hosting my US visit and
    interest in the system

4
Author/Agency Background
  • From CSIRO Marine Research in Australia (located
    in Hobart, Tasmania, 2 other locations c. 300
    staff)

5
CMRs Data and Metadata Storage- similar to many
other agencies ...
6
Metadata functions
  • Dataset discovery - by providing a filtered
    subset of all possible records (according to
    user-specified criteria)
  • Dataset description - permits a degree of
    resource appraisal (will this data be what I
    need?)
  • Dataset surrogate - may enable some questions to
    be answered, and/or statistics compiled, without
    need to access the actual data
  • Should also provide access route to the data if
    required (online link or contact point)
  • C-squares assists each of the first three
    points above.

7
Bounding Rectangles Representation- and
overlapping rectangles search method
  • Current metadata systems hold a bounding
    rectangle (bounding box) for each dataset (N, S,
    E, W bounding coordinates)
  • Spatial searching is carried out by an
    overlapping rectangles test

cases (1) and (2) include the tacit assumption
that the data rectangle is actually filled with
data all overlaps with the data rectangle are
inferred to be overlaps with the actual data.
8
The California Problem
  • The State of California is a classic (previously
    cited) case where the bounding rectangle is a
    poor fit to the real spatial extent ...

search regions in Nevada, a little of
Arizona, plus offshore Pacific Ocean will all
intersect this data rectangle (false hits)
9
False Hits from Overlapping Rectangles Searches
Potential problems can be deconstructed into 3
contributing ones ... (a) Filled polygons, but a
poor fit to their bounding rectangle
(b) Multiple discrete polygons
(c) Incompletely filled polygons
10
Consequences of False Hits ...
  • Can get nonsensical results (sea ice at the
    Equator, marine species in the desert)
  • Time / effort wasted accessing inappropriate
    datasets
  • Cannot use resultsets quantitatively, e.g.
  • how many records / species occur in this defined
    region
  • compare content of one defined region with
    another
  • sum the results of consecutive searches
  • etc.

11
Authors Agency Data (typical)
12
C-squares approach
  • gives flexibility to represent a variety of
    dataset shapes, also patchiness (gaps in data
    coverage)

13
Highlighted Squares
can be expressed as a set of codes (labels) in
an ASCII string, e.g. code1 code2
code5 code7 code13 code14 code15
code21 (etc.)
  • List of codes is potentially more succinct
    (concise) than original data
  • codes potentially terse in themselves
  • multiple points in single square only coded once
  • empty cells not coded
  • Now has capability for increased precision of
    querying (on individual square, not bounding
    rectangle)

14
What Notation to Use?( choosing a taxonomy of
space)
  • Available coding systems (global grids)
  • Lat/long-based systems
  • 10 x 10 degree squares (WMO squares, Marsden
    Squares)
  • 6 x 4 degree squares (International Map of the
    World)
  • 2 x 1 degree squares (Maidenhead locators)
  • Equal-area systems
  • UTM grids
  • other National or local grids (e.g. US, UK
    national systems local mapsheet refs)
  • commercial products (e.g. Go2, MapPlanet)
  • Duttons Quaternary Triangular Mesh (basis for
    MS Encarta)
  • ...Other numeric systems (e.g. postcodes,
    numbered features or zones) - unsuitable because
    of local usage only, and/or lack of scalability

15
Basis for C-squares Codes ...
  • WMO (World Meteorological Organization) 10 x 10
    degree squares chosen as starting point for codes
  • Subsequent subdivisions are base 10 (with
    intermediate base 2 divisions embedded), for
    compatibility with decimal degrees
  • Name C-squares (Concise Spatial Query and
    Representation System)
  • any square (at any resolution) encoded according
    to this method can also be termed a c-square.

16
WMO 10 x 10 degree squares - Numbering Principle
180W
180E
0E/W
90N
90N
1817
NW (7xxx)
NE (1xxx)
Equator
Equator
SE (3xxx)
SW (5xxx)
90S
90S
17
WMO 10 x 10 degree squares in practice(examples)
(Maps courtesy R. Curry/WHOI)
18
Basis for Recursive Subdivision(e.g. in NW
global quadrant)
(Principle as used in Australian Blue Pages
metadata system, 1996)
  • 10 x 10 deg. square - e.g. 7307
  • divided as follows (Blue Pages nomenclature)
  • 73074 (5 x 5 deg. square)
  • 7307487 (1 x 1 deg. square)
  • C-squares then extends this principle
    recursively, e.g. ...
  • 73074873 (0.5 x 0.5 deg. square)
  • 7307487393 (0.1 x 0.1 deg. square)
  • etc.

(NB, arrangement is mirror image across 0º
latitude and 0º longitude 100 is always closest
to the global origin, 499 is furthest away)
19
Actual Size Examples 10 x 10, 5 x 5 degree
squares
20
Actual Size Examples 5 x 5, 1 x 1 degree
squares(1 x 1 degree squares are approx. 110 x
70 km)
follows template
73074
7307487 bounded by 38º N ( 7307487 ) and 77º
W ( 7307487 ) 7307487393 would be
bounded by 38.9º N ( 7307487393 ) and 77.3º W
( 7307487393 )
21
Actual Size Examples 0.1 x 0.1 degree
squares(approx. 11 x 7 km)
7307496 (part)
7307497 (part)
39.1
39.0
7307486 (part)
7307487 (part)
38.9
follows template
38.8
77.0
77.1
77.2
77.3
77.4
76.9
76.8
22
Efficiency via Data Reduction Available ...
  • Global coverage requires up to ...
  • 648 10 x 10 degree squares
  • 64,800 1 x 1 degree squares
  • 259,200 0.5 x 0.5 degree squares
  • To reduce the number of codes required to
    represent large areas without compromising
    resolution, a wildcard notation is permitted,
    e.g.
  • 3414 to indicate 34141 through 34144 (4
    codes)
  • 3414 to indicate 3414100 through 3414499
    (100 codes)
  • 3414 to indicate 34141001 through
    34144994 (400 codes)
  • (etc.)
  • Result is similar to a quadtree approach (only
    subdivide as far as necessary, to match varying
    levels of detail required)

23
Real-world c-squares implementation (example 1)
24
Real-world c-squares implementation (example 2)
603 squares, at 0.1 deg. resolution 7838
characters / 8 Kb
25
Encode - Decode methods
  • Encoders currently available (3 versions)
  • original at CSIRO Marine Research (Oracle PL/SQL)
  • another in use at OBIS, USA (Java)
  • another at FishBase, ICLARM (ColdFusion)
  • source code for all three available via
    c-squares website
  • (all these are for encoding point data)
  • Decoding - not needed for searching (see
    following slide), or for mapping if the c-squares
    mapper is invoked (mapper does the decoding)
  • otherwise, is a very simple algorithm if needed
    (or can do by inspection!)

26
C -squares search mechanism (behind-the-scenes)
  • Look for a text match between search dataset
    extent (expressed as c-square/s) and c-squares
    string for any dataset, e.g.
  • does 3111499 (or 31114, or 3111) appear
    anywhere in the string

301349731114683111478311147931114883111
48931114993112122311212331121313112132
(etc.)
  • Advantage 1 needs no special, vector-based
    searching overhead ( simple text search)
  • Advantage 2 nested nomenclature means that
    searching can be carried out at any level of the
    hierarchy equal to, or greater than, the encoded
    resolution
  • Advantage 3 search precision is now potentially
    to the level of an individual c-square (much
    better than bounding rectangle).

27
C -squares search interface(example from CMRs
MarLIN metadata system)
  • Point-and-click user interface, e.g.

28
C-squares Search Result
29
View Metadata Record (initial portion) ...
30
C-squares Search Result (continued)
  • If no c-squares string held, defaults to standard
    bounding rectangles search, returned as
    possible match, e.g.

(this way, c-squares and non- c-squares
enabled records can co-exist in the same metadata
repository or in distributed searches)
31
C-squares as Explicit Spatial Extent Code/s
  • C-squares can also be quoted explicitly in
    metadata records, or any other web document
    referring to a point or region

32
Can Then Utilize Capabilities of a Standard
Internet Search Engine, e.g.
33
C-squares applicable to a Variety of Data Types,
e.g.
34
Pause to Take Stock ...
  • Light, portable, metadata-friendly system for
    describing a wide variety of dataset footprint
    types
  • Could be expressed as an XML element (e.g.
    ltcsquaresgt lt/csquaresgt)
  • Codes can be easily derived from lats/longs in
    decimal degrees (and vice versa)
  • Can be used for visualization of dataset spatial
    extents via web link to the c-squares mapper (or
    similar)
  • Amenable to text searching via current text / web
    search technology - no additional hardware or
    software overhead needed
  • Improves reliability of search resultsets, fewer
    or no false hits (results suitable for
    quantitative analysis)
  • Could provide an interoperable nomenclature for
    previously binned data (e.g. into 0.1 x 0.1
    degree cells, etc.)

35
C-squares Potential Uses ...
36
C-squares Potential Uses - continued
spatially enabled web pages ?? - (like
dot.geo concept, but requiring no
administrative / hardware overhead)
37
Strengths / Weaknesses ...
  • Strengths ...
  • C-squares is a concise and flexible method of
    encoding simple to moderately complex forms
  • Encoding/decoding is easy and follows previously
    documented methods also directly related to lats
    and longs in decimal degrees
  • Spatial searching is a standard text string
    matching operation - already supported by most
    database search applications (and web search
    engines)
  • C-squares mapper utility available via simple
    web call
  • Can be used as adjunct to bounding coordinates
    searches
  • No proprietary software or hardware required to
    implement the system
  • Potentially globally applicable and
    interoperable equally suitable to marine and
    terrestrial data.

38
Strengths / Weaknesses ...
  • Weaknesses
  • WMO square nomenclature (and subdivisions) are
    only one of several available (competing?)
    taxonomies of space - further effort may be
    needed to promote it as a common/interoperable
    solution
  • C-squares is not an equal-area system - not
    amenable to rapid computation of areas or
    distances
  • Coding is inefficient near the poles (needs
    larger number of codes for same size areas)
  • Strings can become quite long for large, complex
    regions (e.g. Pacific Ocean) - need to be able
    to incorporate data reduction using wildcard
    method
  • Encoding algorithms not yet developed for line/
    polygon vector data, only for points
  • Method can be ambiguous at boundaries of natural
    features or administrative areas (since these
    will not always coincide neatly with c-square
    boundaries).

39
Resources Currently Available
  • C-squares website www.marine.csiro.au/csquares/ -
    includes
  • C-squares draft specification and general
    background
  • Sample code for lat/long to c-squares conversion
  • On-line lat/long to c-squares converter
  • How to link to the c-squares mapper
  • Sample presentations, and links to c-squares
    enabled metadata records
  • Abstracts, presentations from 2 conferences (May,
    November 2002)
  • Paper describing c-squares submitted for
    publication in Oceanography, late 2002
    (anticipated publication date March 2003)

40
Some Questions to Consider ...
  • Does the system have value in the context of the
    present audiences needs?
  • Who would be potential users?
  • What mechanisms could / should be utilized to
    promote it?
  • Who might have an interest in further concept /
    system development, if needed?
  • Is there a place for c-squares in formal metadata
    standards?
Write a Comment
User Comments (0)
About PowerShow.com