The Canopy Database Project Tools for Research - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

The Canopy Database Project Tools for Research

Description:

... State College ... Study. Design. Field. Work. Data. Entry & Verif'n. Data. Analysis. Data. Sharing ... New visualizations. Collaborate with other eco ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 17
Provided by: judit49
Category:

less

Transcript and Presenter's Notes

Title: The Canopy Database Project Tools for Research


1
The Canopy Database Project Tools for Research
Information Integrationhttp//canopy.evergreen.
edu
  • Judy Cushing, Nalini Nadkarni
  • Mike Finch, Anne Fiala
  • Youngmi Kim, Aaron Crosland and others
  • The Evergreen State College

Collaborating Ecologists Collaborating Computer Scientists
Van Pelt, Bond, Dial, Ishii, Keim, Parker, Shaw, Sillett, Sumida, et al Dave Maier, Lois Delcambre, Travis Brooks (OHSU)
Collaborating LTER Information Mangers
Eda, Nicole, Kristin, Ken, Jonathan, James Brunt and others?
NSF CISE and BIO 04-xxx, 03-xxx, 01-31952,
01-9309 99-75510, 9630316, 93-07771
2
Canopy DB Vision
PI IM use ofdatabase technology components
can ease metadata provision, data validation
and archiving, and data mining for synthesisBUT
  • Researchers arent programmers.
  • The technology must be easy to use
  • increase research productivity.

3
The Underlying IdeaDatabase Design with Domain
Specific Components
  • Validate generated databases with rules
  • e.g., Stem
  • depends on study area, plot
  • includes species table

Capitalize on core components for
tools Visualization, Metadata Provision, Data
Acquisition Validation, research protocol,
statistical analysis.
4
Approach
  • Pathfinder Projects
  • Ecologists design carry out field research at
    several sites.
  • Find research, archiving and data mining
    bottlenecks.
  • Determine spatial data structures.
  • Reverse-engineer components.
  • Database Tools for the Field Ecologist
  • Design field databases DataBank.
  • Visualize data using those databases
    CanopyView.
  • Lab-specific metadata acquisition.
  • Hand-held (palm pilot) field data acquisition.
  • Reality-check with LTER Information Managers.
  • Web Accessible Research Reference -- BCD

5
Research BottlenecksDatabase Technology for
Researcher Productivity
Metadata Generation
  • Archive in Lab(common types)

Data Visualization
Statistical analysis
Data validation (against metadata)
Data and metadata capture
  • Database and Protocol Design
  • Research Reference Tools

Information Synthesis

6
Recent Work
  • Finding maintaining the components
  • Ecology Theory spatial categorization of the
    Canopy
  • Template Editor
  • Refine existing software
  • Template-embedded semantic metadata, carried
    forward
  • DataBank now stand alone
  • Generate Excel, as well as Access and other RDBMS
  • New visualizations
  • Collaborate with other eco-informatics projects
  • Closer integration with EML, Morpho
  • LTER IM Collaboration Kaplan, Melendez-Colom,
    Ramsey, Vanderbilt, Walsh.
  • Outreach to computer science community agencies
  • NSF/USGS/NASA/EPA/ JIIS special issue dg.o

7
Future Work
  • Carry out collaborative field studies
  • Develop and test synthesis hypotheses
  • Develop theoretical constructs on canopy
    structure-function
  • Develop statistical protocols that guide study
    design
  • Create and enhance informatics tools
  • Build theory-based components
  • Build better UIs, data import validation, more
    visualization
  • Build parameterized queries for standard
    statistical scripts
  • Develop better metadata capture and evolution
  • Develop or adapt warehouse interface to other
    tools
  • Field test tools from the get-go

8
How DataBank WorksMike Finch
9
Research BottlenecksDatabase Technology
Research Productivity Gain
EML Generation
CanopyView
  • DataBank Database Generator
  • BCD

10
Conclusions
  • Database design is a complex web app
  • Sociological aspects are important
  • Proprietary data
  • Technology adoption
  • Integrative ecology new
  • Defining intuitive adequate set of templates is
    hard
  • Spatial is special.
  • Visualization is cool.

11
DataBank Workflow
Database Components
shopping cart
DB design
Empty DB
convert SQL MSSQL MSAccess
12
DataBank Software Architecture
Internet Browser IE 5 Netscape 6
Web Server (Apache)
Access Field DB
Enhydra (Middleware)
Databank Backend (Java)
Viz Tookkit JDK
DB SQL Server
13
Canopy DataBank
  • What is it
  • End-user database design with components (aka
    templates)
  • Variable table level metadata inherent
  • Study-level metadata available from the BCD
  • Technology
  • HTML, Java, Enhydra, SQLServer, Access, JTK
  • Aim to produce XML/EML for exchange and archive
  • Status
  • Some templates (mostly spatial tree structure)
  • About 5 field studies
  • Some visualization

14
DataBank Architecture (workflow)
template.xml descr.xml pic.gif bigpc.gif
shopping cart
DB design
Empty DB
TEOF internal object representation
TDM convert SQL MSSQL MSAccess
schema element dependencies entities
observation attributes
15
Next Steps
  • XML/EML for data exchange
  • Outreach to CS community
  • VLDB Panel on Ecosystem Informatics (August)
  • NSF BDEI PIs Meetings Forum (May, Nov)
  • Further define support spatial data structures
    -- additional collaborator(s)?
  • Visualization (!!!)

16
Discussion
Are we on the right track with
visualization? What off the shelf viz. tools are
available? Who might consult with us on
visualization, How about spatial scaling? How to
refine our spatial categorization scheme? What
collaborators (data sets) should we seek? How is
modeling linked to visualization? Comments about
DataBank?
Write a Comment
User Comments (0)
About PowerShow.com