GLOBAL BIODIVERSITY - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

GLOBAL BIODIVERSITY

Description:

I intend to draw attention to a problem for users with some GBIF data ... Andorra. Italy. India. Chordata. Chordata. Organisation of biodiversity data: By taxonomy ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 26
Provided by: lspe5
Category:

less

Transcript and Presenter's Notes

Title: GLOBAL BIODIVERSITY


1
GLOBALBIODIVERSITY
INFORMATIONFACILITY
Developing Uncertainty Measures Related to
Taxonomic Determinations
Larry Speers Global Biodiversity Information
Facility Arthur Chapman Australian Biodiversity
Information Services
WWW.GBIF.ORG

2
Disclaimers
  • I intend to draw attention to a problem for users
    with some GBIF data
  • I do not intend to present any finalized
    recommends as to how to deal with this issue
  • I hope to initiate a broader discussion as to
    possible solutions and I will present an example
    solution to initiate this discussion

3
http//www.gbif.org/prog/digit/data_quality/data_q
uality
4
http//www.gbif.org/prog/digit/data_quality/data_c
leaning
5
Issues with QA/QC
  • Legacy Data
  • Need to deal with what we have
  • Data cleaning tools
  • New data
  • Do everything in our power to avoid the problems
    we find with todays legacy data

6
Data Quality
Quality as applied to data, has various
definitions but in the geographic world one
definition is now largely accepted that of
fitness for use (Chrisman 1983).
7
Fitness for Use
In a database, the data have no actual quality or
value they only have potential value. That value
is realized only when someone uses the data to do
something useful (English 1999). The quality of
data cannot be assessed independently of the
users of that data (Strong et al. 1997).
8
What do we mean by fitness for use?
  • Fitness for use
  • Does species x occur in Tasmania?
  • Does species x occur in National Park y

X
Diagram Compliments Arthur Chapman
9
Fitness for use
Data are of high quality if they are fit for
their intended use in operations, decision-making,
and planning. (Juran 1964)
10
Exploring biodiversity data
  • Organisation of biodiversity data
  • By taxonomy
  • By geography
  • By time

Time
2006 2000 1950 1900 1800 500

Geography
Italy
Europe
Belgium
Andorra
Congo
Africa
Benin
Angola
India
India
Asia
China
Bangladesh
Taxonomy
Chordata
Chordata
Annelida Arthropoda
Ascomycota Basidiomycota Coniferophyta
Equisetophyta
Animalia Fungi Plantae
11
(No Transcript)
12
J. Wieczorek et al. INT. J. GEOGRAPHICAL
INFORMATION SCIENCE VOL. 18, NO. 8, DECEMBER
2004, 745767
13
Arthur D. Chapman et al. 2006
14
Exploring biodiversity data
  • Organisation of biodiversity data
  • By taxonomy
  • By geography
  • By time

Time
2006 2000 1950 1900 1800 500

Geography
Italy
Europe
Belgium
Andorra
Congo
Africa
Benin
Angola
India
India
Asia
China
Bangladesh
Taxonomy
Chordata
Chordata
Annelida Arthropoda
Ascomycota Basidiomycota Coniferophyta
Equisetophyta
Animalia Fungi Plantae
15
Documenting Fitness for Use
  • In general, error must not be treated as a
    potentially embarrassing inconvenience, because
    error or uncertanty provides a critical component
    in judging fitness for use.

16
(No Transcript)
17
Problem Misidentification
During the revision of Euscelidia, a frightening
proportion of the borrowed determined material
was found to be misidentified (6273), and a
literature search in a BIOSIS Previews revealed
that the problem is widespread.
Meier Dikow Conservation Biology, Pages 478488
Volume 18, No. 2, April 2004
18
Problem Misidentification
For example, of the 1522 rove beetle specimens
(Staphylinidae Coleoptera) in the Struve
collection 262 (17) were misidentified (Rose
2000), and Papp (1978) reports that for a
collection of Hungarian Lauxaniidae (Diptera) 28
of the 74 species determined and labeled by
Szilády were consistently misidentified.
Meier Dikow Conservation Biology, Pages 478488
Volume 18, No. 2, April 2004
19
Problem Use of Invalid Names
In Euscelidia 13 of all borrowed specimens were
classified under an incorrect name, and for a
recent inventory of palm collections in botanical
gardens, 260 (22) of the submitted 1208 names
were synonyms and 46 (4) were invalid (Maunder
et al. 2001).
Meier Dikow Conservation Biology, Pages 478488
Volume 18, No. 2, April 2004
20
Exploring biodiversity data
  • Organisation of biodiversity data
  • By taxonomy
  • By geography
  • By time

Time
2006 2000 1950 1900 1800 500

Geography
Italy
Europe
Belgium
Andorra
Congo
Africa
Benin
Angola
India
India
Asia
China
Bangladesh
Taxonomy
Chordata
Chordata
Annelida Arthropoda
Ascomycota Basidiomycota Coniferophyta
Equisetophyta
Animalia Fungi Plantae
21
Documenting Taxonomic Determinations
  • Several methods exist for documenting taxonomic
    determinations - none are completely satisfactory
  • Herbarium Information Standards and Protocols for
    the Interchange of Data (HISPID)
  • Australian National Fish Collection (1993)
  • Several others restricted to one or two
    institutions
  • Proposal four level
  • Who determined the specimen and when
  • What was the determination based on (type
    specimen, local flora, monograph, etc.)
  • Level of expertise of the determiner
  • What confidence did the determiner have in the
    determination.

22
Taxon Verification Status - proposed
Name of determiner
From Chapman (2005) Principles of Data Quality.
GBIF
23
Issues with QA/QC
  • Legacy Data
  • Need to deal with what we have
  • Data cleaning tools
  • New data
  • Do everything in our power to avoid the problems
    we find with todays legacy data

24
Taxon Verification Status - proposed
Name of determiner Date of determination Basis
of determination (e.g. compared with holotype,
used national flora)
  • identified by World expert in the taxon with high
    certainty
  • identified by World expert in the taxon with
    reasonable certainty
  • identified by World expert in the taxon with some
    doubt
  • identified by regional expert in the taxon with
    high certainty
  • identified by regional expert in the taxon with
    reasonable certainty
  • identified by regional expert in the taxon with
    some doubt
  • identified by non-expert in the taxon high
    certainty
  • identified by non-expert in the taxon reasonable
    certainty
  • identified by non-expert in the taxon some doubt
  • identified by the collector with high certainty
  • identified by the collector with reasonable
    certainty
  • identified by the collector with some doubt.

From Chapman (2005) Principles of Data Quality.
GBIF
25
Where does this discussion fit within the TDWG
process?
Write a Comment
User Comments (0)
About PowerShow.com