Title: Digitizing%20Tool%20For%20Jane%20Goodall
1Digitizing Tool For Jane Goodalls Chimpanzee
Project
2Overview
- Background
- Motivation
- Problem Definition
- Related Work and its limitations
- Our contribution
- Details of the software
- Error Reduction Techniques
- Future Work
- Questions and Comments
3Background
- In 1960 Jane Goodall began her research and in
1977 founded the Jane Goodall Institute for
Wildlife Research, Education and Conservation. - Accolades
- - U.N. Messenger of Peace (2002)
- - Gandhi/King Award for Nonviolence (2001)
- and many more
- Author of several books including the best seller
Reason for Hope A Spiritual Journey.
4Background
- Jane Goodall Institute's Center for Primate
Studies at the University of Minnesota
(http//www.discoverchimpanzees.org). - Some of the projects undertaken
- - Female dispersal and inbreeding avoidance
- - Sex differences in diet
- - Group and individual ranging patterns
- - A study of social relationships between
females
5Motivation
- Mission of the center
- - Preserve, organize and digitize all the paper
data. Collect and digitize slides, black and
white photographs, and video of the Gombe
chimps. - - Create a relational database of all of these
materials. - - Analyze this data to further our knowledge
about the complex lives of chimpanzees.
6Problem Definition
Given Paper sheets used to record data at
Gombe. Objective To write a program that would
take scanned images of the sheets as input and
provide easy and effective user interface to
digitize the data Constraints Microsoft access
as back-end database. Data should be entered in
the existing tables in the right format.
7Related Work
- Direct Entry Here the user had to read values
from the paper sheet and key them in the database
tables - MS Access Forms Forms were written in access.
The user would read values from the sheet and
fill the form elements. - Digitizing Tablet (Calcomp) In conjunction with
a digitizing puck, this hardware device
connected to the computer serial port and
digitized the data using a batch program.
8Limitations of Related Work
- Resources are not cheap and commonly available.
- Equipment compatibility issues with the latest
versions of operating systems. - Data entry takes a long time.
- Digitization process involves errors.
- Only one person may be able to use the resources
and digitize at a time.
9Our Contribution
- Port digitization process to commonly available
software (Java Swing) and hardware (Scanner)
resources - Design and implement software for Image
Calibration to allow for digitization of multiple
types of sheet. - Design and implement software for Image
Digitization of the scanned sheets. - Provide features to facilitate validation of
calibrated data. - Provide algorithms and techniques to facilitate
validation of digitized data.
10New Setup
Is it a new type?
No
Yes
Digitized Data
Calibrated Data
Calibrator
Digitizer
11Calibrator
Scan the sheet type
Step 1
Run the Calibrator program and select scanned
sheet type
Step 2
Calibrate the sheet
Step 3
Calibrated Data
12Step 1 Scan the sheet type
- A sheet type defines the prototype or a reference
for the actual sheets that are to be digitized. - It changes when chimps die or new chimps are
born. - A sheet type changes at a frequency of about once
a year. - Multiple prototypes or references can be stored
at the same time. Each one is differentiated with
a unique reference name given by the user.
13Step 2 Select scanned sheet type
- Run the Calibrator program
- On the first screen
- Select the scanned sheet type that would be used
to define the reference or prototype. - Give the sheet type a unique reference name.
Error checking is done for duplicate reference
names. - Delete an existing reference if its no longer
used (Optional).
14Step 3 Calibrate the sheet
- On the second screen
- Starting from the leftmost column, mark the four
corners of each column by using the buttons on
the top of the screen. - If the corners are marked in the clockwise
direction, button clicks are not needed - Select the type of column. If its a chimp
column, type the initials for the chimp. - Press next for marking the next column or done if
all the columns have been marked.
15Calibrated Data
- The following data in stored for each column
- Reference Name
- Type of column
- Chimp Name (If applicable)
- Dimension Information
- Row information is not calibrated because it
doesnt change for different sheet types.
16New Setup
Is it a new type?
No
Yes
Digitized Data
Calibrated Data
Calibrator
Digitizer
17Digitizer
Scan the sheets to be digitized
Step A
Run digitizer and select scanned sheets
Step B
Calibrated Data
Select reference sheet and calibrate current
sheet(s)
Step C
Focal information entry screen
Step D
Follow arrival entry screen
Step E
Digitized Data
Food information entry screen
Step F
Other species entry screen
Step G
18Step A Scan the sheets
- Typically two sheets would be recorded in a day
for a chimp - Asabuhi (Morning) time sheet
- Jioni (Evening) time sheet
- There might just be one of the above sheets. The
software allows digitization of one sheet. - Scanned sheets can be of any size and be tilted.
19Step B Select the scanned sheets
- Run the Digitizer program
- On the first screen
- Select the scanned sheet(s) that are to be
digitized using the two buttons on the top of the
screen. - If only one sheet was recorded for the day, press
any one of the two buttons and select the scanned
sheet.
20Step C Calibrate current sheet(s)
- Calibrate the current sheet(s) by marking four
corners using the four buttons on top of the
screen. - Select a reference sheet from the drop down menu
- The skeleton of the entire sheet is redrawn using
the reference sheet information for confirmation. - If only one sheet is recorded for the day, the
time Asabuhi, Jioni (Morning, Evening) can be
selected
21Step D Focal information screen
- Here the information of the target chimp, date,
observer, map recorder, start/ end map numbers
and time and follow start time. - With the above information this screen writes a
record in the Follow table. - An existing set of records for one set of
readings can be deleted on this screen. Its
useful if partial or incorrect data has been
entered for a set of sheets.
22Step E Follow arrival screen
- Here the information for all the chimps being
followed along with the target chimp is recorded
and put in Follow Arrival table. - Two continuous mouse clicks record the start/end
times and map numbers for chimps. Other fields
are automatically populated. - Certainty 0 and 1 is be differentiated using
mouse left and right clicks respectively. - Multiple sequences for same chimp can be entered.
23Step F and GFood / Other species screen
- After the follow arrival screen, a menu screen is
presented with options to go to Food Screen,
Other species screen or Exit the program. - Food screen records the food name, food part.
Normalized food name is automatically filled by
comparing food name on the screen and existing
variants in the database. - Other species provides a a drop down menu with
species name to select from. - The start/end times and map numbers for both
screens are recorded in the same way as follow
arrival screen - Data is stored in Food and Other Species table.
24Food Other Species Screens
25Map Numbers
- Start map number time is taken from the first
screen. - If not sequential, a Map Number screen is
presented after the Focal Information Screen. - A map number file is generated containing records
having Focal Chimp ID, date , times and
corresponding map numbers. - Map numbers are retrieved from the file for each
follow arrival, food and other species entry.
26Our Contribution
- Port digitization process to commonly available
software (Java Swing) and hardware (Scanner)
resources - Design and implement software for Image
Calibration to allow for digitization of multiple
types of sheet. - Design and implement software for Image
Digitization of the scanned sheets. - Provide features to facilitate validation of
calibrated data. - Provide algorithms and techniques to facilitate
validation of digitized data.
27Validation of Calibrated Data
- Done by redrawing the skeleton of the current
sheet using selected reference data. - Main Considerations are varying sheet sizes and
varying tilts. - Approach
- Find the scaling ratio
- Get the individual reference column widths and
heights and multiply by the scaling ratio. - Calculate and distribute the tilts.
28Validation of Calibrated Data
The green lines below show the skeleton of the
redrawn sheet
29Validation of Digitized Data
- Done when mouse is clicked for automatic
recording of time and map number information. - Using the X and Y coordinates, sheet boundary
check is done and the column information is
retrieved from the database. - If its not the right column (example Mouse was
clicked in an adjacent column than the one of
interest) an error message is popped up and data
is not recorded. - Extensive form error checking is provided to
facilitate validation of digitized data.
30Validation of Digitized Data
The screen shot below shows an error message
being displayed when the mouse was clicked
outside the Map Number column for a Map Time
reading.
31Errors
- Types of Errors
- Scanner errors
- Cylindrical Distortion (Tilts)
- Mechanical Distortion (Sheet Crumpling).
- Rounding Errors.
- Human Errors.
- Errors when calibrating a sheet type.
- Errors when drawing the paper sheet.
- Effects
- Errors in Column Type Validation.
- Errors in Exact Time Calculation.
32Reducing Errors in Column Validation
- To improve the validation of column type and
hence facilitate validation of calibrated data,
errors caused due to tilted scans, need to be
reduced. - One possible way is to rotate the sheet
coordinates with the tilt angle when storing in
the database. - Each time mouse is clicked for data entry, the
click coordinates need be rotated with the tilt
angle and the database can then be queried for
column type information. - The disadvantage is even for small tilts, the
coordinates need to be rotated.
33Reducing Errors in Column Validation
Common Region
Tilted Sheet
MOBR
- Another solution is to use Filter and Refine
Strategy . - Here MOBR(Minimum Orthogonal Bounding Rectangle)
is calculated at each mouse click. If there is an
overlap, Refining is done to select the correct
column of the two. - As the tilt reduces the need for refining reduces
hence the overheads are less when tilts are small.
34Reducing Errors in Time Calculation
- To calculate time accurately consider the three
reference points marked initially for the current
sheets - This reduces the the effect of rounding errors
and scanner (tilt) errors in time calculation
35Challenges
- Design an easy and effective interface with a
smooth flow. - Separating type calibration and digitization.
- Allowing digitization of multiple kinds of
sheets. - Designing forms that are easy to understand and
navigate. - Help users in reducing efforts and time to
digitize. - Allowing time and map number entries by mouse
clicks. - Provide features to speed up redundant steps.
- Reduce errors involved in the digitization
process.
36Testing and validation of software
- User Ian Gilby
- Sheets Tested 2
- One demo was given to Ian before and he could use
the software and successfully digitize a set of
sheets without any help. - No errors encountered during the digitization.
- Overall user experience was good.
37Future Work
- Change the queries to add data to the modified
table structures. - Replace looping over each follow arrival with
specific follow arrival entry. - Provide means for partial entry of data for a set
of sheets and resume the session later. - Populate Follow Map Time table instead of writing
to a file and use it for retrieving map number
information. - Provide support for the application.
38