Title: Bee Counted Vote Today
1Bee Counted Vote Today!
2BeeSpace An Interactive Environment for
Functional Analysis of Social Behavior
- Bruce Schatz, principal investigator
- Graduate School of Library Information Science
(GSLIS) - Department of Computer Science, Program in
Neuroscience - Theme for Genomics of Neural and Behavioral
Plasticity - www.beespace.uiuc.edu
- IGB Thematic Research Seminar, November 2, 2004
3BeeSpace FIBR Project
- BeeSpace project is NSF FIBR flagship
- Frontiers Integrative Biological Research,
- 5M for 5 years at University of Illinois
- Nature-Nurture using honey bee as model
- Genome technologies in wet lab and dry lab
biology - Localized Gene Expression for Normal Social
Behavior - Gene Robinson, Entomology (behavioral
expressions) - Susan Fahrbach, Entomology (anatomical
localization) - Sandra Rodriguez-Zas, Animal Sciences (data
analysis) - Interactive Information System for Functional
Analysis - Bruce Schatz, Library Information Science
(info systems) - ChengXiang Zhai, Computer Science (text
analysis) - Chip Bruce, Library Information Science (user
support)
4Post-Genome Informatics
- Classical Organisms have extensive Genetic
Descriptions - There will be NO more classical organisms beyond
- Mice and Men other than Worms and Flies, Yeasts
and Weeds. - So must use comparative genomics to classical
organisms, - Via sequence homologies and literature
clusterings. - Automatic annotation of genes to standard
classifications, - Such as Gene Ontology via sequence homology.
- Automatic analysis of functions to scientific
literature, - Such as concept spaces via text mining.
- Descriptions in Literature MUST be used for
future - interactive environments for functional analysis!
-
5Conceptual Navigation in BeeSpace
6Biology The Model Organism
- The Western Honey Bee, Apis mellifera
- has become a primary model for social behavior
- Complex social behavior in controllable urban
environment. - Normal Behavior honey bees live in the wild
- Controllable Environment hives and queens can
be modified - Small size manageable with current genomic
technology. - Capture bees on-the-fly during normal behavior.
- Record gene expressions for whole-brain or
brain-region.
7Gene Expressions vs. Social Behavior
- Whole Brain for Nurses versus Foragers
- Charles Whitfield, Anne-Marie Cziko, Gene
Robinson 2003 Gene Expression Profiles in the
Brain Predict Behavior in Individual Honey Bees,
Science 302 296-299 (10 October 2003). - Whole Brain not enough resolution for smaller
differences. - Builders versus Guards YES
- Guards versus Undertakers NO
- Must do anatomical distribution of behavioral
transcripts - Functional correspondences between insects and
vertebrates - Insect mushroom bodies versus vertebrate
hippocampus
8Gene Expression Experiments
- Record Gene Expressions for ALL Social Behaviors
- during normal lives of honey bees.
- 660 brain gene expression profiles
- 22 behaviors 3 colonies 10 bees/colony
- 2700 microarrays (4660 for controls plus tests)
- Screen 300 important bee genes for brain
localization - (out of projected 13,000 total and 6000 in the
brain) - In-situ hybridization to identify cell
populations within regions - Honey Bee is again just the right scale for
feasible experiments! - 750K neurons versus 75M for mouse
(100-times-less)
9Societal Roles in Bee Colony
- Home
- Builder (honeycomb) Undertaker (remove corpses)
- Hygienic behavior (remove diseased brood)
- Offspring
- Brood care Attend queen Personal reproduction
(worker) - Defense
- Guard Soldier
- Food
- Forage for nectar Forage for pollen
- Forage for water Forage for resin
- Dance communication sender Dance communication
receiver - Process food (nectar to honey) Scout
- Experimentally induced roles to study development
of social behavior - Precocious forager Normal age nurse Normal age
forager - Overage nurse Reverted nurse Socially isolated
10Informatics From Bases to Spaces
- data Bases support genome data
- e.g. FlyBase has sequences and maps
- Genes annotated by GO and linked to literature
- BeeBase (Christine Elsik, Texas AM)
- Uses computed homologies to annotate genes
- information Spaces support biomedical literature
- e.g. BeeSpace uses automatically generated
conceptual relationships to navigate functions
11BeeSpace Software Environment
- Will build a Concept Space of Biomedical
Literature for Functional Analysis of Genes - -Partition Literature into Community Collections
- -Extract and Index Concepts within Collections
- -Navigate Concepts within Documents
- -Follow Links from Documents into Databases
- Locate Candidate Genes in Related Literatures
then follow links into Genome Databases
12BeeSpace Software Implementation
- Natural Language Processing
- Identify noun phrases
- Recognize biological entities
- Statistical Information Retrieval
- Compute statistical contexts
- Support conceptual navigation
- Network Information System
- Concept switch across community collections
- Semantic Links into biological databases
13BeeSpace Information Sources
- Biomedical Literature
- Medline (medicine)
- Biosis (biology)
- Agricola, CAB Abstracts, Agris (agriculture)
- Model Organisms (heredity)
- -Gene Descriptions (FlyBase, WormBase)
- Natural Histories (environment)
- -BeeKeeping Books (Cornell Library, Harvard
Press)
14Worm Community System (1991)
- WCS Information Sources
- Literature Biosis, Medline, newsletters,
meetings - Data Genes, Maps, Sequences, strains, cells
- WCS Interactive Environment
- Browsing search, navigation
- Filtering selection, analysis
- Sharing linking, publishing
- WCS 250 users at 50 labs across Internet (1991)
- Flagship in NSF National Collaboratory program
15Medical Concept Spaces (1998)
- Obtain discipline-scale collection
- MEDLINE from NLM, 10M bibliographic abstracts
- human classification Medical Subject Headings
- Partition discipline into Community Repositories
- 4 core terms per abstract for MeSH classification
- 32K nodes with core terms (classification tree)
- Community is all abstracts classified by core
term - 40M abstracts containing 280M concepts
- computation took 2 days on NCSA Origin 2000
- Simulating World of Medical Communities
- 10K repositories with gt 1K abstracts (1K w/ gt
10K)
16Biological Concept Spaces (2005)
- Compute concept spaces for All of Biology
- BioSpace across biomedical literature
- 50M abstracts across 50K repositories
- Use Gene Ontology to partition literature into
biological communities for functional analysis - GO same scale as MeSH but adequate coverage?
- GO light on social behavior (biological process)
17WCS Molecular
18WCS Cellular
19WCS PPCS demo
20Navigation in MEDSPACE
- For a patient with Rheumatoid Arthritis
- Find a drug that reduces the pain (analgesic)
- but does not cause stomach (gastrointestinal)
bleeding
Choose Domain
21Concept Search
22Concept Navigation
23Retrieve Document
24Biomedical Session
25Categories and Concepts
26Concept Switching
27Document Retrieval
28BeeSpace An Interactive Environment for
Analyzing Nature and Nurture in Societal Roles
- BeeSpace will enable users to navigate a uniform
space of diverse databases and literature sources
for hypothesis development and testing, with a
software system that goes beyond a searchable
database, using statistical literature analyses
to discover functional relationships between
genes and behavior.New text mining technology
to integrate molecular description with
information from physiology, behavior,
neuroscience, and evolution. THIS TECHNOLOGY IS
GENERAL! - BirdSpace? BehaviorSpace? BrainSpace? SoySpace?
CowSpace? IGBSpace? BioSpace?
29BeeSpace Information Sources
- General for All Spaces
- Scientific Literature
- -Medline, Biosis, Agricola, Agris, CAB Abstracts
- Genome Databases
- -GenBank, ProteinDataBank, ArrayExpress
- Model Organisms
- -Gene Descriptions (FlyBase, WormBase, MGI, SCD,
TAIR) - Special Sources for Natural History
- -BeeKeeping Books (Cornell Library, Harvard
Press)
30Towards CowSpace
- Organize Genome Databases (CowBase)
- Partition Scientific Literature for Cattle
- Gene Descriptions from Model Organisms
- Natural Histories from Population Databases
- Key to Functional Analysis is Special Sources
- Collecting Appropriate Text about Genes
- Extracting Adequate Data about Histories
- Cattle Leverage is AIPL Databases and Dairy
Manuals