Title: BeeSpace: An Interactive Environment for Functional Analysis of Social Behavior
1BeeSpace An Interactive Environment for
Functional Analysis of Social Behavior
- Bruce Schatz
- Institute for Genomic Biology
- University of Illinois at Urbana-Champaign
- www.beespace.uiuc.edu
- First Annual BeeSpace Workshop
- University of Illinois June 6, 2005
2BeeSpace FIBR Project
- BeeSpace project is NSF FIBR flagship
- Frontiers Integrative Biological Research,
- 5M for 5 years at University of Illinois
- Analyzing Nature and Nurture in Societal Roles
using honey bee as model - (Functional Analysis of Social Behavior)
- Genomic technologies in wet lab and dry lab
- Bee Biology gene expressions
- Space Informatics concept navigations
3 for Social Beehavior
4Complex Systems I
- Understanding Social Behavior
- Honey Bees have only 1 million neurons
- Yet
- A Worker Bee exhibits Social Behavior!
- She forages when she is not hungry
- but the Hive is
- She fights when she is not threatened
- but the Hive is
5for Functional Analysis
6Complex Systems II
- Understanding Functional Analysis
- Molecular Mechanisms of Social Behavior
- Can only be Discovered via the
- Interactive Navigations of Distributed Systems
- The Interspace is the next generation of
- of the Net (beyond the Web)
- Where Concept Navigation across
- Distributed Communities is routine
7(No Transcript)
8(No Transcript)
9System Architecture
10Post-Genome Informatics
- Classical Organisms have extensive Genetic
Descriptions! - There will be NO more classical organisms beyond
- Mice and Men other than Worms and Flies, Yeasts
and Weeds. - So must use comparative genomics to classical
organisms, - Via sequence homologies and literature analysis.
- Automatic annotation of genes to standard
classifications, - Such as Gene Ontology via sequence homology.
- Automatic analysis of functions to scientific
literature, - Such as concept spaces via text mining.
- Descriptions in Literature MUST be used for
future - interactive environments for functional analysis!
-
11Informational Science
- Computational Science is the Third Branch of
Science (beyond Experimental and Theoretical) - Genes are Computed, Proteins are Computed,
- Sequence equivalences are Computed.
- Informational Science is coming to be accepted as
- The Fourth Branch of Science
- Based on Information Science technologies for
- Functional Mining of Information Sources
- Comparative Analysis within the
- Dry Lab of Biological Knowledge
12Biology The Model Organism
- The Western Honey Bee, Apis mellifera
- has become a primary model for social behavior
- Complex social behavior in controllable urban
environment - Normal Behavior honey bees live in the wild
- Controllable Environment hives can be modified
- Small size manageable with current genomic
technology - Capture bees on-the-fly during normal behavior
- Record gene expressions for whole-brain or
brain-region - (Note logistical limitations with bees and
expressions)
13Informatics From Bases to Spaces
- data Bases support genome data
- e.g. FlyBase has sequences and maps
- Genes annotated by GeneOntology and
- linked to biological literature
- BeeBase (Christine Elsik, Texas AM)
- Uses computed homologies to annotate genes
- information Spaces support biological literature
- e.g. BeeSpace uses automatically generated
- conceptual relationships to navigate functions
14Project Investigators
- BeeSpace project is NSF FIBR flagship
- Frontiers Integrative Biological Research,
- 5M for 5 years at University of Illinois
- Biology
- Gene Robinson, Entomology (behavioral
expression) - Susan Fahrbach, Wake Forest (anatomical
localization) - Sandra Rodriguez-Zas, Animal Sciences (data
analysis) - Informatics
- Bruce Schatz, Library Information Science
(systems) ChengXiang Zhai, Computer Science (text
analysis) - Chip Bruce, Library Information Science (users)
15Education and Outreach
- Explaining Social Behavior at all Levels
- Graduate Students and Postdocs as System Users
- 5 early adopter labs then 15 international labs
- Undergraduates to plan Bioinformatics Course
through Susan Fahrbach at Wake Forest - Run Workshop for Middle School Minorities through
UIUC SummerMath (George Reese) - University High School Biology Courses (David
Stone) - Home Hi Middle School for Girls Science (Jim
Buell)
16BeeSpace GOALS
- Analyze the relative contributions of
- Nature and Nurture in
- Societal Roles in Honey Bees
- Experimentally measure differential gene
expression for important societal roles during
normal behavior - varying heredity (nature) and environment
(nurture) - Interactively annotate gene functions for
important gene clusters using concept navigation
across biological literature representing
community knowledge
17Concept Navigation in BeeSpace
18BeeSpace Software Environment
- Will build a Concept Space of Biomedical
Literature for Functional Analysis of Bee Genes - -Partition Literature into Community Collections
- -Extract and Index Concepts within Collections
- -Navigate Concepts within Documents
- -Follow Links from Documents into Databases
- Locate Candidate Genes in Related Literatures
then - follow links into Genome Databases
19BeeSpace Software Implementation
- Natural Language Processing
- Identify noun phrases
- Recognize biological entities
- Statistical Information Retrieval
- Compute statistical contexts
- Support conceptual navigation
- Network Information System
- Concept switch across community collections
- Semantic Links into biological databases
20BeeSpace Information Sources
- Biomedical Literature
- Medline (medicine)
- Biosis (biology)
- Agricola, CAB Abstracts, Agris (agriculture)
- Model Organisms (heredity)
- -Gene Descriptions (FlyBase, WormBase)
- Natural Histories (environment)
- -BeeKeeping Books (Cornell Library, Harvard
Press)
21Worm Community System (1991)
- WCS Information Sources
- Literature Biosis, Medline, newsletters,
meetings - Data Genes, Maps, Sequences, strains, cells
- WCS Interactive Environment
- Browsing search, navigation
- Filtering selection, analysis
- Sharing linking, publishing
- WCS 250 users at 50 labs across Internet (1991)
- NSF National Collaboratories Flagship
22WCS Molecular
23WCS Cellular
24Medical Concept Spaces (1998)
- Medical Literature (Medline, 10M abstracts)
- Partition with Medical Subject Headings (MeSH)
- Community is all abstracts classified by core
term - 40M abstracts containing 280M concepts
- computation is 2 days on NCSA Origin 2000
- Simulating World of Medical Communities
- 10K repositories with gt 1K abstracts
- (1K with gt 10K)
25Navigation in MedSpace
- For a patient with Rheumatoid Arthritis
- Find a drug that reduces the pain (analgesic)
- but does not cause stomach (gastrointestinal)
bleeding
Choose Domain
26Concept Search
27Concept Navigation
28Retrieve Document
29CONCEPT SWITCHING
- Concept versus Term
- set of semantically equivalent terms
- Concept switching
- region to region (set to set) match
30Biomedical Session
31Categories and Concepts
32Concept Switching
33Document Retrieval
34Biological Concept Spaces (2006)
- Compute concept spaces for All of Biology
- BioSpace across entire biomedical literature
- 50M abstracts across 50K repositories
- Use Gene Ontology to partition literature into
- biological communities for functional analysis
- GO same scale as MeSH but adequate coverage?
- GO light on social behavior (biological process)
35Interactive Functional Analysis
- BeeSpace will enable users to navigate a uniform
space of diverse databases and literature sources
for hypothesis development and testing, with a
software system that goes beyond a searchable
database, using statistical literature analyses
to discover functional relationships between
genes and behavior. - Genes to Behaviors
- Behaviors to Genes
- Concepts to Concepts
- Clusters to Clusters
- Navigation across Sources
36BeeSpace Information Sources
- General for All Spaces
- Scientific Literature
- -Medline, Biosis, Agricola, Agris, CAB Abstracts
- -partitioned by organisms and by functions
- Model Organisms
- -Gene Descriptions (FlyBase, WormBase, MGI, OMIM,
SCD, TAIR) - Special Sources for BeeSpace
- -Natural History Books (Cornell Library, Harvard
Press)
37XSpace Information Sources
- Organize Genome Databases (XBase)
- Compute Gene Descriptions from Model Organisms
- Partition Scientific Literature for Organism X
- Compute XSpace using Semantic Indexing
- Boost the Functional Analysis from Special
Sources - Collecting Useful Data about Natural Histories
- e.g. CowSpace Leverage in AIPL Databases
38Towards the Interspace
- The Analysis Environment technology is
GENERAL! BirdSpace? BeeSpace? - PigSpace? CowSpace?
- BehaviorSpace? BrainSpace?
- BioSpace
- Interspace