Title: Sponsoring Cancer Center
1Integrative Cancer Research
- Sponsoring Cancer Center
- Lombardi Comprehensive Cancer Center
- Georgetown University
- Workspaces
- Architecture (developer)
- Integrative Cancer Research (developer)
- LCCC caBIG Representative
- Robert Clarke, Ph.D., D.Sc.
- clarker_at_georgetown.edu
- (202) 687-3755
2Integrated Cancer Research Teams
- Georgetown University
- Edmund Gehan (Biostatistics Biomathematics)
- Stephen Moore (Advanced Research Computing)
- Seong Ki Mun (Imaging Science Information
Systems) - Cathy Wu (Protein Information Resource)
- Virginia Tech
- Joseph Wang (Engineering Computer Science)
- Catholic University
- Jason Xuan (Engineering Computer Science)
- National Institutes of Health
- Spiderweb Team (NCICB )
- Javed Khan (NHGRI)
- Aiyi Liu (NICHD)
- University of Edinburgh, Scotland
- William Miller (Oncology)
3Overview
- Project Activity
- Tool development for exploring very high
dimensional data sets - Deployment of grid enabled and integrated array
(MIAME - compliant) and clinical research databases
(Spiderweb) - Contribution of expression array and other data
from multiple - clinical, translational, and basic research
projects - Stage of Maturity
- Tools at each stage of development
- Technical Details/Standards
- Open source currently most are written in C
and/or MatLab - Points of Interoperability
- Use of caCORE APIs caBIG-compatible APIs
- Resources
- Platform migration software engineering
personnel
4Examples of Ongoing Funded Projects
- Bioengineering Research Partnership (NCI)
- Large, prospective, molecular profiling study in
breast cancer - Expression array tool development and deployment
- Expression array and clinical data
- Gene/Nutrition Interactions in Breast Cancer
(NCI) - U54 program project array tool development and
deployment data - Breast Cancer Center of Excellence (DOD)
- Study of alcohol and breast cancer risk array
tool deployment data - Clinical Translational Research Study (DOD)
- Large, prospective and retrospective, molecular
profiling study - of Letrozole and Tamoxifen in breast cancer
- Array tool development and deployment data
5Examples of Ongoing Funded Projects
- Computational Decomposition of Composite
Molecular - Signatures (NCBIB)
- Expression array tool development, optimization,
and deployment - Intelligent Mapping and Visual Exploration of
Gene - Expression Profiles (NCI)
- Expression array tool development, optimization,
and deployment - Comprehensive Computational Analysis of Gene
Expression - (NCI)
- Expression array tool development, optimization,
and deployment
6Integrated Tool Development
7Tools for Very High Dimensional Data Sets
- General Approach
- Probablistic approaches to address the
properties of very high - dimensional data spaces (e.g., "curse of
dimensionality", - "concentration of measure phenomenon")
- Data Preprocessing
- Normalization
- Tissue heterogeneity correction
- Multitask, goal-specific, gene selection tool
(e.g., classification vs. - signaling/function)
- Cluster Discovery and Visualization
- Visual Statistical Data Analysis (VISDA) package
- Classification and Prediction
- Optimized multilayer perceptron classifiers
- Expression array tool development
8Comprehensive Data Analysis
Cluster Discovery Visualization
Data Preprocessing
Classification Prediction
Adaptive Hierarchical Subspace Experts
Cross-Phenotype Normalization
Tissue Heterogeneity Correction
Information Visualization
Optimized Mutlilayer Perceptron Classifiers
Visual Statistical Data Analyzer (VISDA)
Multitask Gene Selection
Matlab
Neural Networks
Pattern Recognition
Independent Component Analysis
9Examples of Novel Tools
Discriminant Components Analysis
10Dynamic Contrast Enhanced-MRI
Tumor angiogenesis in the breast
11Integrating Imaging and Molecular Analysis
12Integrate DataProtein Information Resource
13SpiderWeb Collaboration Goals
- Integration of multiple, varied information to
achieve a basis for rational design of novel
diagnostics and therapeutics - Integration of functional genomics information
into clinical information so it can be used to
diagnose genetic predisposition,
sub-classification of disease and help with
optimal selection of therapies - Bring new efficiencies to clinical research by
integrating bench to bedside and back
14SpiderWeb Project (LCCC/NCICB)
Approval
15Timelines and Resources
- Development Requirements (approx)
- 3 programmers for data analysis tools
- 3 programmers for PIR integration into caBIG
- Infrastructure
- C/Java development environment enhancements
- software system administrator
- Draft 12-month Work Plan and Milestones (approx)
- 1-3 months VISDA caBIG interoperability
- 4-6 months CPN and THC caBIG interoperability
- 7-9 months MTGS and MLP/AHSE caBIG
interoperability - 10-12 months software system integration and
final testing - 12 months for PIR integration into caBIG