Title: PowerPoint Template
1IBM Research An Inter-Corporate Collaboration
on Computer Curation of Intellectual Property
the Scientific Literature
2(No Transcript)
3What we are trying to accomplish
the challenges of today's researchers
Applying text image analysis technology -
to better understand IP (patents) and the
scientific literature Computer curation of
the literature -
Stephen Boyer Ph D Sboyer_at_us.ibm.com 408-858-5
544
4The Problem
All content and no discovery ?
5What we are trying to accomplish
the challenges of today's researchers
The problem Gain a better understanding of
IP (patents) and the Scientific Literature The
Question Can we use computers to read
documents, identify critical entities, and
perform meaningful associations that can help
us with our work ? What we did 1) Apply text
analytics technology to analyze Patents the
Scientific Literature (gt30 M IP documents
Medline abstracts) 2) Apply image analytics to
IP documents 3) Explore how these technologies
can be applied to foreign documents (for
example Chinese Japanese patents) The Value
Provide new insights into chemical biomedical
information (still a work in progress).
6Collaborators
A collaborative work in progress
Corporate Sponsors
Other informal Collaborators partners
- IBM Research
- Novartis
- Pfizer
- Dupont
- Lilly
- Boheringer-Ingelheim
- Roche / Genentech
- AstraZeneca (AZ)
- Bristol-Myers Squibb (BMS)
- NIH
- University of Texas
- EMBL - EBI
- University of Dundee
- UC Davis
- ChemAxon
- CambridgeSoft
- Dalhouise
- Univ of New Mexico
7Why this is important !
What are the differences between these two
molecules?
Chemistry 1 Carbon, 1 Nitrogen, 1 double bond,
1 hydrogen
Business 1.7B in revenue An opportunity loss of
320M A revenue gain of 320M
- Bayer patented molecule
- Annual sales of 320 Million
- Vardenafil (Levitra)
- Late to market, found similar
- molecule and gained share
- Pfizer patented molecule
- Annual sales of gt1.7 billion
- Sildenafil (Viagra)
- 1st to market, but didnt patent (cover) full
Chemical space
8Example IP Challenge
the challenges of today's researchers
Additional Properties
Relationships
How do I find entities from the docs?
How do I find entities relationships?
New IP
Web, Scientific News
Worldwide Patents
Medline
How do I exploit other Information sources?
New Insights
9Can you find the key molecules in an
unstructured text , for example a scientific
journal or patent?
Chemical nomenclature can be daunting
 a) (2P/4S)-4-4-Amino-5-(4-benzyloxy-phenyl)pyrro
lo2,3-dpyrimidin-7-yl-2-hydroxymethyl-pyrrolidi
ne-1-carboxylic acid tert-butyl ester prepared
analogously to Example 18 starting from
(2R/4S)-4-4-amino-5-(4-benzyloxy-phenyl)-pyrrolo
2,3-dpyrimidin-7-yl-pyrrolidine-1,2-dicarboxylic
acid 1-tert-butyl ester 2-ethyl ester (Example
20a). 1 H-NMR (CDCl3, ppm) 8.52 (s, 1H),
7.52-7.32 (m, 7H), 7.1 (d, 2H), 6.95 (d,1 H),
5.50 (m, 1H), 5.13 (s, 2H), 4.62-4.42 (m, 2H),
4.28 (m, 2H), 4.10 (m, 1H), 3.95-3.70 (m, 1H),
2.75 (m, 1H), 2.50 (m, 1H),1.49 (s, 9H). Â Â Â Â b)
(2R/4S)-4-4-Amino-5-(4-benzyloxy-phenyl)-pyrrolo
2,3-dpyrimidin-7-yl-pyrrolidin-2-yl-methanol
0.100 g of (2R/4S)4-4-amino-5-(4-benzyloxy-phenyl
)-pyrrolo2,3-dpyrimidin-7-yl-pyrrolidine-1,2-di
carboxylic acid 1-tert-butyl ester is dissolved
in 4 ml of tetrahydrofuran 10 ml of 4M hydrogen
chloride in diethyl ether are added, and stirring
is carried out for 1 hour at room temperature.
The product is filtered off and dried under a
high vacuum. The dihydrochloride of the title
compound is obtained. 1 H-NMR (CD3 OD, ppm) 8.4
(s, 1H) 7.60 (s, 1H), 7.5-7.10 (m, 9H), 5.65 (m,
1H), 5.18 (s, 2H), 4.32 (m, 1H), 4.00-3.65 (m,
4H), 2.60 (m, 2H). EXAMPLE 24 (2R/4S)-4-(4-Amino-
5-phenyl-pyrrolo2,3-dpyrimidin-7-yl)-1-(2,2-dime
thyl-propionyl)-pyrrolidine-2-carboxylic acid
ethyl ester 0.130 g of (2R/4S)-4-(4-benzyloxycarbo
nylamino-5-phenyl-pyrrolo2,3-dpyrimidin-7-yl)-1-
(2,2-dimethyl-propionyl)-pyrrolidine-2-carboxylic
acid ethyl ester is dissolved in 8 ml of
methanol, and the solution is hydrogenated over
0.030 g of palladium-on-carbon (10) for 1 hour
at normal pressure. The catalyst is removed by
filtration, the filtrate is concentrated by
10identify the chemical names then convert them
to structures chemical names -gt structures !
entity identification
 a) (2P/4S)-4-4-Amino-5-(4-benzyloxy-phenyl)pyrro
lo2,3-dpyrimidin-7-yl-2-hydroxymethyl-pyrrolidi
ne-1-carboxylic acid tert-butyl ester prepared
analogously to Example 18 starting from
(2R/4S)-4-4-amino-5-(4-benzyloxy-phenyl)-pyrrolo
2,3-dpyrimidin-7-yl-pyrrolidine-1,2-dicarboxylic
acid 1-tert-butyl ester 2-ethyl ester (Example
20a). 1 H-NMR (CDCl3, ppm) 8.52 (s, 1H),
7.52-7.32 (m, 7H), 7.1 (d, 2H), 6.95 (d,1 H),
5.50 (m, 1H), 5.13 (s, 2H), 4.62-4.42 (m, 2H),
4.28 (m, 2H), 4.10 (m, 1H), 3.95-3.70 (m, 1H),
2.75 (m, 1H), 2.50 (m, 1H),1.49 (s, 9H). Â Â Â Â b)
(2R/4S)-4-4-Amino-5-(4-benzyloxy-phenyl)-pyrrolo
2,3-dpyrimidin-7-yl-pyrrolidin-2-yl-methanol
0.100 g of (2R/4S)4-4-amino-5-(4-benzyloxy-phenyl
)-pyrrolo2,3-dpyrimidin-7-yl-pyrrolidine-1,2-di
carboxylic acid 1-tert-butyl ester is dissolved
in 4 ml of tetrahydrofuran 10 ml of 4M hydrogen
chloride in diethyl ether are added, and stirring
is carried out for 1 hour at room temperature.
The product is filtered off and dried under a
high vacuum. The dihydrochloride of the title
compound is obtained. 1 H-NMR (CD3 OD, ppm) 8.4
(s, 1H) 7.60 (s, 1H), 7.5-7.10 (m, 9H), 5.65 (m,
1H), 5.18 (s, 2H), 4.32 (m, 1H), 4.00-3.65 (m,
4H), 2.60 (m, 2H). EXAMPLE 24 (2R/4S)-4-(4-Amino-
5-phenyl-pyrrolo2,3-dpyrimidin-7-yl)-1-(2,2-dime
thyl-propionyl)-pyrrolidine-2-carboxylic acid
ethyl ester 0.130 g of (2R/4S)-4-(4-benzyloxycarbo
nylamino-5-phenyl-pyrrolo2,3-dpyrimidin-7-yl)-1-
(2,2-dimethyl-propionyl)-pyrrolidine-2-carboxylic
acid ethyl ester is dissolved in 8 ml of
methanol, and the solution is hydrogenated over
0.030 g of palladium-on-carbon (10) for 1 hour
at normal pressure. The catalyst is removed by
filtration, the filtrate is concentrated by
What is this compound ??
11Problem I need to find information about Valium
nomenclature issues
Valium (Trade Name)
CAS 439-14-5 (Chemical ID )
Diazepam (Generic Name)
Valium has gt 149 names
ALBORAL, ALISEUM, ALUPRAM , AMIPROL
,ANSIOLIN , ANSIOLISINA , APAURIN, APOZEPAM,
ASSIVAL , ATENSINE , ATILEN , BIALZEPAM ,
CALMOCITENE, CALMPOSE , CERCINE, CEREGULART,
CONDITION, DAP, DIACEPAN, DIAPAM , DIAZEMULS
, DIAZEPAN , DIAZETARD , DIENPAX, DIPAM ,
DIPEZONA, DOMALIUM , DUKSEN, DUXEN, E-PAM,
ERIDAN, EVACALM, FAUSTAN, FREUDAL ,
FRUSTAN, GIHITAN, HORIZON, KIATRIUM, LA-III ,
LEMBROL, LEVIUM, LIBERETAS , METHYL
DIAZEPINONE, MOROSAN , NEUROLYTRIL NOAN
NSC-77518 PACITRAN PARANTEN PAXATE PAXEL
PLIDAN QUETINIL QUIATRIL QUIEVITA RELAMINAL
RELANIUM RELAX RENBORIN RO 5-2807 S.A. R.L.
SAROMET SEDAPAM SEDIPAM SEDUKSEN SEDUXEN ,
SERENACK SERENAMIN SERENZIN SETONIL SIBAZON
SONACON STESOLID STESOLIN , TENSOPAM TRANIMUL
TRANQDYN TRANQUASE TRANQUIRIT ,
TRANQUO-TABLINEN , UMBRIUM UNISEDIL USEMPAX
AP VALEO VALITRAN VALRELEASE VATRAN VELIUM,
VIVAL VIVOL WY-3467
12There are many different chemical names for Valium
entity identification
Valium
CAS 439-14-5
Diazepam
13Problems of taxonomy name normalization
The scientist simply wants information about
valium
Choose keywords
Medline
In-house database
Chem. Abstracts
Patent database
DIAPAM
439-14-5 (Chemical ID)
Pereira notebook 23a
7-CHLORO-1-METHYL-5-PHENYL-2H-1,4-BENZODIAZEPIN-2-
ONE
Sedapam
Multiple documents contain Information about
Valium
7-CHLORO-1,3-DIHYDRO-1-METHYL-5-PHENYL-2H-1,4-BEN
ZODIAZEPIN-2-ONE
Diazepam
14Considerations for searching documents (or web
pages) for chemical substances
Name normalization is important
- Chemicals have a wide variety of trivial and
official names. - No text search can find chemicals which are named
using one of the alternative names. - Synonym expansion is insufficient.
- Searching by structure will find all such cases.
Source J Cooper / IBM
15Finding similarity structures not just text !
Find documents with similar structures
- Further, we would like to find compounds which
are supersets of the given structure. - For example toluene and methylnaphthalene
Text searches wont find documents with similar
structures
Source J Cooper / IBM
16The Solution
The proposed solution
Applying text and image analytics to better
understand IP (patents) the scientific
literature Computer curation of the
literature -
17Patents contain molecular data in multiple forms
- Text Image manually created chemical complex
work units (CWUs)
And as (Manually Created) Chemical Complex Work
Units (CWUs)
18Text Analytics
Lets start with text analysis
The computer reads documents and attempts to
determine domain specific entities for
example chemical names, gene names, disease
names, etc.
19Step 1 Identify the chemical entities
Step 2 Extract chemical names and load into
tables
Entity extraction
20Step 3 Convert words to structures
Convert the chemicals into machine readable
formats !
7-CHLORO-1-METHYL-5- PHENYL-2H-1,4- BENZODIAZEPIN-
2-ONE
SMILES strings c1ccccc1
INChI1/C6H6/c1-2-4-6-5-3-1/h1-6H
21Step 4 Automate the process
Scale up automate the process -
HealthCare Life Science Data warehouse
IBM Servers
Any text
Web Pages
Medline
Patents
Valium
Benzene
- 11 Million patent documents
- 18 Million Medline abstracts
- 100 Million
- chemical structures
- gt12 Million unique
22Summary of overall text analysis operations for
chemistry (HMM, CRF, CFG)
Overall process flow for text analysis
2D Structure
toluene
SMILES String
CC1CCCCC1
-
-
-
.
methyl benzene
Dictionary of the English Language minus
the Dictionary of Desired Entities
- Options to compute
- 300 properties per
- molecule
Blue Gene enabled -
23Summary of overall text analysis operations for
chemistry
Overall process flow for text analysis
SMILES String
2D Structure
toluene
-
-
-
CC1CCCCC1
.
methyl benzene
Dictionary of the English Language minus
the Dictionary of Desired Entities
- Options to compute
- 300 properties per
- molecule
Blue Gene enabled -
24Why use Blue Gene?
- Find and compute the 3D structure of every
molecule on every page of every patent (and
Medline abs.) - Identify every protein (from our dictionary of
gt350K proteins) on every page of every patent
(and Medline abs.) - Identify every disease (from our list of 14,500 )
on every page of every patent and map it to
Medline MeSh codes - Identify the occurrence of every biomarker (from
our dictionary of 485 biomarkers) on every page
of every patent - .your request goes here !
- Equivalent to 240K simultaneous Google searches -
Compute properties, find relationships,
Data warehouse
25Examples
Chemicals derived from text analytics
26Examples
Chemicals derived from text analytics
27Examples of structures created via automated
chemical annotation
Chemicals derived from text analytics
28Leading Causes of Annotator Problems
Typical problems encountered when dealing with
OCR text
- Improper spacing within the chemical name
- 2-_(Bicyclo_2.2._1_hept-5-en-2-ylamino)_-5-_2-_
(4-chloro-3-methylphenoxy)_ethyl-l,_3-_thiazol-4_
(5H)-one - Run on lists
- indane, 1,2,_3,4- tetrahydroquinoline,
3,_4-dihydro-2H-1,_4-benzoxazine,
1,5-naphthyridine, 1, 8- naphthyridine - Numbering of compounds
- Comparative Example 3, 2-bromo-4- (1, 3-dioxo-1,
3-dihydro-2H-isoindol-2-yl) butanoic acid
4-(1,3-dioxo-1,3-dihydro-2H-isoindol-2-yl)
butanoic acid - Formatting issues
- 2-2-(bicyclo 2.2. 1 hept-5-en-2-ylamino)
-4-oxo-4, 5-dihydro-1, 3-thiazol-5-yl -N-ltBRgt
ltBRgt ltBRgt ltBRgt ltBRgt ltBRgt ltBRgt ltBRgt
(4-metlioxyphenyl)-N-methylacetamide - Missing or Incorrect Parenthesis
- 5-(2-anilinoethyl)-2-(2-cyclohex-1-en-1-ylethyl)a
mino-1,3-thiazol-4(5H)-one
using WO/2005/075471 as an example
29OCR Errors Compound Names
Searching full-text patents (WO, EP, US, FR, GB,
DE, JP) for the term Simvastatin yields 9030
patents (3666 INPADOC families).
30OCR Errors Chemical Names
If you think that was bad... look at the IUPAC
names
WO2007096753 6(R)-2-(8'(S)-2",2"-dimethylbutyryloxy-2'(S),6'(R)-dimethyl- l',2',6',7,'8',8a'(R)-hexahydronapthyl-l'(S))-ethyl-4(R)-hydroxy -3,4-5,6-tetrahydro- 2H-pyran-2-one
WO2005095374 6(R)-2-8(5)-(2,2-dimethyl.butyyloxy)-2 (S), 6 (R)-dimethyl-1, 2, 6, 7, 8, 8a(R)-hexahydro-l (S)-napthylelhyl/-4(R)-hydroxy-3, 4, 5, 6-tetrahydro-2H-pyran-2 one
WO2005095374 6(R)-2-8(S)-(2, 2-dimethylbulyryloxy)-2 (S), 6 (R)-dimethyl-1, 2, 6, 7, 8, 8a(R)-hexabydro-l (S)-napthylethyl/-4(R)-hydroxy-3, 4, 5, 6-tetrahydro-2H-pyran-2 one
WO2003018570 6(R)-2-8(S)-(2,2 10 dimethylbutylyloxy)-2(S),6(R)-dimethyl-1,2, 6,7,8,8a(R) hexahydronaphthyl-l(S)ethyl-4(R)-hydroxy-3,4,5,6 tetra hydro-2H-pyrane-2-one
WO2003048149 6(R)-2-8(S)-(2,2- dimethylbutylyloxy)-2(S),6(R)-dimethyl-1,2,6,7,8,8a(R)- hexahydronaphthyl-l(S)ethyl-4(R)-hydroxy-3,4,5,6 20 tetrahydro-2H-pyrane-2-on
WO2003018570 6(R)-2-8(S)-(2,2-dimethylbutylyloxy)-2(S),6(R)-dimeth yl-1,2,6,7,8,8a(R)-hexahydronaphthyl-l(S) ethyl-hydrox y-3,4,5,6-tetrahydro-2H-pyrane-2-one
WO2005095374 6(R)-2-8(S)-(2,2-dimethylbutyrylaxy)-2 (S),6 (R)-dimethyAl, 2, 6, 7, 8, 8a(R)-hexahydro-l (S)-napthylJethyl)-4(R)-hydroxy-3, 4, 5, 6-tetrahydro-2H-pyran-2 one
WO2006072963 6(R)-28(S)-(2,2dimethylbutyryloxy)2(5),6(R).. dimethyI..lt/pgtltpgt1,2,6,7,8,8a(R)-hexahydro-1 (S)-naphthylJethy1J-4(R)hydroxy3,4,5, 6 tetrahydro-2H-pyran-2-one
31Transposed Characters
Some errors cannot originate from an erroneous
OCR process. Accidentally transposed characters
are another source for variations
ehtyl 1565 patents mehtyl 840
patents compuond 231 patents relaese 44
patents formual 1689 patents
32Chemical Name Annotation of US patents backfile
(1976-2005) US patent applications (2002
-2005)
Rule 112 Analysis
- Preliminary Results as of June 20 , 2006 -
65,645,252 of Molecules identified -
(total) 3,623,248 of Unique Molecules
1,830,575 of Molecules Passing the
Lipinski Rules 363,993 of documents with
possible 112 violations 17,122 of 2005
pre-grants w/ possible 112 violations
All identified molecules were successfully
converted to Smiles strings
33Analysis Results
Post processing with pipeline pilot
Molecules TOTAL 65,645,252 UNIQUE
3,623,248 DRUG¹ 1,830,575
¹ Passing Lipinskis Rule of 5 http//en.wikipe
dia.org/wiki/Lipinski's_Rule_of_Five
34IBM's Research Collaboration on Computer
Curation
Automated Text Image Analysis !
Annotation Factory ? Data Warehouse
Data
- Annotators
- Chemicals
- Biomarkers
- Genes
- Proteins
- Cell Lines
- Cell Types
- People
- Institutions
- Diseases
- Symptoms
- Other
Full-Text Chemical Structures
Journals
Attributes
Medline
Search
Patents
Entities
Edgar
Analysis
Relationships
Web
Co-occurrence Lipinksi Rules Section 112 Trends,
Molecular Networks Time lines
"UIMA"
Blue Gene
Scitegic Pipeline Pilot and other Partner Tools
35What about processing image data ??
Image entity recognition
IBM pioneered a process for converting images of
chemical structures into Mol files (machine
readable representations of chemical structures)
We can also analyze the image content of patents
journals
36Seminal paper on converting chemical images into
MOL files
Optical Recognition of Chemical Structures
(OROCS)
37Optical recognition of chemical structures
(OROCS) How it works
OC(CN1C2(C3CCCCC3)OC(C)CC1O)N(C)C4C2CC(Cl)
CC4
38Optimization of Image processing process
Extract the images From the page
Isolate the chemical images
Pre-processing of the images makes a
significant difference
SMILE String
39This shows the selective extraction of image data
from within the patent
Individual images
40Image Extracted from the page
Structure Generated from the image
SMILE String Generated from the image
Chemical derived from OCR of image data
Examples Results from OCR of chemical images
Source Dr John Kinney
41Learning from the Exceptions
- Radicals, polymers, organometallics
- Name lookup table differences
- formal
- Structure conventions differ
- i.e., CH3MgBr vs. CH3Mg.Br-
- Ionization state/stereochemistry
- Internal error corrections
- Some names are incomplete and therefore ambiguous!
42Differences of opinion
Often tagged as ambiguous
Where do the punctuation marks belong?
43Structures from Images
- Image-to-Structure software very effective on
clean, crisp images - Like text, image quality in documents varies
greatly! - Improper structure assignments are common
44Structure Recognition Process
- Clipped images from documents are used.
- Processing of full-page images is slow and gives
many errors. - OSRA (NIH) run to produce SDFile output
- PipelinePilot Protocol used to analyze and filter
resulting structure set.
45Criteria for filtering invalid structures
- Presence of non-element atoms, R, X, etc.
- Inappropriate internal coordinates (bond length
and angles) of the 2D representation. - Over-assigned stereochemistry can be corrected
rather than removing the entire structure
46Examples of common errors in translation
Example Structure
Error
Filter Rule
The minimum bond distance where neither atoms is
Hydrogen is required to be greater than 0.85 Ã….
Double bond interpreted as two single bonds
The minimum bond angle from an exocyclic
terminal atom to the ring atoms was required to
be greater than 50.
Aromatic bond interpreted as exocyclic bond from
ring
47Examples of common errors in translation
Example Structure
Filter Rule
Error
Atom found in center of single bond
The maximum bond angle of a carbon with exactly
two single bonds was required to be less than
155.
The minimum bond angle which includes any
terminal atom was required to be greater than
10.
Single bond divided into two single bonds
48Conversion Statistics
- 20,081 patents with 487,537 clip files
35 clean
49Combining Text and Image Structures
50Image Processing Operations
PTO/ Data Processing Operations
Chem CWUs
Clip Images
CDX / MOL files
OSRA /Clide
SDF files
SDF files
Multi-step post processing Operations
Multi-step post processing Operations
Multi-step post processing Operations
51Image Extracted from the page
Structure Generated from the image
SMILE String Generated from the image
Chemical derived from OCR of image data
Examples Results from OCR of chemical images
Source Dr John Kinney
52 Computer curation now involves multiple types of
analysis
combining technologies into workflow protocols
IBM Collaborator input
Derived Meta data
Output db to Collaborators
Internal data
53Multiple Workflows for processing text Image s
via different technologies
54Computer Curation Process Overview
Services Hosted at IBM Almaden
User Applications
Annotation Factory
ChemVerse
Selected Internet Content
Pipeline Pilot
U.S. Patents (1976 -2009)
ChemVerse db
ChemVerse (Semantic Associations)
e Classifier Other Data Associations
View selected Documents Reports
BIW
U.S. Pre- Grants (All)
ADU
IP Database (e.g. DB2)
Database computed Meta Data
Data Sources
Parse Extract data
PCT EPO Apps
Cognos/DDQB/ Other Apps
Medline Abstracts (gt18 M)
In-House Content
Computational Analytics
Annotator 1
Chem Search
Annotator 2
SIMPLE
ADU Automated Data Update
55What about additional meta data ?
Data association
How should we identify extract and associate
attributes ?
Semantic associations using ChemVerse
56ChemVerse a tool for associating molecular
attributes from different sources
Attributes derived from different sources
IP Attributes
Spectral Attributes
Physical Attributes
- Orange Book
- Legal status
- Assignee
- Foreign filings
- Expiration Date
- NIST db
- IR spectra
- NMR,
- Mass Spec, etc
- Computational
- MW,
- MF
- Bp
- Mp , Etc etc
Screening Attributes
Molecular Entities have Various Attributes (
From different sources)
Durg Attributes
- PubChem
- Activity
- Pharam data
- Target data for SRA
- Literature references
- Drugbank
- Activity
- Pharam data
- Protein Binding
- half life
Toxicity Attributes
- WomBat
- Activity
- Pharam data
- Target data for SRA
- Literature references
- EPA databases
- Toxicity studies
- LD50
- Literature references
57ChemVerse Semantically maps associations of
attributes from different sources
Semantic association of attributes
Drugbank
Pub Ch?em
FDA
Orange Book
Others
Database C (Tox)
Internet
Location
Data Source 1 Schema 1
Attributes
Output file list of attributes
Data Sources
Input list of SMILES
The Tank
Data Source 2 Schema 2
Attributes
Input list of Attributes
Output file list of SMILES
Attributes
58Whos is participating ?
Pfizer
Novartis
Bruce.a.Lefker, Christopher.Kibbey,
David.J.Walsh, Sarah.Blendermann, Bryn
Williams-Jones Jacquelyn Klug-McLeod Lee Harland
Robert Owen Marudai Balasubramanian
BMS
Therese.Vachon, Edgar.Jacoby, Peter.Ertl,
Peter.Gedeck, Fatma.Oezdemir-Zaech, John-w.Davies
, Jeremy.Jenkins, Allen.Cornett, Stefan
Wetzel Greg Landrum Richard Lewis A J Dambra
Cynthia.Yang, Charles.Hand, Michael.Rogers,
Ramesh.durvasula, Alice.goshorn,
Mark.Hermsmeier,
AstraZeneca
Sorel.Muresan, Christopher.Southan,
Niklas.Blomberg, Plamen.Petrov,
WIPO
Glenn.Macstravic, Chrstophe.Mazenc, Lustin
Diaconescu, Paul Halfpenny
IBM Almaden Research
Lilly
Ana Lelescu Linda Kato Su Yan Ashish
Sanghavi Ramachandran Prasad Qi He Timothy J
Bethea Yanbo Wu Meenakshi Nagarajan Christopher
Campbell
Stephen Boyer, Jeff Kreulen Ying Chen Tom
Griffin Alfredo Alba Scott Spangler Eric Louie
Brad Wade John Colino Isaac Cheng
Thompson Doman
Boheringer
Jasmin.Saric, Scott.Oloff, John.Hart,
Stephen.Boyer, John.Proudfoot, Markus.Kunze,
NIH
Marc Nicklaus Igor Filippov Marcus Sitzmann
EBI
Genentech / Roche
john Overington Christopher Steinbeck Dominique
Clark
Jeff Blaney, Slaton Lipscomb Keven Clark Jw
Feng Vickie Tsui Bin Qing Ben Sellers
Dupont
John.B.Kinney, Timothy.E.Mueller,
59Research - Its a journey
Backup materials