Title: Fabrice Camous
1Genomic Information Retrieval Using Links
Fabrice Camous
Supervisors Stephen Blott and Alan Smeaton
10th of February 2005
School of Computing, DCU
GenIRL is funded by Enterprise Ireland under the
Basic Research Grants Scheme, project number
SC-2003-0047-Y.
2Gene Expression Process Overview
3Why this project?
Web Search
Links
Links
Genomic Databases
Links
Links
Wet Labs
4Talk Overview
- Genomic Data An heterogeneous environment
- Links diversity in genomic information
- Current work and results
- Future work
5Genomic Data Heterogeneity
Model Organisms
Gene Expression Process
Publications
AGGTCTCTAAGTCTTAGAGGTACCT
RSNVQABLFTNLGHBQTRRNVV
6Explicit Links
Model Organisms
Gene Expression Process
Publications
AGGTCTCTAAGTCTTAGATTTACCT
AGGTCTCTAAGTCTTAGAGGTACCT
Homology
Citation
Homology
Cross-domain references
RSNVQABLFTNLGHBQTRRNVV
7Implicit Links
Model Organisms
Gene Expression Process
Publications
AGGTCTCTAAGTCTTAGATTTACCT
AGGTCTCTAAGTCTTAGAGGTACCT
Sequence similarity
Textual similarity
Ontology-based similarity
RSNVQABLFTNLGHBQTRRNVV
Ontology-based similarity (MEDLINE database)
8Implicit Links MEDLINE records
PMID- 10506108 TI - Reduction of UV-induced skin
tumors in hairless mice by selective COX-2
inhibition. AB - UV light is a complete
carcinogen, inducing both basal and squamous
cell skin cancers. The work described uses
the selective COX-2 inhibitor celecoxib to
examine the efficacy of COX-2 inhibition in the
reduction of UV light-induced skin tumor
formation in hairless mice. UVA-340 sun lamps
were chosen as a light source that effectively
mimics the solar UVA and UVB spectrum.
Hairless mice were irradiated for 5 days a week
for a total dose of 2.62 J/cm(2). When 90
of the animals had at least one tumor, the
mice were divided into two groups so that the
tumor number and multiplicity were the same
(P lt 0.31). Half of the mice were then fed a
diet containing 1500 p.p.m. celecoxib. Tumor
number, multiplicity and size were then
observed for the next 10 weeks. Ninety-five
percent of the tumors formed were
histopathologically evaluated as MH -
Animals MH - Carcinoma, Squamous
Cell/enzymology/pathology/prevention
control MH - Cell Division MH - Cyclooxygenase
Inhibitors/therapeutic use MH - Female MH -
Immunohistochemistry MH - Isoenzymes/drug
effects/metabolism MH - Mice MH - Mice, Inbred
HRS MH - Neoplasms, Radiation-Induced/enzymology/
pathology/prevention control MH -
Prostaglandin-Endoperoxide Synthase/drug
effects/metabolism MH - Skin Neoplasms/enzymology
/pathology/prevention control MH -
Sulfonamides/therapeutic use MH - Support, U.S.
Gov't, P.H.S. MH - Ultraviolet Rays
- PMID- 10434051
- TI - Quantitative alterations of hyaluronan and
dermatan sulfate in the - hairless mouse dorsal skin exposed to
chronic UV irradiation. - AB - The quantitative alterations of hyaluronan
and dermatan sulfate in the - upper dermis (fibrous tissue) and the lower
dermis (adipose tissue) of the - hairless mouse skin chronically exposed to
the UV irradiation as - solar-simulating irradiation (lambda(max)
352 nm, UV distribution 300-310 - nm, 0.9 310-320 nm, 2.0 320-420 nm,
97.1) were evaluated. Hyaluronan and dermatan
sulfate contents in each part of dermis were
determined as follows skin sections on a glass
slide prepared by histological technique - were processed into the upper dermis and
the lower dermis with a small - surgical knife, and treated with
chondroitinase ABC and ACII in the - presence of bacterial collagenase. The
resulting unsaturated disaccharides - were determined by HPLC method. By applying
this method - MH - Animals
- MH - Chondroitin ABC Lyase
- MH - Collagenases
- MH - Deoxyribonucleases, Type II Site-Specific
- MH - Dermatan Sulfate/radiation effects
- MH - Disaccharides/analysis
- MH - Female
9The Medical Subject Headings (MeSH)
1- Anatomy A 2- Organisms B 3- Diseases C
4- Chemicals and Drugs D 5- Analytical,
Diagnostic and Therapeutic Techniques and
Equipment E 6- Psychiatry and Psychology F
7- Biological Sciences G 8- Physical Sciences
H 9- Anthropology, Education, Sociology and
Social Phenomena I 10- Technology and Food and
Beverages J 11- Humanities K 12-
Information Science L 13- Persons M 14-
Health Care N 15- Geographic Locations Z
Body Regions
Research Papers
22,568 Unique Descriptors
Digestive System
Sense Organs
Human Indexer
Cells
MeSH Assignments
Blood Cells
Neurons
Stem Cells
MEDLINE database
Muscle Cells
10Mesh link weight calculation using the MeSH
hierarchy
MeSH link weight
Analytical, Diagnostic and Therapeutic Techniques
and Equipment
- PMID- 10434051
- MH - Animals
- MH - Chondroitin ABC Lyase
- MH - Collagenases
- MH - Deoxyribonucleases, Type II Site-Specific
- MH - Dermatan Sulfate/radiation effects
- MH - Disaccharides/analysis
- MH - Female
- MH - Histological Techniques
- MH - Hyaluronic Acid/radiation effects
- MH - Mice
- MH - Mice, Inbred HRS
- MH - Skin/chemistry/pathology/ radiation
effects - MH - Swine
- MH - Ultraviolet Rays
PMID- 10506108 MH - Animals MH - Carcinoma,
Squamous Cell/enzymology/pathology/prevention
control MH - Cell Division MH - Cyclooxygenase
Inhibitors/therapeutic use MH - Female MH -
Immunohistochemistry MH - Isoenzymes/drug
effects/metabolism MH - Mice MH - Mice, Inbred
HRS MH - Neoplasms, Radiation-Induced/enzymology/
pathology/prevention control MH -
Prostaglandin-Endoperoxide Synthase/drug
effects/metabolism MH - Skin Neoplasms/enzymology
/pathology/prevention control MH -
Sulfonamides/therapeutic use MH - Support, U.S.
Gov't, P.H.S. MH - Ultraviolet Rays
Investigative Techniques
11The use the MeSH link weights in the retrieval
process
Query
Calculation of the MeSH link weights for each
pair of MEDLINE records in the Result Set (link
weight matrix)
Text-based Retrieval System
Result Set
MEDLINE abstracts
12The cluster hypothesis
Closely associated documents tend to be relevant
to the same request.
van Rijsbergen 1979
13The cluster hypothesis
14Initial experimental method
48753 judged MEDLINE documents for 50 queries
from the TREC Genomics Track 2004.
MeSH link weight matrix calculation for each query
15Evaluation measures
- Recall rtight / R
- Precision rtight / Ntight
16Recall
Recall per topic
Average Recall 0.75
17Recall
Recall divided by relative size of tightest
cluster
Average ratio 1.271012038
18Precision
Precision before/after clustering
Precision improved for 42 topics out of 50
19Precision
Precision increase () after clustering
Average precision increase () 0.27
20Future work
- Ways to improve what we did
- Create implicit links with the Gene Ontology
(linking different model organism databases) - Integrate sequence data with publications
21References
- Blott and Al. (2005). On the use of Clustering
and the MeSH Controlled Vocabulary to Improve
MEDLINE Abstract Search. To appear in the
proceeding of the second Conférence en Recherche
d'Infomations et Applications (CORIA) 2005,
Grenoble, France. - Funk, Reid and McGoogan (1983). Indexing
consistency in MEDLINE. Bulletin of the Medical
Library Association71176-83. - Van Rijsbergen (1979), C. J. Information
Retrieval. London Butterworths. - Williams (2003). Genomic Information Retrieval.
Proceedings of the Fourteenth Australasian
database conference on Database technologies 2003
- Volume 17, Adelaide, Australia.