milkER a milk informatics resource - PowerPoint PPT Presentation

About This Presentation
Title:

milkER a milk informatics resource

Description:

... Bos taurus lactoglobulin, beta ... SOURCE Bos taurus (cow) ORGANISM Bos taurus ... Acne and milk, the diet myth, and beyond (Danby, 2005) Diabetes/milk ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 33
Provided by: edwa121
Category:

less

Transcript and Presenter's Notes

Title: milkER a milk informatics resource


1
milkER a milk informatics resource
  • Stephen Edwards BSc.
  • University of Edinburgh
  • BioNLP meeting 6th June 2005

2
Overview
  • Aims of milkER
  • milkER database
  • Text-mining
  • Potential targets

3
milkER aims
  • To amalgamate disperse milk information into one
    resource, allowing more focused analysis of milk
    proteins in relation to dairy issues, health and
    disease.

4
A milk database
  • Knowledge on milk affects many industries
  • UniProt, GenBank excellent resources
  • Marsupial genomics database (New Zealand)
  • Glasgow genomics data
  • Chinese database
  • Polish bioactive peptide database
  • Food property database (commercial)

5
Milk components
  • Fat, carbohydrates, proteins, minerals
  • Growth factors, enzymes, enzyme inhibitors,
    immunoglobulins, allergens, disease factors,
    anti-bacterial proteins, opioids
  • 1. Deliberate
  • 2. Leakage from blood
  • 3. Result of disease conditions
  • 4. Engineered
  • 5. Bacterial origin

6
milkER database
  • Database using BioSQL which allows incorporation
    of UniProt, EMBL, GenBank entries

7
LOCUS NM_173929 790 bp
mRNA linear MAM 27-OCT-2004 DEFINITION Bos
taurus lactoglobulin, beta (LGB), mRNA. ACCESSION
NM_173929 VERSION NM_173929.2
GI31343239 KEYWORDS . SOURCE Bos taurus
(cow) ORGANISM Bos taurus
Eukaryota Metazoa Chordata Craniata
Vertebrata Euteleostomi Mammalia
Eutheria Cetartiodactyla Ruminantia Pecora
Bovidae Bovinae Bos. REFERENCE 1
(bases 1 to 790) AUTHORS Jayat,D.,
Gaudin,J.C., Chobert,J.M., Burova,T.V., Holt,C.,
McNae,I., Sawyer,L. and Haertle,T.
TITLE A recombinant C121S mutant of bovine
beta-lactoglobulin is more
susceptible to peptic digestion and to
denaturation by reducing agents and
heating JOURNAL Biochemistry 43 (20),
6312-6321 (2004) PUBMED 15147215 REMARK
GeneRIF Results suggest that the stability of
beta-lactoglobulin arising from the
hydrophobic effect is reduced by the C121S
mutation so that unfolded or partially
unfolded states are more
favored. ORIGIN 1 actccactcc
ctgcagagct cagaagcgtg atcccggctg cagccatgaa
gtgcctcctg 61 cttgccctgg ccctcacctg
tggcgcccag gccctcatcg tcacccagac catgaagggc
..
8

Information retrieval
Other Databases
EMBL
UniProt
Information extraction
Other Sources (e.g. published tables)
milkER population
milkER
Web Query
9
milkER database
  • Database using BioSQL which allows incorporation
    of UniProt, EMBL, GenBank entries
  • Library of literature on milk
  • User interface (www.milker.org.uk)

10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
Text-mining
  • Machine reading of text
  • Many techniques involved
  • Tokenisation
  • Stemming (Activation ? Activat)
  • POS tagging (Protein ? noun)
  • Abbreviation expansion (CN ? Casein)
  • Entity identification (Casein ? protein)
  • Dictionary

14
Increased levels of IgA antibodies to B-LG were
found and were shown to be an independent risk
marker for type 1 diabetes.
Increased past participle levels
plural noun of preposition
Tokeniser / POS tagger
IgA antibody B-LG protein Diabetes disease
Entity identification
Parser
IgA antibodies to B-LG MARKER type 1
diabetes
15
Information extraction
  • Rule based
  • interact bind activate
  • protein (0-5 words) verbs (0-5 words)
    protein
  • (Blaschke and Valencia, 2002)
  • Machine-learning
  • Statistical methods, Hidden Markov Models
  • Learn interfillers, text lying between tagged
    entities (Bunescu et al, 2004)

16
Difficulties
  • Synonyms
  • Proteins and genes with same name
  • Funny names e.g. ERK-1/2, and gene!
  • Variability of natural language
  • Compounded names
  • Co-ordination, negatives, speeling errors

17
Evaluation
  • Precision (P) - how correct is output
  • Recall (R) - how often does it pick
  • F-measure - combines P and R
  • IE systems can achieve high results, but not
    enough to populate databases automatically

18
Text-mining uses
  • Aim to extract interactions and diseases
  • Swanson (Fish oil)
  • Srinivasan (Turmeric)

19
General model for discovering implicit links
between topics Starting topic Turmeric
(inhibits) Intermediate topic Nuclear
factor-kappa B (involved in) Terminal
topic Crohns disease
Diagram taken from Srinivasan et al, 2004
20
Targets for text mining
  • Many milk relationships still require further
    investigation
  • Positive reasons
  • - nutritional benefits
  • - neonatal growth
  • - antimicrobial activity
  • - bioactive peptides

21
Targets for text mining (cont.)
  • Negative reasons
  • - recent link with Alzheimer's
  • - diabetes link
  • - asthma
  • - human reactions to cow hormones
  • (e.g. Acne, Danby 2005)
  • - drug transfer to milk and effects
  • - allergic reactions/intolerance
  • - toxic contaminants

22
milkER process
  • 897 proteins, 772 dna, 1232 rna
  • Analyze references (1465 MEDLINE refs)
  • MeSH terms, GO terms etc
  • POS tag
  • UMLS standardisation
  • Gene/protein dictionary
  • Extract relations

23
Milk literature
24
milkER interactions
  • Table of interacting proteins
  • Store as queryable XML strings?
  • Discover links between proteins and disease
  • Create hypotheses
  • Confirm experimentally

25
Diabetes
  • Pancreas secretes hormones
  • Glycagon, increases conversion glycagon ? glucose
  • Insulin, increases conversion glucose ? glycagon.
    Allows glucose into cells.
  • Condition where the amount of glucose in the
    blood is abnormally high as the body cannot use
    it adequately as fuel

26
Diabetes
  • Affects 3-5 of industrialised populations
  • Type 1 (10)
  • Genetic and environmental factors (e.g. diet)
  • Decreased insulin production
  • Mostly develops lt age 20
  • Type II (90)
  • Resistance of body to insulin
  • Normally develops gt age 40
  • Often associates with high B.P, cholsterol and
    arterial disease

27
Milk and diabetes
28
Selected quotes
  • More research is needed on all aspects of
    lactation in women with diabetes.
  • Reader D. et al, Curr Diab Rep. 2004
  • The effect of high protein intakes from
    different sources on glucose-insulin metabolism
    needs further study
  • Hoppe et al, European Journal of Clinical
    Nutrition 2005
  • American children also tend to be heavier than
    those from European countries, skewing the
    growth charts further.
  • The Scotsman Sat 5 Feb 2005
  • The government currently recommends that babies
    should be fed breast milk alone for the first six
    months - the WHO recommends two years.

29
Conclusions
  • Knowledge of milk vital in many areas
  • milkER aims to bring disparate milk data together
  • Text-mining can wade through large amounts of
    data to retrieve and discover vital information

30
Future work
  • Relation extraction of milk literature
  • Extend content of milkER to include interaction
    data
  • Create hypotheses for experimental work

31
Acknowledgements
  • Prof. Lindsay Sawyer
  • Dr. Carl Holt (Hannah Research Institute, Ayr)
  • Prof. Bonnie Webber (Informatics)
  • Dr. Alistair Kerr and Dr. Douglas Armstrong for
    technical support

32
References
  • Acne/milk
  • Acne and milk, the diet myth, and beyond (Danby,
    2005)
  • Diabetes/milk
  • Milk and diabetes (Schrezenmeir et al, 2000)
    REVIEW
  • The role of ?-casein variants in the induction of
    insulin-dependent diabetes (Elliott et al, 1997)
  • Text-mining
  • Natural language processing and systems biology
    (Cohen et al, 2004) REVIEW
  • Mining MEDLINE for implicit links between dietary
    substances and diseases (Srinivasan et al, 2004)
  • Learning to extract proteins and their
    interactions from MEDLINE abstracts (Bunescu et
    al, 2003)
Write a Comment
User Comments (0)
About PowerShow.com