How to use the web for bioinformatics - PowerPoint PPT Presentation

About This Presentation
Title:

How to use the web for bioinformatics

Description:

How to use the web for bioinformatics. Molecular Technologies ... At the end of this session you should be able to do all of the ... Mendelian Inheritance ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 29
Provided by: ethans4
Category:

less

Transcript and Presenter's Notes

Title: How to use the web for bioinformatics


1
How to use the web for bioinformatics
  • Molecular Technologies
  • Ethan Strauss
  • ethan.strauss_at_promega.com
  • 274-4330 X 1171
  • http//www.q7.com/ethan

2
Objectives
  • At the end of this session you should be able to
    do all of the following using freely available
    tools on the world wide web
  • Use Genbank or a similar database to find nucleic
    acid sequences of interest
  • Understand the parts of a Genbank entry
  • Use some of the databases at NCBI to find more
    information about a sequence.
  • Perform an alignment of several nucleic acid
    sequences
  • Find an arbitrary tool or database on the web.

3
How to find all those dang URLs!
  • http//q7.com/ethan/molbio/

4
Outline
  • What is Bioinformatics
  • Sequence Databases
  • What does a Genbank Entry look like?
  • Other NCBI databases
  • Multiple Sequence Alignment
  • New tools Databases

5
What is Bioinformatics?
  • Bioinformatics refers to the creation and
    advancement of algorithms, computational and
    statistical techniques, and theory to solve
    formal and practical problems posed by or
    inspired from the management and analysis of
    biological data (Wikipedia)

6
What is Bioinformatics?(my working definition)
  • Anything done on a computer in which knowledge of
    biology is helpful.
  • or
  • Anything done in biology in which knowledge of
    computers is helpful.

7
What sort of questions can Bioinformatics answer?
  • Sequence analysis
  • Where are restriction sites?
  • How does an RNA molecule fold?
  • What changes can be made to a DNA sequence to
    get a new protein with specific functional
    changes?
  • Computational evolutionary biology
  • How are two sequences related?
  • Analysis of gene expression
  • Is this gene highly expressed in cancer cells?

8
What sort of work is done in Bioinformatics?
  • Measuring biodiversity
  • How diverse are individuals of a species?
  • Is it one species or two?
  • Analysis of regulation
  • What does this drug do to expression of a gene?
  • Analysis of mutations in cancer
  • What is different about these cancer cells as
    compared to none cancer cells?
  • High-throughput image analysis
  • How can we analyze the affects of 1000 different
    compounds on the location of a specific protein?
  • And more!

9
Sequence Databases
  • NCBI databases Nucleic acids, proteins,
    Literature, genomes, taxonomy, SNPs and more!
  • EMBL Nucleic acid, protein, structure,
    microarray data and more.
  • DBJJ Nucleic acid, protein.
  • SwissProt Very well annotated protein database.
  • Many other general and specialized databases
    exist.

10
Sequences DatabasesNCBI/Genebank
  • Nation Center for Biotechnology Information
    (NCBI)
  • Sponsored and run by the US government.
  • Contains many different databases and huge
    amounts of information.
  • Most or all data is freely downloadable.
  • This one site is probably sufficient for all your
    Nucleic acid and Protein database needs!

11
Sequences DatabasesEntrez
  • Allows searching and access to NCBI databases.

12
Sequences DatabasesSequence Records
  • LOCUS Number Size Type Topology
    Division Date
  • DEFINITION - Name of the Sequence
  • ACCESSION - Unique Id number
  • VERSION - Other numbers which are associated
  • KEYWORDS
  • SOURCE What was it isolated from
  • ORGANISM - More taxonomic detail
  • REFERENCE - Paper or papers about the sequence
  • AUTHORS
  • TITLE
  • JOURNAL
  • FEATURES - A complete list of all of the
    features of a sequence. Can be very extensive and
    useful!
  • ORIGIN The actual Sequence!
  • http//www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db
    nucleotideval58533118

13
Other NCBI databases
  • Online Mendelian Inheritance in Man (OMIM)
  • A catalog of human genes and genetic disorders
    with links to other NCBI databases, including
    sequence databases.
  • This is a good starting point if you want to get
    sequences for a specific disorder.
  • http//www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD
    searchDBomimtermHFI

14
Other NCBI databases
  • Gene Database
  • Gathers information about a single gene.
  • Exactly one entry per Gene.
  • A good place to dig deeper into a single gene or
    to reduce redundancy about a single gene.

15
Other NCBI databases
  • HomoloGeneGathers homologs from various species
  • 3D DomainsProtein Structure collection
  • TaxonomySpecies information
  • Geo (Gene Expression Omnibus)A gene
    expression/molecular abundance repository

16
General Utilities
  • http//searchlauncher.bcm.tmc.edu/seq-util/seq-uti
    l.html
  • Translation
  • Restriction Digestion
  • Reformatting (alternately FASTA Formatter)
  • Complement/Reverse
  • Etc.
  • http//www.promega.com/biomath/calc11.htm
  • Melting Temperature of an oligo.

17
Database search by sequence similarity
  • Basic Local Alignment Search Tool (BLAST)

18
Multiple Sequence Alignment
  • Many programs can align multiple sequences with
    each other to find the best fit for all.
  • This is generally more biologically meaningful
    for protein sequences since they are more highly
    conserved.
  • Clustal is the most common.

19
Multiple Sequence Alignment
  • MEAGAYLNAIIFVLVATIIAVISRGLTRTEPCTIRITGESITVHACHID
    SX ETIKALA MEAGAYLNAIIFVLVATIIAVISRGLTRTEPCTIRITG
    ESITVHACHIDS...ETIKALA MEA..YLNAII.VLV.TIIAVIS..L.
    RTEPC.IkITGESITV.ACklDa.....I..L.
    MEAgaYLNAIIfVLVaTIIAVISrgLtRTEPCtIrITGESITVhAChiDs
    x etIkaLa
  • LK PLSLERLFQ LK.PLSLERLFQ ......L..... lk
    plsLerlfq

20
New Tools
  • Development of new tools and databases is
    ongoing.
  • Your needs will probably change over time.
  • You can find new tools using
  • Google
  • Lists
  • Nucleic Acids Research Annual Database issue

21
Homework
  • Assignments due next session
  • Find a entry of interest in OMIM
    (http//www.ncbi.nlm.nih.gov/entrez/query.fcgi?db
    OMIM)
  • Find a Gene associated with that entry
  • Click on the links link on the right and
    choose Gene

22
Homework
  1. The Gene page has gathered scads of information
    about this one gene. Find homologs in other
    species. From this page again choose links and
    go to Homologene

23
Homework
  1. Gather the protein sequences for each homologous
    gene (or 5 of them if there are more than that).
  2. Click DownLoad in the homologene listing
  3. Download everything with the default settings.

24
Homework
  • You will get a text file in Fasta format. Save
    it somewhere convenient.

25
Homework
  • Go to the Clustal server at http//searchlauncher.
    bcm.tmc.edu/multi-align/multi-align.html
  • Paste your complete Fasta file contents into the
    input box and click submit.
  • This takes awhile, so be patient. You will get
    output that looks something like this.

26
Homework
  • At the bottom of the alignment file is the same
    results in Fasta format. Copy the complete
    Fasta results and paste it into the input box at
    a BoxShade server (http//bioweb.pasteur.fr/seqana
    l/interfaces/boxshade.html)

27
Homework
  • Depending on the parameters chosen for BoxShade,
    you will see something like this. Regions which
    are the same in all species are likely involved
    in function in some way.

28
Homework
  • After all that work, your boss comes to you ands
    says that sequence comparison is obsolete! He
    wants you do structural alignments of these
    proteins. Figure out what a structural alignment
    is, find two different tools to find conserved 3D
    structures and choose which one you would use for
    this. Describe why this tool is preferable to the
    other.
  • NOTE You do not need to actually do any
    structural alignments. Just find out how you
    would go about doing on if you had to.
Write a Comment
User Comments (0)
About PowerShow.com