Bioinformatics Primer - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Bioinformatics Primer

Description:

Alistair Chalk, Elisabet Andersson. Stem Cell Biology and Bioinformatic Tools, ... Cygwin (unix in windows)? open cygwin. putty (log into a unix server) ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 17
Provided by: dbrm
Category:

less

Transcript and Presenter's Notes

Title: Bioinformatics Primer


1
Bioinformatics Primer
  • Goal Introductory skills for bioinformatics
    analysis.
  • Format Complete the exercises, ask anything.

2
Basic Skills interpro
  • interpro (www.ebi.ac.uk/interpro/)?
  • Exercise
  • for 3 proteins important to your research area
    (choose 2 well defined, 1 not well defined)?
  • download their protein sequence
    fromwww.ncbi.nlm.nih.gov
  • analyse them using interpro
  • what domains do they contain?
  • what are the functions of these domains?
  • what families do the proteins belong to?
  • how would you do this on 100 proteins, or 20,000
    proteins?

3
Basic Skills gene ontologies
  • Gene Ontology database
  • www.geneontology.org?
  • Exercise
  • Keep this information saved as you will use it
    thje following days
  • 1) Define
  • Molecular function
  • Biological process
  • Subcellular location
  • 2) Find GO identifiers that describe functions,
    processes or locations that are relevant to your
    research
  • List the identifier, type and description.
  • Should you use identifiers further up or down the
    hierarchy?

4
Basic Skills gene ontologies
  • Exercise continued
  • 3) For 3 proteins relevant to your research
  • What GO terms are assigned to the protein?
  • What evidence is there for the assignments?
  • 4) Describe the difference between the evidence
    codes.
  • 5) How would you find all proteins with a
    specific molecular function?

5
Basic Skills ArrayExpress/GEO
  • GEO/ArrayExpress
  • Microarray repository tools containing published
    microarray data
  • Note differences in ease of use and completeness!
  • Exercise
  • Compare GEO and ArrayExpress.
  • Search for Human stem cell microarray studies
  • What are the GEO/ArrayExpress identifiers for
    some recent Stem cell microarray studies?
  • What data is available? Raw data? Processed data?
  • Download a CEL file (or set of CEL files) from a
    stem cell microarray study.
  • Go to ArrayExpress Atlas
  • Look up at least two genes of interest (in stem
    cell biology)?
  • What does the database tell you?

6
Basic Skills Ensembl
  • Exercise
  • Go to Ensembl. Describe it.
  • Look up a (human) gene. How many transcript
    variants does it have?
  • Explore!
  • Use BioMart to gather all Ensembl identifiers and
    Entrez geneIDs for all human and mouse genes,
    export this data into excel (you will need this
    later).

7
Basic Skills UCSC
  • Exercise
  • Go to genome.ucsc.edu.
  • Look up a (human) gene. Select many different
    gene models how many transcript variants are
    found for your gene in UCSC known genes, AceView,
    Refseq?
  • Use the table browser to download all human genes
    (refseq) into excel.
  • What else of interest can you download?

8
Basic Skills R
  • See accompanying worksheet

9
Basic Skills command line login
  • Try this on your own laptop
  • Windows command line
  • windowsR, type cmd
  • Cygwin (unix in windows)?
  • open cygwin
  • putty (log into a unix server)?
  • ip address, username, password
  • VMware (virtual machine within windows)?
  • choose a unix virtual machine (i.e. tinyunix)?
  • open a terminal
  • Apple Mac
  • OS X open a terminal

10
Basic Skills command line
  • Basic command line operations
  • Directories
  • cd ltdirectorygt Change the current directory
  • pwd get current working directory
  • Viewing files and directories
  • ls ltpathgt list the contents of a directory
    (dir)?
  • more ltfilegt see contents of file on screen,
    stop after every page
  • less ltfilegt see contents of file (with better
    ability to move in the file)?
  • cat ltfilegt see contents of file, don't stop at
    new page

11
Basic Skills command line
  • Basic command line operations
  • Editing files
  • emacs ltfilegt open file for editing in emacs
  • other programs nedit, vi
  • Copying and moving files
  • cp ltfilegt ltdestinationgt copy file to
    destination (copy)?
  • mv ltfilegt ltdestinationgt move (or rename) file
    to destination (move)?

12
Basic Skills command line
  • Basic command line operations
  • login and copying
  • ssh / scp login to server, copy files
  • viewing parts of files
  • head -lines ltfilegt look at first lines
  • tail -lines ltfilegt look at last lines
  • pattern matching
  • grep -e patternltfilegt find lines in file with
    pattern
  • grep -v patternltfilegt find lines in file
    without pattern
  • counting
  • wc ltfilegt count words, lines in file

13
Basic Skills command line
  • Basic command line operations
  • gt ltfilegt send results to file
  • more filename gt filename2 (send all of filename
    to filename2)?
  • ls gt directory_contents.txt
  • pipe send the results forward to another
    program
  • grep -e pattern filename gt filename_pattern.txt
  • head -5 filename gt filename_pattern.txt

14
command line exercises
  • Create a directory, name it after yourself
  • What is the current working directory?
  • Copy exercise.txt into your directory
  • Change the working directory to that directory
  • Look at the file with more
  • Read the man page for wc with man wc
  • What are the first 5 lines?
  • What are the last 3 lines?
  • How many lines contain the word fish? (hint you
    need to use pipe)?

15
command line exercises
  • command line in windows
  • Test the following in windows command line (open
    with windows-key R, then cmd)?
  • more
  • (pipe)?
  • grep
  • wc
  • sed
  • Which work, which do not?
  • How do you find help for a program??
  • What is sed for?

16
Additional resources
  • Plenty of tutorials are available online for R
    and unix
  • Unix tutorial for beginers
  • http//www.ee.surrey.ac.uk/Teaching/Unix/
  • R
  • http//cran.r-project.org/other-docs.html
  • Note some are very large (100 pages)?
Write a Comment
User Comments (0)
About PowerShow.com