BIOINFORMATIK I UEBUNG 2 - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

BIOINFORMATIK I UEBUNG 2

Description:

Title: Slide 1 Author: Andreas Prokesch Last modified by: Hackl Hubert Created Date: 1/23/2004 7:41:20 AM Document presentation format: Bildschirmpr sentation (4:3) – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 27
Provided by: Andrea590
Category:

less

Transcript and Presenter's Notes

Title: BIOINFORMATIK I UEBUNG 2


1
BIOINFORMATIK I UEBUNG 2
http//icbi.at/bioinf
2
mRNA processing
3
splicing
4
Spliceosome assembly
GU
YAG
A
hnRNP
U1
GU
U4
U5
U2
SR proteins
YAG
A
kinases and phosphatases
U1
RNA helicases
Cyclophilins
200 non-snRNP proteins
GU
U6
U2
U5
YAG
A
5
Different levels of regulation
6
Regulation of transcription
7
ChIP procedure
DNA
Farnham, Nature Rev Genetics, 2009
8
microRNAs
http//www.mirbase.org/
9
Ensembl BioMart
10
UCSC Table Browser
11
UCSC Table Browser
12
Notepad and regular expressions
gt
any symbol
begin of line
0 or more times
gt . \r \n
carriage return (CR)
line feed (LF)
13
Notepad and regular expressions
character   meaning
\   escape used to make specials non-special
()   group you can retrieve its contents e.g. with \1 for the first occurrence
  any character inside is considered a match
.   matches any character
  match the previous character 0 or more times
  match the previous character 1 or more times
n match the previous character n times
  if the first character in the regex, means beginning of line inside means not
last character in the regex, means end of line
\s   any space character (space, tab)
\t tab (--gt)
\r carriage return (CR)
\n line feed (LF)
14
Notepad and regular expressions
gt.\r\n replace with
ACGT.\r\n replace with
(.20).\r\n replace with \1\r\n
15
\r\n
replace with gt
replace with
\r\ngt repeatMaskingnone replace with
\r\n gt.\r\n
replace with .(.20)
replace with \1
16
Sequence Logo
http//icbi.at/logo
17
KEGG
18
Protein domains
Uniprot, Prosite, Interpro, Pfam, CD, SMART
19
Gene Ontology
The Gene Ontology project provides a controlled
vocabulary to describe gene and gene product
attributes in any organism.
3 organizing principles
  • cellular component (e.g. mitochondrium)
  • biological process (e.g. lipid metabolism)
  • molecular function (e.g. hydrolase activity)

Each entry in GO has a unique numerical
identifier of the form GOnnnnnnn, and a GO term
ISS Inferred from Sequence Similarity IEP Inferred
from Expression Pattern IMP Inferred from Mutant
Phenotype IGI Inferred from Genetic
Interaction IPI Inferred from Physical
Interaction IDA Inferred from Direct
Assay RCA Inferred from Reviewed Computational
Analysis TAS Traceable Author Statement NAS Non-tr
aceable Author Statement IC Inferred by
Curator ND No biological Data available
Evidence code
Directed acyclic graph (DAG) with different
levels and 2 relations (part_of, is_a)
20
Orthologs
Protein A
Homologs A B C Orthologs B1 C1
Paralogs C1 C2 C3 Inparalogs C2 C3
Outparalogs B2 C1 Xenologs A1 AB1
21
Orthologous prediction
22
Ortholog databases
  • YOGY (eukarYotic OrtholoGY) is a web-based
    resource and integrates 5 independent resources
    (Sanger)
  • COG Cluster of ortholog groups of proteins and
    KOG for 7 eukaryotic genomes (NCBI),
  • Inparanoid (Center Stockholm Bioinformatics)
  • HomoloGene (NCBI)
  • OrthoMCL use Markov Clustering algorithm
    (University of Pennsylvania)

23
Multiple sequence alignment (CLUSTALW)
Progressive tree alignment
Jalview
24
Exercise 2-1 REGULATORY GENOMICS
Pyruvate Carboxylase as example Ensembl
Biomart 1.1 For the human transcript NM_000920
(pyruvate carboxylase) find official gene symbol,
number of exons, Ensembl transcript ID, Ensembl
gene ID, 3'UTR sequence as fasta file, length of
3'UTR microRNA target prediction 1.2 Is there a
complementary sequence within the 3'UTR of PC to
postion 2-8 in the sequence of microRNA
hsa-mir-182. UCSC genome browser 1.3 Position
of transcript start site and transcription end of
Pyruvate carboxylase (NM_000920) in hg19
assembly
25
Exercise 2-1 REGULATORY GENOMICS
Find splicing signals 1.4 Get sequences
(10bp/-10bp) around intron-exon borders and
exon-intron borders from pyruvate carboxylase
using UCSC table browser and Notepad 1.5
Construct in both cases sequence logo and
frequency plot. Can you identify (regulatory)
sequence motifs? Regulatory motifs
(transcription factor binding sites) 1.6 We know
from Chromatin immunoprecipitation (ChIP-seq)
experiments in a mouse cell line that the
transcription factor Pparg is binding near the
pyruvate carboxylase gene and hence potentially
regulate its transcription (ppar.wig). Show
binding region as custom track in UCSC genome
browser and extract sequence.
26
Exercise 2-2 PROTEIN FUNCTION
Identify function /processes/pathways for a
protein 2.1 What is the function of pyruvate
carboxylase and in which pathways and processes
this enzyme is involved? Show pathway maps and
find Enzyme ID (EC) using KEGG Identify
functional domains and Gene Ontology Annotation
of the protein sequence using Uniprot, Prosite,
Pfam Find orthologs and perform multiple
sequence alignment 2.2 Find ortholog protein
sequences in Mus musculus, Rattus norvegicus,
Saccharomyces cervisiae, perform multiple
sequence alignment using ClustalW, and visualize
with Jalview.
Write a Comment
User Comments (0)
About PowerShow.com