Welcome to SMD - PowerPoint PPT Presentation

1 / 114
About This Presentation
Title:

Welcome to SMD

Description:

Welcome to SMD – PowerPoint PPT presentation

Number of Views:105
Avg rating:3.0/5.0
Slides: 115
Provided by: jano152
Category:
Tags: smd | cc | eee | mla | welcome

less

Transcript and Presenter's Notes

Title: Welcome to SMD


1
Welcome to SMD
  • Stanford Microarray Database
  • April 6, 2009
  • Janos Demeter

2
User Help Tutorials and Workshops
  • SMD Help FAQ
  • http//smd.stanford.edu/help/index.html
  • SMD Office hours
  • Monday 3 - 5 pm
  • Wednesday 2 - 4 pm
  • SMD Tutorials regularly scheduled
  • Welcome to SMD tutorial
  • Data analysis, Normalization and Clustering

3
Welcome to SMD a tutorial
  • What well talk about
  • User Registration /Accounts
  • Submitting Data
  • Finding Your Data
  • Displaying Your Data
  • Organizing Data
  • Repository
  • What we will not discuss, or only brush the
    surface of
  • Experimental Design
  • Data Normalization
  • Data Quality Assessment
  • Data Retrieval and Analysis (clustering)
  • External User Tools (XCluster, TreeView, etc.)
  • Please fill out the sign-up sheet and survey form
  • Questions? email us at array_at_genome.stanford.edu

4
Welcome to SMD What is SMD?
  • Database
  • Stores/gives you access to your data
  • Gives you access to data from other members of
    your lab/collaborators/public data
  • You can control who has access to your data
  • Services e.g. gene annotations kept up-to-date
  • Tools to analyze microarray data
  • File management system to keep results of your
    analyses easily retrievable and annotated.

5
Welcome to SMD
6
Welcome to SMD
7
Welcome to SMD
  • User Registration/Accounts
  • Navigating SMD
  • Submitting Data
  • Finding Your Data
  • Displaying Your Data
  • Organizing Data
  • Repository

8
User Registration
  • The registration form is on SMDs login page
  • http//smd.stanford.edu/cgi-bin/tools/display/regi
    stration.pl
  • Accounts are grouped by labs. Access shared
    within groups.
  • Lab PI must verify every user in group and the
    type of account needed
  • Two requirements to obtain account in SMD
  • Upon publication all experiments are made public.
    Once published, data are deposited in GEO and
    ArrayExpress
  • SMD is free again for Stanford users. External
    user groups have to pay(/per group/per year)
  • of arrays loaded fee
  • gt200 15,000
  • 100-199 9,000
  • 50-99 4,500
  • 1-49 2,250


9
Accounts Types of Users
  • Accounts are grouped into lab groups
  • Unrestricted User (lab members)
  • Can load data into SMD (i.e. have loader account)
  • Can edit/delete/change access to all his/her
    experiments
  • Can view/clone all experiments of his/her lab
    group
  • Restricted User (collaborators)
  • May view only those arrays for which they were
    given access privileges by the experimenter
  • Can NOT edit or delete data
  • Can NOT load data into SMD
  • Inactive Users (users who have left the lab)
  • Can still see, but no longer enter/edit/delete
    their own data
  • Can no longer view group data

10
Accounts loader account
  • Every unrestricted user gets an sFTP account
    on loader.stanford.edu
  • Login information (user name and password) is the
    same for web and loader accounts
  • You need an sFTP capable program locally to
    transfer files to and from loader account
  • http//www.stanford.edu/group/itss/ess/
  • (e.g. SecureFX for PC, Fetch for Mac)
  • This account is used for transferring files to SMD

11
Welcome to SMD
  • User Registration/Accounts
  • Navigating SMD
  • Submitting Data
  • Finding Your Data
  • Displaying Your Data
  • Organizing Data
  • Repository
  • Submitting a Printlist

12
Navigating SMD web-site Unrestricted User menu
  • Most frequently used programs are accessible from
    menu
  • All help docs help - First item context
    specific help
  • lists -gt index page

13
Navigating SMD web-site Unrestricted User
Index page
Restricted users can only see Search and List
Data options
14
Navigating SMD web-site SMD Tools
  • Find tools that are not so easy to find

15
Welcome to SMD
  • User Registration/Accounts
  • Navigating SMD
  • Submitting Data
  • Finding Your Data
  • Displaying Your Data
  • Organizing Data
  • Repository
  • Submitting a Printlist

16
Submitting Data to SMD Prerequisites
  • SMD currently accepts expression data produced
    by Agilent
  • GenePix
  • ScanAlyze
  • SpotReader
  • Affymetrix (MAS 5, GCOS, dChip)
  • NimbleGen (single channel)
  • Illumina (soon)
  • The appropriate print design file has to be
    entered into the database before experimental
    data can be loaded.
  • If you are using arrays printed at SFGF, this is
    done automatically and you dont need to worry
    about this
  • If slides come from external source, please send
    us the design file in MAGE-ML or gal format a few
    days before you plan to load the data.
  • If you are creating your own prints godlist file

17
Submitting Data to SMD (Finding a print)
  • You can find prints from the lists menu -gt
    prints
  • Type in the name of the print (hoad)
  • Click on the gal file icon
  • Select the id/annotation column you need
  • Download the gal file

18
Submitting Data to SMD
  • Files required
  • For ScanAlyze Data
  • data.dat, grid.sag, channel1.scn and channel2.scn
  • For GenePix Data
  • data.gpr, grid.gps, channel1.tif, channel2.tif
  • For SpotReader Data
  • data.srr, grid.sra, channel1.tif, channel2.tif
  • For Agilent Data
  • data.txt, shape.shp, channel1.tif, channel2.tif
  • For Affymetrix Data
  • expt.exp, image.dat, cell.cel, probeset.txt
    (currently text versions)
  • Compressed files are accepted for loading

19
Submitting Data to SMD Data Entry
Select Enter My Data from the menu
Or, go here to load your experiments
Go here to annotate your experiments
20
Submitting Data to SMD Single Experiment Entry
  • Choose whether you are entering a new
    experiment, a new result set for already existing
    experiment or a batch of experiments/result sets.
  • Select printing method
  • Select feature extraction software used for data
    generation
  • Select organism whose genes are arrayed

21
Submitting Data to SMD Agilent or Affymetrix
Experiment Entry
  • Select a Result Set Name and Description
  • As for any single Agilent or Affymetrix
    experiment there may be n result sets, you must
    create a name for each of these sets so that each
    result set may be identified and retrieved
    unambiguously from the database

22
Submitting Data to SMD Data File Locations
  • Select the Print Name from the pull-down list
  • Enter a Slide Name
  • unique
  • should be informative
  • Barcode
  • Slide
  • Max of 30 characters
  • Choose the data, grid, green scan and red scan
    files to be loaded from your loader account
  • You will get a pull down menu with a list of all
    files in the incoming directory of your loader
    account

23
Submitting Data to SMD Loader
  • incoming
  • Stores all files prior to experiment loading
    (This is where you will upload the data files.)
  • ORA-OUT
  • Feedback files from the database are written to
    this directory
  • Experiment loading logs
  • arraylists
  • The database will look in this folder to retrieve
    arraylists (a.k.a. result set lists)
  • genelists
  • The database will look in this folder to retrieve
    any genelists

Clean-up - all inappropriate files are
deleted daily - data files in incoming
directory are deleted when more than 3 weeks
old.
24
Submitting Data to SMD Common Problems
  • Cant connect to loader- check protocol (sFTP)
  • File transfer not complete. Check size of
    uploaded files.
  • Filename problems dont use space and unusual
    characters (e.g. !_at_/)
  • Files stored on loader longer than 3 weeks are
    deleted

25
Submitting Data to SMD
  • Although SMD does archive your data, this is your
    primary result

Please Archive Your Data!
26
Submitting Data to SMD Affy Expt Loading
  • The first 3 files have to be tab-delimited text
  • Exp file generated by Affy scanning software and
    contains protocol information.
  • Cell file Data for individual features on the
    slide
  • Gene Intensity file Probe set (gene) level data
  • DAT file Image file

27
Submitting Data to SMD Affy Expt Loading
  • These files should be exported from the analysis
    software as tab-delimited text files
  • The probe set file (MAS5/GCOS) should include the
    following columns
  • Analysis Name, Probe Set Name, Stat Pairs, Stat
    Pairs Used, Signal Detection, Detection p-value
  • Cell file should have these columns
  • X, Y, MEAN, STDV, NPIXELS
  • We are working to accept binary files

28
Submitting Data to SMD Experiment Description
Details
  • Experiment Date
  • Date of Hybridization
  • Date of Data Entry
  • SMD Experiment Name
  • Unique
  • Should be descriptive
  • Category/Subcategory
  • Green Red channel descriptions
  • Reverse Replicate for normalization data
    quality purposes
  • Normalization Type
  • (described later)

29
Submitting Data to SMDDoping Controls
  • Doping control data
  • - recommended amount 10 ng
  • - after amplification, dilution,
  • used 12 ng
  • gt factor 1.2
  • DCV2.1is given in 2 tubes DCV2.1_MJ and
    DCV2.1_A_S.
  • You can enter them separately (if factors are
    different), or as a single mix (if factors are
    the same)

30
Submitting Data to SMD Experiment Access
  • SMD Experimenter
  • Person who will have edit/delete/access
    privileges
  • Experiment Owner
  • SMD Group
  • By default, your lab group will be able to see
    all your experiments
  • If you wish for another entire group to view your
    experiments, you select the group name here
  • SMD Individual User
  • Give an individual user the ability to view your
    experiment

31
Submitting Data to SMD Experiment Access cont.
  • World Access
  • Selecting Yes here will make your data viewable
    by the WORLD!
  • This is usually only done for published data

32
Submitting Data to SMD Errors
  • Loading software checks for common errors
  • Experiments will not be loaded if there are
    errors. You must go back, correct your error(s)
    and resubmit your data

33
Submitting Data to SMD Queue
  • After passing the checks, your data goes to the
    loading queue
  • The queue holds all experiments being loaded and
    processes them in an ordered fashion
  • Progression of loading can be checked by looking
    at the log file or from the email you are sent.
  • If error, please, send link to curators.

34
Submitting Data to SMD Batch Loading
  • Instead of loading experiments one by one, you
    can load experiments in batch
  • All experiments have to be listed in a
    tab-delimited file (a batch file) in your loader
    account
  • There are sample batch files located on the batch
    entry help page
  • http//smd.stanford.edu/help/batch_load.shtml

35
Submitting Data to SMD Assembling a Batch File
  • (Result Set Name)
  • Print Name
  • Experiment Category
  • Experiment SubCategory
  • Slide Name
  • Data File Location
  • Grid File Location
  • Green Scan File Location
  • Red Scan File Location
  • Experiment Date
  • Experiment Name
  • Green Channel (CH1) Description
  • Red Channel (CH2) Description
  • Normalization Type
  • Norm Value
  • Experimenter
  • Experiment Description
  • Collaborative Group
  • Individual User

All underlined column headers indicate required
data
36
Submitting Data to SMD Batch File
37
Submitting Data to SMD Batch Loading
  • Your batch file and all experiment files MUST be
    in your loader account in the incoming directory.
  • You have the option to first check your batch
    file.
  • This will check for the usual errors before the
    data are loaded into queue.
  • After your batch file has passed the check, you
    can load your batch file.
  • Experiment loading proceeds as for single
    experiment entry.

38
Submitting Data to SMDCategories Subcategories
  • Currently, these are the only searchable fields
    to find experiments. Once you publish your data,
    you will want outside users to be able to search
    the data with useful terms
  • Before you enter any data into the database you
    must have a category and subcategory to annotate
    your experiments
  • Make sure that your categories and subcategories
    are meaningful and not cryptic.
  • Once you have chosen your terms and their
    definitions, please email your request to
  • array_at_genome.stanford.edu

39
Submitting Data to SMDCategories Subcategories
Go here to see existing lists of categories
subcategories
40
Normalization Why normalize data?
  • Normalization allows you to recognize the
    biological information in your data and compare
    data from one array to another
  • Goal remove signal that is unrelated to
    biological information (dye bias, location bias,
    intensity dependence, ).
  • During data loading simple normalizations are
    done - adding new data columns to existing ones

41
Submitting Data to SMD Normalizaton
Normalization
Normalized data columns added to original data
(Genepix, Scanalyze, SpotReader)
Uploaded Result file After analysis of the
scanned images
No normalization
No normalization, but several result
sets Agilent, Affy, NimbleGen
42
Normalization Channel biases
Before Normalization
43
Normalization Channel biases
After Normalization
44
Normalization Default normalization in SMD
  • Assume that on average the channel intensity
    ratio should be 1 (i.e. no difference between
    samples/channels) - not always true
  • Choose spots that are nice in both channels
  • Calculate a factor for nice spots
  • Apply this factor to channel twos data for all
    spots

45
Normalization Choosing Spots
  • There are two options for selecting nice spots
    for normalization (Only non-flagged spots are
    used for each)
  • Regression correlation spots with uniform
    color. (pixel regression correlation greater than
    0.6)
  • Computed bright spots. (large percentage of
    pixels are at least one standard deviation above
    background)
  • nice spots are those with at least 65 of
    pixels significantly above background.
  • If less than 10 of spots on the array meet the
    threshold, the 65 threshold is reduced stepwise
    until either 10 of spots pass or the threshold
    reaches 55 of pixels above background (whichever
    comes first)

46
Normalization Calculating and Applying the
Factor
  • Normalization factor is the geometric mean of the
    red/green ratio of the nice spots (arithmetic
    mean of log-ratios)
  • Alternatively, a user can specify a normalization
    factor
  • Both foreground and background intensity of
    channel 2 (red) for all spots are divided by the
    normalization factor
  • Other normalized values are calculated from these

47
NormalizationAvailable Tools
  • Additional normalization/ background correction
    methods are available for experiments already in
    the database
  • In batch
  • Individual experiments (see later)

48
Re-normalize Experiments
- Default normalization options - Bioconductor
normalizations - Use a subset of spots to do
normalization - Background correction options
49
Welcome to SMD
  • User Registration/Accounts
  • Navigating SMD
  • Submitting Data
  • Finding Your Data
  • Displaying Your Data
  • Organizing Data
  • Repository
  • Submitting a Printlist

50
Finding Your Data in SMD
  • Ways to search for data (searches public lab
    group collaborators your own data)
  • Advanced Search
  • Basic Search
  • Experiment List
  • Name Search
  • Direct ways to data you own
  • Display My Data
  • Select My Data

51
Finding Your Data in SMD Basic Search
  • There are three ways to find your data via Basic
    Search
  • Publications include all published data in SMD
  • Experiment sets allow you to select pre-defined
    experiment groups.
  • Search for data by category

52
Finding Your Data in SMD Advanced Search Results
53
Advanced vs Basic Search
  • Use Basic Search to retrieve
  • a single Publication
  • a single Experiment set
  • your personal sets
  • others, if viewable
  • a single Experimental category
  • Use Advanced Search to perform
  • boolean search
  • by Experimenter
  • by Category
  • by Subcategory
  • retrieval by Print
  • retrieval by result set list
  • search experiment
  • word search in description field

54
Welcome to SMD
  • User Registration/Accounts
  • Navigating SMD
  • Submitting Data
  • Finding Your Data
  • Displaying Your Data
  • Organizing Data
  • Repository
  • Submitting a Printlist

55
Displaying Your Data
56
Display Data
Clone experiment
57
Display DataDownload Original Data Files
  • Simple script to download original datafiles
    loaded into SMD
  • Same files that were uploaded
  • Currently only works for single experiments
  • Can be run in batch for a result set list

58
Display Options Raw Data
  • Downloaded file contains all measured and
    normalized data, biological annotations, and
    experiment annotation.
  • File name
  • ltexptidgtltsoftwaregtltresult setgt.xls
  • E.g., 12345GENEPIX0.xls
  • The file is actually a tab-delimited text file
    that can be opened in any program

59
Display Options Alignment data
60
Display Options View Data
  • Select Columns to be displayed
  • Array expression data
  • Biological Annotation Data
  • Select filtering criteria
  • Select sorting column
  • Select how many rows to be displayed per page
  • Include controls/nulls
  • Data may be viewed or downloaded

61
Display Data Clickable Image
  • Gives you the array image (gif image, not
    original tif files)
  • Does not give you the filtering option
  • If you click on a spot, you get the spot details

62
Display Data Spot Image
63
Display Data View images with grids
  • Assess data quality
  • Select data for grids with the usual filtering
    options.
  • Selected spots may be flagged or un-flagged en
    masse.
  • If you see a spot that you want to flag, you can
    do so by clicking on the spot.
  • When you click on a spot you get

64
Display Data Plot Array Data
  • You can evaluate data quality by plotting values
    for any array.
  • Histograms and scatter graphs
  • These graphs are very useful before/after
    normalization or selecting filter values for data
    retrieval.
  • Arrays in an Experiment Set may be plotted
    simultaneously.

65
Display Options View Details
  • All experiment and protocol information
  • Displays the normalization method and value
  • Links to tools to assess the quality of data
  • ArrayQuality plots (doping control plots)

66
Ratios on Array Tool
  • Quick visualization of log-ratio distribution on
    the slide
  • Color assignments are based on log-ratio values
    and also intensity
  • Can visualize normalized or non-normalized
    log-ratios
  • PLUS ANOVA analysis to detect spatial bias
    (print-tip or plate)

67
Ratios on Array Tool
  • Not normalized vs. normalized (loess intensity,
    print-tip)

68
Display DataClone an Experiment
69
Display DataEdit Experiment Details
  • Edit all names and descriptions
  • Associate clinical information with an array
    (only for human arrays)
  • Experiment Type
  • CGH
  • Chromatin IP
  • Expression Type I
  • Expression Type II
  • GMS
  • Associate procedural information
  • View Data Distribution
  • Re-normalize data
  • Edit doping controls

70
Display DataEnter Experiment Annotations
  • Single array entry available from edit tool.
  • Tag-value data and/or free-text protocols.
  • Batch entry available from main page.

71
Display DataExperiment Annotation in Batch
  • Tab-delimited text file with procedures and
    parameters (http//smd.stanford.edu/help/batch_pro
    cedure.shtml). Put it into the incoming directory
    on loader.
  • List of procedures, protocols, parameters and
    experiment types available

72
Display DataExperiment Annotation
  • MGED - Microarray Gene Expression Database
    Society
  • In September 2002, MGED sent out a letter to
    journals and reviews requesting the microarray
    publications have the minimal MIAME information
  • Several journals have adopted these policies
    concerning publications
  • MIAME checklist http//www.mged.org/Workgroups/M
    IAME/miame_checklist.html
  • Nature Genetics (2001) 29, 365-371.
  • SMD allows you to store all required information
    (but it wont happen on its own).

73
Display DataExperiment Annotation
  • Experimental Design
  • Array Design
  • Biological Samples
  • Hybridizations
  • Measurements
  • Data Normalization and Transformation

74
Display Data Changing Access
  • Here, you can add or remove experiment access to
    individual users or to groups by experiment.
  • To add access in batch make a resultset list
    (arraylist) and use

75
Display DataDelete an Experiment
  • Only the owner (the experimenter) of an
    experiment can delete it
  • Once an experiment is deleted from the database,
    it can not be recovered
  • Once an experimenter leaves the lab, the lab head
    should consider what to do with his/her
    experiments, i.e. should the user still have the
    ability to delete all their experiments?

76
Welcome to SMD
  • User Registration/Accounts
  • Navigating SMD
  • Submitting Data
  • Finding Your Data
  • Displaying Your Data
  • Organizing Data
  • Repository
  • Submitting a Printlist

77
Organizing Data
  • Grouping experiments
  • Experiment sets -gt database record
  • Result set lists (a.k.a. arraylists) -gt text file
  • Grouping genes
  • Genelists -gt text file
  • Saving files online
  • Repository -gt database record

78
Organizing DataExperiment sets Arraylists
  • From Search pages -gt
  • data retrieval
  • Create Result Set List
  • Create Experiment Set

79
Organizing Data Creating a Result Set List
  • Name your Result Set List
  • Select which experiments to include in the list
  • Order experiments
  • Select filters to use during data retrieval
  • First, customize filters per array, then save the
    list
  • or, save the list without customizing it.
  • A file is created in the arraylists directory of
    your loader account
  • You can use this result set list to select your
    experiments within the Advanced Search or give
    batch access to experiments

80
Organizing Data Creating an Experiment Set
  • Same first page as for Resultset list to select
    and order arrays (no filters)
  • The following pages allow you to specify
    experimental factors (see next page)
  • and meta-information about the set
  • Name
  • Experiment set design
  • Longevity
  • Etc.
  • Group of experiments can be made public here. We
    ask you to make sets for publications.

81
Organizing Data Creating an Experiment Set
  • Enter experimental factors and values.
  • May be drawn from procedural data, if entered.

82
Organizing Data Creating an Experiment Set
  • Factor data are available when viewing the
    experiment set.

83
Organizing Data Arraylist vs Experiment Set
  • Result Set List
  • Tab-delimited text file that exists in your
    loader account
  • Contains filters and their values per array
  • Accessed through Advanced Search
  • Can be converted to Experiment Set (link under
    Tools)
  • Experiment Set
  • Annotated list of experiments
  • Exists in the database
  • Accessed through Basic Search
  • Required for publication of data

84
Organizing Data Genelists
  • What is a genelist?
  • A tab-delimited text file containing a list of
    gene identifiers that exists in your loader
    account in the genelists directory
  • What is the purpose of a genelist?
  • Data retrieval for only a set of genes
  • Collapse gene data -gt synthetic genes
  • Have you own annotation for these genes instead
    of using SMDs
  • Other uses e.g. normalize arrays based on a list
    of genes
  • There are several shared standard genelists that
    are available for many organisms.
  • You may create your own precompiled list of
    genes.

85
Organizing Data Creating your own genelists file
  • Create a tab-delimited text file.
  • The first line of the file must have the
    appropriate label for the data contained within
    it.
  • NAME (e.g. YPR119W, IMAGE1542757, HPY1808,
    hSQ000234)
  • LUID (SMD identifier, unique for an instance of a
    sequence - plate well)
  • SPOT
  • GOID
  • GOTERM
  • Your file may contain additional columns with any
    type of annotation data you desire for each gene
    (Annotation).

86
Welcome to SMD
  • User Registration/Accounts
  • Navigating SMD
  • Submitting Data
  • Finding Your Data
  • Displaying Your Data
  • Organizing Data
  • Repository
  • Submitting a Printlist

87
Organizing Data Repository
Here!
88
Organizing Data Repository
89
Using Your Repository PCL Deposits
  • Re-enter data retrieval pipeline to
  • Filter by gene expression pattern
  • Cluster
  • Avoid repeating data retrieval
  • Use SVD and KNN Impute tools
  • Average data by synthetic genes
  • Use analysis tools in GenePattern
  • View retrieval report
  • Download files
  • Share data with collaborators
  • 200 MB storage space

90
Using Your Repository CDT Deposits
  • View cluster using GeneXplorer or (java)TreeView
  • View cluster images
  • View retrieval and clustering report
  • Download files
  • Assign access

91
Depositing Data
  • Deposit from data retrieval pipeline
  • Upload from desktop computer

92
SMD Staff
Heng Jin Scientific Programmer
Gavin Sherlock Asst. Professor, Co-PI
Michael Nitzberg Database Administrator
Catherine Ball Director
Farrell Wymore Lead Programmer
Janos Demeter Computational Biologist
Zachariah Zachariah Sr. Systems Manager
Jeremy Hubble Programmer
93
Questions to SMD
SMD
Send e-mail array_at_genome.stanford.edu Office
hours Mondays 3-5 pm Wednesdays 2-4
pm Office Grant S201 Phone 736 -
0075 Online help http//smd.stanford.edu/help/
index.html
Fairchild
Bio-X
94
Welcome to SMD
  • User Registration/Accounts
  • Navigating SMD
  • Submitting Data
  • Finding Your Data
  • Displaying Your Data
  • Organizing Data
  • Submitting a Printlist

95
Submitting a Printlist to SMD
  • The creation of a print within SMD is a complex
    process, but is absolutely required prior to
    experiment entry.
  • If you receive your arrays from SFGF, this is
    done automatically and you do not need to stay
    for the remainder of the tutorial
  • A printlist (godlist) is a tab-delimited list of
    plate samples (well address contents) in the
    order the plates were put in the printer.
  • There is a program to assist you in printlist
    submission
  • Located under Tools on the SMD homepage
  • Printlist must be in your ORA-OUT directory on
    loader

96
Submitting a Printlist Is a new list required?
  • Yes, if the plates used have not been previously
    entered into the database
  • Yes, if the plate was entered in the past, but
    their contents have changed over time (well
    contamination, well emptied)
  • No, if your lab makes 3 different prints using
    the exact same plates in the same or different
    order
  • Need to supply the database with a list of SMD
    plateIDs and plateNames from the first print in
    their new order.

97
Submitting a Printlist Column Headers for New
Plates
  • PLAT The plate number eg 1, 2, 3, etc.
    INTEGER
  • PROW The plate row eg A, B, C, etc. CHARACTER
  • PCOL The plate column eg 1, 2, 3, etc.
    INTEGER
  • NAME The sequence name
  • usually a systematic name or clone identifier
    (I.e. YBL016 or IMAGE753234)
  • This is the only name used for samples of TYPE
    other than CDNA.
  • TYPE The sequence type
  • Usually ORF, CDNA, CONTROL, or EMPTY.
  • List of types can be seen from the SMD homepage
    under List Data Sequence Type
  • FAIL Whether the PCR failed
  • 0 one distinct band - success
  • 1 no signal - fail
  • 2 multiple distinct bands
  • 3 signal, but not a distinct band (smear)
  • 4 multiple smears
  • 5 unknown
  • 101 worst cases of peeled away or haloed
    spots(assigned on a 96 well plate basis)
  • 102 less bad cases of peeled away or haloed
    spots(assigned on a 96 well plate basis)
  • Null is assumed to be 0 (success)

98
Submitting a Printlist Additional Columns for
cDNA data
  • CLONEID Required for samples of TYPECDNA, if
    ACC is absent/null. Real cDNA clones must have a
    cloneID.
  • ACC Required if CLONEID is absent/null.
  • This is the GenBank accession, usually acquired
    from dbEST.
  • IS_CONT Whether the sample is known to be
    contaminated. A blank entry will default to
    unknown (U)
  • IS_VER Whether the DNA in a well has been
    verified. A blank entry will default to
    unverified (U).
  • SOURCE A string describing the source of the
    clone or DNA. This has typically been used to
    indicate the original plate source, and the 96
    and 384 well plate locations that a clone has
    been in
  • GF20096(1A1)384(1A1).
  • GF200 refers to a set of resgen plates

99
Submitting a Printlist Optional Columns
  • DESC A description of the molecular entity. This
    description is associated with the SUID itself
    (not a clone or platesample description)
  • LUID Laboratory Unique ID For those samples
    that have identical NAME and TYPE, but require
    distinction within the laboratory for
    experimental reasons (different sources, new
    PCR,new plate). If you wish to enter LUIDs for
    your labs platesamples, please contact the
    curators array_at_genome.stanford.edu
  • GENE_NAME Sometimes clones will stop being
    included in UniGene for spurious reasons, but
    users have a 'Preferred Name' for those clones.
  • ORIGIN For CDNA clones, this can indicate
    whether this is a public or private clone.
  • SAMPLE_DESC A description, if any, about that
    particular sample. This description is specific
    to the plate sample.
  • ORGANISM If submitting a print containing
    samples from multiple organisms (i.e. human,
    yeast). For those few rows where the sample is
    derived from an organism other than the default
    (user-defined), the organism code must be
    specified.

100
Submitting a Printlist Creating New SUIDs
  • New samples in your printlist (i.e. not currently
    in the database) will need to have a unique
    identifier assigned to them (SUID)
  • A SUID is meant to represent a unique molecular
    entity within SMD. It is meaningless outside the
    context of the database.
  • The combination NAMETYPEORGANISM uniquely
    identify an SUID
  • YBL001CORFSC ? SUID3429
  • IMAGE486544CDNAHS ?SUID28546
  • SUIDs allow comparison of the same samples across
    different prints.
  • It is extremely important that erroneous SUIDs
    are not created.
  • This will prevent comparisons between
    prints/experiments

101
Submitting a Printlist Avoiding Common Name
Errors
  • Erroneous SUIDs are usually created by a bad NAME
  • misspelled, non-standard, or non-systematic
  • ACT1ORFSC or ActinORFSC ? YFL039CORFSC
  • 3X SSCCONTROLSC ? 3xsscCONTROLSC
  • Every new sample must be verified by the user
    before it is assigned a new SUID and before the
    printlist can be entered.
  • Please be a conscientious user and verify that
    any new SUIDs you approve are valid.
  • Empty wells must be specified as such
  • All empty wells must be designated NAMEgtEMPTY
    and TYPEgtEMPTY.
  • Do not use "blank or "control" to describe empty
    wells.

102
Submitting a Printlist Avoiding Common Errors
  • Headers misspelled or absent
  • Required data missing
  • except FAIL, CLONEID, but column header must
    still be present
  • Correct Plate ordering
  • No wells may be skipped (with the exception of
    the last plate in the print run).
  • Useful check number of plate samples number of
    printed spots
  • samples (printlist rows-1) lt tips rows
    per sector columns per sector spots

103
Submitting a Printlist Printlist Check Program
  • The printlist must be placed in your ORA-OUT
    directory on your loader account
  • This program will assist you in printlist
    submission
  • It follows the rules stipulated above.
  • The program will send all feedback to your
    ORA-OUT directory
  • Filename.new
  • Filename.errors

104
Submitting a Printlist Notify Curators
  • Additional information needed
  • Number of sector rows/columns
  • Distance of rows/columns in sector
  • Printing algorithm http//smd.stanford.edu/help/c
    reatePrint.shtml
  • Number of slides printed
  • Plate location
  • Printer used for printing
  • When your printlist is correct - send email with
    info above to array_at_genome.stanford.edu

105
SubmittingDatato SMDThework flow
106
Welcome to SMD
  • User Registration/Accounts
  • Category/Subcategory
  • Submitting Data
  • Finding Your Data
  • Displaying Your Data
  • Organizing Data
  • Submitting a Printlist

107
Submitting Data to SMD Successful Experiment
Entry
  • Once your experiment has been loaded into the
    database, there are 2 methods to get the details
    of the experiment loading process
  • From the queue page
  • A file will be created on your loader account in
    the ORA-OUT directory
  • process_no.log

108
Submitting Data to SMD Example queue logfile
Loading Expt Batch NO 3279
Experiment Name blah blah Thu Dec 13 155401
2001 Processing Data File
/loader/ftphome/youruserid/incoming/slidename.gpr
Inserting experiment info into experiment
table... exptID 28765 The experiment
data has been successfully inserted into
experiment table! Updating Experiment
Access Control Table ... Updating
expt_access for experimenter YOURUSERID () ...
OK Updating expt_access for Brown/Botstein labs
() ... OK Calculate norm value...
Reading all data from datafile and doing all
calculation now... PassCriteria 16005 Using
36490 spots for normalization 43.8 passed
criteria of a good spot with 0.65
Updating exptNorm table... NormType
Computed NormValue 0.96 Updating Result
table... Total Record 43200
Updating Result table...
Expected 43200, actual is 43200 1000 . .
.
109
Normalization computed method
  • nice spots are those with at least 65 of
    pixels significantly above background.
  • If less than 10 of spots on the array meet the
    threshold, the 65 threshold is reduced stepwise
    until either 10 of spots pass or the threshold
    reaches 55 of pixels above background (whichever
    comes first)

110
Submitting Data to SMD Data Standards
  • MGED - Microarray Gene Expression Database
    Society
  • initially established November, 1999, Cambridge,
    UK.
  • MGED 7, last year at Toronto had over 300
    participants - international.
  • Interest in a data standards and format
    specifications.

111
Submitting Data to SMD MIAME
  • Nature Genetics (2001) 29, 365-371.

112
Submitting Data to SMD Upload gif files
  • How
  • Use your copy of tif files
  • Make composite and save as .gif
  • Upload on loader into incoming/
  • Use the Upload.gif link
  • When
  • The gif created by the default process is not
    acceptable
  • After renormalization
  • If SMDs gif creation fails, notify the curators
    before uploading your own - they may be able to
    fix the problem.

Data Entry
113
Display Options View Details
  • Data Distribution
  • Plot Data
  • Ratios on Array
  • Channel Intensities
  • These graphs are covered in the data analysis
    tutorial.

114
Submitting Data to SMD Experiment Entry Log File
  • The log file will give you the following
    information
  • ExptID (experiment ID)
  • Information on experiment access
  • Information on normalization applied
  • Number of spots that pass criteria
  • Spots used to calculate normalization
  • Percentage of spots that passed criteria
  • Normalization Value
  • Error message if (sub)process failed
Write a Comment
User Comments (0)
About PowerShow.com