Clustering Tool in 8 Minutes - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Clustering Tool in 8 Minutes

Description:

You can download executable software, source code, user manual ... CDT and *.GTR files. Micro-array. Row data. Data cleaning. Data-mining. Clustering result ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 23
Provided by: masu2
Category:
Tags: clustering | gtr | minutes | tool

less

Transcript and Presenter's Notes

Title: Clustering Tool in 8 Minutes


1
Clustering Toolin 8 Minutes
  • D93725001 WK TSAO
  • R93922143 MASUYA

2
Presentation Schema
  • Where does the tool from
  • When and why you need it
  • How to use it
  • Background knowledge
  • Analyze process flow
  • QA

3
Where
http//rana.lbl.gov/
  • You can download executable software, source
    code, user manual from this website

4
When
  • When you have a lot of micro-array dataset and
    you want to take a first look about it
  • When you want to have an idea about which
    direction you will go further
  • When either supervised or non-supervised
    understanding needed.

5
Why
  • Comparing to other Data-mining algorithm
  • Cluster need Minimal requirements for domain
    knowledge to determine input parameters

6
Background Knowledge
  • Statistics about normalization
  • Clustering thinking and algorithm
  • Familiar with you data!

7
Analyze Process Flow
Row data collection
Data preparation
Data cleaning
  • Filtration

Adjustment
Data-mining
Clustering
Visualization
View the result
8
1.Data Preparation
  • Format your data as follows.

Simple form
Complete form
The dif whether you want to take weight and
order into accounted
9
Import data into the software
  • Cluster will inform you about the loaded data

1
2
3
10
2.Adjustment
  • Before clustering the data, it is necessary for
    us to adjust data. This tool proves some modes of
    adjustment of data.

11
What Is Normalization
Normalization of Expression Profiles
12
Why? To Avoid Correlation Pitfalls
Correlation0.97
13
Why? To Avoid Correlation Pitfalls
Correlation-0.02
14
Adjustment List, WHY?
  • Log transform
  • Micro-array dataset are based on fluorescent
  • Normalization
  • Avoid possible correlation pitfall
  • Mean/median center transform
  • When your micro-array sample are compared to a
    common referenced sample

15
3.Filtration, Remove Bad Data
  • The filter tab allows you to remove certain genes
    those do not fulfill certain desired criteria of
    your dataset.

16
4.Clustering
  • The tool provides us with three methods of
    clustering
  • Hierarchical
  • K-means
  • Self organizing map ( SOM )

17
4-1.Hierarchical Clustering
  • The hierarchical clustering tab allows to perform
    hierarchical clustering on the datasets.

18
4.2 Partitioned Clustering
  • The parameters controlling K-Means clustering
    are
  • The number of clusters ( K )
  • The maximum number of cycles.

19
4.3 SOM Clustering
  • Here we have considered one dimensional approach
    of SOM

20
5.Result Viewer
Micro-array Row data
.CDT and .GTR files
Cluster
Tree-View
Data cleaning Data-mining
Clustering result Visualization
21
5.Result Tree-viewer
  • Use your mouse click/drag/find
  • Use your keyboard up/down/left/right

2
1
3
22
Thank You QA
Write a Comment
User Comments (0)
About PowerShow.com