Unsupervised clustering in mRNA expression profiles - PowerPoint PPT Presentation

About This Presentation
Title:

Unsupervised clustering in mRNA expression profiles

Description:

Unsupervised clustering in mRNA expression profiles ... calculate proportion of overlap for each window. ... Unsupervised clustering in mRNA expression profiles – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 12
Provided by: LanceRP3
Category:

less

Transcript and Presenter's Notes

Title: Unsupervised clustering in mRNA expression profiles


1
Unsupervised clustering in mRNA expression
profiles
  • D.K. Tasoulis, V.P. Plagianakos, and M.N.
    Vrahatis Computational Intelligence Laboratory
    (CILAB), Department of Mathematics, University of
    Patras, GR-26110 Patras, Greece
  • University of Patras Artificial Intelligence
    Research Center (UPAIRC), University of Patras,
    GR-26110 Patras, Greece
  • Computers in Biology and Medicine
  • In Press, Corrected Proof, Available online 24
    October 2005

2
K-Windows Clustering
  • Adaptation of K-means, originally proposed in
    2002 by Vrahatis et. al.
  • Windowing technique improves speed and accuracy
  • Tries to place a d-dimensional window (box)
    containing all patterns that belong to a single
    cluster

3
K-Windows Basic Concepts
  • Move windows to find cluster centers (fig a)
  • Select k points as centers of d-windows of size
    a.
  • Window means becomes new center.
  • Repeat until stopping criterion (movement of
    center).
  • Enlarge windows to determine cluster edges (fig
    b)
  • Enlarge one dimension by a specified percent.
  • Relocate window as above.
  • Keep only if increase in instances in window
    exceeds threshold

4
Unsupervised K-Windows (UKW)
  • Start with sufficiently large number of windows
  • Merge to automatically determine the number of
    clusters
  • For each pair of overlapping windows, calculate
    proportion of overlap for each window.
  • Large overlap, considered same cluster, W1 is
    deleted.
  • Many points in common, considered the same
    cluster.
  • Low overlap, considered two different clusters.

5
Experimental Setup
  • Leukemia dataset well characterized
  • Default UKW parameters used
  • Supervised dimension reduction
  • Two previously published gene subsets and their
    union
  • Unsupervised dimension reduction
  • Biclustering with UKW
  • PCA
  • PCA and UKW hybrid

6
Supervised Feature Selection
  • Use two gene subsets selected in previously
    published papers using supervised techniques.
  • All algorithms did best on combined set, results
    below.

7
Unsupervised Feature Selection(Biclustering
Technique)
  • Apply UKW to cluster genes, select one gene,
    closest to cluster center, as representative from
    each cluster.
  • Apply UKW to samples, using those genes (239).
  • UKW accuracy 93.6 (ALL) and 76 (AML)
  • No results reported for other algorithms

8
Unsupervised Feature Selection(PCA Techniques)
  • PCA and scree plot to reduce features
  • Poor Performance
  • Hybrid PCA and UKW method
  • Partition genes using UKW
  • Transform each partition using PCA
  • Select representative factors from each cluster
  • UKW accuracy 97.87 (ALL) and 88 (AML)

9
UKW Results Summary
Dataset ALL Accuracy AML Accuracy
Published Gene Subsets (Supervised) 90 100
UKW Biclustering (Unsupervised) 93.6 76
PCA (Unsupervised) N/A N/A
PCA-UKW Hybrid (Unsupervised) 97.87 88
10
(No Transcript)
11
  • Default parameters
  • initial window size a5
  • enlargement threshold ?e0.8
  • merging threshold ?m0.1
  • coverage threshold ?c0.2
  • variability threshold ?v0.02
  • Link to article
Write a Comment
User Comments (0)
About PowerShow.com