About Me - PowerPoint PPT Presentation

About This Presentation
Title:

About Me

Description:

Use of Bipartite Graphs. The clustering problem can now be posed as a partitioning problem ... cut vertex partitions in Bipartite Graphs. Optimal Solution is NP ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 15
Provided by: peterro
Category:
Tags: bipartite

less

Transcript and Presenter's Notes

Title: About Me


1
About Me
  • Swaroop Butala
  • MSCS graduating in Dec 09
  • Specialization Systems and Databases
  • Interests
  • Learning new technologies
  • Application of technology to financial sectors

2
Coclustering Documents and words using
BipartiteSpectral Graph Partitioning
  • Author Inderjit S. Dhillon
  • Department of Computer Sciences
  • University of Texas, Austin
  • Presented by
  • Swaroop Butala, Fall 2008

3
Clustering and Current Solutions(1)
  • Clustering
  • Collection of Objects
  • Future Navigation and Searches

4
Clustering and Current Solutions(2)
  • Current Solutions
  • K-means
  • Fuzzy C-means
  • Hierarchical clustering
  • Document Clustering
  • Word Clustering

5
Document Clustering
  • Problem
  • Vector Space Model
  • Extract Unique Content-Bearing Words
  • Word by Document matrix
  • Existing Solutions
  • K-means Algorithm
  • Self organized maps
  • Computationally Prohibitive

6
Word Clustering
  • Basis of documents in which they Co-occur
  • Words that typically associate together in
    documents should be associated with similar
    concepts.
  • Uses
  • Automatic Classification of documents

7
Co-clustering Documents and Words
  • Novel Idea
  • Duality of word and document clustering
  • Use of Bipartite Graphs
  • The clustering problem can now be posed as a
    partitioning problem
  • Solution
  • Spectral Co-Clustering algorithm

8
Bipartite Graph(1)
  • No Edges between Words or between Documents

9
Bipartite Graphs(2)
Adjacency Matrix
10
The Partitioning Problem
  • Minimum cut vertex partitions in Bipartite Graphs
  • Optimal Solution is NPComplete
  • Solutions KL and FM algorithms exist
  • Spectral Algorithm gives a good global solution
  • Better solutions than KL and FM algorithms

11
Graph Partitioning
  • To find equally sized vertex subsets such that
    the cut is minimum
  • Eigenvectors as optimal partition vectors
  • Since the discrete solution is NP complete
  • The Bipartitioning Algorithm

12
Conclusions
  • A novel idea of Coclustering Words and Documents
    together is proposed
  • A real relaxation to optimal solution of
    partitioning is provided
  • Algorithm works well on real examples

13
Critique
  • Actual motivation for combining document and word
    clustering is not stated
  • The solution is not completely optimal since the
    problem of Partitioning is NP complete

14
Questions?
Write a Comment
User Comments (0)
About PowerShow.com