Running Clustering Algorithm in Weka - PowerPoint PPT Presentation

About This Presentation
Title:

Running Clustering Algorithm in Weka

Description:

Running Clustering Algorithm in Weka Presented by Rachsuda Jiamthapthaksin Computer Science Department University of Houston What is Weka? Data mining software in ... – PowerPoint PPT presentation

Number of Views:273
Avg rating:3.0/5.0
Slides: 31
Provided by: www2CsUh8
Learn more at: https://www2.cs.uh.edu
Category:

less

Transcript and Presenter's Notes

Title: Running Clustering Algorithm in Weka


1
Running Clustering Algorithm in Weka
Presented by Rachsuda Jiamthapthaksin Computer
Science Department University of Houston
2
What is Weka?
  • Data mining software in Java
  • Supervised learning (classification)
  • Unsupervised learning (clustering)
  • Tools
  • Exploration
  • Visualization
  • Experiment
  • Statistical summary

3
Download Weka
  • http//www.cs.waikato.ac.nz/ml/weka/
  • Window (weka-3-5-6jre.exe)
  • Linux

4
Getting Start
5
Memory Limitation in Weka
  • Run Chooser from DOS to increase memory
  • C\gt java -Xmx128m -classpath ./progra1/weka-
    3-5/weka.jar weka.gui.GUIChooser

6
Weka GUI
7
Explorer
8
Open Files (.csv, .arff)
9
Datasets Description
Datasets statistics
Attributes
10
Remove Class Attribute
Non-class attributes
11
Select A Clustering Algorithm
12
Select A Clustering Algorithm
13
Select A Clustering Algorithm
14
Parameters Setting
15
Run A Clustering Algorithm
16
DBSCAN Results
  • Run information
  • Scheme weka.clusterers.DBScan -E 0.9 -M 6
    -I weka.clusterers.forOPTICSAndDBScan.Databases.Se
    quentialDatabase -D weka.clusterers.forOPTICSAndDB
    Scan.DataObjects.EuclidianDataObject
  • Relation iris-weka.filters.unsupervised.attri
    bute.Remove-R5
  • Instances 150
  • Attributes 4
  • sepallength
  • sepalwidth
  • petallength
  • petalwidth
  • Test mode evaluate on training data
  • Model and evaluation on training set
  • DBScan clustering results

  • Clustered DataObjects 150
  • Number of attributes 4

17
Simplify A Tested Dataset
18
Simplify A Tested Dataset
19
Parameters Setting
20
DBSCAN Clustering Results
  • Run information
  • Scheme weka.clusterers.DBScan -E 0.3 -M 50
    -I weka.clusterers.forOPTICSAndDBScan.Databases.Se
    quentialDatabase -D weka.clusterers.forOPTICSAndDB
    Scan.DataObjects.EuclidianDataObject
  • Relation iris-weka.filters.unsupervised.attri
    bute.Remove-R1-2,5
  • Instances 150
  • Attributes 2
  • petallength
  • petalwidth
  • Test mode evaluate on training data
  • Model and evaluation on training set
  • DBScan clustering results

  • Clustered DataObjects 150
  • Number of attributes 2
  • Epsilon 0.3 minPoints 50
  • Index weka.clusterers.forOPTICSAndDBScan.Database
    s.SequentialDatabase

21
Run k-Means in Weka
22
Parameters Setting
23
k-Means Clustering Results
  • Run information
  • Scheme weka.clusterers.SimpleKMeans -N 2
    -S 10
  • Relation iris-weka.filters.unsupervised.attri
    bute.Remove-R1-2,5
  • Instances 150
  • Attributes 2
  • petallength
  • petalwidth
  • Test mode evaluate on training data
  • Model and evaluation on training set
  • kMeans
  • Number of iterations 6
  • Within cluster sum of squared errors
    5.179687509974782

24
ArffViewer Convert Datasets Extension
25
Open A Datasets file
26
Select A Datasets File
27
View the Dataset
28
Manipulate the Dataset (Optional)
29
Save As .Arff File
30
Weka Documentation
Write a Comment
User Comments (0)
About PowerShow.com