A Short Introduction to Weka - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

A Short Introduction to Weka

Description:

A Short Introduction to Weka Natural Language Processing Thursday, September 25th What is weka? Java-based Machine Learning Tool Implements numerous classifiers 3 ... – PowerPoint PPT presentation

Number of Views:106
Avg rating:3.0/5.0
Slides: 14
Provided by: csColumb9
Category:

less

Transcript and Presenter's Notes

Title: A Short Introduction to Weka


1
A Short Introduction to Weka
  • Natural Language Processing
  • Thursday, September 25th

2
What is weka?
  • Java-based Machine Learning Tool
  • Implements numerous classifiers
  • 3 modes of operation
  • GUI
  • Command Line
  • Java API (not discussed here)
  • Google weka java

3
weka Homepage
  • http//www.cs.waikato.ac.nz/ml/weka/
  • To run
  • java -Xmx1024M -jar cs4705/bin/weka.jar

4
.arff file format
  • http//www.cs.waikato.ac.nz/ml/weka/arff.html
  • 1. Title Iris Plants Database
  • _at_RELATION iris
  • _at_ATTRIBUTE sepallength NUMERIC _at_ATTRIBUTE
    sepalwidth NUMERIC _at_ATTRIBUTE petallength
    NUMERIC _at_ATTRIBUTE petalwidth NUMERIC
    _at_ATTRIBUTE class Iris-setosa,Iris-versicolor,
    Iris-virginica
  • _at_DATA 5.1,3.5,1.4,0.2,Iris-setosa
    4.9,3.0,1.4,0.2,Iris-setosa 4.7,3.2,1.3,0.2,Iris
    -setosa

5
.arff file format
  • _at_attribute attrName numeric, string, ltnominalgt,
    date
  • numeric a number
  • nominal a (finite) set of strings,
    e.g.Iris-setosa,Iris-versicolor,
    Iris-virginica
  • string ltarbitrary stringsgt
  • date (default ISO-8601) yyyy-MM-ddTHHmmss

6
Example Arff Files
  • cs4705/bin/weka-3-4-11/data/
  • iris.arff
  • soybean.arff
  • weather.arff

7
To Classify with weka GUI
  1. Run weka GUI
  2. Click 'Explorer'
  3. 'Open file...'
  4. Select 'Classify' tab
  5. 'Choose' a classifier
  6. Confirm options
  • Click 'Start'
  • Wait...
  • Right-click on Result list entry
  • 'Save result buffer'
  • 'Save model'

8
Classify
  • Some classifiers to start with.
  • NaiveBayes
  • JRip
  • J48
  • SMO
  • Find References by selecting a classifier
  • Use Cross-Validation!

9
Analyzing Results
  • Important tools for Homework 2
  • Accuracy
  • Correctly classified instances
  • F-measure
  • Confusion matrix
  • Save model
  • Visualization

10
Running weka from the Command Line
  • Running an N-fold cross validation experiment
  • java -cp cs4705/bin/weka.jar weka.classifiers.bay
    es.NaiveBayes -t trainingdata.arff -x N -i
  • Using a predefined test set
  • java -cp cs4705/bin/weka.jar weka.classifiers.bay
    es.NaiveBayes -t trainingdata.arff -T
    testingdata.arff

11
  • Saving the model
  • java -cp cs4705/bin/weka.jar weka.classifiers.bay
    es.NaiveBayes -t trainingdata.arff -d
    output.model
  • Classifying a test set
  • java -cp cs4705/bin/weka.jar weka.classifiers.bay
    es.NaiveBayes -l input.model -T testingdata.arff
  • Getting help
  • java -cp cs4705/bin/weka.jar weka.classifiers.bay
    es.NaiveBayes -?

12
Homework 2 Weka Workflow

T1
TN
YourFeature Extractor
S1
S2
YourFeature Extractor
best model
Weka
.arff
Test .arff

SN
results
results
Weka
Preprocessing (you)
Grading (us)
Experimentation (you)
13
Tips for Homework Success
  • Start early
  • Read instructions carefully
  • Start simply
  • Your system should always work
  • 80/20 Rule
  • Add features incrementally
  • This way, you always have something you can turn
    in.
Write a Comment
User Comments (0)
About PowerShow.com