A Closer Look at WEKA - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

A Closer Look at WEKA

Description:

WEKA is a java based system which allows contribution by ... The saved file needs to be reloaded for the filtered data to be in use. 6. Some Basic Filters ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 12
Provided by: xx197
Category:
Tags: weka | closer | look | reloaded

less

Transcript and Presenter's Notes

Title: A Closer Look at WEKA


1
A Closer Look at WEKA
  • Notes on Manipulating Data in Weka

2
General Remarks
  • WEKA is a java based system which allows
    contribution by independent researchers.
  • Although commercial use is not prohibited, it is
    primarily a research oriented tool, which results
    in the difficulties encountered in the interface
    (input, output).

3
Preprocessing (Input)
  • WEKA uses arff files for input. It is a
    text-based format. See the example files under
    installation folders.
  • It accepts comma delimited text files as well.
    Given an Excel file, this formatting requirement
    can be easily met Save as (csv, comma delimited
    for file type)

4
Preprocessing (Input)
  • If one wishes to modify the input file, for
    instance remove some attributes, there are two
    ways to proceed
  • Modify the csv file using Excel
  • Modify the input file using WEKA filters and save
    the file as an arff file for future use.
  • WEKA filters are strong and capable tools, and
    their use is documented in the users manual.

5
Applying a Filter
  • Select a filter, click on add. When apply
    filters button is pressed, all filters that
    appear highlighted in the box will be applied.
  • Once a file has been filtered, it is technically
    a different file (although it might be the same
    as the original in terms of content) and can be
    saved as such.
  • The saved file needs to be reloaded for the
    filtered data to be in use.

6
Some Basic Filters
  • Add filter adds an attribute to dataset
  • Attribute filter removes attributes from the
    data set
  • Attribute expression creates new numeric
    attribute as a function of an existing one
  • Discretize filter converts numeric attribute
    into nominal by using bins
  • Randomize filter randomizes order of instances

7
Conclusion
  • The GUI of WEKA has improved a lot.
  • From an instructional perspective, it is a strong
    and useful tool because it supports alternative
    methods.
  • For more information
  • http//www.cs.waikato.ac.nz/ml/weka/

8
Some Classifiers in WEKA
  • zeroR predicts the majority class, or the
    average if the output attribute is numeric
  • oneR uses one attribute only in rule generation
  • NaiveBayes implements the Bayes classifier (the
    normal distribution assumption for numerical
    attributes can be modified if necessary)

9
Some Classifiers in WEKA- ctd
  • IBk instance based k-nearest neighbor classifier
  • J4.8 (J4.8 PART) Decision tree-based algorithm,
    PART generates rules
  • ID3
  • Logistic Regression
  • ...

10
Methods for Prediction (Numerical Output
Attribute)
  • Regression
  • Regression / Model trees (M5)
  • Neural Networks (supports classification as well)

11
Other Methods
  • Clustering
  • K-means
  • EM
  • Cobweb
  • Association (A Priori Algorithm)
Write a Comment
User Comments (0)
About PowerShow.com