Debellor Data Mining Platform with Stream Architecture - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Debellor Data Mining Platform with Stream Architecture

Description:

Debellor. Data Mining Platform with Stream Architecture ... Immutability of data objects safety. Features of Debellor. 11. Debellor. 12. Algorithm = Cell ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 25
Provided by: marcinwo
Category:

less

Transcript and Presenter's Notes

Title: Debellor Data Mining Platform with Stream Architecture


1
DebellorData Mining Platform with Stream
Architecture
Marcin Wojnarski
Warsaw University, Poland
2
Outline
  • Debellor data mining platform
  • Motivation
  • Main features
  • Architecture
  • Cell
  • data streaming
  • multi-threading
  • Available in ver. 0.6
  • Future releases
  • Summary

3
Debellor
  • Language Java
  • Licence open source (GPL)
  • Download www.debellor.org
  • Debello to conquer (latin). Debellor
    conqueror of data

4
Debellor data mining platform
Rseslib
LibSVM
Debellor
Weka
TA-Lib
own
own
5
Motivation
  • Demand for more complex algorithms.
  • Necessity to combine elementary algorithms.

6
Motivation
  • Data Processing Network (DPN)

7
Motivation
  • Committee of algorithms

8
Motivation
  • Nested algorithms

RBF neural network
K-means
9
Requirements
Versatile
Efficient
Simple
10
Features of Debellor
  • All types of data processing algorithms
  • Extendible data types
  • Stream architecture ? large data sets
  • Multi-threading
  • Immutability of data objects ? safety

11
Debellor
12
Algorithm Cell
cell
  • Cell cell new RseslibClassifier("C45")
  • cell.set("pruning", "true")

13
Cell data source
cell
  • cell.open()
  • Sample s1 cell.next(),
  • s2 cell.next(),
  • ...
  • cell.close()

14
Cell data receiver
cell
anotherCell
  • cell.setSource(anotherCell)

15
Trainable Cell
EMPTY
cell
TRAINED
cell
  • cell.setSource()
  • cell.learn()

16
Data Streaming
BATCH
STREAM
Its the cell who is responsible for asking for
data
17
Benefits of streaming
training of k-means
X
X
crash!
18
Multi-threading
Thread_1
A
B
19
Multi-threading
Thread_2
Thread_1
A
B
  • A.newThread()

20
Available in version 0.6
  • Rseslib algorithms
  • classifiers (20 algorithms)
  • Weka algorithms
  • ARFF reader
  • classifiers (60)
  • filters (47)
  • Debellor algorithms
  • TrainTest evaluation
  • k-means for large data (stream-based)
  • Data types
  • numeric and symbolic features
  • vectors of features, vectors of vectors of

21
Future releases
  • Multi-input multi-output cells
  • Composite cells (e.g. meta-learning)
  • Serialization and copying

22
Summary
  • Platform
  • Stream architecture
  • Extendible
  • Multi-threaded
  • Weka Rseslib partially integrated

23
Home
www.debellor.org
24
Thank You
Write a Comment
User Comments (0)
About PowerShow.com