Visualization%20and%20Data%20Mining - PowerPoint PPT Presentation

About This Presentation
Title:

Visualization%20and%20Data%20Mining

Description:

Sepal, a non-reproductive part of the flower. 25. Parallel Coordinates. Sepal. Length. 5.1. 26. Parallel Coordinates: 2 D. Sepal. Length. 5.1. Sepal. Width. 3.5. 27 ... – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 34
Provided by: lia9
Category:

less

Transcript and Presenter's Notes

Title: Visualization%20and%20Data%20Mining


1
Visualization andData Mining
2
Napoleon Invasion of Russia, 1812
Napoleon
3
Marley, 1885
4
(No Transcript)
5
Snows Cholera Map, 1855
6
Asia at night
7
South and North Korea at night
North Korea Notice how dark it is
Seoul, South Korea
8
Visualization Role
  • Support interactive exploration
  • Help in result presentation
  • Disadvantage requires human eyes
  • Can be misleading

9
Bad Visualization Spreadsheet with misleading Y
-axis
Year Sales
1999 2110
2000 2105
2001 2120
2002 2121
2003 2124
Y-Axis scale gives WRONG impression of big change
10
Better Visualization
Year Sales
1999 2110
2000 2105
2001 2120
2002 2121
2003 2124
Axis from 0 to 2000 scale gives correct
impression of small change
11
Lie Factor14.8
(E.R. Tufte, The Visual Display of Quantitative
Information, 2nd edition)
12
Lie Factor
Tufte requirement 0.95ltLie Factorlt1.05
(E.R. Tufte, The Visual Display of Quantitative
Information, 2nd edition)
13
Tuftes Principles of Graphical Excellence
  • Give the viewer
  • the greatest number of ideas
  • in the shortest time
  • with the least ink in the smallest space.
  • Tell the truth about the data!

(E.R. Tufte, The Visual Display of Quantitative
Information, 2nd edition)
14
Visualization Methods
  • Visualizing in 1-D, 2-D and 3-D
  • well-known visualization methods
  • Visualizing more dimensions
  • Parallel Coordinates
  • Other ideas

15
1-D (Univariate) Data
  • Representations

7 5 3 1
Tukey box plot
Middle 50
low
high
Mean
0
20
Histogram
16
2-D (Bivariate) Data
  • Scatter plot,

price
mileage
17
3-D Data (projection)
price
18
3-D image (requires 3-D blue and red glasses)
Taken by Mars Rover Spirit, Jan 2004
19
Visualizing in 4 Dimensions
  • Scatterplots
  • Parallel Coordinates
  • Chernoff faces

20
Multiple Views
Give each variable its own display
1
A B C D E 1 4 1 8 3 5 2 6 3 4 2 1 3 5 7 2 4 3 4
2 6 3 1 5
2
3
4
A B C D E
Problem does not show correlations
21
Scatterplot Matrix
Represent each possible pair of variables in
their own 2-D scatterplot (car data) Q Useful
for what? A linear correlations (e.g.
horsepower weight) Q Misses what? A
multivariate effects
22
Parallel Coordinates
  • Encode variables along a horizontal row
  • Vertical line specifies values

Same dataset in parallel coordinates
Dataset in a Cartesian coordinates
Invented by Alfred Inselberg while at IBM, 1985
23
Example Visualizing Iris Data
Iris versicolor
Iris setosa
Iris virginica
24
Flower Parts
Petal, a non-reproductive part of the flower
Sepal, a non-reproductive part of the flower
25
Parallel Coordinates
Sepal Length
5.1
26
Parallel Coordinates 2 D
Sepal Length
Sepal Width
3.5
5.1
27
Parallel Coordinates 4 D
Sepal Length
Petal length
Petal Width
Sepal Width
3.5
5.1
0.2
1.4
28
Parallel Visualization of Iris data
3.5
5.1
1.4
0.2
29
Parallel Visualization Summary
  • Each data point is a line
  • Similar points correspond to similar lines
  • Lines crossing over correspond to negatively
    correlated attributes
  • Interactive exploration and clustering
  • Problems order of axes, limit to 20 dimensions

30
Chernoff Faces
Encode different variables values in
characteristics of human face
http//www.cs.uchicago.edu/wiseman/chernoff/ http
//hesketh.com/schampeo/projects/Faces/chernoff.ht
ml
Cute applets
31
Interactive Face
32
Chernoff faces, example
33
Visualization Summary
  • Many methods
  • Visualization is possible in more than 3-D
  • Aim for graphical excellence
Write a Comment
User Comments (0)
About PowerShow.com