Title: Multivariate Data
1Multivariate Data Representations
- CS 7450 - Information Visualization
- Jan. 20, 2005
- John Stasko
2Agenda
- Data forms and representations
- Basic representation techniques
- Multivariate (gt3) techniques
3Data Sets
- Data comes in many different forms
- Typically, not in the way you want it
- How is stored (in the raw)?
4Example
- Cars
- make
- model
- year
- miles per gallon
- cost
- number of cylinders
- weights
- ...
5Example
6Data Tables
- Often, we take raw data and transform it into a
form that is more workable - Main idea
- Individual items are called cases
- Cases have variables (attributes)
7Data Table Format
Case1 Case2 Case3 ...
Variable1 Variable2 Variable3 ...
Value11 Value21 Value31
Value12 Value22 Value32
Dimensions
Value13 Value23 Value33
Think of as a function f(case1) ltVal11, Val12,gt
8Example
Mary Jim Sally Mitch
...
SSN Age Hair GPA ...
145 294 563 823
23 17 47 29
brown black blonde red
2.9 3.7 3.4 2.1
People in class
9Example
Baseballstatistics
10Variable Types
- Three main types of variables
- N-Nominal (equal or not equal to other values)
- Example gender
- O-Ordinal (obeys lt relation, ordered set)
- Example fr,so,jr,sr
- Q-Quantitative (can do math on them)
- Example age
11Metadata
- Descriptive information about the data
- Might be something as simple as the type of a
variable, or could be more complex - For times when the table itself just isnt enough
- Example if variable1 is l, then variable3 can
only be 3, 7 or 16
12How Many Variables?
- Data sets of dimensions 1, 2, 3 are common
- Number of variables per class
- 1 - Univariate data
- 2 - Bivariate data
- 3 - Trivariate data
- gt3 - Hypervariate data
13Representation
- Whats a common way of visually representing
multivariate data sets? - Graphs!
14Good Example
www.nationmaster.com
15Basic Symbolic Displays
- Graphs ?
- Charts
- Maps
- Diagrams
From S. Kosslyn, Understanding chartsand
graphs, Applied CognitivePsychology, 1989.
161. Graph
Showing the relationships between
variablesvalues in a data table
17Properties
- Graph
- Visual display that illustrates one or more
relationships among entities - Shorthand way to present information
- Allows a trend, pattern or comparison to be
easily comprehended
18Issues
- Critical to remain task-centric
- Why do you need a graph?
- What questions are being answered?
- What data is needed to answer those questions?
- Who is the audience?
money
time
19Graph Components
- Framework
- Measurement types, scale
- Content
- Marks, lines, points
- Labels
- Title, axes, ticks
20Other Symbolic Displays
Aside
212. Chart
- Structure is important, relates entities to each
other - Primarily uses lines, enclosure, position to
link entities
Examples flowchart, family tree, org chart, ...
223. Map
- Representation of spatial relations
- Locations identified by labels
23Choropleth Map
Areas are filled and colored differently
to indicate some attribute of that region
24Cartography
- Cartographers and map-makers have a wealth of
knowledge about the design and creation of visual
information artifacts - Labeling, color, layout,
- Information visualization researchers should
learn from this older, existing area
254. Diagram
- Schematic picture of object or entity
- Parts are symbolic
Examples figures, steps in a manual,
illustrations,...
26Details
- What are the constituent pieces of these four
symbolic displays? - What are the building blocks?
27Visual Structures
- Composed of
- Spatial substrate
- Marks
- Graphical properties of marks
28Space
- Visually dominant
- Often put axes on space to assist
- Use techniques of composition, alignment,
folding, recursion, overloading to 1)
increase use of space 2) do data encodings
29Marks
- Things that occur in space
- Points
- Lines
- Areas
- Volumes
30Graphical Properties
- Size, shape, color, orientation...
Spatial properties
Object properties
Position Size
Expressing extent
Grayscale
Color Shape Texture
Differentiating marks
Orientation
31Intermission
- Getting slides
- Getting papers
- Photos
32Back to Data
- What were the different types of data sets?
- Number of variables per class
- 1 - Univariate data
- 2 - Bivariate data
- 3 - Trivariate data
- gt3 - Hypervariate data
33Univariate Data
Bill
7 5 3 1
Tukey box plot
Middle 50
low
high
Mean
0
20
34What goes where
- In univariate representations, we often think of
the data case as being shown along one dimension,
and the value in another
Line graph
Bar graph
Y-axis is quantitativevariable Compare relative
pointvalues
Y-axis is quantitativevariable See changes
overconsecutive values
35Alternative View
- We may think of graph as representing independent
(data case) and dependent (value) variables - Guideline
- Independent vs. dependent variables
- Put independent on x-axis
- See resultant dependent variables along y-axis
36Bivariate Data
Scatter plot is common
price
Two variables, want tosee relationship Is there
a linear, curved orrandom pattern?
mileage
Each mark is nowa data case
37Trivariate Data
3D scatter plot is possible
price
horsepower
mileage
38Alternative Representation
Still use 2D but havemark propertyrepresent
thirdvariable
39Alternative Representation
Represent each variablein its own explicit way
40Hypervariate Data
- Ahhh, the tough one
- Number of well-known visualization techniques
exist for data sets of 1-3 dimensions - line graphs, bar graphs, scatter plots OK
- We see a 3-D world (4-D with time)
- What about data sets with more than 3 variables?
- Often the interesting, challenging ones
41Multiple Views
Give each variable its own display
1
A B C D E 1 4 1 8 3 5 2 6 3 4 2 1 3 5 7 2 4 3 4
2 6 3 1 5
2
3
4
A B C D E
42Scatterplot Matrix
Represent each possible pair of variables in
their own 2-D scatterplot Useful for
what? Misses what?
43Chernoff Faces
Encode different variables values in
characteristics of human face
http//www.cs.uchicago.edu/wiseman/chernoff/ http
//hesketh.com/schampeo/projects/Faces/chernoff.ht
ml
Cute applets
44Star Plots
Var 1
Space out the n variables at equal angles around
a circle Each spoke encodes a variables value
Var 2
Var 5
Value
Var 3
Var 4
45Star Plot examples
http//seamonkey.ed.asu.edu/behrens/asu/reports/c
ompre/comp1.html
46Star Coordinates
E. Kandogan, Star Coordinates A
Multi-dimensional Visualization Technique with
Uniform Treatment of Dimensions, InfoVis
2000 Late-Breaking Hot Topics, Oct. 2000
Demo
47Parallel Coordinates
48Parallel Coordinates
Encode variables along a horizontal row Vertical
line specifies values
V1 V2 V3 V4 V5
49Parallel Coords Example
Basic
Grayscale
Color
50Application
- System that uses parallel coordinates for
information analysis and discovery - Interactive tool
- Can focus on certain data items
- Color
Taken from A. Inselberg, Multidimensional
DetectiveInfoVis 97, 1997.
51Discuss
- What was their domain?
- What was their problem?
- What were their data sets?
52The Problem
- VLSI chip manufacture
- Want high quality chips (high speed) and a high
yield batch ( of useful chips) - Able to track defects
- Hypothesis No defects gives desired chip types
- 473 batches of data
53The Data
- 16 variables
- X1 - yield
- X2 - quality
- X3-X12 - defects (inverted)
- X13-X16 - physical parameters
54Parallel Coordinate Display
yield quality
defects
parameters
Yikes! But not that bad
Distributions x1 - normal x2 - bipolar
55Top Yield Quality
split
defects
Have some defects
56Minimal Defects
Not thehighestyields andquality
57Best Yields
Appears that some defects are necessary to
produce the best chips Non-intuitive!
58Xmdv
Toolsuite created by Matthew Ward of
WPI Includes parallel coordinate views
Demo
59Parallel Coordinate Tree
Demo
D. Brodbeck and L. Girardin, "Visualization of
Large-Scale Customer Satisfaction Surveys Using a
Parallel Coordinate Tree", InfoVis 03.
60Parallel Coordinates
- Technique
- Strengths?
- Weaknesses?
61Sliding Rods
T. Lanning, K. Wittenburg, et al,
"Multidimensional Information Visualization
through Sliding Rods", Proceedings of AVI 2000
62Administratia
- Computer accounts
- HW 1 in today
- HW 3 due Tuesday
63Upcoming
- Multivariate vis tools
- Reading
- Eick paper
- Visual perception
- Tufte (please be reading)
64Sources Used
CMS book Referenced articles Marti Hearst SIMS
247 lectures Kosslyn 89 article A. Marcus,
Graphic Design for Electronic Documents and
User Interfaces M. Monmonier, How to Lie with
Maps W. Cleveland, The Elements of Graphing
Data C. H. Yu, Visualization Techniques of
Different Dimensions http//seamonkey.ed.asu.edu/
behrens/asu/reports/compre/comp1.html http//www.c
sc.ncsu.edu/faculty/healey/PP/PP.html