Title: Group One
1CSIS
School of Computer Science and Information Systems
Group One Data Visualization Spring 2005
Doctor of Professional Studies in Computing
2Agenda
I. Overview II. Foundations of Visualization III.
Visualization and KDD IV. I Can See Clearly
Now V. XmdvTool Demonstration with ISBSG Case
Study
3I. Overview
Visualize "to form a mental vision, image, or
picture of (something not visible or present to
sight, or of an abstraction) to make visible to
the mind or imagination" The Oxford English
Dictionary, 1989 Many Variations
"Visualization" 1) Visualization in
Scientific Computing (Scientific
Visualization) 2) Information
Visualization 3) Software Visualization
4 5Fish Eating Boat
6I. Foundations of Visualization
7(No Transcript)
8(No Transcript)
9(No Transcript)
10(No Transcript)
11III. Visualization and KDD
- Knowledge Discovery from Databases
- Data Processing
- Machine Learning
- Evaluation
- Visualization
- Experiments may be nested
- Approach Advocated by YALE
- Yet Another Learning Environment
- http//www-ai.cs.uni-dortmund.de/SOFTWARE/YALE
12IV. I Can See Clearly Now
Key Points
- Data generation is exploding, particularly
dimensional data - Visualization takes place in context tools and
functionality are driven by user needs and
objectives - Yang, et al provide an excellent baseline list of
core and advanced techniques for consideration - Keim introduces an interesting 3-dimention view
linking data type, interaction technique, and
display type
13Data Growth Factoids
- How much new information per person? According to
the Population Reference Bureau, the world
population is 6.3 billion, thus almost 800 MB of
recorded information is produced per person each
year. It would take about 30 feet of books to
store the equivalent of 800 MB of information on
paper. - Information explosion? We estimate that new
stored information grew about 30 a year between
1999 and 2002 - The World Wide Web contains about 170 terabytes
of information on its surface in volume this is
seventeen times the size of the Library of
Congress print collections. - Instant messaging generates five billion messages
a day (750GB), or 274 Terabytes a year. -
- Email generates about 400,000 terabytes of new
information each year worldwide.
14Visualization takes place in context different
users with different needs have different
requirement and techniques.
Richness of Information
What should I do ?
Managerial Snap-shot
Interactive reporting
What If" analysis
What Next ?
Use/need
15Visualization takes place in context different
users with different needs have different
requirement and techniques.
Richness of Information
What should I do ?
Managerial Snap-shot
Interactive reporting
What If" analysis
What Next ?
Use/need
Enterprise Reporting Navigation needs and
reliable information
Mutli-dimensional speed of thought
Prescribed action Alerts and notifications
Managed Metrics Scorecard Dashboards
Analysis and predictive values
Typical Output
16Visualization takes place in context different
users with different needs have different
requirement and techniques.
Richness of Information
What should I do ?
Managerial Snap-shot
Interactive reporting
What If" analysis
What Next ?
Use/need
Enterprise Reporting Navigation needs and
reliable information
Mutli-dimensional speed of thought
Prescribed action Alerts and notifications
Managed Metrics Scorecard Dashboards
Analysis and predictive values
Typical Output
Filter and Zoom
Slice Dice, Pivot tables
Derived Information
Recommend and Act
Interaction
Fixed display
17Visualization takes place in context different
users with different needs have different
requirement and techniques.
Richness of Information
What should I do ?
Managerial Snap-shot
Interactive reporting
What If" analysis
What Next ?
Use/need
Enterprise Reporting Navigation needs and
reliable information
Mutli-dimensional speed of thought
Prescribed action Alerts and notifications
Managed Metrics Scorecard Dashboards
Analysis and predictive values
Typical Output
Filter and Zoom
Slice Dice, Pivot tables
Derived Information
Recommend and Act
Interaction
Fixed display
Use the data to prove/disprove a hypothesis
Use the data to generate hypotheses
18Yang, et al identify Core Navigation Tool
- Filter reduce the amount of data to increase
focus - Distortion enlarge some part of a display to
examine details - Zooming and Panning enlarge, make smaller, move
through display - Manual Pixel re-ordering top to bottom, bottom
to top - Comparing create/examine relationships
- Refining generate a new, focused display of
data subset
19Yang, et al identify Advance Navigation Tool
- Showing names mouse-overs
- Layer re-ordering ordering of overlapping data
- Manual relocation separation of overlapping
data - Extent Scaling interactive, proportional
resizing - Dynamic Masking hiding of irrelevant data
- Automatic Shifting automatic overlap reduction
20Keim creates a 3-dimentional chart that relates
interaction technique, type of data, and
visualization technique
21Breakdown and examination of Keim model
Simple data
Complex data
22Breakdown and examination of Keim model
Interaction and manipulation techniques, similar
to Yang
23Breakdown and examination of Keim model
Recommended display type (some of which we will
see in the demos)
24V. XmdvTool Demonstration with ISBSG Case Study
- Tool Available at http//davis.wpi.edu/xmdv
- Methods
- Scatterplots
- Glyphs
- Parallel Cordinates
- Dimensional Stacking
- N-D Brush
- Highlight
- Mask
- Values
- Average
25Source of Case Study
- The International Software Benchmarking Standards
Group - Mission Help Improve Management of IT Resources
Through a Public Repository - Produces ISBSG Estimating, Benchmarking
Research Suite (Release 8 in 2003) of Data and
Tools - Academic Use Free or Nominal Charge
- Web Site www.isbsg.org
- Same Source As Team Ones Data Mining Project
26Composition of Study File
- 451 New Development Projects
- Fields
- Size in Adjusted Function Points
- Duration in Months
- Maximum Team Size
- Work Effort in Hours
- Project Delivery Rate