Title: Visual Analytics Research at WPI
1Visual Analytics Research at WPI
- Dr. Matthew Ward and Dr. Elke Rundensteiner
- Computer Science Department
2What is Visual Analytics?
- The science of analytical reasoning facilitated
by interactive visual interfaces, from
Illuminating the Path the Research and
Development Agenda for Visual Analytics, J.
Thomas and K. Cook (eds.), 2005 - More than information visualization or visual
data mining, it involves technology to support
all aspects of the analysis and reasoning
processes.
3An Overview of VA at WPI
Transforms Abstractions
Data Sources
Discovery Reasoning
Interaction Spaces
Visual Representations
-Files -Databases -Numeric -Nominal
-Clustering -Sampling -Nominal to
ordinal -Dimension reduction
-Data (multiple) -Statistics -Structure
(hierarchy)
-Data -Structure (hierarchy)
-Clusters -Associations
-Nuggets -Outliers
-Spatial -Temporal -Quality
-Quality -Uncertainty -Missing values
-Data quality -Abstraction quality -Anomalies
-Events -Trends -Hypotheses
-Clutter reduction
-Streaming
-Evidence
-Past Work
-Recent Work
-Planned Work
4Examples of Projects
5Multiresolution Visualization
- For large datasets, visualizations quickly get
cluttered - We have extended all of our visualizations to
work at multiple resolutions - Hierarchical clustering generates many levels of
detail - User can select areas of interest to view at full
resolution while the rest of the data is shown
via cluster centers and extents (shown as bands
of variable opacity)
This work was funded by NSF grant IIS-9732897
6Dimension Reduction
- Dimensions are hierarchically clustered based on
similarity measures - Hierarchy displayed using InterRing
- Users select clusters of dimensions or
representative dimensions for detailed analysis
This work was funded by NSF grant IIS-0119276
42 dimension census dataset.
7Linking Spatial and Non-Spatial
- Diagonal plots of scatterplot matrix can have
numerous uses - Weve implemented histograms, line plots, and 2-D
options - Example show multispectral remote sensing data, 1
layer per diagonal plot - User can select in either 2-D or parameter space
and see corresponding elements in other views.
8Layout Strategies
- Different layout strategies can reveal different
patterns in the data - Detecting, classifying, and measuring trends,
outliers, repeated patterns, clusters, and
correlations can be facilitated via specific
layouts
Cyclic
Data Driven
Principal Components
Order Driven
9Visualizing Data with Nominal Fields
- Arbitrary assignment of non-numeric fields to
numbers can lead to misinterpretation, lost
patterns - By looking at similarities in distributions
across all dimensions, we can group values of a
nominal variable with similar global
characteristics - Assignments used to convey order and relative
distance
Original Assignment
Assignment after Correspondence Analysis
This work was funded by NSF grant IIS-0119276 and
funds from the NSA
10Visual Clutter Reduction
- In scenes with thousands of moving objects, there
is need to reduce clutter - Weve explored and developed many strategies,
including - Information-preserving
- Information-reducing
- Visual remapping
This work was funded by a grant from the AFRL
11Data Quality Visual Encoding
- Data quality refers to the degree of uncertainty
of data - Quality measures are visually encoded into
existing visualizations - This helps users focus on high quality data to
draw reliable conclusions
This work was funded by NSF grant IIS-0414380
12Quality Space Visualization
- Quality space is visualized separately to convey
patterns in the data quality measures - Records or dimensions can be ordered by quality
to reveal structure and relations - Stripe view shows individual data value quality
Histogram view shows summarization and
distribution
StripeQualityMap
HistogramQualityMap
This work was funded by NSF grant IIS-0414380
13Interactions between Data Spaceand Quality Space
- Linking brush When users select a subset in one
space, the corresponding subset in the other
space will be highlighted accordingly. - Sample figures The data points in the data space
with high values in the third dimension are
highlighted, then the distribution of quality
measures for this subset is rendered in the
quality map.
Data space with highlighting
LinkedQuality space
This work was funded by NSF grant IIS-0414380
14Nugget Management System (NMS)
- Nuggets are patterns, clusters, anomalies or
other features of a data set that have been
visually or computationally isolated. - NMS helps users to extract, consolidate and
manage nuggets during their visual exploration.
NMS eventually builds a hypothesis view based on
the nugget space to support or refute hypotheses
of users.
Nugget Space
Hypothesis View
15Common Themes and Strategies
- Provide data and attributes in multiple, linked
spaces - Use automated and interactive tools for
controlling and optimizing views - Measure quality at all stages of the pipeline and
convey to the user for decision support - Assess quality measures by comparing them to user
responses - Manage scale via abstractions such as sampling
and clustering, but communicate information loss
to analyst to allow trade-offs - Perform usability testing with all visualizations
and interactive tools - Release code to the public domain for widest
possible impact
16Some References
- Hierarchical Parallel Coordinates
- Fua, Y.-H., Ward, M. O., and Rundensteiner, E.
A., "Hierarchical Parallel Coordinates for
Visualizing Large Multivariate Data Sets," IEEE
Conf. on Visualization '99, Oct. 1999. - Hierarchical Dimension Management
- Jing Yang, Matthew O. Ward, Elke A. Rundensteiner
and Shiping Huang, "Visual Hierarchical Dimension
Reduction for Exploration of High Dimensional
Datasets", Proc. VisSym 2003. - Jing Yang, Wei Peng, Matthew O. Ward and Elke A.
Rundensteiner, "Interactive Hierarchical
Dimension Ordering, Spacing and Filtering for
Exploration of High Dimensional Datasets", IEEE
Symposium on Information Visualization 2003
(InfoVis 2003), pp 105 - 112, October 2003. - Visual Clutter Measurement and Reduction
- Wei Peng, Matthew O. Ward and Elke A.
Rundensteiner, "Clutter Reduction in
Multi-Dimensional Data Visualization Using
Dimension Reordering", IEEE Symposium on
Information Visualization 2004 (InfoVis 2004), pp
89 - 96, October 2004. - Glyph Layout
- Matthew O. Ward, "A taxonomy of glyph placement
strategies for multidimensional data
visualization", Information Visualization, Vol 1,
pp 194-210, 2002. - Nominal Data Visualization
- Geraldine E. Rosario, Elke A. Rundensteiner,
David C. Brown, Matthew O. Ward and Shiping
Huang, "Mapping Nominal Values to Numbers for
Effective Visualization", Information
Visualization Journal, Vol 3, pp 80-95, 2004. - Data Quality Visualization
- Z. Xie, S. Huang, M. Ward, and E. Rundensteiner,
Exploratory Visualization of Multivariate Data
with Variable Quality, Proc. IEEE Symposium on
Visual Analytics Science and Technology, pp
183-190, 2006. - Zaixian Xie, Matthew O. Ward, Elke A.
Rundensteiner, Shiping Huang, "Integrating Data
and Quality Space Interactions in Exploratory
Visualizations", The Fifth International
Conference on Coordinated Multiple Views in
Exploratory Visualization (CMV 2007), pp 47-60,
July 2007. - Discovery Management
- Di Yang, Elke A. Rundensteiner, Matthew O. Ward,
"Nugget Discovery in Visual Exploration
Environments by Query Consolidation", ACM CIKM
2007, November, 2007 - Di Yang, Elke A. Rundensteiner, Matthew O. Ward,
"Analysis Guided Visual Exploration to
Multivariate Data", IEEE Symposium on Visual
Analytics Science and Technology, October 2007.