Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases

About This Presentation
Title:

Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases

Description:

Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases –

Number of Views:529
Avg rating:3.0/5.0
Slides: 64
Provided by: christoph133
Category:

less

Transcript and Presenter's Notes

Title: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases


1
PolarisQuery, Analysis, and Visualization of
Large Hierarchical Relational Databases
  • Chris Stolte
  • Computer Science Department
  • Stanford University

2
Motivation
  • Large relational databases have become very
    common
  • Corporate data warehouses
  • Amazon, Walmart,
  • Scientific projects
  • Human Genome Project
  • Sloan Digital Sky Survey
  • Need tools to extract meaning from these
    databases
  • Programmatic data mining/statistical analysis
  • Visual exploration and analysis

3
Related Work
  • Formalisms for graphics
  • Bertins Semiology of Graphics
  • Mackinlays APT
  • Roth et al.s Sage and SageBrush
  • Wilkinsons Grammar of Graphics
  • Visual exploration of databases
  • DeVise
  • DataSplash/Tioga-2
  • Visualization and data mining
  • SGIs MineSet
  • IBMs Diamond

4
Outline
  • Review of Data Warehouses Data Cubes
  • A Visualization Formalism
  • Polaris Visual Data Mining
  • Multiscale Visualization

5
Review of Data Warehouses and Data Cubes
6
Review Data Warehouses
  • Data warehouse stores data for analysis
  • measures (facts) categorized by dimensions

Fact table
State Month Product Name Profit Sales Payroll Mar
keting Inventory Margin ...
Nominal / Ordinal fields (categorical dimensions)
Coffee chain (courtesy Visual Insights)
Quantitative fields (measures)
7
Hierarchies
  • Data warehouses are very largeneed to summarize
  • Add hierarchical structure to warehouse

Dimension tables
Time Year Quarter Month
Fact table
Location Market State
State Month Product Name Profit Sales Payroll Mar
keting Inventory Margin ...
Products Product Type Product Name
8
Hierarchical Dimensions
Time Year Quarter Month
9
Data Cube
  • For each level-of-detail, summarize relations as
    cubes
  • More efficient, powerful model for analysis

Each cell aggregatesall measures for those
dimensions
Each cube axis corresponds to a dimension in the
relation at a level-of-detail
10
Hierarchies Data Cubes
Hierarchies define a lattice of cubes
Least detailed
Each cube is defined by a level-of-detail in each
dimension.
Data abstraction
Most detailed
11
Projecting Data Cubes
Can further abstract a cube by projection
Data abstraction
12
Data Warehouse Summary
  • Industry standard for storing analytic data
  • Not operational or transactional data
  • Structured as a lattice of data cubes
    (aggregations)
  • Provide summaries of data at meaningful levels of
    detail
  • To perform data abstraction
  • Choose a cube in the lattice of cubes
  • Project to relevant dimensions
  • Where a lot of important data is stored

13
A Visualization Formalism
14
A Visualization Formalism
  • Typical approach
  • Monolithic objects defining a single visual
    metaphor
  • Formalism
  • Defines a space of visualization and unifies
    tables, different graphs as a class of visual
    representation
  • Succinct specification of sophisticated
    visualizations
  • Can be compiled into necessary drawing operations
    and database queries
  • Exposes structure and pattern of effective visual
    metaphors
  • Powerful tool for describing, comparing, and
    building visualizations

15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
Polaris Formalism
  • Visualizations described using visual
    specifications that define
  • Table configuration for visualization (algebra)
  • Type of graphic in each pane
  • Encoding of data as visual properties (color,
    size, shape, ) of marks
  • Data transformations and queries
  • Interpreter compiles a specification into drawing
    commands and database queries

21
Polaris Algebra Operands
  • Ordinal fields interpret domain as a set that
    partitions table into rows and columns
  • Quarter (Qtr1),(Qtr2),(Qtr3),(Qtr4) ?
  • Quantitative fields treat domain as single
    element set and encode spatially as axes
  • Profit (Profit) ?

22
Concatenation () Operator
  • Ordered union of two sets
  • Quarter ProductType
  • (Qtr1),(Qtr2),(Qtr3),(Qtr4)(Coffee),(Espres
    so)
  • (Qtr1),(Qtr2),(Qtr3),(Qtr4),(Coffee),(Espress
    o)
  • Profit Sales
  • (Profit),(Sales)

23
Cross (?) Operator
  • Direct-product of two sets
  • Quarter ? ProductType
  • (Qtr1,Coffee), (Qtr1, Tea), (Qtr2, Coffee),
    (Qtr2, Tea),
  • (Qtr3, Coffee), (Qtr3, Tea), (Qtr4, Coffee),
    (Qtr4,Tea)
  • ProductType ? Profit

24
Categorical Hierarchies
  • Quarter ? Month
  • Direct product of two sets
  • Would create twelve entries for each quarter,
    i.e. (Qtr1, December)
  • Quarter / Month
  • Based on tuples in fact table not semantics
  • Would only create three entries per quarter
  • Can be expensive to compute
  • Quarter . Month
  • Based on tuples in dimension tables

25
Encoding System
ltcolor encodinggtltmeasure nameProfitgtlt/color
encodinggt
26
SQL Dataflow
Sort
Query Results
Tuples in Panes
Marks in Panes
  • Notes
  • Aggregation operators applied after sort
  • Only one layer is shown additional z-sort

27
Polaris Visual Data Mining
28
The Pivot Table Interface
  • Common interface to statistical packages/Excel
  • Cross-tabulations
  • Simple interface based on drag-and-drop

29
Extending the Pivot Table Interface
  • Extend the interface by
  • Generating rich table-based graphical displays
    rather than tables of text
  • Providing a single conceptual model for both
    graphs and tables
  • Preserving the ability to rapidly construct
    displays

30
Polaris Design Goals
  • Design guided by two primary goals
  • Interactive analysis and exploration versus
    static visualization
  • Simple, consistent interface

31
Analysis Exploration Challenges
  • Designing a user interface for analysis and
    exploration places several requirements on the
    user interface
  • Data dense displays display both many tuples
    many dimensions
  • Multiple display types different displays suited
    to different tasks
  • Exploratory interfaces rapidly change data
    transformations and views

32
Simple, Consistent Interface
  • Excel Pivot tables provide a simple interface for
    building text-based tables
  • Graphs require multiple steps different
    interfaces and conceptual models
  • Want to unify tables, graphs, and database
    queries in one interface

33
Polaris Demo!
34
Data Mining and Visualization
  • Polaris not solely for visual analysis
  • Precursor to algorithmic analysis
  • Validate results and establish trust
  • Incorporate decision trees and classification
    algorithms into data warehouses as hierarchies

35
Multiscale Visualization
36
Multiscale Visualization
  • Directly support analysis process
  • Overview first, zoom and filter, then
    details-on-demand
  • Visual representation changes as user pans and
    zooms
  • Overview, lots of data ? data highly abstracted
  • Zoom, data density decreases ? more detailed
    information shown
  • Visual and data abstraction
  • Visual abstraction different representation/same
    data
  • Data abstractiontransformations to reduce data
    set size

37
Existing Multiscale Visualizations
  • Cartography
  • Multiscale information visualization
  • Pad alternate desktops
  • DataSplash
  • XmdvTool
  • ADVIZOR
  • Main limitations
  • One zoom path
  • Primarily visual abstraction

38
Contributions
  • Multiscale visualization with both visual and
    data abstraction using generalized mechanisms
  • Data Abstraction ? Data Cubes
  • Visual Abstraction ? Polaris
  • Design Patterns

39
Path of Exploration
  • Can think of an analysis as path of specifications

40
Path of Exploration
Visual abstraction
41
Path of Exploration
This is a multiscale visualization!
Dataabstraction
42
Graphical Notation
43
Graphical Notation Templates
Instance
Template
44
Specifying Multiscale Visualizations
  • Specify multiscale visualization using a graph of
    Polaris specifications
  • zoom graphs
  • Infovis 2002 paper describes how to implement

?Polaris Specification
Zooming
?Possible zoom
45
Specifying Multiscale Visualizations
  • Can specify a zooming pattern by using templates

46
Specifying Multiscale Visualizations
  • Independent zooming on different dimensions is
    described as a graph

y-axis zoom
x-axis zoom
47
Design Patterns
48
Design Patterns
  • Zoom graphs simplify specifying and implementing
    multiscale visualizations
  • Design is still very hard
  • Design patterns (a la Gamma et al.)
  • Capture zoom structures that have been used
    effectively reuse in new designs
  • We present four such patterns
  • Formal way to discuss multiscale visualization

49
Thematic Maps
50
Thematic Maps
51
Thematic Maps
52
Thematic Maps
53
Chart Stacks
54
Chart Stacks
55
Chart Stacks
56
Chart Stacks
57
Matrices
58
Matrices
59
Matrices
60
Matrices
61
Dependent QQ Plots
62
Summary
  • Polaris
  • Spreadsheet for table-based displays
  • Simple drag-and-drop interface
  • Built on a formalism that allows algebraic
    manipulation of visual mapping of tuples to marks
  • Multiscale visualizations using data and visual
    abstraction
  • Connects to SQL/MDX servers
  • See http//www.graphics.stanford.edu/projects/pola
    ris

63
Future Work
  • Articulate full-set of multiscale design patterns
  • Transition between levels of detail
  • Develop system infrastructure for browsing VLDB
  • Support layers/lenses/linking with tuple flow
  • Device independence through graphical encodings
  • Extend formalism to 3D
  • Couple scientific and information visualization
Write a Comment
User Comments (0)
About PowerShow.com