Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases

About This Presentation
Title:

Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases

Description:

Roth et al.'s Sage and SageBrush. Wilkinson's 'Grammar ... Canterbury and East Kent. 1:50,000. 1:625,000. Generalization: Techniques. Selection. Simplification ... –

Number of Views:87
Avg rating:3.0/5.0
Slides: 32
Provided by: christoph133
Category:

less

Transcript and Presenter's Notes

Title: Polaris Query, Analysis, and Visualization of Large Hierarchical Relational Databases


1
PolarisQuery, Analysis, and Visualization of
Large Hierarchical Relational Databases
  • Pat Hanrahan
  • With Chris Stolte and Diane Tang
  • Computer Science Department
  • Stanford University

2
Motivation
  • Large databases have become very common
  • Corporate data warehouses
  • Amazon, Walmart,
  • Scientific projects
  • Human Genome Project
  • Sloan Digital Sky Survey
  • Need tools to extract meaning from these databases

3
Related Work
  • Formalisms for graphics
  • Bertins Semiology of Graphics
  • Mackinlays APT
  • Roth et al.s Sage and SageBrush
  • Wilkinsons Grammar of Graphics
  • Visual exploration of databases
  • DeVise
  • DataSplash/Tioga-2
  • Visualization and data mining
  • SGIs MineSet
  • IBMs Diamond

4
Formalism
5
Polaris Formalism
  • UI interpreted as visual specification that
    defines
  • Table configuration
  • Type of graphic in each pane
  • Encoding of data as visual properties of marks
  • Data transformations and queries

6
Schema
Market State Year Quarter Month Product
Type Product Profit Sales Payroll Marketing Inven
tory Margin COGS ...
Ordinal fields (categorical)
Coffee chain dataVisual Insights
Quantitative fields (measures)
7
Polaris Visual Encodings
Principle of Importance Ordering Encode the most
important information in the most effective way
Cleveland McGill
8
The Pivot Table Interface
  • Common interface to statistical packages/Excel
  • Cross-tabulations
  • Simple interface based on drag-and-drop

9
(No Transcript)
10
Data Cubes
  • Structure relation as n-dimensional cube

Each cell aggregatesall measures for those
dimensions
Each cube axis corresponds to a dimension in the
relation
11
Table Algebra Operands
  • Ordinal fields interpret domain as a set that
    partitions table into rows and columns
  • Quarter (Qtr1),(Qtr2),(Qtr3),(Qtr4) ?
  • Quantitative fields treat domain as single
    element set and encode spatially as axes
  • Profit (Profit) ?

12
Concatenation () Operator
  • Ordered union of two sets
  • Quarter ProductType
  • (Qtr1),(Qtr2),(Qtr3),(Qtr4)(Coffee),(Espres
    so)
  • (Qtr1),(Qtr2),(Qtr3),(Qtr4),(Coffee),(Espress
    o)
  • Profit Sales
  • (Profit),(Sales)

13
Cross (?) Operator
  • Direct-product of two sets
  • Quarter ? ProductType
  • (Qtr1,Coffee), (Qtr1, Tea), (Qtr2, Coffee),
    (Qtr2, Tea),
  • (Qtr3, Coffee), (Qtr3, Tea), (Qtr4, Coffee),
    (Qtr4,Tea)
  • ProductType ? Profit

14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
SQL Dataflow
  • Notes
  • Aggregation operators applied after sort
  • Only one layer is shown additional z-sort

Sort
Relational Table
Tuples in Panes
Marks in Panes
19
Multiscale Visualization
20
Hierarchical Structure
  • Challenge these databases are very large
  • Queries/Vis should not require all the records
  • Augment database with hierarchical structure
  • Provide meaningful levels of abstraction
  • Derived from domain or clustering
  • Provides metadata (missing data for context)

21
Hierarchies and Data Cubes
  • Each dimension in the cube is structured as a
    tree
  • Each level in tree corresponds to level of detail

22
Schema Star Schema
Existence Table
Fact table
Location Market State
State Month Product Profit Sales Payroll Marketing
Inventory Margin ...
Time Year Quarter Month
Products Product Type Product Name
Measures
  • Generalizations
  • Snowflake schemas
  • Lattices (DAGs)

23
Categorical Hierarchies
  • Quarter ? Month
  • Direct product of two sets
  • Would create twelve entries for each quarter,
    i.e. (Qtr1, December)
  • Quarter / Month
  • Based on tuples in database not semantics
  • Would only create three entries per quarter
  • Can be expensive to compute
  • Quarter . Month
  • Based on tuples in existence tables (not db)

24
Cartographic Generalization
Canterbury and East Kent
150,000
1625,000
25
Generalization Techniques
  • Selection
  • Simplification
  • Exaggeration
  • Regularization
  • Displacement
  • Aggregation

26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
Summary
  • Polaris
  • Spreadsheet or table-based displays
  • Simple drag-and-drop interface
  • Built on a formalism that allows algebraic
    manipulation of visual mapping of tuples to marks
  • Multiscale visualizations using data and visual
    abstraction
  • Connects to SQL/MDX servers
  • See http//www.graphics.stanford.edu/projects/pola
    ris

31
Future Work
  • Articulate full-set of multiscale design patterns
  • Transition between levels of detail
  • Develop system infrastructure for browsing VLDB
  • Support layers/lenses/linking with tuple flow
  • Device independence through graphical encodings
  • Extend formalism to 3D
  • Couple scientific and information visualization
Write a Comment
User Comments (0)
About PowerShow.com