Title: Visual Thinking
1Visual ThinkingThinking about Visualization
- William Ribarsky
- Charlotte Visualization Center
- SouthEast RVAC
2Visual Reasoning
Visual Thinking
3Foraging, Analysis, Reasoning, and
Decision-Making for Large Data and Complex
Problems
- Objective Develop capabilities for collecting
evidence from large and multiple data sources,
with multiple analysis tools. Build hypotheses
and use to steer data collection. Methods must be
automated but subject to user control. Integrate
all for presentation or decision. - DHS Mission Impact New means of support for
intelligence analysts, disaster prevention
planners, and emergency responders.
Sensemaking Loop
STAB
RESIN
Foraging/Analysis Loop
4RESIN Foraging, Analysis, and Reasoning
- Problem How to reason towards a decision with
massive data, several visual analytics tools, and
limited time and other resources - Solution
- 1. Build an end-to-end process for reasoning
towards a decision with limited time and
resources - 2. Provide a mixed initiative capability so that
computer and user can work together, but always
under user control. - 3. Provide a capability for reasoning about
complex problems with several aspects.
STAB
5RESIN
A runtime view of RESINs control panel while
solving a task with a tight deadline. The image
browser tool is chosen to analyze the data. On
the bottom left is a hierarchical description of
the problem solving process which also captures
the real-time execution information of various
sub-tasks. On the top right is a partial view of
the Markov Decision Process used to compute the
decision policy for the task instance.
6Multimedia Automated Video Content Analysis
7Multimedia Automated Video Semantic Analysis
- News Interestingness Prediction
8Multimedia Video Semantic Analysis
- News Theme Network Visualization
9Multimedia News Broadcast Analysis
- Problem How should we handle the stream of
thousands of stories and themes from many sources
over time?
Ultimately, you gotta read (view) the stories
John Stasko
- Solution
- 1. Develop LensRiver and EventRiver capabilities.
- 2. Develop highly interactive ways to explore
themes and sub-themes, their interlinkages, and
stories over time.
LensRiver hierarchical display
10Multimedia News Broadcast Analysis
- Deep Exploration Reasoning Capabilities
- Hierarchical exploration
- Filter by theme (also, broaden/narrow)
- Shoebox
- Search by Example
EventRiver
11Multimedia News Broadcast Analysis
Emerging events
- Comparing themes and sub-themes for different
channels
Karr
plot
Jonbenet
Ramsey
Emerging Themes
CNN (above) and Fox top 30 themes from 8/1 to
8/24/2006
12With These Tools, There Is Much That Can Be
Done(Some of Which Is Underway Already)
- All the news. High value content from official
and semi-official news sources at all levels. - Identification and tracking of events and themes.
- High quality knowledge structures over time.
- Analysis of different viewpoints, different
opinions based on origin of story, what is being
talked about, who is talking, etc. - Local vs. national
- Different broadcast styles (e.g., Fox vs. CNN vs.
Al Jazeera) - News at one level (e.g., local or for a foreign
region) that is not being reported nationally or
in other regions.
13Integrating Terrorism Data Analysisand News
Analysis
- News analysis is the foundation of systematic
databases such as the Global Terrorism Database
and the Minorities at Risk Database. - News is a source for most investigative analyses
(e.g., fraud and money laundering analysis). - Compiling the systematic databases is very labor
intensive requiring experienced (i.e., expensive)
investigators. - Other investigations are also laborious
14Integrating Terrorism Data Analysisand News
Analysis
- News stories are as much viewpoint and opinion as
news. - Can thus get different angles from local,
regional, or different national news sources. - Automated news analysis provides a complete
record of everything thats going on over any
period of time. - News stories have strong relationships.
- News can follow the flow and change of a story
over time. - News is immediate, but it is also rough and
incomplete
15Integrating Terrorism Data Analysisand News
Analysis
Terrorism Visual Analysis
Terrorism Databases
Terrorism VA
Jigsaw
NVAC
STAB/ RESIN Reasoning Environment
Framing, Affective Analysis
Broadcast VA
News Visual Analysis
News Story Databases
Next full, Web-based multimedia content and the
Dark Web
16Visual Analysis of Terrorism Data Supporting The
Investigative Process
Where
Who
Example selections on the GTD spatio-temporal
interface that support investigative analysis.
User would be able to follow over time and space.
What
When
The user-driven investigation addresses the
issues of why.
17WHO Terrorist Groups
Five Flexible Entry Components
What
WHERE
WHEN
18Enter System by single or multiple Selections
System will supply Specific Information
Drilldown to Original Info
19Terrorism Data Analysis
- Combine continuous and categorical data
- Curved ribbons for better readability of the data
- Layering of ribbons
- First results
- - Number of terrorists killed depends strongly on
type of entity attacked - - large number killed when attacking
police/military - - few terrorists killed in most other cases, like
businesses, transportation, etc.
20Terrorism Data Analysis
- Number of female terrorists depends on the
region - Female terrorists in Latin
- America and Europe
- hardly any female terrorists in
- Asia, Middle East, and
- throughout Africa
- Future plans for curved/forked ribbons
- Full interaction with these ribbons reordering,
highlighting - Histograms on numerical axes
- Filtering by categorical or numerical axes
(including time)
21Applying Visual Analytics to Financial
Transactions
22Relevant Properties of Visual Analytics
- Positioned for exploration and discovery.
- -Highly interactive, contextual views,
unstructured exploration - Meant for large and/or complex data, with
uncertainty, with missing data (but we may not
know where the holes are), with data that are
constructed to be purposely misleading. - Support of analytical reasoning, argument
building, evidence gathering and marshaling. - Support of argument presentation and reporting
(smart reporting).
23Application Financial Fraud Analysis
All transaction activity
Identify
Google
Interactive Visualization
Prioritize
Report
Investigate
24WireVisChallenges to Financial Fraud Detection
- Bad guys are smart
- Automatic detection (black box) approach is
reactive to already known patterns - Usually, bad guys are one step ahead
- Evaluation is difficult
- Difficult to obtain Ground Truth
- Financial Institutions do not perform law
enforcement - Suspicious reports are filed
- Turn around time on accuracy of reports could be
long - What is the percentage of fraudulent activities
that are actually found and reported?
25WireVis Challenges with Wire Fraud Detection
- Size
- More than 200,000 transactions per day
- No transaction by itself is suspicious
- Its like searching for a needle in a stack of
needles Bill Fox - Lack of International Wire Standard
- Loosely structured data with inherent ambiguity
London
Singapore
Charlotte, NC
Indonesia
26WireVis Challenges with Wire Fraud Detection
London
Charlotte, NC
Singapore
Indonesia
- No Standard Form
- When a wire leaves Bank of America in Charlotte
- The recipient can appear as if receiving at
London, Indonesia or Singapore - Vice versa, if receiving from Indonesia to
Charlotte - The sender can appear as if originating from
London, Singapore, or Indonesia
27WireVis Using Keywords
- Keywords
- Words that are used to filter all transactions
- Only transactions containing keywords are flagged
- Highly secretive
- Typically include
- Geographical information (country, city names)
- Business types
- Specific goods and services
- Etc
- Updated based on intelligence reports
- Ranges from 200-350 words
- Could reduce the number of transactions by up to
90 - Most importantly, give quantifiable meanings
(labels) to each transaction, and are
repositories of expert knowledge.
28WireVis Current Practice at Bank of America
- Database Querying
- Experts filter the transactions by keywords,
amounts, date, etc. - Results are displayed in a spreadsheet.
- Problems
- Cannot see more than a week or two of
transactions - Difficult to see temporal patterns
- It is difficult to be exploratory using a
querying system
29WireVisSystem Overview
Search by Example (Find Similar Accounts)
Heatmap View (Accounts to Keywords Relationship)
Keyword Network (Keyword Relationships)
Strings and Beads (Relationships over Time)
30WireVisHeatmap View
- List of Keywords
- Sorted by frequency from high to low (left to
right)
- Hierarchical Clusters of Accounts
- Sorted by activities from big companies to
individuals (top to bottom) - Fast binning that takes O(3n)
- Number of occurrences of keywords
- Light color indicates few occurrences
31WireVisStrings and Beads
- Each string corresponds to a cluster of accounts
in the Heatmap view - Each bead represents a day
- Y-axis can be amounts, number of transactions,
etc. - Fixed or logarithmic scale
32WireVisKeyword Network
- Each dot is a keyword
- Position of the keyword is based on their
relationships - Keywords close to each other appear together more
frequently - Using a spring network, keywords in the center
are the most frequently occurring keyword - Link between keywords denote co-occurrence
33WireVisSearch by Example
- Accounts that are within the similarity threshold
appear ranked (most similar on top)
- Target Account
- Histogram depicts the occurrences of keywords
- User interactive selects features within the
histogram used in comparison
- Similarity threshold slider
34WireVisCase Study
- Evaluation performed with James Price, lead
analyst of WireWatch of Bank of America - Dataset has been sanitized and down sampled
- Video
- This system is generalizable to visual analysis
of transactional data
35WireVisIntegrated with Full Transaction Database
- Scalability
- Were now connected to the database at Bank of
America with 10-20 millions of records over the
course of a rolling year (13 months) - Connecting to a database makes interactive
visualization tricky - Unexpected Results (Access through the VA
interface!) - go to where the data is operations relating
to the data are pushed onto the database (e.g,
clustering).
36WireVisIntegrated with Full Transaction Database
- Performance Measurements
- Data-driven operations such as re-clustering,
drilldown, transaction search by keywords require
worst case of 1-2 minutes. - All other interactions remain real time
- No pre-computation / caching
- Single CPU desktop computer
- WireVis is in deployment on James Prices
computer at WireWatch for testing and evaluation - This is a general approach applicable to all
types of data.
37WireVisFuture Work
- Use text analysis (like IN-SPIRE) to
automatically identify keywords and associated
important terms. - Relationships between Accounts
- Seeing who send money to whom (over time) is
important - Evaluation
- Working with analysts, try to understand how they
use the system and how to better their workflow - Tracking and Reporting
- With tracking, we can make the analysis results
repeatable, sharable, and accountable
38Financial Visual Analytics Workshop
- Met in Charlotte on December 3, 4 2007
- Participants from federal agencies (DHS, CIA,
FinCEN, Treasury, DEA), NVAC, Banks, National
Insurance Crime Bureau, and including several key
university researchers. - Report and recommendations coming out and to be
disseminated within the month.
39Visual Reasoning (Knowledge Visualization)I
nteraction Theory
Can we identify (conjecture) some (design)
principles even without a full theory? Just
thinking about visualization tasks in this way
can pay off.
40Some Ideas That Could Lead to Principles
- The interaction is the analysis. --Remco Chang
- Keep interaction simple and direct.
- For more complex problems, have multiple views
(more pixels). - -Each one optimized for its purpose integrated
with the others. - -Balanced interaction among views.
- -There is a trade-off. How many views?
- Each interactive visualization should have the
highest value for that moment in the reasoning
process.
Knowledge Visualization
41More Ideas
- Determine the highest value (how?)
- -Task-dependent
- But are there valuable visual artifacts that are
general, or that would be useful for a whole set
of tasks? Or are there general tasks?
- -General Task Exploration and Discovery
- Alternatively, are there ways to set up high
value visualizations where the artifacts that
populate them are task-dependent but the way to
set them up are general (e.g., spatio-temporal
layouts)? - Can we build models, even rather crude heuristic
models, with predictive capability?
42Determining the Value Knowledge Visualization
Data Visualization
Information Visualization
Knowledge Visualization
43Properties of Knowledge
- Knowledge is of higher value than information or
data. - Knowledge begets knowledge.
- Knowledge is compact.
- Knowledge is connected (more connections, more
value). - Labeling is important (also, captions, titles,
text annotations). - Knowledge artifacts are the elements of
reasoning. - Knowledge can be made independent of user and
context (including domain).
44What is Knowledge?
Knowledge is the perception of agreement or
disagreement of two ideas. -- John Locke (1689)
- To distinguish between ideas, one needs an
inferential framework. - The basic element in such a framework are two
concepts (or ideas) and a connecting inference.
Ideas The content of cognition specific
thoughts.
Country
United States
Thus knowledge is built of ideas and their
inferential relations.
Belongs to Country
- In an ontology, the basic element is two objects
or concepts and their linking (inferential)
relation.
Belongs to Country
State
Capital
45The Value of Visualization
Visualization Model
D
dK/dt
Im
K
P
V
D
K
D
D
dS/dt
E
S
data
visualization
user
D data S specifications
V visualization Im resultant image
P knowledge process E interactive exploration
-van Wijk, 2005
46The Value of Visualization
Knowledge
Data
value
time
P is a functional and is a path integral!
Cost/Benefit Analysis
Return on Investment
Profit
47What is the Role of Interaction?
- The principal role of interaction in knowledge
visualization is to involve the user intimately
in exploration, discovery, and knowledge
creation. - The best interactive interface should have an air
of inevitability, successfully answering the
question what next?
Interaction selects the path that maximizes the
above.
48Knowledge Visualization Bioinformatics
49Knowledge Visualization Bioinformatics
50Knowledge Visualization Bioinformatics
51Knowledge Visualization Bioinformatics
52Questions?
www.srvac.uncc.edu www.viscenter.uncc.edu