Title: Visual Analysis Algebra
1Visual Analysis Algebra
- Anna Shaverdian, Hao Zhou
- H. V. Jagadish, George Michailidis
- University of Michigan
2Find a criminal network within a network? 50
different solutions, 5 minute videos to explain
process, pages of text
3Desired Features in Visual Analysis
- Mix and match ideas from multiple projects
- Compare/Validate tools and techniques
- Document and reproduce results from anothers
visual analysis - Not ambiguous
- Not wordy
- Optimize techniques
4Visual Analysis Algebra
- Graph Model
- Predicate/ Witness
- Graph Matching Function
- Operators
- Selection
- Labeling
- Aggregation
- Helper Functions
- Visual Operators
5Graph Model
- Attributed Graph D G,X
- Graph G (V,E)
- Each node assigned unique id through ?(vertex)
function - Allows directed, multi-edge graphs
- (Direction captured as an edge attribute)
- Attributes X (XV, XE, XG)
- Each attribute has a name, type, and value
- Attributes can be intrinsic or computed
- Intrinsic independent features which stay
constant if graph topology changes - Computed Created through composition functions
- Examples degree, betweeness, centrality
6Example Graph Model
- Cell Phone Network node represents a phone and
an edge represents a call between two phones - D G (V,E), X (Xv XphoneID, XE
Xdate, Xduration, Xtower, XcallerID, XG ) - Initial data set with intrinsic attributes
- Perform operations on sets of attributed graphs
(closed algebra) - Dday1, Dday2, , Dday10
7Predicate Definition
- p (V, E, XV, XE, XG, !E)
- V,E describe the graph structure
- XV, XE, XG describe the conditions on the
attributes in V, E - Example Xv.weight.node12 lt Xv.weight.node10 in
XV - !E describe the excluded edges
- An edge e1 in !E doesnt exist in the graph G and
given a closed universe U, for all S where G is a
subgraph of S, then e1 doesnt exist in S either
8Witness
- An attributed graph where there exists
- Bijection mapping between nodes
- The predicates conditions all hold on its node,
edge, and graph attributes
9Example Excluded Edges Witness
- Predicate Attributed Graph
10Graph Matching Function (? ) Subroutine used by
operators
- Inputs an attributed graph D and predicate p
- Outputs
- A list of witnesses W
- Attributes of the nodes, edges, and graph of
witness include all attributes of those
respective elements in D - If one or more witnesses share ids attributes
(ex. same but different rotation) combine to
arbitrary one - A model witness X
- Set of mapping lists of the witnesses in W to X
11Graph Matching Function (?) Example
Predicate
Attributed Graph
Age 12
Age 22
Age 16
?
Mappings
Age 12
Witness found
Model witness
ID Model ID
1 6
2 7
3 8
4 9
Age 16
12Selection Operator s
- There are two types of selection operators
- Work at the attributed graph level
- Work at the element (nodes edges level)
- Both operate on a set of attributed graphs and
output a set of graphs
13Set Selection sset
- Given a set of attributed graphs D and a
predicate p - Set Selection outputs the set of graphs where
there exists a witness for the predicate - Example
- The graphs with an average degree greater than 42
- p (V , E , XV , XE ,
XG Xg.averageDegree
gt 42,!E ) - sset, p(D1, D2, , D10) D
- where D subset of D, for any Di in D,
Xg.averageDegree gt 42
14Element Selection selement
- Given a set of attributed graphs and a predicate
p - s element,p (D1, D2, , Dn) Ui Di
- where each Di Wi.p.1, , Wi.p.k the k
witnesses of predicate p found in Di - An attributed graph for each witness found in the
set of graphs
15Example Element Selection selement
- Select a subgraph from a set of graphs
- p (V 1, 2, 3, E e12, e23, XV , XE
, XG ,!E ) - D1 (V(1,2,3,4), E e12, e14 , e23 , e34,
X (,,)
s element,p (D1)
16Labeling Operator
- During graph analysis, need a way to select
nodes of interest, mark them somehow, and
continue analysis, sometimes referring to the
marked nodes - We do this by labeling
- Given a set of graphs and a predicate
- We modify each graph to remember its match to the
predicate
17Labeling Operator
- For each attributed graph Di where there exists a
witness for the predicate (using ? function) - Create the model witness structure x within Di
- Label it with a unique group id
- For each witness wj found in Di
- Use the mapping lists to create directed edges
between the wj and x
18Labeling Example
Labeling
Predicate
Each edge has a group id to say its an edge to a
model witness and a structure id, to say its one
witness found
Attributed Graph
18
19Example Labeling Visual Analysis
- Given a Social Network
- We have a suspected terrorist subnetwork and some
features of interest - Analyze the subgraphs that match the suspected
subnetwork - Predicate structure isnt the final structure
were looking for, its an intermediate step - VAST 2009 challenge
20Example Labeling Visual Analysis
Degree 40
Geographic size small island
21Helper Functions
- Visual Operators
- Ex. Feed values into a histogram, layouts,
presentation - Creating/Deleting
- Create/Delete a set of nodes/edges/attributes
- Copy a graph
22Phone Record Case Study
- In an attempt to characterize the entire network,
we loaded the entire data set into MobiVis, which
links people (blue nodes) if they had a phone
conversation. Unfortunately, the tight
connectivity of the resulting network made it
impossible to find interesting patterns.
Following the lead that person 200 is likely to
be FerdinandoCatalano, we filtered the data to
visualize only its closest nodes. Figure 1 shows
the social network of person 200. Figure 1.
Overview of the social network of
FerdinandoCatalano (id 200). This reflects the
general social structure over, at least, the
first seven days. We can further characterize
this network by looking at the links between the
immediate neighbors of person 200. Persons 5,
200, 97 and 137 seem to form a clique, whereas
persons 1,2 and 3 form another. Looking at the
amount of communication between those, which is
depicted as the thickness of the edges, we
discovered that 200 and 5 talk a lot among
themselves. The color coding of the edges helps
visualize the symmetry of the calls. For example,
a warm color (orange) in the middle indicates a
symmetric connection (both parties call each
other frequently), whereas a biased orange color
indicates more calls in the direction of the
bias. We then characterized the network as being
the connection of the two families the
Catalanos, represented in persons 200
(FerdinandoCatalano), 5 (which we believe is
EstabanCatalano, since its tight connection to
200), 97 and 137. And the Vidros, represented in
persons 1,2 and 3. We can further characterize
the substructure of the Vidrosas hierarchical.
Although it was not evident at first, person 1
always calls persons 2 and 3, which led us to
believe that he has a role of coordinator. We
validated this with another capability of
MobiVis, which allows us to display people in the
social network according to some semantic
filtering criteria. In Figure 2(a), we display
the people called by 1 and people who called
person 1 . Those people who called person 1 are
connected to an orange node, while people who
where called by person 1 are connected to a red
node. We can see that person 1 had a
bi-directional communication with
FerdinandoCatalano, but only in one direction
with 2,3 and 5. Figure 2(b) shows the same
analysis for person 5. We noticed an inverse
behavior 1, 2 and 3 always call 5, but not vice
versa. Furthermore, it helped us characterized
the social structure better. The high symmetry of
communication between 200 and 5 validates our
claim about their identities being of
Ferdinandoand EstabanCatalano, respectively.
Person 1, however, seems to coordinate the
efforts of 2,3 and 5, which suggests that he can
be associated to David
22
23Phone Record Case Study
- Original data set (10 days)
- D G (V,E), X (Xv XphoneID, XE Xdate,
Xduration, Xtower, XcallerID, XG ) - View Entire Graph
- Create 10 graphs (per day)
- Predicate for day i calls
- pday_i(V v1, v2, E e12 , XV , XE
Xe.12.day i, XG ,!E ) - Labeling by day
- µday_iD
- Element Selection on day_igroup
- ?element,day_iD D1, D2, D3, D4, D5, D6,
D7,, D8,, D9,, D10 - View Each Graph
24Phone Record Case Study
- Look at pattern change in node 200s neighborhood
- Predicate for node 200 neighbor
- p200Neighbor(V v1, v2, E e12 , XV
Xv.1.callerID 200, XE , XG ,!E ) - Labeling by day
- µ200NeighborD1, D2, , D10
- Selection on 200 neighbor group
- ?element,200NeighborD1, D2, , D10 D1,
D2, D3, D4, D5, D6, D7, D8, D9, D10 - Aggregate days 1-7 and days 8-10 graphs
- Set Aggregation
- ?set, pdays1-7. pdays8-10(D1, D2, , D10)
Dday1-7, Dday8-10 - Element Aggregation on CallerID
- ?element, pdays1-7. pdays8-10(Dday1-7,
Dday8-10 )
25Algebraic Visual Analysis The Catalano Phone
Call Data Set Case Study
- Anna Shaverdian, Hao Zhou, George Michailidis,
and H.V. Jagadish, VAKD 09 - Simulate many existing analytical workflows with
operators from visual analytic algebra - Ability to do analysis beyond existing workflows
26Multiple Step Social Structure Analysis with
Cytoscape
- Hao Zhou, Anna Shaverdian, H.V. Jagadish, George
Michailidis, VAST 09 - VAST 09 Flitter Mini Challenge Award Good Tool
Adaption - Demonstrates Cytoscapes utility in identifying
the structure in a social network