Title: Link Analysis
1Link Analysis
- 2001. 1. 31
- Sekyoung Youm
2Contents
- Introduction
- Graph theory
- Some basic graph theory
- Seven Bridges of Konigsberg
- Traveling salesman problem
- Case study
- Conclusion
3Introduction
- Link Analysis
- Analyzing telephone call patterns
- Understanding physician referral patterns
- Combining leads
4Graph theory
- What is a graph?
- is an abstraction developed specifically to
represent relationships
- Node(vertex)
- Edge
- Fully-connected
- Planar
- Degree
- Loop
- Path
- Cycle
- Component
D14 E 7
5Graph theory
- A breeder, a ram, a wolf, cabbage. problem
A breeder
A breeder
( bre, wolf, ram, cab / 0 ) a ( bre, wolf, ram /
cab ) b ( bre, wolf, cab / ram ) c ( bre,
ram, cab / wolf ) d ( bre, ram / wolf, cab )
e
a ( wolf, cab / bre, ram ) b ( wolf / bre, ram,
cab ) c ( ram / bre, wolf, cab ) d ( cab / bre,
wolf, ram ) e ( 0 / bre, wolf, ram, cab)
a?a?c?d?d?c?e?e
6Graph theory
- Directed graphs
- Source node
- Sink node
- Cycle
- Detecting cycles in a graph
- Remove the source and sink node
7Seven Bridges of Konigsberg
- Is it possible to walk over all seven bridges
exactly once? - The pregel river in Konigsberg has two islands
connected by a total of seven bridges
8Eulerian Path
- Eulerian path
- exist only when the degrees of all the nodes in
a graph are even, except at most two - Path 1,5,6,7,8,9,10,12,13,14,15,11,2,3,4
- Seven Bridges of Konigsberg
- There are four nodes
- whose degrees are odd
9Traveling Salesman problem
- Hamiltonian path
- a path that visits all nodes in a graph exactly
once - Weight graph
- the shortest path ABCDE(24)
- Greedy algorithm
- CDBEA(32)
(yes)
(no)
(no)
10Graph Coloring Algorithm
- Complete graph the number of node
- Cyclic graph whose of node and edge are equal
- of node is odd3
- of node is even2
A
C
D
B
11Case study
- Who is using fax machines from home?
- The data
- Several types of fax machine usage
- Dedicated fax(only for fax communication)
- Shared(voice calls)
- Data(via fax or via computer modem)
12Approach
- The process used to find fax machines
- graph coloring algorithm
- a call graph for 15 numbers and 19 calls
13Approach(cont)
- Short calls removed and nodes is labeled as
fax, unknown, and information - Nodes connected to the initial fax machines are
assigned the fax label - Those connected to information are assigned the
voice label - Those connected to both, are shared.
- The rest are unknown
14Conclusion
- Strengths of link analysis
- Most Appropriate for Linked Data
- Useful for Visualization
- Creates Derived Attributes
- Weakness of link Analysis
- Not Applicable to Many Types of Data
- Few Tool
- Inefficient in SQL