Title: Small Worlds in Semantic Networks
1Small Worlds in Semantic Networks
- Mark Steyvers
- Josh Tenenbaum
- Stanford University
2- Real-Life Networks
- Collaboration network for film actors
- Power-grid
- Neural network of worm C. elegans
- WWW
-
- Properties
- Short paths between any pair of nodes
- Clustering neighbors are oftenalso each others
neighbors - Power Law distribution in number of neighbors
- This Research
- Consider semantic networks
- Do semantic networks have similar properties?
- What network model can predict these properties?
small world
3Path Lengths ClusteringWatts Strogatz 98
Adamic 99 Albert, Jeong, Barabasi, 99
L average length of shortest path between nodes
C fraction of neighbors that are connected to
each other
L
C
Film actors 3.65 .79 (n220,000) Power
Grid 18.7 .08 (n4,900) Neural
Network 2.65 .28 C. Elegans (n282) WWW
(n3,000) 4.06 .16 WWW (n300,000)
11.2 -
4Erdös-Réyni graphs
Connect every pair of nodes with probability p
Short path lengths L log( n )
5Path Lengths Clusteringcompared with
Erdös-Réyni graphs
Watts Strogatz (1998)
LErdös-Réyni
C
CErdös-Réyni
L
Film actors 3.65 2.99 .79 .00027
(n220,000) Power Grid 18.7 12.4 .08 .005 (n4,
900) Neural Network 2.65 2.25 .28 .05
C.elegans (n282) WWW 4.06 4.05 .16 .0012 (n3,
000)
6Degree Distribution
Erdös-Réyni
Real-life networks
P(k) Poisson distributedExponential tail
unlikely to have hubs
P(k) k-g Power law tail, linear in log/log
plotThere are a few hubs connected to many
nodes
7Semantic Networks
- Associative Networks
- WordNet
- Rogets Thesaurus
8Word AssociationNelson et al. (1999)
Nwords 5,000
9WordNetGeorge Miller and colleagues
Nwords 120,000 Nsenses 99,000
10Rogets Thesaurus (1911)
Nwords 29,000 Ncategories 1000
11Path-lengths Clustering
L
LErdös-Réyni
CErdös-Réyni
C
Word Association 3.04 3.03 .175 .0004
WordNet 10.6 10.6 .745 .0000 (0.51) Roget
s 5.60 5.43 .875 .0004 Thesaurus (0.61)
12Degree Distribution
13Growing, Scale-Free NetworksBarabasi Albert 99
- Start with m0 nodes
- Growth At each time-step, add a node with m
links. - Preferential attachment connect links to
existing nodes with probability
14Apply model on Word Association
(m011, m11, T5018)
Degree Distribution
15Degree vs. time
Because of preferential attachment, early nodes
get most connections Prediction words acquired
early in life have more connections?
16Degree vs. Age of Acquisition (rated)
17Degree vs. Age of Acquisition (objective)
18Frequent words on shortest paths
Rogets Thesaurus
WordNet
Word Association
19Watts Strogatz (1998)
RANDOMGRAPH
REGULARLATTICE
SMALL WORLD NETWORK
Rewiring Probability
L
C
L Characteristic average minimum path length
Path Length
C Clustering fraction of neighbors that are
connected to each other Coefficient