Title: Social Networks and the Semantic Web
1Social Networks and the Semantic Web
- Peter Mika
- BI/FEW, BO/FSW
- Vrije Universiteit, Amsterdam
2Contents
- Motivation I.
- The explosive mix of social networks and the Web
- A brief history of Network Science
- Social Network Analysis
- SNA in entrepreneurship
- Motivation II.
- Two problems in search of a solution
- Contributions
3Motivation I.
- Excitement two vibrant application areas
converging rapidly - Social networks meet the Web
- The Semantic Web meets social networks
Hardly a surprise personal information made the
first Web
4Social networks meet the Web
- Making Friendsters in High Places (Wired News,
July 17, 2003) - Will Microsoft Wallop Friendster? (Wired News,
Nov 8, 2003) - Social Nets Find Friends in VCs (Wired News, Nov
17, 2003) - Google spawns social networking service (CNET,
Jan 22, 2004) - The Technology of the Year Social Network
Applications (Business 2.0, November 2003) - And the backlash
- Social Nets Not Making Friends (Wired News, Jan
28, 2004) - Too many confirmations to answer
- Pretendsters, fakesters
- No clear goal
- Different definitions of friendship etc.
- Next generation social networks AND
5(No Transcript)
6(No Transcript)
7(No Transcript)
8(No Transcript)
9(No Transcript)
10Links
- Friendship-network sites
- www.orkut.com
- www.friendster.com (over 5 million users)
- www.linkedin.com
- www.tribe.net
- www.tickle.com (formerly Emode)
- www.ryze.com
- www.flickr.com
- www.wiw.hu
- Weblogs
- Paolo Massa's blog
- Clay Shirky's blog
- Judith Meskill's blog
11Friend-of-a-Friend
- Friend-of-a-Friend (FOAF) a standard vocabulary
for recording personal information in a machine
readable format (RDF)
- FOAF documents contain information such as
- name
- homepage
- image
- depiction
- interests
- projects
- publications
- memberships
- etc.
http//www.foaf-project.org/
12Example
- ltfoafPerson rdfIDFrankvHgt
- ltfoafnamegt
- Frank van Harmelen
- lt/foafnamegt
- ltfoafmbox_sha1sumgt
- 241021fb0e6289f92815fc210f9e9137262c252e
- lt/foafmbox_sha1sumgt
- ltfoafhomepage rdfresource"http//www.cs.vu.nl/
frankh" /gt - ltfoafimg rdfresource"http//www.cs.vu.nl/fran
kh/figs/FvH-2003.jpg" /gt - ltfoafknows rdfresourcehttp//...HansAkkerman
s /gt - ltfoafknows rdfresourcehttp//...PeterMika
/gt - lt/foafPersongt
Thousands of these documents exist already on the
(Semantic) Web. All of them are linked together
through the knows relationship. This network is
also known as the FOAF-web.
13Applications
- FOAF Explorer Text browser
- FOAF Naut graphical browser
- Codepiction browse the network in images
14foafnaut
15Codepiction
16Codepiction
17Network Science
- Graph theory meets real life
- Models to capture what is common of networks
observed in the physical world - Part of complex systems research in physics
- Broad applications
- Social Network Analysis
- Biology
- Chemistry
- Engineering
18Graph theoretical concepts
- Degree of a node
- number of incoming / outgoing links
- Average Shortest Path, lav
- average shortest path over all pairs of vertices
between which a path exists. - Clustering Coefficient, C
- The clustering coefficient, C, represents the
average fraction of existing connections between
nearest neighbours of a vertex. - Relative size of the largest component, S
- The relative size of the largest component is
simply the size of the largest component divided
by the total number of nodes.
19Networks and graphs
- Euler (1736) bridges of Königsberg
- Later, Cauchy, Hamilton,
- Cayley,Kirchhoff, Pólya et al.
- Regular graphs
- Each node has the same degree
- High clustering, high paths lengths
- e.g. the lattice of atoms
- Nothing particular happened in the next 200 years
20Erdos-Rényi
- Eight papers on Random graphs (1959)
- Given a fixed set of nodes.
- Choose two nodes and if you roll six with a dice,
place a link between them. - Poisson degree distribution (Not regular, but in
large networks all nodes will have an average
degree.) - Low clustering, low paths lengths
- Explains important network phenomena
- six degrees of separation (next)
- the emergence of a giant component
- A tipping point when the average degree in a
graph reaches 1 a giant cluster emerges - Physics percolation or phase transition
- Dominates thinking over complex networks until
recently
21Poisson distribution
22Power-law distribution
- Example Top 50 terms in TIME articles
- Also called Zipf-law in linguistics
23Six degrees
- The (Stanley) Milgram experiment (1967)
- Two targets
- a stock broker in Boston, MA and the wife of a
divinity grad student in Sharon, MA - Starting points
- random people in Wichita, Kansas and Omaha,
Nebraska - Chain letter to forward to a personal
acquaintance who is more likely to know the
target person - 42 of the 160 letters make it back, average chain
is 5.5 long (overestimation!) - six degrees of separation, small world
- Avg. distance increases logarithmically with
network size - Species in food webs (2), molecules in the cells
(3), neurons in the brain of C. Elegans (14), The
Web (19)
24Granovetter
- Granovetter (1969) The strength of weak ties
- How people network (use their social
connections) to find a job? - Mostly through acquaintances (weak ties). Close
friends are also friends of each other -gt unable
to provide reach, access to new information - A small and clustered world densely knit network
of close friends acquaintances connecting them - Impossible to explain by Erdos-Rényi (random
graphs) - high degree of clustering
25Watts Strogatz
- Watts-Strogatz model (Nature, 1998)
- First successful attempt to reconcile clustering
and random graphs - Very few extra links cut back the separation
drastically, while not alter the clustering
coefficient significantly
26Barabási
- The Barabási Model scale-free networks
- Growth Starting with a small number of nodes, at
every time step, add a new node with m edges that
link to m nodes already present in the system. - Preferential attachment When choosing the nodes
to which the new node connects, assume that the
probability that a new node will be connected to
a particular node depends on the degree of that
node. Highly connected nodes are more likely to
become even more connected. (The rich get
richer, first movers advantage) - The result is a network with a few highly
connected nodes (hubs), while a considerable
proportion of nodes have a degree of only one or
two. In fact, the degree distribution ends up
following a power law. (P(k) k-3)
27Networks as competitive systems
- Barabási number of links is a function of time.
First movers take all. - How could new kids on the block succeed in the
world of Barabási? - Google, Boeing, Palm
- Barabási each node has a fitness.
- Preferential attachment is driven by the fitness
connectivity product (fitness links) - Speed of acquiring links is governed by fitness
- Still a power law?
28Evolution
- Evolution of a network depends on the fitness
distribution - In fit-get-rich networks, power structure
survives at all moments, the winners lead is
never significant - In some networks, the winner-takes-all. The
fittest node gets almost all links, leading to a
star topology. (1 hub, many tiny nodes) - Example Microsoft ?
29Criticism to the state-of-the-art
- Does this help us as individuals?
- Path finding is local search in real life
- We have 150 weak ties on average
- Would George Bush lend me his car?
- Not all ties are equal
- Effects of similarity, proximity lacking from
models
30Implications for architectures
- Network Robustness
- Error refers to random node failure.
- Attack refers to preferential removal of the
most connected nodes. - Robustness can be measured as the relative size,
S, and the average shortest path, lav, of the
largest cluster. - Simulation shows that scale free networks are
more robust to error but less robust to attack
than both random and exponential networks.
31Social Network Analysis
- Social Network Analysis (SNA) is the study of
social relations among a set of actors. 1 2 - Network analysis is distinguished from other
fields of sociology by a focus on relationships
between actors rather than attributes of actors. - sense of interdependence a molecular rather than
atomistic view (network view) - belief that structure affects substantive
outcomes (emergent effects) - SNA focuses on the various roles individuals play
in social networks, the various kinds of
relations that may exist and looks for
viable/most efficient structures. SNA also scored
successes in explaining network dynamics such as
the spread of epidemics, fashions, inventions,
technologies etc.
32Network research in Entrepreneurship
- Application of SNA to entrepreneurial (business)
networks - i.e. nodes are entrepreneurs or their firms,
links are business relations - Applications to Business Venturing, Management,
Strategy, Organizational Studies - Key concept embeddedness
- Entrepreneurs are said to be embedded in their
social network - Business ties are said to be relationally
embedded - Focus on enterprise formation period or small to
medium enterprises - Investigates how certain properties of networks
effect the success or failure of the enterprise - Financial success
- Innovativeness
33Research foci
- Network content
- Information and advice
- Emotional support
- Legitimation
- Relationship governance
- Trust
- Power and influence
- Threat of ostracism and loss of reputation
- Network structure
- Structural holes (Burt)
- Weak vs. Strong ties (Granovetter)
- Network as independent vs. network as dependent
variable
34Findings
- Embeddedness as a result of local search
- Lack of information, lack of resources to pursue
broad searches - previous linkages guide partner selection
- Social identity theory similarity strengthens
self-image, similar people are treated more
favorably - Similarity can be cause (precondition) or result
similarity breeds interaction vs. interaction
breeds similarity
35Two sides of embeddedness
- Enabling effect of strong ties
- Strong ties result in better information
sharing/concerted acting, lower resistance
(trustworthiness), comfort (familiarity),
sometimes limiting of sharing (prevent leakage) - Especially if first mover partners become tied
up later - Constraining effect of strong ties
- Over-embeddedness actors become locked-in,
homogeneity (similarity) increases - makes it more difficult to break out (discover
information, resource niches) and diversify
(learn, innovate) - non-group members increasingly isolated
- No possibility for gaining advantage by brokerage
(Burt) - Contingency
- Different mix of strong vs. weak ties is
appropriate - at different stages of the enterprise
- for different activities
36Motivation II.
- Two problems in search of a common solution
- How can the Semantic Web benefit from a machine
understanding of the social networks of agents? - In what way can Social Network science benefit
from Semantic Web research and technology?
37Problem cluster the SW perspective
- An ontology is a shared, formal conceptualization
of a domain Gruber93. - Ontologies build upon a shared understanding
within a community. - This understanding represents an agreement of
experts over the concepts and relationships that
are present in a domain. - The social factor in ontology-based KM.
- Ontologies are expressed in machine processable,
logic-based representations. - This allows computers to manipulate ontologies.
- The machine factor in ontology-based KM.
- Yet, current SW technology ignores the social
aspects of knowledge. - Result hot potatoes, scalability problems
38Hot potatoes
- Which one of these are considered difficult
problems in ontology research? - Ontology acquisition (automated)
- Ontology development (manual)
- Ontology evaluation and measures of ontology
quality - Ontology representation
- Ontology query
- Ontology-based platforms
- MAS, P2P, Grid, Web Services
- Ontology management
- Ontology storage
- Ontology-based reasoning
- Ontology versioning
- Ontology alignment, merging and mapping
- Ontology presentation and visualization
- Ontology-based search
- Ontology-based query
- Ontology-based collaboration
- And what do they have in common?
39Scalability problems
- Ontologies for Information Management Balancing
Formality, Stability, and Sharing Scope
ElstAbecker2001
40What does this mean for the Semantic Web?
41Contingency
- An architecture of the Semantic Web that reflects
the social nature of communication and knowledge
42Problem clusterthe Social Science perspective
- A characteristic (and to some critics, a
weakness) of research on networks is the lack of
a core theory that in turn yields a set of
well-defined propositions from which network
constructs are defined - The result is a loose federation of approaches
(Burt, 1980) in which researchers often debate
how concepts are operationalized rather than the
underlying theoretical arguments themselves
(Hoang and Antoncic, 2003) - Result propositions are difficult to compare for
their dependence on operationalization
43Contribution
- A core ontology of social networks and
relationships - Formulate network theories in a semi-formal
manner, explicate link between theory and case
study data - Case study the Semantic Web research community
- Action-oriented
- Beyond FOAF Towards a social structure for the
Semantic Web
44The End
- Your questions, my pleasure.
45Visualizations
- Standalone tools
- Pajek
- http//vlado.fmf.uni-lj.si/pub/networks/pajek/
- NetMiner (commercial)
- http//www.netminer.com
- 3D
- Kinemages (3D data format)
- http//kinemage.biochem.duke.edu/
- several layout softwares
- Developers tools
- CAIDA (several tools)
- http//www.caida.org/tools/visualization/
- TouchGraph (Java API)
- http//www.touchgraph.com/index.html
- http//touchgraph.sourceforge.net/
- Jung (Java API for network analysis)
- http//jung.sourceforge.net/
- ATT GraphViz (layout sw, C/C)
- http//www.research.att.com/sw/tools/graphviz/