Introduction%20to

About This Presentation

Title:

Introduction%20to

Description:

'To speak of social life is to speak of the association between people their ... Multiplex categorical edges. Ego-Net. Global-Net. Best Friend. Dyad. Primary. Group ... – PowerPoint PPT presentation

Number of Views:59

Avg rating:3.0/5.0

Slides: 105

Provided by: jwm

Learn more at: https://people.duke.edu

Category:

more less

Transcript and Presenter's Notes

Title: Introduction%20to

1
Introduction to Social Network
Analysis Columbia University April
2007 James Moody Duke University
2
Introduction

Introduction
Social Network data
Basic data elements
Network data sources
Local (ego) Network Analysis
Introduction
Network Composition
Network Structure
Local Network Models
Complete Network Analysis
Exploratory Analysis
Network Connections
Network Macro Structure
Stochastic Network Analyses
Social Network Software Review
Work through examples

3
Introduction
We live in a connected world
To speak of social life is to speak of the
association between people their associating in
work and in play, in love and in war, to trade or
to worship, to help or to hinder. It is in the
social relations men establish that their
interests find expression and their desires
become realized. Peter M. Blau Exchange and
Power in Social Life, 1964
"If we ever get to the point of charting a whole
city or a whole nation, we would have a picture
of a vast solar system of intangible structures,
powerfully influencing conduct, as gravitation
does in space. Such an invisible structure
underlies society and has its influence in
determining the conduct of society as a
whole." J.L. Moreno, New York Times, April 13,
1933
These patterns of connection form a social space,
that can be seen in multiple contexts
4
Introduction
Source Linton Freeman See you in the funny
pages Connections, 23, 2000, 32-42.
5
Introduction
High Schools as Networks
6
(No Transcript)
7
(No Transcript)
8
Introduction
And yet, standard social science analysis methods
do not take this space into account. For the
last thirty years, empirical social research has
been dominated by the sample survey. But as
usually practiced, , the survey is a
sociological meat grinder, tearing the individual
from his social context and guaranteeing that
nobody in the study interacts with anyone else in
it. Allen Barton, 1968 (Quoted in Freeman
2004) Moreover, the complexity of the relational
world makes it impossible to identify social
connectivity using only our intuition. Social
Network Analysis (SNA) provides a set of tools to
empirically extend our theoretical intuition of
the patterns that compose social structure.
9
Introduction
Why do Networks Matter?
Local vision
10
Introduction
Why do Networks Matter?
Local vision
11
Introduction

Social network analysis is
a set of relational methods for systematically
understanding and identifying connections among
actors. SNA
is motivated by a structural intuition based on
ties linking social actors
is grounded in systematic empirical data
draws heavily on graphic imagery
relies on the use of mathematical and/or
computational models.
Social Network Analysis embodies a range of
theories relating types of observable social
spaces and their relation to individual and group
behavior.

12
Introduction Key Questions

Social Network analysis lets us answer questions
about social interdependence. These include
Networks as Variables approaches
Are kids with smoking peers more likely to smoke
themselves?
Do unpopular kids get in more trouble than
popular kids?
Are people with many weak ties more likely to
find a job?
Do central actors control resources?
Networks as Structures approaches
What generates hierarchy in social relations?
What network patterns spread diseases most
quickly?
How do role sets evolve out of consistent
relational activity?
We dont want to draw this line too sharply
emergent role positions can affect individual
outcomes in a variable way, and variable
approaches constrain relational activity.

13
1. Introduction and Background

Why networks matter
Intuitive information travels through contacts
between actors, which can reflect a power
distribution or influence attitudes and
behaviors. Our understanding of social life
improves if we account for this social space.
Less intuitive patterns of inter-actor contact
can have effects on the spread of goods or
power dynamics that could not be seen focusing
only on individual behavior.

14
Social Network Data
The unit of interest in a network are the
combined sets of actors and their relations. We
represent actors with points and relations with
lines. Actors are referred to variously
as Nodes, vertices, actors or
points Relations are referred to variously
as Edges, Arcs, Lines, Ties
Example
b
d
a
c
e
15
Social Network Data Basic Data Elements

Social Network data consists of two linked
classes of data
Nodes Information on the individuals (actors,
nodes, points, vertices)
Network nodes are most often people, but can be
any other unit capable of being linked to another
(schools, countries, organizations,
personalities, etc.)
The information about nodes is what we usually
collect in standard social science research
demographics, attitudes, behaviors, etc.
Often includes dynamic information about when the
node is active
b) Edges Information on the relations among
individuals (lines, edges, arcs)
Records a connection between the nodes in the
network
Can be valued, directed (arcs), binary or
undirected (edges)
One-mode (direct ties between actors) or two-mode
(actors share membership in an organization)
Includes the times when the relation is active
Graph theory notation G(V,E)

16
Social Network Data Basic Data Elements
In general, a relation can be (1) Binary or
Valued (2) Directed or Undirected
The social process of interest will often
determine what form your data take. Almost all
of the techniques and measures we describe can be
generalized across data format.
17
Social Network Data Basic Data Elements
In general, a relation can be (1) Binary or
Valued (2) Directed or Undirected
b
d
a
c
e
Directed, Multiplex categorical edges
The social process of interest will often
determine what form your data take. Almost all
of the techniques and measures we describe can be
generalized across data format.
18
Social Network Data Basic Data Elements Levels
of analysis
Global-Net
19
Social Network Data Basic Data Elements Levels
of analysis
We can examine networks across multiple levels
1) Ego-network - Have data on a respondent (ego)
and the people they are connected to (alters).
Example 1985 GSS module - May include estimates
of connections among alters
2) Partial network - Ego networks plus some
amount of tracing to reach contacts of contacts
- Something less than full account of
connections among all pairs of actors in the
relevant population - Example CDC Contact
tracing data for STDs
20
Social Network Data Basic Data Elements Levels
of analysis
We can examine networks across multiple levels

3) Complete or Global data
- Data on all actors within a particular
(relevant) boundary
- Never exactly complete (due to missing data),
but boundaries are set
Example Coauthorship data among all writers in
the social sciences, friendships among all
students in a classroom

21
Social Network Data Graph Layout
A good network drawing allows viewers to come
away from the image with an almost immediate
intuition about the underlying structure of the
network being displayed. However, because there
are multiple ways to display the same
information, and standards for doing so are few,
the information content of a network display can
be quite variable.
Consider the 4 graphs drawn at right. After
asking yourself what intuition you gain from each
graph, click on the screen.
Now trace the actual pattern of ties. You will
see that these 4 graphs are exactly the same.
22
Social Network Data Graph Layout
Network visualization helps build intuition, but
you have to keep the drawing algorithm in mind.
Here we show the same graphs with two different
techniques
Spring embedder layouts
Tree-Based layouts
(Fair - poor)
(good)
Most effective for very sparse, regular graphs.
Very useful when relations are strongly directed,
such as organization charts or internet
connections.
Most effective with graphs that have a strong
community structure (clustering, etc). Provides
a very clear correspondence between social
distance and plotted distance
Two images of the same network
23
Social Network Data Graph Layout
Another example
Spring embedder layouts
Tree-Based layouts
(poor)
(good)
Two layouts of the same network
24
Social Network Data
Basic Data Structures
In general, graphs are cumbersome to work with
analytically, though there is a great deal of
good work to be done on using visualization to
build network intuition. I recommend using
layouts that optimize on the feature you are most
interested in. The two I use most are a
hierarchical layout or a force-directed layout
are best. Well see some examples of best
practice after getting a little more familier
with data structure.
25
Social Network Data
Basic Data Structures
From pictures to matrices
Undirected, binary
Directed, binary
26
Social Network Data
Basic Data Structures
From matrices to lists
Arc List
Adjacency List
a b b a b c c b c d c e d c d e e c e d
27
Social Network Data Basic Data Elements Modes
Social network data are substantively divided by
the number of modes in the data. 1-mode data
represents edges based on direct contact between
actors in the network. All the nodes are of the
same type (people, organization, ideas, etc).
Examples Communication, friendship, giving
orders, sending email. There are no constraints
on connections between classes of nodes.
1-mode data are usually singly reported (each
person reports on their friends), but you can use
multiple-informant data, which is more common in
child development research (Cairns and Cairns).
28
Social Network Data Basic Data Elements Modes
Social network data are substantively divided by
the number of modes in the data. 2-mode data
represents nodes from two separate classes, where
all relations cross classes. Examples People
as members of groups People as authors on
papers Words used often by people Events in the
life history of people The two modes of the data
represent a duality you can project the data as
people connected to people through joint
membership in a group, or groups to each other
through common membership N-mode data
generalizes the constraint on ties between
classes to N groups
29
Social Network Data Basic Data Elements Modes
Breiger 1974 - Duality of Persons and Groups
Argument
Metaphor people intersect through their
associations, which defines (in part) their
individuality.
The Duality argument is that relations among
groups imply relations among individuals
30
Social Network Data Basic Data Elements Modes
Bipartite networks imply a constraint on the
mixing, such that ties only cross classes. Here
we see a tie connecting each woman with the party
she attended (Davis data)
31
Social Network Data Basic Data Elements Modes
Bipartite networks imply a constraint on the
mixing, such that ties only cross classes. Here
we see a tie connecting each woman with the party
she attended (Davis data)
32
Social Network Data Basic Data Elements Modes
By projecting the data, one can look at the
shared between people or the common memberships
in groups this is the person-to-person
projection of the 2-mode data.
33
Social Network Data Basic Data Elements Modes
By projecting the data, one can look at the
shared between people or the common memberships
in groups this is the group-to-group projection
of the 2-mode data.
34
Social Network Data Basic Data Elements Modes
Working with two-mode data
A person-to-group adjacency matrix is
rectangular, with persons down rows and groups
across columns
Each column is a group, each row a person, and
the cell 1 if the person in that row belongs to
that group. You can tell how many groups two
people both belong to by comparing the rows
Identify every place that both rows 1, sum
them, and you have the overlap.
1 2 3 4 5 A 0 0 0 0 1 B 1 0 0 0 0 C 1 1 0 0 0 D
0 1 1 1 1 E 0 0 1 0 0 F 0 0 1 1 0
A
35
Social Network Data Basic Data Elements Modes
Working with two-mode data
Compare persons A and F
Person A is in 1 group, Person F is in two
groups, and they are in no groups together.
Or persons D and F
Person D is in 4 groups, Person F is in two
groups, and they are in 2 groups together.
36
Social Network Data Basic Data Elements Modes
Working with two-mode data
Similarly for Groups
Group 1 has 2 members, group 2 has 2 members and
they overlap by 1 members (C).
37
Social Network Data Basic Data Elements Modes
Working with two-mode data
In general, you can get the overlap for any pair
of groups / persons by summing the multiplied
elements of the corresponding rows/columns of the
persons-to-groups adjacency matrix. That is
Groups-to-Groups
Persons-to-Persons
38
Social Network Data Basic Data Elements Modes
Working with two-mode data
One can get either projection easily with a
little matrix multiplication. First define AT as
the transpose of A (simply reverse the rows and
columns). If A is of size P x G, then AT will be
of size G x P.
39
Social Network Data Basic Data Elements Modes
1 2 3 4 5 A 0 0 0 0 1 B 1 0 0 0 0 C 1 1 0 0 0 D
0 1 1 1 1 E 0 0 1 0 0 F 0 0 1 1 0
A B C D E F 1 0 1 1 0 0 0 2 0 0 1 1 0 0 3 0 0 0
1 1 1 4 0 0 0 1 0 1 5 1 0 0 1 0 0
P A(AT) G AT(A)
A
AT
(5x6)
(6x5)
40
Social Network Data Basic Data Elements Modes
Theoretically, these two equations define what
Breiger means by duality With respect to the
membership network,, persons who are actors in
one picture (the P matrix) are with equal
legitimacy viewed as connections in the dual
picture (the G matrix), and conversely for
groups. (p.87)
The resulting network 1) Is always
symmetric 2) the diagonal tells you how many
groups (persons) a person (group) belongs to
(has)
In practice, most network software (UCINET,
PAJEK) will do all of these operations. It is
also simple to do the matrix multiplication in
programs like SAS, SPSS, or R.
41
Social Network Data Network Data Sources
Existing data sources

Existing Sources of Social Network Data
There are lots of network data archived. Check
INSNA for a listing. The PAJEK data page
includes a number of exemplars for large-scale
networks.
2-Mode Data
One can construct networks from many different
data sources if you want to work with 2-mode
data. Any list can be so transformed.
Director interlocks
Protest event participation
Authors on papers
Words in documents
1-Mode Data
Local Network data
Fairly common, because it is easy to collect from
sample surveys.
GSS, NHSL, Urban Inequality Surveys, etc.
Pay attention to the question asked
Key features are (a) number of people named and
(b) whether alters are able to nominate each
other.

42
Social Network Data Network Data Sources
Existing data sources

Existing Sources of Social Network Data
1-Mode Data
Partial network data
Much less common, because cost goes up
significantly once you start tracing to contacts.
Snowball data start with focal nodes and trace
to contacts
CDC style data on sexual contact tracing
Limited snowball samples
Colorado Springs drug users data
Geneology data
Small-world network samples
Limited Boundary data select data within a
limited bound
Cross-national trade data
Friendships within a classroom
Family support ties

43
Social Network Data Network Data Sources
Existing data sources

Existing Sources of Social Network Data
1-Mode Data
Complete network data
Significantly less common and never perfect.
Start by defining a theoretically relevant
boundary
Then identify all relations among nodes within
that boundary
Co-sponsorship patterns among legislators
Friendships within strongly bounded settings
(sororities, schools)
Examples
Add Health on adolescent friendships
Hallinan data on within-school friendships
McFarlands data on verbal interaction
Electronic data on citations or coauthorship (see
Pajek data page)
See INSNA home page for many small-scale networks

44
Social Network Data Network Data Sources
Collecting network data
Boundary Specification Problem Network methods
describe positions in relevant social fields,
where flows of particular goods are of interest.
As such, boundaries are a fundamentally
theoretical question about what you think matters
in the setting of interest. See Marsden (19xx)
for a good review of the boundary specification
problem In general, there are usually relevant
social foci that bound the relevant social field.
We expect that social relations will be very
clumpy. Consider the example of friendship ties
within and between a high-school and a Jr. high
45
Social Network Data Network Data Sources
Collecting network data

Network data collection can be time consuming. It
is better (I think) to have breadth over depth.
Having detailed information on lt50 of the sample
will make it very difficult to draw conclusions
about the general network structure.
Question format
If you ask people to recall names (an open list
format), fatigue will result in under-reporting
If you ask people to check off names from a full
list, you can often get over-reporting
c) It is common to limit people to a small number
if nominations (5). This will bias network
measures, but is sometimes the best choice to
avoid fatigue.
d) Concrete relational indicators are best (who
did you talk to?) over attitudes that are harder
to define (who do you like?)

46
Social Network Data Network Data Sources
Collecting network data
Boundary Specification Problem
While students were given the option to name
friends in the other school, they rarely do. As
such, the school likely serves as a strong
substantive boundary
47
Social Network Data Network Data Sources
Collecting network data

Local Network data
When using a survey, common to use an
ego-network module.
First part Name Generator question to elicit
a list of names
Second part Working through the list of names to
get information about each person named
Third part asking about relations among each
person named.

GSS Name Generator From time to time, most
people discuss important matters with other
people. Looking back over the last six months --
who are the people with whom you discussed
matters important to you? Just tell me their
first names or initials.

Why this question?
Only time for one question
Normative pressure and influence likely travels
through strong ties
Similar to best friend or other strong tie
generators
Note there are significant substantive problems
with this name generator

48
Social Network Data Network Data Sources
Collecting network data

Electronic Small World name generator

49
Social Network Data Network Data Sources
Collecting network data
Local Network data The second part usually asks
a series of questions about each person GSS
Example Is (NAME) Asian, Black, Hispanic,
White or something else?
ESWP example
Will generate N x (number of attributes)
questions to the survey
50
Social Network Data Network Data Sources
Collecting network data
Local Network data The third part usually asks
about relations among the alters. Do this by
looping over all possible combinations. If you
are asking about a symmetric relation, then you
can limit your questions to the n(n-1)/2 cells of
one triangle of the adjacency matrix
GSS Please think about the relations between the
people you just mentioned. Some of them may be
total strangers in the sense that they wouldn't
recognize each other if they bumped into each
other on the street. Others may be especially
close, as close or closer to each other as they
are to you. First, think about NAME 1 and NAME 2.
A. Are NAME 1 and NAME 2 total strangers? B. ARe
they especially close? PROBE As close or closer
to eahc other as they are to you?
51
Social Network Data Network Data Sources
Collecting network data
Local Network data The third part usually asks
about relations among the alters. Do this by
looping over all possible combinations. If you
are asking about a symmetric relation, then you
can limit your questions to the n(n-1)/2 cells of
one triangle of the adjacency matrix
52
Social Network Data Network Data Sources
Collecting network data

Snowball Samples
Snowball samples work much the same as
ego-network modules, and if time allows I
recommend asking at least some of the basic
ego-network questions, even if you plan to sample
(some of) the people your respondent names.
Start with a name generator, then any demographic
or relational questions.
Have a sample strategy
Random Walk designs (Klovdahl)
Strong tie designs
All names designs
Get contact information from the people named
Snowball samples are very effective at providing
network context around focal nodes. New work on
Respondent Driven Sampling (RDS) makes it
possible to get good representation even with
initially biased seed nodes.

http//www.respondentdrivensampling.org/reports/RD
Srefs.htm
53
Social Network Data Network Data Sources
Collecting network data
Snowball Samples
54
Social Network Data Network Data Sources
Collecting network data

Complete Network data
Data collection is concerned with all relations
within a specified boundary.
Requires sampling every actor in the population
of interest (all kids in the class, all nations
in the alliance system, etc.)
The network survey itself can be much shorter,
because you are getting information from each
person (so ego does not report on alters).
Two general formats
Recall surveys (Name all of your best friends)
Check-list formats Give people a list of names,
have them check off those with whom they have
relations.

55
Social Network Data Network Data Sources
Collecting network data

Complete network surveys require a process that
lets you link answers to respondents.
You cannot have anonymous surveys.
Recall
Need Id numbers a roster to link, or hand-code
names to find matches
Checklists
Need a roster for people to check through

56
Social Network Data Network Data Sources
Collecting network data

Complete network surveys require a process that
lets you link answers to respondents.
Typically you have a number of data tradeoffs
Limited number of responses.
Eases survey construction coding, lowers
density degree, which affects nearly every
other system-level measure.
Some evidence that people try to fill all of the
slots.
Name check-off roster (names down a row or on
screen, relations as check-boxes).
Easy in small settings or CADI, but encourages
over-response.
The Amy Willis Problem.
Open recall list.
Very difficult cognitively, requires an extra
name-matching step in analysis.
Think carefully about what you want to learn from
your survey items.

57
Social Network Data Network Data Sources Missing
Data
Whatever method is used, data will always be
incomplete. What are the implications for
analysis?
Example 1. People can name friends out of
sample, but no way to match them (Add Health)
Out
Out
Out
Out
Out
Out
M
Ego
M
Ego
M
M
M
M
M
M
If the true network looks like this
you cannot distinguish it from this
58
Social Network Data Network Data Sources Missing
Data
Example 2 Node population 2-step
neighborhood of Actor X Relational population
Any connection among all nodes
F 1 2 3 4 5 1 2 3 4 5 6 7 8 1 2 3
Full (0)
Full
Full (0)
F 1.1 1.2 1.3 1.4 1.5 2.1 2.2 2.3 2.4 2.5 2.6 2.7
2.8 3.1 3.2 3.3
Full (0)
Full
Full
F
1-step
UK
Full
Full
F (0)
2-step
3-step
F (0)
Full (0)
Unknown
UK
59
Social Network Data Network Data Sources Missing
Data
Example 3 Node population 2-step neighborhood
of Actor X Relational population Trace, plus
All connections among 1-step contacts
F 1 2 3 4 5 1 2 3 4 5 6 7 8 1 2 3
Full (0)
Full
Full (0)
F 1.1 1.2 1.3 1.4 1.5 2.1 2.2 2.3 2.4 2.5 2.6 2.7
2.8 3.1 3.2 3.3
Full (0)
Full
Full
F
UK
Full
Unknown
F (0)
F (0)
Full (0)
Unknown
UK
60
Social Network Data Network Data Sources Missing
Data
Example 4. Node population 2-step neighborhood
of Actor X Relational population Only tracing
contacts
F 1 2 3 4 5 1 2 3 4 5 6 7 8 1 2 3
Full (0)
Full
Full (0)
F 1.1 1.2 1.3 1.4 1.5 2.1 2.2 2.3 2.4 2.5 2.6 2.7
2.8 3.1 3.2 3.3
Full (0)
Unknown
Full
F
1-step
UK
Full
Unknown
F (0)
2-step
3-step
F (0)
Full (0)
Unknown
UK
61
Social Network Data Network Data Sources Missing
Data
Example 5 Node population 2-step neighborhood
from 3 focal actors Relational population All
relations among actors
Focal
1-Step
2-Step
3-Step
Focal
Full
Full (0)
Full (0)
Full
Full (0)
Full
Full
Full
1-Step
UK
Full
Full
Full (0)
2-Step
Full (0)
3-Step
Full (0)
Unknown
UK
62
Social Network Data Network Data Sources Missing
Data
Example 6. Node population 1-step neighborhood
from 3 focal actors Relational population Only
relations from focal nodes
Focal
1-Step
2-Step
3-Step
Focal
Full
Full (0)
Full (0)
Full
Full (0)
Unknown
Unknown
Full
1-Step
UK
Unknown
Unknown
Full (0)
2-Step
Full (0)
3-Step
Full (0)
Unknown
UK
63
Social Network Data Network Data Sources Missing
Data
Summary Data collection design missing data
affect the information at hand to draw
conclusions about the system. Everything we do
from now on is built on some manipulation of the
observed adjacency matrix so we want to
understand what are valid and invalid conclusions
due to systematic distortions on the
data. Statistical modeling tools hold promise.
We can build models of networks that account for
missing data we are able to fix the
structural zeros in or models by treating them as
given. This then lets us infer to the world of
all graphs with that same missing data structure.
These models are very new, and not widely
available yet.
64
Local Network Analysis Introduction

Local network analysis uses data from a simple
ego-network survey. These might include
information on relations among egos contacts,
but often not. Questions include

Population Mixing The extent to which one type
of person is tied to another type of person (race
by race, etc.) Local Network Composition Peer
behavior Cultural milieu Opportunities or
Resources in the network Social Support Local
Network Structure Network Size Density Holes
Constraint Concurrency Dyadic behavior Frequency
of contact Interaction content Specific exchange
behaviors Dyadic Similarity
65
Local Network Analysis Introduction

Advantages
Cost data are easy to collect and can be sampled
Methods are relatively simple extensions of
common variable-based methods social scientists
are already familiar with
Provides information on the local network
context, which is often the primary substantive
interest.
Can be used to describe general features of the
global network context
Population mixing, concurrency, exchange
frequency, etc.
Disadvantages
Treats each local network as independent, which
is false.
The poor performance of number of partners for
predicting STD spread is a clear example.
Impossible to account for how position in a
larger context affects local network
characteristics. popular with who
If structure matters, ego-networks are strongly
constrained to limit the information you can get
on overall structure

66
Local Network Analysis Introduction
Local
67
Local Network Analysis Introduction
Global
68
Local Network Analysis Network Composition
Perhaps the simplest network question is what
types of alters does ego interact with?
Network composition refers to the distribution of
types of people in your network.

Networks tend to be more homogeneous than the
population. Using the GSS, Marsden reports
heterogeneity in Age, Education, Race and Gender.
He finds that
Age distribution is fairly wide, almost evenly
distributed, though lower than the population at
large
Homogenous by education (30 differ by less than
a year, on average)
Very homogeneous with respect to race (96 are
single race)
Heterogeneous with respect to gender

69
Local Network Analysis Network Composition
Claude Fischers book To Dwell Among Friends is
a classic study of urbanism that makes good use
of local network data.
Age heterogeneity varies by egos age and across
urban settings.
70
Local Network Analysis Network Composition
Claude Fischers book To Dwell Among Friends is
a classic study of urbanism that makes good use
of local network data.
Marital composition similarly varies across
respondents and settings
71
Local Network Analysis Network Composition
Calculating network composition using GSS style
data.
Generally you have a separate variable for each
alter characteristic, and you can construct items
by summing over the relevant variables. You
would, for example, have variables on age of each
alter such as Age_alt1 age_alt2 age_alt3
age_alt4 age_alt5 15 35 20 12 . You
get the mean age, then, with a statement such
as meanagemean(Age_alt1, age_alt2, age_alt3,
age_alt4, age_alt5) Be sure you know how the
program you use (SAS, SPSS) deals with missing
data.
72
Local Network Analysis Network Composition
Calculating local network information from global
network data

We often want to construct local-level measures
from global level data. This involves a number
of steps opens more opportunities than
GSS-style data
1) Define the local neighborhood
Distance (1-step, 2-steps, what?)
Direction of tie
Sent, Received, or both?
2) Pull the relevant alters
3) Match the alters to the variables of interest
Once you decide on a type of tie, you need to get
the information of interest in a form similar to
that in the example above.
A number of programs do this for you
automatically (SPAN, R, etc.)

73
Local Network Analysis Network Composition
An example network All senior males from a small
(n350) public HS.
SPAN will do this for you
74
Local Network Analysis Network Composition

Common composition measures
Level measures
Mean of a given attribute (average income of
alters)
Proportion with a particular attribute
(proportion who smoke)
Counts (number of peers who have had sex)
Dispersion measures
Heterogeneity index (Racial heterogeneity)
Index of dissimilarity
Standard Deviation
Absolute value of the differences
Variable range of values
Composition measures for multiple variables
simultaneously
Average correlation across all alters
Euclidean / Mahalanobis distance measures

75
Local Network Analysis Network Mixing
A common interest in network research is
identifying how likely persons of one category
are to interact with people of another
category. Examples Race mixing how likely are
people of one race to interact with people of
another? Sexual activity mixing Are people with
many partners likely to associate with each
other? Neighborhood / location mixing Are people
likely to name friends from the same
neighborhood. These questions can be answered by
cross classifying the category of the nominator
with the category of the nominated in a mixing
matrix.
76
Local Network Analysis Network Mixing
Race mixing in one of the Add Health schools
77
Local Network Analysis Network Mixing
White Black Hispan Asian
Mix/Other White 1099 128 53
0 231 Black 97 10218
1032 0 539 Hispanic 54
961 104 1 91 Asian
0 0 0 0 0 Mix/Other
191 560 66 0 106
78
Local Network Analysis Network Mixing

Working with mixing matrices
Group segregation index (Freeman 1972)
Associations between rows and columns (valued
relations)
Assortative mixing
Correlations or Q
Log-linear models
Assessing chance levels depends on the data
available. If you have full network data you can
look at density between groups, without you can
only focus on the sheer volume of ties (without
information on the size of the target groups)

79
Local Network Analysis Network Structure

While network structure data are limited, there
are a number of features that can be of interest,
assuming you have data on the relations among
egos contacts.
Basic arguments
structural amplification that some feature of
the arrangement of ties amplifies any peer effect
of network composition (see Haynies paper)
Network range effects that being connected to
a diverse set of alters -- who are not connected
to each other provides profitable returns.
Granovetters Strength of Weak Ties, Burts
Structural Holes
Familiar to students of social theory as the
Tertius Gaudens argument from Simmel
In both cases, we use the pattern of ties
surrounding ego to characterize the local
structure. We start with volume measures, then
move on to more complex pattern measures.

80
Local Network Analysis Network Structure volume
Network Size
X1985 2.9 X2004 2.1
From time to time, most people discuss important
matters with other people. Looking back over the
last six monthswho are the people with whom you
discussed matters important to you? Just tell me
their first names or initials. IF LESS THAN 5
NAMES MENTIONED, PROBE Anyone else?
81
Local Network Analysis Network Structure volume
Network Size by
Age Drops with age at an increasing rate.
Elderly have few close ties. Education Increase
s with education. College degree 1.8 times
larger Sex (Female) No gender differences on
network size. Race African Americans networks
are smaller (2.25) than White Networks (3.1).
82
Local Network Analysis Network Structure volume
What does Fischer have to say about the size of
local nets (by context)?
83
Local Network Analysis Network Structure volume
Density is the average value of the relation
among all pairs of ties. T /
((NN-1)/2) Density is usually calculated over
the alters in the network.
2
1
R
3
4
5
D 5 / ((54)/2) 5 / 10 0.5
84
Local Network Analysis Network Structure volume
What does Fischer have to say about the density
of local nets (by context)?
85
Local Network Analysis Network Structure volume
GSS Density
86
Local Network Analysis Network Structure volume

In general, dense networks should be more
cohesive and we would expect that goods will
flow through the network more efficiently
Social support peer influence, for example,
should be stronger in dense networks
Density is a volume measure, however, and can
mask significant structural differences

These two networks have the same density but very
different structures. Most network analysis
programs will calculate ego-network density
directly.
87
Local Network Analysis Network Structure Weak
Ties Structural Holes
The Strength of Weak Ties In a classic
article, Granovetter (1972) argues that for many
purposes (such as getting a job), the most useful
network contacts are through weak ties. This
is because weak ties connect you to a more
diverse set of alters, increasing the range of
your network. Your strong ties tend to be tied
to each other, making them redundant for the
purposes of bringing information. Essentially
this argument works on a spurious relation. The
key value of weak ties is not in the weak
affective bond, but in the structural location of
the ties. We can measure this directly, and Ron
Burt provides a series of measures for doing so.
88
Local Network Analysis Network Structure Weak
Ties Structural Holes
Maximum Efficiency
Decreasing Efficiency
Number of Non-Redundant Contacts
Increasing Efficiency
Minimum Efficiency
Number of Contacts
89
Local Network Analysis Network Structure Weak
Ties Structural Holes
Effective Size
Conceptually the effective size is the number of
people ego is connected to, minus the redundancy
in the network, that is, it reduces to the
non-redundant elements of the network. Effective
size Size - Redundancy
Where j indexes all of the people that ego i has
contact with, and q is every third person other
than i or j. The quantity (piqmjq) inside the
brackets is the level of redundancy between ego
and a particular alter, j.
90
Local Network Analysis Network Structure Weak
Ties Structural Holes
Effective Size
Piq is the proportion of actor is relations that
are spent with q.
2
3
Adjacency 1 2 3 4 5 1 0 1 1 1 1 2 1 0 0 0 1 3 1
0 0 0 0 4 1 0 0 0 1 5 1 1 0 1 0
1
5
4
91
Local Network Analysis Network Structure Weak
Ties Structural Holes
Effective Size
mjq is the marginal strength of contact js
relation with contact q. Which is js interaction
with q divided by js strongest interaction with
anyone. For a binary network, the strongest link
is always 1 and thus mjq reduces to 0 or 1
(whether j is connected to q or not) The sum of
the product piqmjq measures the portion of is
relation with j that is redundant to is relation
with other primary contacts.
92
Local Network Analysis Network Structure Weak
Ties Structural Holes
Effective Size
2
3
Working with 1 as ego, we get the following
redundancy levels
1
P 1 2 3 4 5 1 .00 .25 .25 .25 .25 2
.50 .00 .00 .00 .50 3 1.0 .00 .00 .00 .00 4 .50
.00 .00 .00 .50 5 .33 .33 .00 .33 .00
PM1jq 1 2 3 4 5 1 --- --- --- ---
--- 2 --- .00 .00 .00 .25 3 --- .00 .00 .00 .00 4
--- .00 .00 .00 .25 5 --- .25 .00 .25 .00
5
4
Redundancy 1 Effective size 4-1 3
93
Local Network Analysis Network Structure Weak
Ties Structural Holes
Effective Size
2
3
When you work it out, in a binary network,
redundancy reduces to the average degree, not
counting ties with ego of egos alters. Since
the average degree is simply another way to say
density, we can calculate redundancy as 2t/n
where t is the number of ties (not counting
ties to ego) and n is the number of people in the
network (not counting ego). Meaning that
effective size n - 2t/n
1
5
4
UCINET, STRUCTURE, SPAN and PAJEK all calculate
effective size
94
Local Network Analysis Network Structure Weak
Ties Structural Holes
Efficiency is simply effective size divided by
observed size. Taken from each egos point of
view, efficiency in this network would be
Effective Ego Size
Size Efficiency 1 4 3 .75 2
2 1 .50 3 1 1 1.00 4
2 1 .50 5 3 1.67 .55
2
3
1
5
4
95
Local Network Analysis Network Structure Weak
Ties Structural Holes
Constraint
Conceptually, constraint refers to how much room
you have to negotiate or exploit potential
structural holes in your network.
2
3
..opportunities are constrained to the extent
that (a) another of your contacts q, in whom you
have invested a large portion of your network
time and energy, has (b) invested heavily in a
relationship with contact j. (p.54)
1
5
4
96
Local Network Analysis Network Structure Weak
Ties Structural Holes
Constraint
Cij Direct investment (Pij) Indirect
investment (PiqPqj)
97
Local Network Analysis Network Structure Weak
Ties Structural Holes
2
3
Constraint
1
5
4
Given the p matrix, you can get indirect
constraint (piqpqj) by simply squaring the matrix
PP 1 2 3 4 5 1 ... .083
.000 .083 .250 2 .165 ... .125 .290 .125 3 .000
.250 ... .250 .250 4 .165 .290 .125 ... .125 5
.330 .083 .083 .083 ...
P 1 2 3 4 5 1 .00 .25 .25 .25 .25 2
.50 .00 .00 .00 .50 3 1.0 .00 .00 .00 .00 4 .50
.00 .00 .00 .50 5 .33 .33 .00 .33 .00
98
Local Network Analysis Network Structure Weak
Ties Structural Holes
Constraint
Total constraint between any two people then is
C (P P2)2
Where P is the normalized adjacency matrix, and
means to square the elements of the matrix.
99
Local Network Analysis Network Structure Weak
Ties Structural Holes
Hierarchy
Conceptually, hierarchy (for Burt) is really the
extent to which constraint is concentrated in a
single actor. It is calculated as
Note this measure says nothing about the
direction of ties its not about asymmetry
100
Local Network Analysis Network Structure Weak
Ties Structural Holes
Hierarchy
2
3
1
2 3 4 5 C C .11 .06 .11 .25
.53 .83 .46 .83 1.9
5
4
H.514
101
Local Network Analysis Network Structure Weak
Ties Structural Holes
Burt (2004) AJS 110349-399
102
Local Network Analysis Network Structure Weak
Ties Structural Holes
Burt (2004) AJS 110349-399
103
Local Network Analysis Network Structure Weak
Ties Structural Holes
Burt (2004) AJS 110349-399
104
Local Network Analysis Local Network Models
Modeling Issues

Local Network modeling issues
Case independence
In very clustered settings, the alters that each
person names will overlap. This will lead to
non-independence among the cases.
If you have enough cases or over time data, you
can use random or fixed effect models
If you know the names of alters, you can link
them to build in a direct network autocorrelation
effect.
Small network effects
Be aware of the size of your networks.
Substantively, having 50 white networks means
something different in a net of size 2 vs a net
of size 10. I often suggest interactions to
check for these kinds of effects
Dealing with isolates
Isolated nodes have no network alters, so none of
these measures apply. Depending on the context,
you can either leave them out of the analysis, or
use interaction terms to selectively apply the
measures of interest.

105
Local Network Analysis Local Network Models
Modeling Issues

Selection
That some unobserved factor, z, creates both
friendships and the outcome of interest.
Endogeneity
That the causal order of peer relations and
outcomes is reversed. Peers do not cause Y, but
Y causes friendship relations

106
Local Network Analysis Local Network Models
Modeling Issues
Selection

What do we know about how friendships form?

Opportunity / focal factors
- Being members of the same group
- In the same class
- On the same team
- Members of the same church
Structural Relationship factors
- Reciprocity
- Social Balance
Behavior Homophily
- Smoking
- Drinking

107
Local Network Analysis Local Network Models
Modeling Issues
Selection
How to correct this problem?

Essentially, this is an omitted variable problem,
and the obvious solution is been to identify as
many potentially relevant alternative variables
as you can find.
Sensitivity measures (see Ken Franks work here)
Propensity score matching
Individual-level fixed effect models
Substantively you only look at change in Y as a
function of change in X, holding constant
(because dummied out) any individual level
effect.
This works, but its drastic. Any endogenous
effect of networks on the self are essentially
removed

108
Local Network Analysis Local Network Models
Modeling Issues
Endogeneity
Estimated Y b0 b1(P) e where P some
peer function. But the actual model may really
be P b0 b1(f(Y)) e
109
Local Network Analysis Local Network Models
Modeling Issues
Endogeneity
Does it matter?
Algebraically the relation between y and p should
be direct translation of the coefficients
since
The statistical problem of endogeneity is that
when you estimate b1, it does not equal 1/b1,
because of our assumptions about x, and hence e.
There are other models that make different
assumptions, where this direction is irrelevant.
But they are uncommon and hard to work with in
the multivariate context.
(see Joel H. Levine, Exceptions are the Rule, for
a full discussion of this)
110
Local Network Analysis Local Network Models
Modeling Issues
Possible solutions

Theory Given what we know about how friendships
form, is it reasonable to assume a bi-directional
cause? That is, work through the meeting,
socializing, etc. process and ask whether it
makes sense that Y is a cause of P.
Models
Time Order. We are on somewhat firmer ground if
P precedes Y in time.
- Simultaneous Equation Models. Model both the
friendship pattern and the outcome of interest
simultaneously. Difficult to identify
instruments or to specify orders that do not
logically make the model inestimable.

111
Local Network Analysis Local Network Models Peer
influence example

Haynie asks whether peers matter for delinquent
behavior, focusing on
a) the distinction between selection and
influence
b) the effect of friendship structure on peer
influence
Two basic theories underlie her work
a) Hirchis Social Control Theory
Social bonds constrain otherwise criminal
behavior
The theory itself is largely ambivalent toward
direction of network effects
b) Sutherlands Differential Association
Behavior is the result of internalized
definitions of the situation
The effect of peers is through communication of
the appropriateness of particular behaviors
Haynie adds to these the idea that the structural
context of the network can boost the effect of
peers (a) so transmission is more effective in
locally dense networks and (b) the effect of
peers is stronger on central actors.