Title: The statistical analysis of personal network data
1The statistical analysis of personal network data
- I. Cross-sectional analysis
- II. Dynamic analysis
- Miranda Lubbers,
- Autonomous University of Barcelona
2Sociocentric networks
Sociocentric or complete networks consist of
the set of relations among the actors of a
defined group (e.g., a school class, a firm)
3Personal networks
A personal network consists of the set of
relations a focal person (ego) has with an
unconstrained set of others (alters) and the
relations among them.
4Egonet, software to aid the collection of
personal network data
- Information about the respondent (ego e.g., age,
sex, nationality) - Information about the associates (alters) to whom
ego is connected (e.g., alters age, sex,
nationality) - Information about the ego-alter pairs (e.g.,
closeness, frequency and / or means of contact,
time of knowing, geographic distance, whether
they discuss a certain topic, type of relation
e.g., family, friend, neighbour, workmate ) - Information about the relations among alters as
perceived by ego (simply whether they are related
or not, or strong/weak/no relation)
5The statistical analysis of personal versus
sociocentric networks what are the differences?
- Whereas sociocentric network researchers often
(yet not always) concentrate on a single network,
personal network researchers typically
investigate a sample of networks (ideally a
random, representative sample). - The dependency structure of sociocentric networks
is complex, therefore leading to the need of
specialized social network software, but personal
network researchers, as they have up till now
hardly used the data on alter-alter relations,
have a simpler dependency structure...
6Personal network data have a multilevel
structure
- E.g. sample of 100 respondents for each
respondent, data of 45 alters were collected, so
in total a collection of 4500 alters
7For cross-sectional analysis, three types of
analysis have been used in past research
- Type I Aggregated analysis
- Type II Disaggregated analysis
- (not okay, forget about it quickly!)
- Type III Multilevel analysis
8Type 1 Aggregated analysis
- First, aggregate all information to the ego-level
(this can be exported directly from Egonet) - Compositional variables (aggregated
characteristics of alters or ego-alter
relations) e.g., percentage of women, average
closeness, average distance between ego and his
nominees,...) - Then use standard statistical procedures to e.g.
- Describe the network size and / or composition or
compare it across populations - Explain the size and / or composition of the
networks (network as a dependent variable) with
for example regression analysis (e.g., in SPSS, R)
9Regression analysis
- In simple linear regression, the model that
describes the relation between a single dependent
variable y and a single explanatory variable x is
- yi ß0 ß1xi ei
- ß0 and ß1 are referred to as the model
parameters, and e is a probabilistic error term
that accounts for the variability in y that
cannot be explained by the linear relationship
with x.
10Regression analysis
- Simple linear regression
- yi ß0 ß1xi ei
- More explanatory variables can be added
- yi ß0 ?ßpxip ei
11Illustration aggregate analysis
- S. G. B. Roberts, R. I. M. Dunbar, T. V. Pollet,
T. Kuppens (2009). Exploring variation in active
network size Constraints and ego
characteristics. Social Networks, 31, 138-146.
12Illustration explaining personal network size
1. Explaining unrelated network size
13Illustration explaining personal network size
2. Explaining related network size
14Regression analysis at the aggregate level
- Is statistically correct provided that you do not
make any cross-level inferences ( ecological
fallacy)
15Hypothetical illustration of the statement to not
make cross-level inferences on the basis of
aggregate results
- I ask three persons to name ten friends each
- I further ask what the sex of each friend is and
how close they feel with each friend on a scale
from 0 (not close at all) to 4 (very close). - My question is Do persons who have many women in
their networks feel closer with their network
members?
16Example Statistical relation at aggregate level
cannot be interpreted at tie level
17Example Statistical relation at aggregate level
cannot be interpreted at tie level
18Example Statistical relation at aggregate level
cannot be interpreted at tie level
19Type 2 Disaggregate analysis
- Disaggregated analysis of dyadic relations (e.g.,
a linear regression analysis on the 4500 alters)
is statistically not correct even though it has
been done (e.g. Wellman et al., 1997, Suitor et
al., 1997) - Observations of alters are not statistically
independent as is assumed by standard statistical
procedures - If observations of one respondent are correlated,
standard errors will be underestimated, and
consequently significance will be overestimated
20Type 3 Multilevel analysis
- Multilevel analysis is a generalization of linear
regression, where the variance in outcome
variables can be analyzed at multiple
hierarchical levels. In our case, alters (level
1) are nested within egos / networks (level 2),
hence the variance is decomposed in variance
between and within networks. - The regression equation yi ß0 ß1xi Ri
is now extended to yij ß0j ß1jxij
Rij, - where ß0j
?00 U0j
21Type 3 Multilevel analysis
- Dependent variable Some characteristic of the
dyadic relationships (e.g., strength of tie). - Note Special multilevel models have been
developed for discrete dependent variables. - Explanatory variables can be (among others)
- characteristics of egos (level 2),
- characteristics of alters (level 1),
- characteristics of the ego-alter pairs (level 1).
- Software e.g., R, MLwiN, HLM, VarCL
22Illustrations of multilevel analysis for personal
networks
- G. Mollenhorst, B. Völker, H. Flap (2008). Social
contexts and personal relationships The effect
of meeting opportunities on similarity for
relationships of different strength. Social
Networks, 30, 60-68. - Mok, D., Carrasco, J.-A., Wellman, B. (2009).
Does Distance Still Matter in the Age of the
Internet? Urban Studies, forthcoming.
23The effect of the context where people meet on
the amount of similarity between them
(Mollenhorst, Völker, Flap)
24Illustration Analysis of the importance of
distance for overall contact frequency (Mok,
Carrasco Wellman)
- LnDist is the natural logarithm of residential
distance between ego and alter, RIMM is a dummy
variable indicating whether ego is an immigrant.
Bold figures are significant at p lt .05, bold and
italic at p lt .10.
25See for a good article about the possibilities of
multilevel analysis of personal networks
- Van Duijn, M. A. J., Van Busschbach, J. T.,
Snijders, T. A. B. (1999). Multilevel analysis of
personal networks as dependent variables. Social
Networks, 21, 187-209.
26In summary, cross-sectional analysis of personal
networks...
27... but what about the relationships among alters?
- So far, we have only looked at the relationships
a person (ego) has with his or her network
members (alters)
28e.g., we ask people to nominate 45 others and to
report about their relationships with them
29But data can also be collected on the
relationships among network members
30... but what about the relationships among alters?
- Most researchers are only interested in
alter-alter relations to say something about the
structure of personal networks at the network
level only
31... but what about the relations among alters?
- Most researchers are only interested in
alter-alter relations to say something about the
structure of personal networks at the network
level only - Compute structural measures at the aggregate
level (e.g., density, betweenness centralization,
number of cliques) - Predict the structure of the networks in an
aggregated analysis using for example regression
analysis
32... but what about the relations among alters?
- It may however be interesting to analyze which
alters are related (at the tie level) - What predicts transitivity in personal relations?
Or, as Louch expressed it, what predicts network
integration?
33Exponential Random Graph Models (ERGMs)
- The class of ERGMs is a class of statistical
models for the state of a social network at one
time point. - The presence or absence of a tie between any pair
of actors in the network is modeled as a function
of structural tendencies (e.g., transitivity,
popularity), individual and dyadic covariates
(e.g., similarity).
34Exponential Random Graph Models (ERGMs)
- ERGMs can be estimated in, among others, the
software SIENA (up to version 3), statnet, pnet
(e.g., in R) - Dependent variable whether pairs of alters are
related or not - Explanatory variables
- characteristics of alters,
- characteristics of the relation alters have with
ego, - characteristics of the alter-alter pair,
- endogenous network characteristics such as
transitivity - (in a meta-analysis, characteristics of ego can
be added as well) - Type of analysis Apply a common ERGM to each
network, then run a meta-analysis (cf. Lubbers,
2003 Snijders Baerveldt, 2003 Lubbers
Snijders, 2007).
35Ego influences parameter estimates strongly
36 so we tend to leave ego out
37Example ERGM Predicting relations among alters
in the personal networks of immigrants
p lt .05, p lt .01. Conditioned on degree.
38In summary, cross-sectional analysis of personal
networks...
39Part II. Dynamic analysis
- How do personal networks change over time?
- Studies that collect data on personal networks in
two or more waves in a panel study
40Interest in dynamic analysis
- Networks at one point in time are snapshots, the
results of an untraceable history (Snijders) - E.g., personal communities in Toronto (Wellman et
al.) - Changes following a focal life event (individual
level) - E.g., transition from high school to university
(Degenne Lebeaux, 2005) childbearing, moving,
return to school in midlife (Suitor Keeton,
1997) retirement (Van Tilburg, 1992) marriage
(Kalmijn et al., 2003) divorce (Terhell, Broese
Van Groenou, Van Tilburg, 2007) widowhood
(Morgan, Neal, Carder, 2000) migration
(Lubbers, Molina, Lerner, Ávila, Brandes
McCarty, 2009) - Broader studies of social change Social and
cultural changes in countries with dramatic
institutional changes - E.g., post-communism in Finland, Russia (Lonkila,
1998), Eastern Germany (Völker Flap, 1995),
Hungary (Angelusz Tardos, 2001), China (Ruan,
Freeman, Dai, Pan, Zhang, 1997),
41Sources of change in (personal) networks
- Unreliability due to measurement error
- Inherent instability
- Systemic change
- External change
- Leik Chalkley (1997), Social Networks 19, 63-74
42Sources of change in (personal) networks
- Unreliability due to measurement error
- Inherent instability
- Systemic change
- External change
- Leik Chalkley (1997), Social Networks 19, 63-74
43Personal networks are layered
Personal network (
150)
Close / active network ( 50)
Sympathy group ( 15)
Support clique ( 5)
44Dependent variables in dynamic personal network
studies
Typology Feld, Suitor, Gartner Hoegh, 2007,
Field Methods, 19, 218-236.
45Type 1 Persistence of ties with alters across
time
- Dependent variable whether a tie persists or not
to a subsequent time (dichotomous) - Explanatory variables
- characteristics of ego at t1 (gender, job
situation) - change characteristics of ego t1-t2 (e.g., change
in marital status) - characteristics of alter at t1 (e.g., educational
level) - characteristics of the ego-alter pair at t1
(e.g., tie strength) - cross-level interactions (e.g., egos marital
status kin) - Type of analysis Logistic multilevel analysis
(e.g., MLwin, Mixno)
46Type 1 Persistence of ties with alters across
time
- Logistic regression is used to predict the log
odds that a tie persists over time (log odds
log (p / q)). - Logistic regression is in reality ordinary
regression using the log odds as the response
variable. - The coefficients B in a logistic regression model
are in terms of the log odds - A unit increase in the explanatory variable x1
will multiply the log odds for having a tie with
eß1
47Illustration type 1 Explaining persistence of
ties for immigrants
p lt .05, p lt .01. Excluded Sex, employment
status, marital status, recent visits to country
of origin, changes in employment marital
status, tie duration, kin
48Type 2 Changes in characteristics of persistent
ties across time
- Dependent variable change in some characteristic
of the relationship (e.g., change in strength of
tie) or characteristic at t2, and use same
characteristic at t1 as covariate
(auto-correlation approach) - Explanatory variables
- characteristics of ego at t1 (gender, job
situation) - change characteristics of ego t1-t2 (e.g., change
in marital status) - characteristics of alter at t1 (e.g., educational
level) - characteristics of the ego-alter pair at t1
(e.g., tie strength) - cross-level interactions (e.g., egos marital
status kin) - Type of analysis Multilevel analysis
49Example
- Change in contact frequency (visits and telephone
calls) after an important life event - Two time points shortly after the life event
took place and four years later - Van Duijn, M. A. J., Van Busschbach, J. T.,
Snijders, T. A. B. (1999).
50(No Transcript)
51Type 3 Changes in the size of the network across
time
- Dependent variable change in number of ties in
the personal network - Explanatory variables
- characteristics of ego at t1 (gender, job
situation) - change characteristics of ego t1-t2 (e.g., change
in marital status) - characteristics of the set of alters at t1
- Type of analysis Regression analysis at the
aggregate level
52Illustration of the analysis of the stability of
personal networks over time (East York studies,
Wellman et al.)
Multiple regression predicting network turnover
(n 33)
53Type 4 Changes in overall network
characteristics across time
- Dependent variable change in compositional or
structural variable (e.g., percentage of alters
with higher education, density of the network) - Explanatory variables, e.g.
- Characteristics of ego at t1
- Characteristics of the network at t1
- Type of analysis Regression analysis at the
aggregate level
54Dynamic personal network analysis More than two
observations
- Add an extra level to the analysis representing
the observation - One-level models become two-level models
- Two-level models become three-level
55Dynamic personal network analysis More than two
observations
- Example of type 2 analysis with multiple
observations Changes in contact after widowhood
Guiaux, M., van Tilburg, T. Broese van Groenou,
M. (2007). Changes in contact and support
exchange in personal networks after widowhood.
Personal Relationships, 14, 457-473
56(No Transcript)
57More than two observations example of
alternative way (type 3 analysis)
E. L. Terhell, M. I. Broese van Groenou T. van
Tilburg (2004). Network dynamics in the long-term
period after divorce. Journal of Social and
Personal Relationships, 21, 719-738
58More than two observations example of
alternative way (type 3 analysis) contd
59See for example the chapter on longitudinal data
in this book
- T. A. B. Snijders R. J. Bosker (1999).
Multilevel analysis. An introduction to basic and
advanced multilevel modeling. London Sage
Publications.
60In summary, dynamic analysis of personal networks
61... but what about the dynamics of alter-alter
relations?
62Time 1
An example of a changing personal network
Node color Stable alters are dark blue temporal
alters light blue Edge color Relations among
stable alters are dark blue among / with
temporal alters light blue Node size Egos
closeness with alter Labels Spanish, Fellow
Migrants, Originals, TransNationals
63An example of a changing personal network
Node color Stable alters are dark blue temporal
alters light blue Edge color Relations among
stable alters are dark blue among / with
temporal alters light blue Node size Egos
closeness with alter Labels Spanish, Fellow
Migrants, Originals, TransNationals
64Dependent variables in dynamic personal network
studies Composition and structure
65Type 5 Changes in ties among alters across time
- Dependent variable whether alters make new ties
or break existing ties with other alters across
time - Independent variables
- characteristics of alters,
- characteristics of the relation alters have with
ego, - characteristics of the alter-alter pair,
- endogenous network characteristics such as
transitivity - (in a meta-analysis, characteristics of ego can
be added as well) - Type of analysis Apply a common SIENA model to
each network (leaving ego out), then run a
meta-analysis (cf. Lubbers, 2003 Snijders
Baerveldt, 2003 Lubbers Snijders, 2007). A
multilevel version of SIENA is on the agenda.
66Just a few thoughts about the use of SIENA for
personal networks
- Ego influences parameter estimates considerably,
therefore, ego should be left out or
alternatively, his or her relations can be given
structural ones (to model that ego is by
definition related to everyone else) - As ego reports about the relationships between
his or her alters, relations tend to be
symmetric, so non-directed model type for SIENA - Smaller networks or networks that have only a few
changes per network (less than 40) can be
combined into one or multiple multigroup
project(s)
67Example Predicting the changes in ties among
alters in immigrant networks
p lt .01. N 44 respondents
68In summary, dynamic analysis of personal networks
69Conclusion
- Multiple statistical methods for personal network
research, depending on your research interest - Combining several methods probably gives the
greatest insight into your data
70- Thanks!
- My e-mail address MirandaJessica.Lubbers_at_uab.es