NSF grant - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

NSF grant

Description:

Hub-spoke structure (scale free) Small world (on average, 13 ... Proximity at home. Proximity at work. Proximity outside work. Proximity on Saturday nights ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 35
Provided by: alexander4
Category:
Tags: nsf | at | classifieds | for | free | grant | home | post | work

less

Transcript and Presenter's Notes

Title: NSF grant


1
NSF grant 0429452
  • Life in the Network
  • The Coming Era of Computational Social Science

David Lazer Harvard University Netsci 2007
2
Life in the Network
  • Much of human civilization has been about
    building network infrastructure (or taking
    advantage of naturally occurring infrastructure,
    such as rivers)
  • Benefits of trade
  • Economies of scale
  • Human drive to connect
  • Cities, roads, railroads, telephone lines
  • There has been a proliferation of various types
    of ICT networks connecting people in the last few
    decades

3
Digital traces of our networked lives
  • E-mail
  • Instant messaging
  • Text messaging
  • Telephone logs
  • Link structure among websites
  • Facebook
  • Web surfing http//www.iq.harvard.edu/blog/netgo
    v/
  • What can data like these tell us?

4
Existing approaches to studying social networks
  • Growth in interest in these kinds of relational
    phenomena
  • Generally rely on self reports
  • Static generally based on snapshots
  • Shaky reliability what is being measured by self
    reports?
  • Small scale mostly systems in the hundreds or
    less
  • ? Inferential challenges in existing research
  • ? Many important phenomena are neglected

5
What can data like these tell us?
  • How do things spread through a network?
  • Ideas?
  • Avian flu?
  • How do people/organizations work together?
  • Collaboration and coordination?
  • Who is in key positions in the network?
  • Form an empirical basis for various types of
    policy recommendations
  • Possibly even real-time feedback for effective
    interventions

6
Computational social science
  • The capturing and analysis of human activity
    represented in digital form
  • Increased computational capacity to manipulate
    data
  • Incidental, vast archives of human activity
    (e.g., Internet, e-mail)
  • Instrumentation of human behavior (e.g., cookies,
    GPS devices)
  • Creation of virtual worlds to experiment with
  • What are the implications for our understanding
    of collective human behavior?

7
What can be done with data like these? Four
studies
  • Call log analysis
  • Instrumentation of human behavior
  • Natural language processing
  • Building virtual worlds

8
Study 1 Call log analysis
  • Structure and tie strengths in mobile
    communication networks (just came out in PNAS,
    with J.-P. Onnela, J. Saramäki, J. Hyvönen, G.
    Szabó,K. Kaskil, J. Kertész, A.-L. Barabási )
  • Examination of call log data from mobile phone
    company in moderate sized European nation a
    total of approximately 7,000,000 users, 49
    trillion dyads
  • What does network structure look like?
  • Small world? (six degrees of separation Watts
    Strogatz)
  • Scale free? (hub-spoke structure Barabasi and
    Albert)
  • Strength of weak ties? (Granovetter)

9
Call log network data
10
Results
  • Hub-spoke structure (scale free)
  • Small world (on average, 13 degrees of
    separation)
  • But poorly structured for dissemination Strong
    ties tend to be clustered, and weak ties bind
    clusters together (consistent with Granovetter)
  • Simulations suggest that weak(est) ties are not
    effective at spreading (inconsistent with
    Granovetter)
  • Potentially powerful tool for studying evolving
    social structures of communities
  • Possible use of data for a variety of policy
    purposes, from criminal investigations to early
    warning system for avian flu
  • But what does a phone call between two phones
    mean??

11
Study 2 Instrumentation of human behavior
  • Paper Revealing Social Relationships using
    Contextualized Proximity and Communication Data
    (with Nathan Eagle and Sandy Pentland)
  • Collaboration with Media Lab
  • Program mobile phones of 100 students for 9
    months
  • Call log data
  • Physical proximity (using Bluetooth)
  • Location (using cell tower triangulation)
  • Also collected self report data on friendship,
    satisfaction
  • What is the information in these data?
  • Compare observations to self reports

12
Self reported vs observed proximity
  • Substantial recency effects recent interactions
    weighted more heavily
  • Reciprocal non-friends 99.5 accurate at
    reporting 0s
  • Reciprocal friends 35 accurate at reporting
    0s
  • Friends more accurate at non-0s

13
Is friendship observable?
  • Friendship is important at individual and
    collective levels due to the resources that flow
    among friends
  • Purely cognitive relationship in principle,
    you could be friends with someone with whom you
    do not interact.
  • But generally we all make inferences about who is
    friends with whom based on our observations
  • Can the types of information that inform our
    inferences be captured via our mobile phones?
  • Certainly, one anticipates that (for ex) friends
    will tend to be proximate to each other
  • If high accuracy is possible, then possible to
    look at evolution of friendship structure in
    larger populations over time (as well as other
    cognitive relationships, such as advice)

14
Predicting friendships
  • Relational scripts culturally-embedded patterns
    of relational behavior
  • We generated seven relational variables
  • Frequency of phone calls
  • Proximity at home
  • Proximity at work
  • Proximity outside work
  • Proximity on Saturday nights
  • Proximity with no signal
  • Number of unique locations
  • Interactions broke into two factors

15
Predicting friendships
  • Relational scripts culturally-embedded patterns
    of relational behavior
  • We generated seven relational variables
  • Frequency of phone calls
  • Proximity at home
  • Proximity at work
  • Proximity outside work
  • Proximity on Saturday nights
  • Proximity with no signal
  • Number of unique locations
  • Interactions broke into two factors
  • In-role communication
  • Extra-role communication

16
Reported friendships
17
Inferred friendships (based on extra-role factor)
18
Self reported versus observed friendships
  • We were able to categorize correctly 95 of
    reciprocated friendships and reciprocated
    non-friendships
  • Unreciprocated friendships came from high
    scores in-role communication, perhaps capturing
    cultural ambiguity
  • Created continuous construct from dichotomous
    self report perhaps a more valid measure of
    friendship?
  • Second layer of validation predicting
    satisfaction based on (a) actual friendships and
    (b) inferred friendship. Second model does
    slightly better.
  • We follow culturally embedded programs with
    respect to our relationships
  • Results suggest potential for inferring
    friendship on much larger scale.

19
Current data collection the sociometer
  • More collaboration with the Media Lab
    sociometers
  • Study of teams (with Nancy Katz)
  • When do team members talk to each other, and who
    does the talking?
  • What difference does this make at individual and
    collective level?

20
Study 3 Automated content analysis of
Congressional websites
  • Acknowledgement NSF grant 0429452, Stephen
    Purpura
  • Advances in computational linguistics and voice
    recognition software
  • Importance amplified by simultaneous improvements
    in voice recognition software
  • Code human coders categorize blocks of text
    (training set and test set)
  • Train classification algorithms
  • Test against test data

21
Official Congressional websites
  • Every House member has a website at www.House.gov
  • Example http//www.house.gov/capuano/
  • Strategic calculus of what message to send to
    constituents
  • Comparable set of websites, create panel data
    set, some things varying over time, some not

22
(No Transcript)
23
Official Congressional websites
  • Every House member has a website at www.House.gov
  • Example http//www.house.gov/capuano/
  • Strategic calculus of what message to send to
    constituents
  • Comparable set of websites, create panel data
    set, some things varying over time, some not

24
Official Congressional websites
  • Track evolution of language usage
  • Examples positive/negative mentions of Bush?
  • For example, what was/is strongest single word
    predictor of partisanship of Member of Congress?
  • In 2001 terror (Republicans used more than
    Democrats)
  • In 2006 Iraq (Democrats used more than
    Republicans)
  • Allows us to see what is flowing through networks

25
Study 4 Building virtual deliberation world
  • Acknowledgements Kevin Esterling, Michael
    Neblo, Curt Ziniel NSF grant 0429452
  • Creation of virtual space (use of Macromedia
    Breeze)
  • Twenty deliberative sessions with Members of
    Congress regarding immigration
  • Allows complete control and recording of
    interactions, recruitment of representative
    sample
  • Pretest, various control groups, during session
    questions, post-test a week later, and
    post-election survey
  • Analysis still pending (dramatic effects on
    approval and vote intention)

26
Virtual deliberation world
27
Study 4 Building virtual deliberation world
  • Acknowledgements Kevin Esterling, Michael
    Neblo, Curt Ziniel NSF grant 0429452
  • Creation of virtual space (use of Macromedia
    Breeze)
  • Twenty deliberative sessions with Members of
    Congress regarding immigration
  • Allows complete control and recording of
    interactions, recruitment of representative
    sample
  • Pretest, various control groups, during session
    questions, post-test a week later, and
    post-election survey
  • Analysis still pending (dramatic effects on
    approval and vote intention)

28
Computational social science
  • Orders of magnitude increase in data being
    collected about human behavior over last decade
  • Constant increase in computational power
  • Shift in social science research over the next
    generation
  • Thinking relationally what is flowing among
    people? How are people working together?

29
The big picture
  • The capturing of massive amounts of digitalized
    information about human behavior (especially
    relational behavior)
  • The capacity to manipulate those data
  • New insights into collective human behavior

30
Challenges, Caveats, and Conundrums
  • Overcoming silos of academia, particularly wide
    between the sciences and social sciences
  • The need to develop new infrastructures within
    social sciences
  • Substantial human subjects issues
  • Partnerships with those that are guardians of the
    network
  • Concerns about use of knowledge that is produced
    (ex of NSA wiretaps, private sector data mining)
  • Figuring out what those insights are time to
    shift the paradigm

31
NSF grant 0429452
  • Computational Social Science

Partially supported by NSF grant 0429452
32
NSF grant 0429452
  • Computational Social Science

Partially supported by NSF grant 0429452
continue at http//www.iq.harvard.edu/blog/netgo
v/
33
(No Transcript)
34
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com