Title: Twitter Catches The Flu: Detecting Influenza Epidemics using Twitter
1Twitter Catches The Flu Detecting Influenza
Epidemics using Twitter
- Eiji ARAMAKI
- Sachiko MASKAWA
- Mizuki MORITA
- The University of Tokyo
- National Institute of Biomedical Innovation
EMNLP2011
2Why we developed this system?
Let me show you several existing systems
3Centers for Disease Control and Prevention (CDC)
4 Infection Disease Surveillance Center (IDSC)
5European Influenza Surveillance Network (EISN)
6Why each country has each surveillance system?
- Influenza epidemics are a major public health
concern, because it causes tens of millions of
illnesses each year. - To reduce the victims, the early detection of
influenza epidemics is a national mission in
every country. - BUT These surveillance systems
- basically rely on hospital reports
- (written manually).
-
7Two Problems Recent Approach
- (1) Small Scale
- For example, IDSC gathers influenza patient data
from 5,000 clinics. But It does not cover all
cities (especially local cities). - (2) Time Delay (Time lag)
- For example, the data gathering process typically
has a 12 week reporting lag - To deal with these problems
- Recently, various approaches that directly
capture peoples behavior are proposed
8Recent Approach
- using Phone Call data
- Espino et al. (2003) used data of a telephone
triage service, a public service, to give an
advice to users via telephone. They reported the
number of telephone calls that correlates with
influenza epidemics. - using Drug sale data
- Magruder (2003) used the amount drug sales.
Among various approaches
9The State-of-the-ArtWeb based Approach
- Ginsberg et al. (Nature 2009) used Google web
search queries that correlate with an influenza
epidemic, such as flu, fever. - Polgreen et al. (2008) used a Yahoo! query log.
- Hulth et al. (2009) used a query log of a
Switzerland web search engine.
10This Study
- Web search query is a extremely large scale and
real-time data resource. - BUT the query data is closed (not freely
available), which is available only for several
companies, such as Google, Yahoo, or Microsoft. - ? This study examines Twitter data, which is
widely available.
11OUTLINE
- Background
- Objective
- Method
- Experiment
- Discussion
- Conclusion
Detailed Task Definition
12Simple Word Frequency in TwitterCold, Fever
influenza
Winter
Summer
Actual influenza curve is more smooth
Simple Word Frequency contains various
noises Because.
13A word influenza does not always indicate an
influenza patient
Positive Influenza Tweet
Negative Influenza Tweet
14Two types of Influenza Tweets
- Negative influenza tweet
- indicates an influenza patient
- Negative influenza tweet
- includes mention of influenza, but does not
indicate that an influenza patient is present - Not only the general news, but also various
phenomena generate Negative influenza tweet
Positive Influenza Tweet
Negative Influenza Tweet
15Various Negative Influenza Tweet (1/2)
- Prevention
- You need to get a influenza shot sometime
soon. - Modality (just suspition)
- _at_John might be suffering from influenza
- Question
- Did you catch the influenza ?
16Various Negative Influenza Tweet (2/2)
- Influenza of Cat or Dog
- Today, I couldn't go home late. My cat caught the
influenza... - Influenza of TV Character
- In the last episode of that TV Series, Ritsu-chan
caught the flu
17Research Questions
- In total, half of Influenza related tweets are
negative, motivating an automatic filtering.
- RQ1 Could a NLP system filter out the negative
influenza tweet? - RQ2 Could this filtering contributes to the
surveillance accuracy?
18OUTLINE
- Background
- Method
- Experiment
- Discussion
- Conclusion
19Basic Idea Binary Classification
- We regard this task as a binary classification
task , such as a spam mail filtering
input
(2) What kind of Feature?
(3) What kind of Machine Learning Method?
Training Corpus
(1) What kind of Corpus?
Negative
Positive
20Corpus (5k Sentences with Labels)
See proceeding for detailed Average Annotator
Agreement Ratio 0.85
21What kind of Feature?
Twitter contains many ungrammatical expressions
- Surrounding Words (BOW, no stemming, no POS)
I think the influenza is going around
R2
L1
L2
L3
R1
R3
- Among various settings, Window size 6 achieved
the highest accuracy
22What kind of Machine Learning Method?
Classifier F-Measure Time
AdaBoost 0.592 40.192
Bagging 0.739 530.310
Decision Tree 0.698 239.446
Logistic Regression 0.729 696.704
Nearest Neighbor 0.695 22.441
Random Forest 0.729 38.683
SVM (polynomial d2) 0.738 92.723
- Among various settings, SVM achieved the feasible
accuracy
23OUTLINE
- Background
- Method
- Experiment
- Discussion
- Objective
24Twitter Data (2008-2010)
Season I
Season II
Season III
Season IV
- First month is used for training corpus
- We divides the other data into 4 seasons
- Twitter API sometimes changes the spec, leading
to dropout periods.
25Method Comparison Evaluation
- (1) TWEET-SVM (The proposed method)
- (2) TWEET-RAW
- Based on simple word frequency of influenza
- (3) GOOGLE Ginsberg 2009
- Based on Google web-search query
- The previous estimation data is available at the
Google Flu Trend website. - (4) DRUG-SALE Magruder 2003
- Evaluation is based on
- Average Correlation with GOLD_STANDARD DATA that
is the real number of the influenza patients
reported by Infection Disease Surveillance Center
(IDSC)
26Result Correlation Ratio
SVM
TWEET-RAW TWEET-SVM GOOGLE DRUG
Season I 0.683 0.816 0.817 -0.208
Season II -0.009 -0.018 0.232 0.406
Season III 0.382 0.474 0.881 0.684
Season IV 0.390 0.957 0.976 0.130
Bold indicates the correlation gt statistical
significance level.
In most seasons, the proposed method achieved the
higher correlation than simple word freq-based
method, demonstrating the advantage of the SVM
based filtering
27Result Correlation Ratio
SVM
TWEET-RAW TWEET-SVM GOOGLE DRUG
Season I 0.683 0.816 0.817 -0.208
Season II -0.009 -0.018 0.232 0.406
Season III 0.382 0.474 0.881 0.684
Season IV 0.390 0.957 0.976 0.130
Bold indicates the correlation gt statistical
significance level.
Except for Season II, the proposed method
achieved almost the same accuracy to GOOGLE.
28Why Twitter suffers from Season II? Because it
includes Pandemic!
WHO says Pandemic In 1999 Jul (Season II).
- Suggesting Twitter might be biased by News Media
TWEET-RAW TWEET-SVM GOOGLE DRUG
Normal Season 0.831 0.890 0.847 0.308
Pandemic Season 0.001 0.060 0.918 0.844
29Season I
TWEET-SVM ? GOOGLE
Relative number
30Season II
Relative number
TWEET-SVM ltlt GOOGLE
31OUTLINE
- Background
- Method
- Experiment
- Discussion
- Conclusion
Extra Experiment
32Frequent Question
- Could an Influenza Patient REALLY use a Twitter
or Google Search? - That seems to be un-natural situation!
Id like to sleep ...
Due to that, we modified the system assuming
as follows
People use Twitter or Google at the first
sign of the influenza
33Implemented by usingInfectious Model
Kermack1927
(? Markov model)
UNDER FLU
0.62
BEFORE FLU
AFTER FLU
Catch the flu
Recover
S
I
R
0.38
Infectious
Recover
Susceptible
- S-to-I transition is observed by Twitter / Google
- 38 of Influenza people recover a day
34BUT It ALSO improves Google based Approach
- This model improves correlation of
- BOTH Twitter GOOGLE.
- This result suggests that there is a room of
collaboration between medical study and web/NLP
study
35OUTLINE
- Background
- Method
- Experiment
- Discussion
- Conclusion
36Answer to Research Questions
- This study proposed a new influenza surveillance
system using Twitter - RQ1 Could a system filter out the negative
influenza? - Yes. But NOT Perfect
- RQ2 Could this accuracy contribute to the
surveillance performance? - YES. It increases the correlation (except for
pandemic period). - We could achieve the almost same accuracy to
GOOGLE using freely available data.
37Conclusion
- Still now, more than 100 (sometime over 1,000)
people die from influenza in Japan - We hope that this study might help people
38Thank youNLP could save a life!
Eiji ARAMAKI Ph.D. University of
Tokyo http//mednlp.jp