Predicting the Present - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Predicting the Present

Description:

Property Management. Home Financing. 6. Subcategories under Real Estate by Query Shares ... Property Management. Rental Listings & Referrals. Real Estate Agencies ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 43
Provided by: ecEu4
Category:

less

Transcript and Presenter's Notes

Title: Predicting the Present


1
Predicting the Present
  • With Google Trends

Hyunyoung Choi Hal Varian June 2009
2
Problem statement
  • Government agencies and other organizations
    produce monthly reports on economic activity
  • Retail Sales
  • House Sales
  • Automotive Sales
  • Unemployment
  • Problems with reports
  • Compilation delay of several weeks
  • Subsequent revisions
  • Sample size may be small
  • Not available at all geographic levels
  • Google Trends releases daily and weekly index of
    search queries by industry vertical
  • Real time data
  • No revisions (but some sampling variation)?
  • Large samples
  • Available by country, state and city
  • Can Google Trends data help predict current
    economic activity?
  • Before release of preliminary statistics
  • Before release of final revision

2
3
Categories in Google Trends by Query Shares
Note Queries from 2009-01-01 to 2009-04-30
Growth Comparison w/ the same time window
3
4
Real Estate
5
Geography
Time window
Category
5
6
Subcategories under Real Estate by Query Shares
6
7
Search on Real Estate Agencies
7
8
Searches on Rental Listings Referrals
8
9
Depicting trends
  • Google Trends measures normalized query share of
    particular category of queries controls for
    overall growth
  • Often useful to look at year-on-year changes to
    eliminate seasonality.
  • Illustrate correlations and covariates.
  • Improving predictions
  • Forecast time series using its own lagged values
    and add Trends data as a predictor.
  • Statistical significance?
  • Improved fit?
  • Improved forecasts?
  • Identify turning points?

9
10
15 yr Mortgage Rate vs. Home Financing
10
11
Forecasting primer
  • Basic forecasting models
  • Autoregressive value at time t depends on
  • Value at time t-1
  • Seasonal adjustment value at time t depends on
  • Value at time t-12
  • For monthly data
  • Transfer function value at time t depends on
  • Other contemporaneous or lagging variables
  • Seasonal autoregressive transfer model Value at
    time t depends on
  • Value at time t-12 (seasonality)?
  • Value at time t-1 (recent behavior)?
  • Other lagging or contemporaneous variables (such
    as Google Trends data)?
  • Typical question of interest
  • How much more accurate forecasts can you get from
    additional variables over and above the accuracy
    you get with the history of the time series
    itself?

11
11
12
Model
New Home Sales
  • Recent Search Activity on
  • Real Estate Agencies
  • Rental Listings Referrals
  • Home Inspections Appraisal
  • Property Management
  • Home Insurance
  • Home Financing

Housing affordability with Average/Median Home
Price
Exogenous Variables
Recent Trend with New Home Sales at
t-1 Seasonality with New Home Sales at t-12
Time Series
Google Trends
13
Predicting the present
New Residential Sales from US Census
Google Trends Real Estate by Category
  • Monthly release 24 28 days after the month
  • Seasonally adjusted
  • National and Regional aggregate
  • Home Inspections Appraisal
  • Home Insurance
  • Home Financing
  • Property Management
  • Rental Listings Referrals
  • Real Estate Agencies

13
14
New House Sales vs. Real Estate Google Trends
14
15
Analysis and Forecasting
Model Yt 446.1 0.864 Yt - 1 4.340
us378.1 4.198 us96.2 0.001 AvgPt 1 Yt
New house sold at t-th month AvgPt 1 Average
Sales Price of New One-Family Houses Sold at
(t-1)-th month us378.1 Google Trend of vertical
id 378 (Rental Listings Referrals ) at t-th
month 1st week us96.2 Google Trend of
vertical id 96 (Real Estate Agent) at t-th
month 2nd week
July 2008 Actual 515K Predicted
442.98K Z-score 2.53
August 2008 Prediction 417.52K
15
16
Analysis and Forecasting
  • Observations
  • Since 2005 new house sales have been decreasing,
    with little seasonality
  • Google Trends captures seasonality recent
    trends
  • Positive association with Real Estate Agencies
    (96)
  • Negative association with Rental Listings
    Referrals (378) and Average Price

16
17
Travel
18
Subcategories under Travel by Query Shares
18
19
Travel to Hong Kong
Google Trends Travel by Category
Visitors Arrival Statistics from Hong Kong
Tourism Board
  • Monthly summaries release with 1 month lag
  • Reports Country/Territory of Residence of
    visitors
  • Data available 2004-2008
  • Hotels Accommodations
  • Air Travel
  • Car Rental Taxi Services
  • Cruises Charters
  • Attractions Activities
  • Vacation Destinations
  • Australia
  • Caribbean Islands
  • Hawaii
  • Hong Kong
  • Las Vegas
  • Mexico
  • New York City
  • Orlando
  • Adventure Travel
  • Bus Rail

19
20
Visitors Arrival Statistics vs. Google Trends
20
21
Analysis and Forecasting
Model log(Yi,t) 0.664 0.113 log(Yi,t-1)
0.828 log(Yi,t-12) 0.001 Xi,t,2 0.001
Xi,t,3 0.005 FXrate i,t
?i, ei,t ei,t N(0, 0.09382), ?i
N(0, 0.02282)? Yi,t Arrival to Hong Kong at
month t and from i-th country Xi,t,1 Google
Trend Search at 1st week of month t and from i-th
country Xi,t,2 Google Trend Search at 2nd week
of month t and from i-th country Xi,t,3 Google
Trend Search at 3rd week of month t and from i-th
country FXrate i,t Hong Kong Dollar per one
unit of i-th countrys local currency at month t.
Average of first weeks FX rate is used as a
proxy to FX rate per each month.
21
22
Visitor Arrival Statistics - Actual Fitted
22
23
Analysis and Forecasting
  • Conclusion
  • Arrival at time t is positively associated with
    arrival at time t-1 and arrival at time t-12.
  • It shows strong seasonality and autocorrelation
  • Arrival at time t is positively associated with
    searches on Hong Kong.
  • Arrival at time t is positively associated with
    FX rates.
  • When the local currency appreciates relative to
    Hong Kong Dollar, visitors to Hong Kong increase.

23
24
Automobiles
25
US Auto Sales by Make
Google Trends under Vehicle Brands Category
US Auto Sales by Make
  • Monthly summaries released 1 week after end of
    month
  • Data available by Car Sales, Truck Sales and
    Total Sales for each make
  • Data available from 2003-2008
  • Source Automotive News Data Center
  • Google Trends subcategory Vehicle Brands.
  • Weekly Search query index
  • Total 31 verticals in this subcategory
  • 27 verticals matching to Monthly Sales available

25
25
26
Google Categories under Vehicle Brands
NOTE Area represents the queries volume from
first half year 2008 and the color represents
queries yearly growth rate
26
27
Auto Sales by Make (Top 9 Make by Sales) Monthly
Sales vs. Google Trends at Second Week of each
month
27
27
28
Analysis and Forecasting
Fixed effects model log(Yi,t) 2.4276
0.2552 log(Yi,t-1) 0.4930 log(Yi,t-12)
0.0005 Xi,t,2 0.0014 Xi,t,2
ai Makei ei,t ei,t N(0,
0.13472) , Adjusted R2 0.9829 Yi,t Auto Sales
of i-th Make at month t Xi,t,1 Google Trend
Search at 1st week of month t and from i-th
make Xi,t,2 Google Trend Search at 2nd week of
month t and from i-th make Makei Dummy variable
for Auto Make ai Coefficient to capture the
mean level of Auto Sales by Make
ANOVA Table Df Sum Sq Mean Sq
F value Pr(gtF) trends1 1
12.89 12.89 710.3542 lt 2e-16 trends2
1 0.05 0.05 2.7987 0.09455 .
log(s1) 1 1532.95 1532.95 84452.7530
lt 2e-16 log(s12) 1 24.07 24.07
1325.9741 lt 2e-16 as.factor(brand) 26
3.34 0.13 7.0696 lt 2e-16 Residuals
1480 26.86 0.02
28
28
29
Actual vs. Fitted Sales (Top 9 Make by Sales)?
29
30
Analysis and Forecasting
  • Conclusion
  • Sales at time t are positively associated with
    Sales at time t-1 and Sales at time t-12.
  • Sales show strong seasonality and autocorrelation
  • Monthly Sales are positively correlated to the
    first and second weeks search volume of each
    month.
  • If the search volume increase by 1, the sales
    volume will increase by an average of 0.19.

30
30
31
Unemployment
32
YoY Growth in Initial Claims Google Search

According to the NBER, the current recession
started December 2007. National unemployment
rate passed 5 in mid 2008 and search queries on
Welfare and Unemployment also increased at same
time.
33
Initial claims is an important leading indicator
34
Google Trends data Search Insights screenshot
35
Initial Claims and Google Trends
36
Strong Autocorrelation in Initial Claims
Time Series
Autocorrelation Function
37
Initial Claims Before/After Recession Started
California
New York
38
Time Window for Analysis
Recession Starts
Window For Long Term Model
Window For Short Term Model
39
Model
Reference ARIMA(0,1,1) X (1,0,0)12 Model
ARIMA(0,1,1) X (1,0,0)12 Model With Google Trends
Model Fit improved significantly smaller
Standard deviation, high log likelihood and
smaller AIC Initial Claims are positively
correlated with searches on Jobs and Welfare.
40
Long Term Model Prediction Comparison with MAE
  • With Google Trends, the out-of-sample prediction
    MAE decreases by 16.84.
  • Prediction with rolling window from 1/11/2009 to
    4/12/2009
  • Prediction Error at t
  • Mean Absolute Error

41
Short Term Model Prediction Comparison with MAE
  • With Google Trends, the out-of-sample prediction
    MAE decreases by 19.23.
  • Prediction errors are within the same range as LT
    Model.
  • Fit improvement is better with ST Model.

42
Summary
  • Google Trends significantly improves
    out-of-sample prediction of state unemployment,
    up to 18 days in advance of data release.
  • Mean absolute error for out-of-sample predictions
    declines by 16.84 for LT Model and 19.23 for ST
    Model.
  • Further work
  • Can examine metro level data
  • Other local data (real estate)?
  • Combine with other predictors
  • Detect turning points?
Write a Comment
User Comments (0)
About PowerShow.com