Experience with the Gravity Model

1 / 29
About This Presentation
Title:

Experience with the Gravity Model

Description:

Experience with the Gravity Model Introduction There is demand for air travel between every city-pair in the world (can be very small) We have imperfect data on the ... – PowerPoint PPT presentation

Number of Views:1
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Experience with the Gravity Model


1
Experience with the Gravity Model
2
Introduction
  • There is demand for air travel between every
    city-pair in the world (can be very small)
  • We have imperfect data on the actual travel
    (for many of the larger demands)
  • The Gravity Model is the long-standing
    traditional formulation
  • Demand is bigger, the bigger the origin city
  • Demand is bigger, the bigger the destination
  • Demand is smaller, the longer the distance
  • Other things may also matter

3
Prologue
  • We have some experience and prejudices
  • Doubling the origin city size should double the
    demand
  • Doubling the destination size should double the
    demand
  • Cost may be a better metric than distance
  • The zero demands should not be left out
  • Air Demand lt200 miles competes with ground
  • Common Language helps demand
  • Common alphabet helps demand
  • Different incomes hurts demand
  • Gravity works better at distributing total
    outbound demand than at estimating size of total
    travel
  • Leisure destinations are origin-specific and
    arbitrary

4
Act 1 We go Exploring
  • Guidelines (Pirates Code)
  • Take the easiest first
  • Use places you know about
  • Examine the results in detail
  • US domestic data
  • Best reporting quality
  • One country, one language, one income
  • Lots of points
  • Use Seattle (SEA), Boston (BOS), Chicago (CHI)
  • Disparate types of cities
  • Ive lived there

5
SEA, BOS, CHI
  • Passenger data from US ticket sample
  • Origin-Destination reporting
  • Some breakage of interline trips
  • Domestic points only similar fares, taxes,
    hassle
  • Origin gravity weight
  • Not population or income or .....
  • Use outbound departing passengers
  • Focus results on distributing to destinations
  • Destination gravity weight
  • Arriving passengers to destination
  • O-D data as used is not directional
  • Original source has home city for trips
  • All further data (outside US) will not
  • So we will use US data in non-directional form

6
Starter Formulation
  • Calibrate gravity model
  • Pax WO WD / Dista
  • Where Pax Origin-Destination Demand
  • WO Origin weight (size)
  • WD Destination weight (size)
  • Dist Intercity Distance
  • a Distance exponent (calibrated variable)
  • Use Log formulation, linear least squares fit
  • Examine forecast to actual Passengers/Demand
  • Allow origin size WO to be a calibrated variable
  • One each for SEA, BOS, CHI

7
Early Observations Some Wild Outliers
8
Fitzing with data
  • Most Distlt200 had low actuals
  • Demand diverted to surface modes not in data
  • SEA high actuals were
  • Points in Alaska
  • Had trips interlining in SEA--with broken data
  • Were dedicated Seattle pointslike college towns
  • BOS high actuals were
  • Leisure destinations, for Boston
  • Characteristics of high actuals
  • Destinations had small number of origin cities
  • Destinations had one large demand to origin
  • Some were secondary airports in a city
  • End of Act 1

9
Act 2 Our First Regressions
  • We eliminate all pointslt200 miles
  • Due to ground competition
  • We eliminate all points with lt12 origins
  • Tend to be captive-to-single-origin points
  • We did a big side-study on share-of-largest
    origin
  • We generate zeros by destination (16)
  • When 1 or 2 of SEA, BOS, CHI lack demand
  • Due to log form, zeros dont work
  • We try .3, .1, .01, and .001 for zeros
  • We get rising a with smaller zeros, significantly
  • We include only 5 zeros, but get same reactions

10
Small Demands are a Real Problem
  • Regression results driven by zero points
  • Least squares in log form gives equal weight to
    each demand point
  • Log form emphasizes percentage error
  • Actual needs are different
  • Forecast big demands with smaller errors
  • Forecast the small demands merely as small

11
Compromise
  • Ignore small demands and zeros
  • Require Paxgt10 for all three cities
  • Or drop the destination
  • Merge multiple airports in a city to a single
    city destination
  • (We had been using airports gt cities)
  • We now get same answers, with or without
    remaining outliers (errors below ½ or above 2)
  • Errors on large demands more reasonable
  • Most small markets forecast as small
  • Exceptions are large for one origin
  • Could be large for other two, but no online
    services

12
Early Observations
  • Define draw as ratio actual / forecast

13
Lessons Learned So Far
  • Distance exponent a -0.66
  • NOT the same as domestic fares (Fare Dist0.2)
  • Do not include zero demand points
  • Destinations with few origins tend to be
    captive
  • Do not use them in generic calibrations
  • To improve errors in forecasts of large demands,
    use only points with large demands
  • Result will forecast small demands small, mostly
  • Use Cities, not airports

14
More Lessons
  • City WO fairly consistent with city size
  • More about this on next slide
  • Ran against Pax data adjusted to standard fares
  • Many under-forecasts were in discount markets
  • Ran international destinations
  • True O D not from US ticketing source
  • Distance exponent a of -1.5 (much different)
  • Demands 1/5th of forecasts from domestic
  • Suggests language, or other barriers count
  • Goods research found borders act like 3000 mi.

15
Play within the Play
  • Observed different ratios to total outbound
    travel for SEA, BOS, CHI (Wo).
  • But not very different
  • Ran all US domestic pairs (Paxgt10)
  • Using just a single variable (WO WD), with
    exponent ß
  • Results
  • Distance exponent a -0.55 (had been -0.66)
  • City-Size exponent ß 0.85
  • Suggests larger cities have smaller demands
  • Maybe because higher of demands are gt1 and
    therefore are captured by data base. (Bigger W
    ? demands.)
  • Also small cities show more short-haul, which was
    removed
  • Otherwise, large cities have more direct services
    lower fares !
  • Interpretation allows ß 1.0 to be reasonable

16
Act 3 European Regional
  • New set of data points
  • London, Copenhagen, Istanbul
  • 200 mi lt Distance lt 2800 mi
  • All 3 (LON, CPH, IST) have Pax gt 10
  • 219 points
  • Regression Results
  • Distance exponent a near -0.80
  • Origin-Specific adjustments not significant
  • Removing outliers has small effect on answers
  • Some really big errors in really big markets
  • Tends to confirm US data experiences

17
Europe All Points
  • Distance gt 200 mi
  • Pax gt 10
  • Least squares regression
  • Distance exponent a goes to -1.2
  • Weights (WO WD) exponent ß 0.4
  • Gives almost all demands near 40
  • Results Not Satisfactory
  • Distance exponent seems wrong (beyond -1)
  • City size (weights) exponent ß too far from 1.0
  • Unsatisfactory forecasts by inspection
  • Most big markets forecast too small
  • Most smaller market forecast too big

18
Go Back to Detailed Look
  • All markets with Pax gt 200
  • Drop 12 high-side outliers
  • Redefine Error
  • Not percentage-error-squared (log least sq.)
  • Not Diff (Passengers Forecast)
  • Compromise Diff0.75
  • Compromise is halfway between size and
  • Iterate

19
Iterative Procedure
  • Start with Distance and Weight Exponents 1
  • Adjust scaling so median Forecast/Pax 1.00
  • Adjust Weight exponent ß to reduce Error
  • Readjust scaling on each try
  • Adjust Distance exponent a to reduce Error
  • Readjust scaling on each try
  • Iterate to find min ? Diff0.75 (min Error)
  • An ugly, unofficial, but practical, process

20
Results from Procedure
  • Distance Exponent a likes to be -1.05
  • Could be cultural distance
  • City Weights Exponent ß likes to be 1.25
  • Why???
  • Two effects are independent
  • Many too big forecasts for small demands

21
Poor Fit of Forecast to Data
22
One Last Regression
  • All Europe Classic Gravity formula
  • Pax gt 10, Dist gt 200
  • Distance exponent fixed at 1.00
  • City weight exponent fixed at 1.00
  • Allowed factor for same country
  • Was about 5x, as for US vs International
  • Nice scatter
  • Fewer unreasonable forecasts
  • Huge errors everywhere

23
Gravity Forecast is Very Poor
24
Obituary on Gravity Model
  • Forecasts are really bad
  • Outliers have large effect on answer
  • Need to be removed
  • Zeros have large effect on answer
  • Forecasts more sensible when not included
  • Results will be misleading
  • Small markets will be forecast as medium

25
Overall Conclusion
  • Air travel between cities is
  • Strongly influenced by city-pair specific factors
  • Not amenable to gravity model approach
  • If you have to have a forecast
  • Calibrate from existing larger culturally similar
    cities to same destinations
  • Recognize the same country effect is large
    (maybe 5x)

26
More Gravity Long Haul
  • All world markets
  • Distance gt 3100 mi (5000 km)
  • Passengers gt 20
  • No existing nonstop service
  • Least Squares Regressions
  • Four Equations (log calibrations)
  • Traditional calibrate ratio to gravity term
  • Distance exponent a only (-1.37)
  • Whole Gravity term exponent only (0.19)
  • Separate City Size ß and Distance a exponents
  • (ß 0.18 and a -0.03)

27
Best Fit was not usefulmeasured by either or
value errors
  • Models 3 4 fit best
  • Fit achieved by low variance
  • No forecasts at large values
  • No forecasts at small values
  • Most forecasts near 40
  • This is a pretty worthless forecast
  • Model 2 had much worse misses than 1
  • Traditional Gravity form had least harmful
    answers

28
Traditional Gravity was Best But not Good
29
Median Forecasts are Weakly Correlated with
Actuals
Write a Comment
User Comments (0)