Title: The Complexity of Testing Forecasts
1The Complexity of Testing Forecasts
- Lance FortnowComputer Science, University of
Chicago Rakesh VohraKellogg MEDS, Northwestern
2Economic Models
3Economic Models
4Economic Models
5Auctions
6Markets
7Voting
8Computationally-Bounded Agents
9Forecasting
- How can we certify a forecaster?
- How well can a forecaster do when the forecaster
only knows the past and not the future?
10Certifying a Weatherman
60
20
40
70
30
50
80
50
11Our Research
- We create a test of forecasters that cannot be
fooled unless the forecaster can solve
computationally hard problems.
12Calibration
60
20
40
70
30
50
80
50
13Calibration Theorem
- Foster-Vohra
- There exists a probabilistic forecaster that
given any sequence will, with high probability,
be calibrated on that sequence. - Proof uses min-max theorem.
- Holds even if we require calibration over
countable number of subsequences.
14Certifying a Weatherman
60
20
40
70
30
50
80
50
15Log Loss
60
20
40
70
30
50
80
50
Log .6
Log .8
Log .6
Log .7
Log .3
Log .5
Log .8
Log .5
16Entropy
- Best possible long-term average log-loss for a
forecaster equals the long-term average entropy
of the distribution. - Since we dont know the entropy of the
distribution we cant judge whether the
forecaster is doing well.
17Experts Test
- Test by checking whether forecaster does at least
as well as some fixed large collection of
experts. - Various Experts Theorems that creates a
forecaster that will fool these tests.
18Sandronis Theorem
- Let T be a test of forecasters that
- Always passes the Truth
- Says Pass or Fail after finite rounds.
- For every such test T, there is a probabilistic
forecaster that knowing nothing expect the path
so far will, with high probability, pass test T.
19Forecaster
0.6
0.4
0.2
0.8
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
20Tester
0.6
0.4
0.2
0.8
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
Pass
Fail
Fail
Pass
Pass
Fail
Pass
Pass
21Testing Forecaster
0.6
0.4
0.2
0.8
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
Pass
Fail
Fail
Pass
Pass
Fail
Pass
Pass
22Nature
0.4
0.6
0.8
0.2
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
23The Truth
0.4
0.6
0.6
0.4
0.6
0.8
0.2
0.5
0.5
0.2
0.8
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
24Passing The Truth
- A tester passes the truth if we have a forecast
that outputs probabilities identical to nature
then the tester will pass the forecaster with
high probability.
25Forecaster
0.6
0.4
0.2
0.8
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
26Probabilistic Forecaster
27Probabilistic Forecaster
0.6
25
0.4
0.3
0.7
40
0.2
0.8
35
- Forecaster plays a mixed strategy.
28Finite Decisions
P F P P P F F P P F F P F P F
P P P P P P F F P P P F P P F
P P
29Sandronis Theorem
- Let T be a test of forecasters that
- Always passes the Truth
- Says Pass or Fail after finite rounds.
- For every such test T, there is a probabilistic
forecaster that knowing nothing expect the path
so far will, with high probability, pass test T.
30Getting Around Sandroni
- Allow decisions of Pass or Fail based on
infinite paths. - Dekel and Feinberg (2006)
- Never say Pass but must Fail in a finite
number of steps. - Olszewski and Sandroni (2006)
31Our Results
- Dont try to catch all forecasters, just those
that run in a short amount of time. - For any reasonable time bound t(n), there is a
tester that runs in time t2(n) - Passes The Truth on every distribution.
- For some distribution of nature, every
probabilistic forecaster running in time t(n)
will be caught with high probability. - n number of rounds
32Further Results
- We give a linear time tester
- Passes The Truth on every distribution.
- For every number m there is some distribution
such that if a probabilistic forecaster F fools
our tester than we can use the forecaster to
factor m. - Base hardness on more difficult problems
NP-complete and beyond.
33First Result
- For any reasonable time bound t(n), there is a
tester that runs in time t2(n) - Passes The Truth on every distribution.
- For some distribution of nature, every
probabilistic forecaster running in time t(n)
will be caught with high probability.
34First Result
- For any reasonable time bound t(n), there is a
tester that runs in time t2(n) - Passes The Truth on every distribution.
- For some distribution of nature, every
probabilistic forecaster running in time t(n)
will be caught with high probability.
35First Result
- For any reasonable time bound t(n), there is a
tester that runs in time t2(n) - Passes The Truth on every distribution.
- For some distribution of nature, a fixed
deterministic forecaster running in time t(n)
will be caught with high probability.
36Forecaster
0.6
0.4
0.2
0.8
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
37Low Weight Path
0.6
0.4
0.2
0.8
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
38Tester
Pass
Pass
Pass
Pass
Pass
Pass
Pass
39Tester
On Path Fail if product of forecasts is less
than ?
40Tester
Pass
Pass
Pass
Pass
Pass
Pass
Pass
Pass if ?
41Passes the Truth
Pass
Pass
Pass
Pass
Pass
Pass
Pass
Pass if ?
42Fails the Forecaster
0.6
0.4
0.2
0.8
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
Pass
Pass
Pass
Pass
Pass
Pass
Pass
Pass if ?
43Fails the Forecaster
0.6
0.4
1
0
0.2
0.8
0.5
0.5
0
1
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
1
0
Pass
Pass
Pass
Pass
Pass
Pass
Pass
Pass if ?
44Running Time
- The tester needs only simulate the forecaster
using only slightly more computation time.
45First Result
- For any reasonable time bound t(n), there is a
tester that runs in time t2(n) - Passes The Truth on every distribution.
- For some distribution of nature, a fixed
deterministic forecaster running in time t(n)
will be caught with high probability.
46First Result
- For any reasonable time bound t(n), there is a
tester that runs in time t2(n) - Passes The Truth on every distribution.
- For some distribution of nature, a fixed
probabilistic forecaster running in time t(n)
will be caught with high probability.
47Probabilistic Forecaster
0.6
25
0.4
0.3
0.7
40
0.2
0.8
35
48Probabilistic Forecaster
0.6
25
0.4
0.3
0.7
40
0.2
0.8
35
Take the path more likely to have a forecast at
most 0.5
49Tester
Pass
Pass
Pass
Pass
Pass
Pass
Pass
Pass if
50First Result
- For any reasonable time bound t(n), there is a
tester that runs in time t2(n) - Passes The Truth on every distribution.
- For some distribution of nature, a fixed
probabilistic forecaster running in time t(n)
will be caught with high probability.
51Enumerating Forecasters
- There are counting number of probabilistic
forecasters that run in time t(n). - F1, F2, F3
52Combine to Form Paths
F1
F2
F3
F4
F5
53Tester
F1
F2
F3
F4
F5
P P P P P P P P P P P P P P
P P P P P P P P P P P P P P P P
P
Fail if product of probabilities is at most ?
54First Result
- For any reasonable time bound t(n), there is a
tester that runs in time t2(n) - Passes The Truth on every distribution.
- For some distribution of nature, for every
probabilistic forecaster running in time t(n)
will be caught with high probability. - Can we create a tester where the forecaster needs
to use much more time than the tester?
55Forecaster Must Factor
- We give a linear-time tester
- Passes The Truth on every distribution.
- For every number m there is some distribution
such that if a probabilistic forecaster F fools
our tester than we can use the forecaster to
factor the number m.
56Viewing Tree as Numbers
64929
103079
131699
702157
57Viewing Tree as Numbers
64929
103079
131699
702157
941
103079
881
257
127
587
137
373
1
797
367
19
251
61
23
23
27
241
3
1
5
19
1
17
19
1
7
3
3
7
1
3
58Factorization Paths
64929
103079
131699
702157
941
103079
881
257
127
587
137
373
1
797
367
19
251
61
23
23
27
241
3
1
5
19
1
17
19
1
7
3
3
7
1
3
59Tester
64929
103079
131699
702157
941
103079
881
257
127
587
137
373
1
797
367
19
251
61
23
23
27
241
3
1
5
19
1
17
19
1
7
3
3
7
1
3
P P P P P P P P P P P P P
P P P P P P P P P P P P P P
P
Fail if product of blue edges are
60Efficient
64929
103079
131699
702157
941
103079
881
257
127
587
137
373
1
797
367
19
251
61
23
23
27
241
3
1
5
19
1
17
19
1
7
3
3
7
1
3
P P P P P P P P P P P P P
P P P P P P P P P P P P P P
P
Fail if product of blue edges are
61Passes the Truth
64929
103079
131699
702157
941
103079
881
257
127
587
137
373
1
797
367
19
251
61
23
23
27
241
3
1
5
19
1
17
19
1
7
3
3
7
1
3
P P P P P P P P P P P P P
P P P P P P P P P P P P P P
P
Fail if product of blue edges are
62Forecaster Must Factor
64929
103079
131699
702157
941
103079
881
257
127
587
137
373
1
797
367
19
251
61
23
23
27
241
3
1
5
19
1
17
19
1
7
3
3
7
1
3
P P P P P P P P P P P P P
P P P P P P P P P P P P P P
P
Fail if product of blue edges are
63Forecaster Must Factor
64929
103079
131699
702157
941
103079
881
257
127
587
137
373
1
797
367
19
251
61
23
23
27
241
3
1
5
19
1
17
19
1
7
3
3
7
1
3
P P P P P P P P P P P P P
P P P P P P P P P P P P P P
P
Fail if product of blue edges are
64Forecaster Must Factor
- We give a linear-time tester
- Passes The Truth on every distribution.
- For every number m there is some distribution
such that if a probabilistic forecaster F fools
our tester than we can use the forecaster to
factor the number m.
65How Hard is Factoring?
- Testing Primality is Easy
- Solovay-Strassen, Agrawal-Kayal-Saxena
- Factoring Seems Hard
- Basis of Modern Cryptography
- No complexity basis for hardness of factoring.
- Can factor with (hypothetical) quantum computer
(Shor).
66Search Problems
- We would like to base the hardness on general
NP search problems. - Traveling Salesperson
- Map Coloring
- Boolean Formula Satisfiability
- Problem Proof needs unique witnesses or test
might fail the truth.
67Solution Interactive Proofs
- Embed Interactive Proof System into the tester.
- Bonus We get not only all NP search problems but
all of PSPACE. - Not only must forecaster route salespeople, and
color maps but must also play a perfect game of
chess.
68Future Directions
- Complexity Gap
- PSPACE vs. Exponential Time
- Our proofs require Nature to createhard-to-comput
e distributions. - Connections to Entropy and Dimension
- Generally Applying Computational Complexity Tools
to Economic Models