The Complexity of Testing Forecasts - PowerPoint PPT Presentation

1 / 68

About This Presentation

Title:

The Complexity of Testing Forecasts

Description:

Says 'Pass' or 'Fail' after finite rounds. ... say 'Pass' but must 'Fail' in a finite number of ... Fail if product of probabilities is at most. First Result ... – PowerPoint PPT presentation

Number of Views:36

Avg rating:3.0/5.0

Slides: 69

Provided by: lancef

Category:

more less

Transcript and Presenter's Notes

Title: The Complexity of Testing Forecasts

1
The Complexity of Testing Forecasts

Lance FortnowComputer Science, University of
Chicago Rakesh VohraKellogg MEDS, Northwestern

2
Economic Models
3
Economic Models
4
Economic Models
5
Auctions
6
Markets
7
Voting
8
Computationally-Bounded Agents
9
Forecasting

How can we certify a forecaster?
How well can a forecaster do when the forecaster
only knows the past and not the future?

10
Certifying a Weatherman
60
20
40
70
30
50
80
50
11
Our Research

We create a test of forecasters that cannot be
fooled unless the forecaster can solve
computationally hard problems.

12
Calibration
60
20
40
70
30
50
80
50
13
Calibration Theorem

Foster-Vohra
There exists a probabilistic forecaster that
given any sequence will, with high probability,
be calibrated on that sequence.
Proof uses min-max theorem.
Holds even if we require calibration over
countable number of subsequences.

14
Certifying a Weatherman
60
20
40
70
30
50
80
50
15
Log Loss
60
20
40
70
30
50
80
50
Log .6
Log .8
Log .6
Log .7
Log .3
Log .5
Log .8
Log .5
16
Entropy

Best possible long-term average log-loss for a
forecaster equals the long-term average entropy
of the distribution.
Since we dont know the entropy of the
distribution we cant judge whether the
forecaster is doing well.

17
Experts Test

Test by checking whether forecaster does at least
as well as some fixed large collection of
experts.
Various Experts Theorems that creates a
forecaster that will fool these tests.

18
Sandronis Theorem

Let T be a test of forecasters that
Always passes the Truth
Says Pass or Fail after finite rounds.
For every such test T, there is a probabilistic
forecaster that knowing nothing expect the path
so far will, with high probability, pass test T.

19
Forecaster
0.6
0.4
0.2
0.8
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
20
Tester
0.6
0.4
0.2
0.8
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
Pass
Fail
Fail
Pass
Pass
Fail
Pass
Pass
21
Testing Forecaster
0.6
0.4
0.2
0.8
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
Pass
Fail
Fail
Pass
Pass
Fail
Pass
Pass
22
Nature
0.4
0.6
0.8
0.2
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
23
The Truth
0.4
0.6
0.6
0.4
0.6
0.8
0.2
0.5
0.5
0.2
0.8
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
24
Passing The Truth

A tester passes the truth if we have a forecast
that outputs probabilities identical to nature
then the tester will pass the forecaster with
high probability.

25
Forecaster
0.6
0.4
0.2
0.8
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
26
Probabilistic Forecaster
27
Probabilistic Forecaster
0.6
25
0.4
0.3
0.7
40
0.2
0.8
35

Forecaster plays a mixed strategy.

28
Finite Decisions
P F P P P F F P P F F P F P F
P P P P P P F F P P P F P P F
P P
29
Sandronis Theorem

Let T be a test of forecasters that
Always passes the Truth
Says Pass or Fail after finite rounds.
For every such test T, there is a probabilistic
forecaster that knowing nothing expect the path
so far will, with high probability, pass test T.

30
Getting Around Sandroni

Allow decisions of Pass or Fail based on
infinite paths.
Dekel and Feinberg (2006)
Never say Pass but must Fail in a finite
number of steps.
Olszewski and Sandroni (2006)

31
Our Results

Dont try to catch all forecasters, just those
that run in a short amount of time.
For any reasonable time bound t(n), there is a
tester that runs in time t2(n)
Passes The Truth on every distribution.
For some distribution of nature, every
probabilistic forecaster running in time t(n)
will be caught with high probability.
n number of rounds

32
Further Results

We give a linear time tester
Passes The Truth on every distribution.
For every number m there is some distribution
such that if a probabilistic forecaster F fools
our tester than we can use the forecaster to
factor m.
Base hardness on more difficult problems
NP-complete and beyond.

33
First Result

For any reasonable time bound t(n), there is a
tester that runs in time t2(n)
Passes The Truth on every distribution.
For some distribution of nature, every
probabilistic forecaster running in time t(n)
will be caught with high probability.

34
First Result

For any reasonable time bound t(n), there is a
tester that runs in time t2(n)
Passes The Truth on every distribution.
For some distribution of nature, every
probabilistic forecaster running in time t(n)
will be caught with high probability.

35
First Result

For any reasonable time bound t(n), there is a
tester that runs in time t2(n)
Passes The Truth on every distribution.
For some distribution of nature, a fixed
deterministic forecaster running in time t(n)
will be caught with high probability.

36
Forecaster
0.6
0.4
0.2
0.8
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
37
Low Weight Path
0.6
0.4
0.2
0.8
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
38
Tester
Pass
Pass
Pass
Pass
Pass
Pass
Pass
39
Tester
On Path Fail if product of forecasts is less
than ?
40
Tester
Pass
Pass
Pass
Pass
Pass
Pass
Pass
Pass if ?
41
Passes the Truth
Pass
Pass
Pass
Pass
Pass
Pass
Pass
Pass if ?
42
Fails the Forecaster
0.6
0.4
0.2
0.8
0.5
0.5
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
Pass
Pass
Pass
Pass
Pass
Pass
Pass
Pass if ?
43
Fails the Forecaster
0.6
0.4
1
0
0.2
0.8
0.5
0.5
0
1
0.1
0.9
0.6
0.4
0.4
0.6
0.3
0.7
1
0
Pass
Pass
Pass
Pass
Pass
Pass
Pass
Pass if ?
44
Running Time

The tester needs only simulate the forecaster
using only slightly more computation time.

45
First Result

For any reasonable time bound t(n), there is a
tester that runs in time t2(n)
Passes The Truth on every distribution.
For some distribution of nature, a fixed
deterministic forecaster running in time t(n)
will be caught with high probability.

46
First Result

For any reasonable time bound t(n), there is a
tester that runs in time t2(n)
Passes The Truth on every distribution.
For some distribution of nature, a fixed
probabilistic forecaster running in time t(n)
will be caught with high probability.

47
Probabilistic Forecaster
0.6
25
0.4
0.3
0.7
40
0.2
0.8
35
48
Probabilistic Forecaster
0.6
25
0.4
0.3
0.7
40
0.2
0.8
35
Take the path more likely to have a forecast at
most 0.5
49
Tester
Pass
Pass
Pass
Pass
Pass
Pass
Pass
Pass if 50
First Result

For any reasonable time bound t(n), there is a
tester that runs in time t2(n)
Passes The Truth on every distribution.
For some distribution of nature, a fixed
probabilistic forecaster running in time t(n)
will be caught with high probability.

51
Enumerating Forecasters

There are counting number of probabilistic
forecasters that run in time t(n).
F1, F2, F3

52
Combine to Form Paths
F1
F2
F3
F4
F5
53
Tester
F1
F2
F3
F4
F5
P P P P P P P P P P P P P P
P P P P P P P P P P P P P P P P
P
Fail if product of probabilities is at most ?
54
First Result

For any reasonable time bound t(n), there is a
tester that runs in time t2(n)
Passes The Truth on every distribution.
For some distribution of nature, for every
probabilistic forecaster running in time t(n)
will be caught with high probability.
Can we create a tester where the forecaster needs
to use much more time than the tester?

55
Forecaster Must Factor

We give a linear-time tester
Passes The Truth on every distribution.
For every number m there is some distribution
such that if a probabilistic forecaster F fools
our tester than we can use the forecaster to
factor the number m.

56
Viewing Tree as Numbers
64929
103079
131699
702157
57
Viewing Tree as Numbers
64929
103079
131699
702157
941
103079
881
257
127
587
137
373
1
797
367
19
251
61
23
23
27
241
3
1
5
19
1
17
19
1
7
3
3
7
1
3
58
Factorization Paths
64929
103079
131699
702157
941
103079
881
257
127
587
137
373
1
797
367
19
251
61
23
23
27
241
3
1
5
19
1
17
19
1
7
3
3
7
1
3
59
Tester
64929
103079
131699
702157
941
103079
881
257
127
587
137
373
1
797
367
19
251
61
23
23
27
241
3
1
5
19
1
17
19
1
7
3
3
7
1
3
P P P P P P P P P P P P P
P P P P P P P P P P P P P P
P
Fail if product of blue edges are 60
Efficient
64929
103079
131699
702157
941
103079
881
257
127
587
137
373
1
797
367
19
251
61
23
23
27
241
3
1
5
19
1
17
19
1
7
3
3
7
1
3
P P P P P P P P P P P P P
P P P P P P P P P P P P P P
P
Fail if product of blue edges are 61
Passes the Truth
64929
103079
131699
702157
941
103079
881
257
127
587
137
373
1
797
367
19
251
61
23
23
27
241
3
1
5
19
1
17
19
1
7
3
3
7
1
3
P P P P P P P P P P P P P
P P P P P P P P P P P P P P
P
Fail if product of blue edges are 62
Forecaster Must Factor
64929
103079
131699
702157
941
103079
881
257
127
587
137
373
1
797
367
19
251
61
23
23
27
241
3
1
5
19
1
17
19
1
7
3
3
7
1
3
P P P P P P P P P P P P P
P P P P P P P P P P P P P P
P
Fail if product of blue edges are 63
Forecaster Must Factor
64929
103079
131699
702157
941
103079
881
257
127
587
137
373
1
797
367
19
251
61
23
23
27
241
3
1
5
19
1
17
19
1
7
3
3
7
1
3
P P P P P P P P P P P P P
P P P P P P P P P P P P P P
P
Fail if product of blue edges are 64
Forecaster Must Factor

We give a linear-time tester
Passes The Truth on every distribution.
For every number m there is some distribution
such that if a probabilistic forecaster F fools
our tester than we can use the forecaster to
factor the number m.

65
How Hard is Factoring?

Testing Primality is Easy
Solovay-Strassen, Agrawal-Kayal-Saxena
Factoring Seems Hard
Basis of Modern Cryptography
No complexity basis for hardness of factoring.
Can factor with (hypothetical) quantum computer
(Shor).

66
Search Problems

We would like to base the hardness on general
NP search problems.
Traveling Salesperson
Map Coloring
Boolean Formula Satisfiability
Problem Proof needs unique witnesses or test
might fail the truth.

67
Solution Interactive Proofs

Embed Interactive Proof System into the tester.
Bonus We get not only all NP search problems but
all of PSPACE.
Not only must forecaster route salespeople, and
color maps but must also play a perfect game of
chess.

68
Future Directions