Title: Miroslav K
1 - Miroslav Kárný
- Department of Adaptive Systems
- Institute of Information Theory and Automation
- Academy of Sciences of the Czech Republic
- school_at_utia.cas.cz, http//as.utia.cas.cz
2 speakers home institute
nickname for
Institute of Information Theory and Automation
Cybernetics ? Communication Control in
Machines Animals
Cybernetics is speakers research domain and led
to applications in
- Adaptive control of paper machines, rolling
mills, drum boilers, - Nuclear medicine modeling DM, dynamic image
studies - Support of operators of complex systems (FET)
- Traffic control in cities, optimization of
financial strategies - Multiple participants DM and E-democracy
? !
Bayesian DM single-horse on decades-lasting
trip with a good team
3FET organizes a review process
to select the best proposals p among all
submitted proposals
- An expert e assigns marks emp ?0,,M to several
proposals within a small group ep of proposals - A small group of experts pe, reviewing the
proposal p, harmonizes the final mark mp via
discussion - Assembly of all experts completely ranks all
proposals
EC supports top proposals up to a budget-implied
border-line
4Addressed problem
Procedure is good fair
up to the extremely disturbing step
- An expert e assigns marks emp ?0,,M to several
proposals - within a small group ep of proposals
- A small group of experts pe, reviewing the
proposal p, harmonizes the final mark mp via
discussion - Assembly of all experts completely ranks all
proposals
- Each expert e has studied a tiny portion of all
proposals - Experts marks emp are subjectively scaled
- Discrete-valued marks cause many coincidences
- Time slot of the assembly is strongly limited
errors manipulation expenses
?
5Aims
of the research
- to test belief that Bayesian DM is (almost)
universal
tool relying on the proper modeling only - to test a promising negotiation methodology
needed in other contexts, too
of the talk
- to help FET to be fair and cost-efficient
- to help proposing researchers to be treated
fairly - to share fun (?) from the conclusions
6Basic idea
Experts serve as rank-measuring devices
Project proposal p has its objective rank
rp !
Ranking ? estimation of rank rp from marks emp,
which are noise-corrupted observations of the
objective rank
7Guide
- Experts as measuring devices
- Prior knowledge
- MAP estimate
- Experimental results
- Discussion
8Experts as measuring devices
emp mark of proposal p by the expert e
rp objective rank of proposal p
e? personal error
experts try to be fair ? mark emp proportional to
rp e? independent of p
e? personal error eb
bias
e? personal fluctuations with variance ev
- interpretation of marks
- top M ? Nobel Prize
- top M ? flawless
Simplicity maximum entropy ? e? assumed to be
Gaussian
9Prior knowledge
Needed
emp rp eb e? (rp C) ( eb C)
e?, for any C
Available
rank ? 0, largest mark ? rp ? 0, M
bias eb ? -M, M ,
noise variance ev ? 0, M2
10MAP estimate
Posterior log-likelihood function
- smoothly dependent on the estimated r, b, v
- concave in the estimated r, b, v
- defined on a convex domain
- harmonised domain and data range
Evaluation
Conditions for extreme are solved by successive
approximations
fast, simple and reliable can be used
on-line
11Experiments - proposals viewpoint
Processed marks m ? 0, 0.5,,30 Assembly
ranking available
Extreme cases
Proposal
32 1341 Experts
33 588 acceptance
Threshold 22 25
proposals above T by A 11
157 proposals above T by us 16
72 proposals chosen by A and us
11 57 common acceptance / A-one 100
36
- typical numbers
- prior does not spoil results with a few data
12Histogram of rank estimates
box width about 2 of the mark range !
(rgtT ?25) 57
(r gtT ?22) 11
13Experiments - experts viewpoint
- mean (bias) / Threshold 6
4 - minimum (bias) / T
- 13 -45 - maximum (bias) / T
15 13 - mean (std. dev.) / T
13 12 - minimum (std. dev.) / T
10 7 - maximum (std. dev.) / T
21 38
Box width containing significant number of
proposals ? 3 of T !
14Individual results small file
15Individual top results extensive file
16Discussion
Evaluation aspects
- it works
- it exhibits fast and reliable convergence
- it is reasonably robust to variations of prior
statistics
Operational aspects
- it can substitute or at least support assembly
ranking - it allows continuous-valued marking
- it avoids the need to harmonize marks within pe
- it makes ranking less sensitive to experts
biases variations - it suppresses lottery-type results for
gray-zone-ranked - proposals (those with the rank around
threshold) - it makes evaluation more objective
17Discussion
Quality assurance aspects
- it checks reliability of experts, using their
biases variances
70-80 experts o.k. but unreliable or cheating
rest still forms a significant portion
- it allows tracking of bad experts
- it opens a way to relate prior posterior
ranking, i.e., the achieved results of
supported projects
Methodological aspects
- it can be tailored to other problems
- it can serve as a tool supporting negotiation
18Future
- alternative models of experts, e.g.,
non-normal, Markov-chain type - comparison of prior and posterior ranking
- application to other negotiation-type processes
- application to individual marks thresholds
- quality assurance of the evaluation including
experts competence !