Title: Talking about Science
1Talking about Science
- A lecture in the 6th Century course
- Mankind in the Universe
- by Kees van Deemter, Computing Science dept.,
University of Aberdeen
2Objectivity
- a major theme in Mankind in the Universe
- Can people know the universe? (e.g., the Big
Bang, man-made global warming) - Can people know objectively whats right? (e.g.
stem-cell research) - Philosophical positions include
- Realism
- Anti-realism
- Constructivism
- This lecture the expression of scientific data
and theories in language
3Plan of the lecture
- Publishing scientific results
- Using computers from data to text
- (Science in daily life and politics)
41. Publishing Scientific Results
- Peer review the main mechanism for deciding
whether a result is worth publishing (e.g., as a
journal article) - Authors submit article
- (2) Editors select expert reviewers (peers)
- (3) Reviewers assess article
- (4) Editors decide accept/reject/revise If
revise then authors may go back to (1) - Submissions as conference papers lack revise
option
5Peer review is no guarantee against flaws
- Human frailty
- Maybe the experts lack in expertise
- Peers may disagree with each other
- (Maybe they dont like the authors)
- (A dishonest peer may reject, then steal
results) - Possible solutions
- Anonimity of reviewer and/or reviewee
- Declaring conflicts of interest
- No silver bullet. Much depends on the editor.
6Peer review is no guarantee against flaws
- 2. Publication bias
- Reviewers and editors are keen on interesting
results. - Interesting results are read eagerly, are often
quoted, and sell journals
7So how about disappointing results?
- Research hypothesis activity x makes you more
likely to get cancer - 1000 patients tested. 500 do x, 500 dont do x.
x 50 get cancer not x 53 get cancer - Your hypothesis is not confirmed (the trend even
goes in the opposite direction) - Your journal submission may be rejected, because
its not interesting enough.
8- Your negative findings may never get published
- Yet they tell us something of potential value
- Maybe x is unrelated to cancer
- Maybe x makes you less likely to get cancer
- Note Your experiment does not show convincingly
that x makes you less likely to get cancer.
(50/53 is too small a difference) - Statisticians say the result is not significant
- But others may have found similar negative
results
9Meta-analysis
- A stats analysis that tries to draw conclusions
from a set of experiments. (Meta about) - Championed, among others, by the Cochrane
collaboration - Instructive logo
10The Cochrane logo explained
- A landmark 1989 analysis of the use of steroids
on prematurely born babies - 2 studies had found a positive effect
(significant), - 5 studies had found no significant effect
- Doctors did not believe the effect until a
meta-analysis of all 7 studies together showed a
positive effect - Back to our imaginary study of cancerA
meta-analysis might have shown that x makes you
less likely to get cancer
11But
- those negative results will not be counted in the
meta-analysis, because they were never published - Omission of disappointing results could even
result in the erroneous conclusion that x makes
you more likely to get cancer - Goldacre 2009 Bad Science. Harper Perennial
123. Cheating
- Dishonesty about authorship plagiarism
- Dishonesty about data and statistics
13Plagiarism
- taking someone elses work and passing it off
as ones own - There is a grey area. I got my definition from
the Macs Dictionary application. Do I have to
acknowledge this? - If you take someone elses ideas then (try to)
say who had them first - If you also take someone elses words verbatim
(for more than just a few words) then put
quotes around the text as well - A grey area just a few words
14Plagiarism and peer review
- Peer review contains important safeguards against
plagiarism. - One of your reviewers may have read that earlier
article - But peer review is no guarantee.
- What if the article was published in Japanese?
- Still, offenders get caught. Moreover, if the
dishonesty only concerned the authorship, the
implications for science are limited - A victimless crime?
15Improper use of data
- In science (as opposed to teaching),
- this is a bigger problem than plagiarism
- Conscious cheating
- Unconscious cheating
16Conscious cheating (?)
- Some notorious cases, where it appears that data
were intentionally faked or distorted - Andrew Wakefields work linking the MMR vaccine
to autism - Parts of the University of East Anglias work on
global warming - Hwang Woo-suks work on stem-cell research and
human cloning
17BBC News, 15 Dec 2005
- () Stem cell success 'faked
- A South Korean cloning pioneer has admitted
fabricating results in key stem cell research, a
colleague claims. At least nine of 11 stem cell
colonies used in a landmark research paper by Dr
Hwang Woo-suk were faked, said Roh Sung-il, who
collaborated on the paper. Dr Hwang wants the US
journal Science to withdraw his paper on stem
cell cloning, Mr Roh said. Dr Hwang, who is
reported to be receiving hospital treatment for
stress, was not available for comment. Science
could not confirm whether it had received a
request to retract the paper. Dr Hwang's paper
had been hailed as a breakthrough, opening the
possibility of cures for degenerative diseases.
()
18Unconscious cheating observer bias
- One experiment Some patients got a medicine
against multiple sclerosis, others got a placebo - 50 of trained observers (A) knew who got the
placebo - 50 of trained observers (B) did not know
- Observers (A) observed an improvement in the
condition of patients who were given the medicine - Observers (B) did not observe an improvement
- Noseworthy et al. The impact of blinding on the
results of a randomized, placebo-controlled
multiple sclerosis clinical trial. Neurology.
200157S31 S35.
19Unconscious cheating
- Rosenthal effect. Participants were given
photographs of people, and ask to say whether
these were successful in life. - Some (A) experimenters were told that
participants judge most photographs as successful - Other experimenter (B) were told that
participants judge most photographs as
unsuccessful - Participants supervised by A judged photographs
much more positively than those supervised by B - Supervisors could only read out a set speech!
20Unconscious cheating
- Rosenthal effect (conclusion) by believing in a
given behaviour, you can make this result come
about - Rosenthal R. Interpersonal expectations effects
of the experimenter's hypothesis. In Rosenthal
Rosnow (eds.) Artifact in Behavioral Research.
New York, NY Academic Press 1969181-277 - Rosenthal effect concerns experiments with
people observations in physics can be hazardous
as well (e.g., when do you stop running an
experiment?) - Observer bias Rosenthal effect are reasons for
making studies with human subjects double-blind
21- Cheating is not something done by a few
criminals, but something we all need to
constantly be on guard against - in science
- in daily life
- The science behind these phenomena is interesting
in itself - Ben Goldacre 2008 (again!)
22Dubious uses of statistics
- There are lies, damn lies, and statistics
(author unknown) - This not an indictment of numbers or statistics
- Statistics is safe when performed competently,
but errors are easy to make - These can be conscious or unconscious
23One common abuse of statistics
- Failing to declare your research hypothesis in
advance - Recall the disappointing cancer study study
- Your research hypothesis x makes cancer more
likely - You found weak indications for the oppositex
makes cancer less likely (50/53, not
significant) - Suppose you had found strong indications for this
(e.g., 40/63, significant) - Reporting this as a confirmed hypothesis would be
wrong! - Stats is for testing a pre-existing suspicion
- Anything else is data fishing
24On to our next topic
252. Computers as authors from data to text
- Measurement can give rise to a huge amount of
numerical information, e.g., - Monitoring patients in intensive care
- Climate predictions 2 petabyte (2 1015 bytes)
- People are bad at making sense of this, so we
use Natural Language Generation to let computers
produce readable text - At Aberdeen Reiter, Turner, Sripada,
Davy.Example Turners pollen level forecasts
demo - http//www.csd.abdn.ac.uk/rturner/cgi_bin/pollen.
html
26Neonatal ICU (Babytalk project)
27Baby Monitoring
28Input Sensor Data (45 mins)
29Some medical jargon
- Bradycardia when the heart rate is too slow
- Intubation placing a tube in the windpipe (e.g.,
for oxygen or drugs) - FiO2 a metric of oxygen flow
- Sats oxygen saturation levels
- ETT suction sucking away contaminated
secretions (which might cause pneumonia) - BP Blood Pressure
- HR Heart rate
30Written by nurse
- In preparation for re-intubation, a bolus of 50ug
of morphine is given at 1039 when the FiO2 35.
There is a momentary bradycardia and then the
mean BP increases to 40. The sats go down to 79
and take 2 mins to come back up. The toe/core
temperature gap increases to 1.6 degrees. - At 1046 the baby is turned for re-intubation and
re-intubation is complete by 1100 the baby being
bagged with 60 oxygen between tubes. During the
re-intubation there have been some significant
bradycardias down to 60/min, but the sats have
remained OK. The mean BP has varied between 23
and 56, but has now settled at 30. The central
temperature has fallen to 36.1C and the
peripheral temperature to 33.7C. The baby has
needed up to 80 oxygen to keep the sats up. - Over the next 10 mins the HR decreases to 140 and
the mean BP 30-40. The sats fall with ETT
suction so the FiO2 is increased to 80 but by
1112 the FiO2 is down to 49.
31Generated by Babytalk system
- You saw the baby between 1030 and 1112. Heart
Rate (HR) 148. Core Temperature (T1) 37.5.
Peripheral Temperature (T2) 36.3. Mean Blood
Pressure (mean BP) 28. Oxygen Saturation (SaO2)
96. - The tcm sensor was re-sited.
- By 1040 SaO2 had decreased to 87. As a result,
Fraction of Inspired Oxygen (FIO2) was set to
36. SaO2 increased to 93. There had been a
bradycardia down to 90. Previously 50.0 mics/min
of morphine had been administered. Over the next
17 minutes mean BP gradually increased to 37. - By 1100 the baby had been hand-bagged a number
of times causing 2 successive bradycardias. She
was successfully re-intubated after 2 attempts.
The baby was sucked out twice. - At 1102 FIO2 was raised to 79.
- By 1106 the baby had been sucked out a number of
times. Previously T2 had increased to 34.3. Over
the next 17 minutes HR decreased to 140. - FIO2 was lowered to 61.
32How the computer generates the text (four
stages, just a sketch )
- A kind of data mining using computers to
analyse summarise data - Signal analysis
- Data abstraction
- Content Determination
- Saying it in English (alternative
graphs/diagrams)
331. Signal Analysis
- Essentially a collection of mathematical tools
- Detect trends, patterns, events, etc in the data
- (Blood oxygen levels) increasing
- Downward spike (in heart rate)
- Etc.
- Separate real data from artefacts
- Sensors can malfunction
342. Data Abstraction (1)
- Detect higher-level events in the data
- Bradycardia
- Sensor flapping against skin (inferred from
shapes in data) - Not just maths medical knowledge required
352. Data Abstraction (2)
- Determine relative importance of events
- Link related events
- Blood O2 falls, therefore O2 level in incubator
is increased (reason for the action) - HR up because baby is being handled (cause)
- Potentially a strong point of text summaries
- Graphs/diagrams seldom show such links
363. Content Determination
- Determine whats important enough to talk about.
- This
- depends on purpose context of text
- How much space/time is available?
- Saying A may force you to say B as well
- uses importance rating (from Data Abstraction
(2))
374. Saying it in English
- Lots of different issues. For instance,
- How to organise the text as a whole? (e.g.,
Chronologically? Organised in paragraphs?) - What sentence patterns to use? (e.g., Active
mood? One fact per sentence?) - have varied between 23 and 56
- How to refer? (e.g. refer to a time saying at
1105, or after intubation?) - What words to use (e.g., avoiding medical
jargon?)
38Objectivity issues
- In signal analysis Whats an event?
- Imagine three short downward spikes in HR
- Three events or one?
- In data abstraction
- Concepts like bradycardia are theory laden
- 20 years from now, a different definition?
- Causality is problematic
- Was HR increase caused by handling?
- Many thresholds are a bit arbitrary
39Objectivity issues
- In Content Determination
- Suppose 37.5C counts as a fever. Suppose this
lasts for only 10 minutes - Is this worth saying? (Can it be relevant for
clinical decisions?)
40- How long does your temperature need to be above
threshold to call it a fever? - How long before we call something a bradycardia?
- What makes a momentary bradycardia, or
significant bradycardias? - How long can a fever last before it is worth
reporting?
41Using vague words
- What does it take for SATs to be OK?
- As SATs decrease, medical complications become
more likely - This is not a Yes/No thing, but something gradual
- Application of vague words can be a matter of
judgment - Should a patients age be taken into account?
- His/her medical condition? The nurses
expectations? - Computers struggle using vague words
(significant, momentary, OK) appropriately - Often avoided altogether (see earlier example)
42Using vague or crisp words
- Science often replaces vague concepts by crisp
ones, e.g. obese BMI gt 30 - Such definitions make a value judgment about
whats good or bad for ones health (e.g.
motivated by statistical data about life
expectancy) - Hence, they are theory laden
- These value judgments may not always match
doctors assessment - There is more to morbide obesity than BMI
43- Not just in medical affairs!
- Consider weather forecasting
44Two weather forecasters(Is the cup half full or
half empty?)
- 1. Sunny spells and mainly dry. Temperatures up
to 15C this afternoon and when the sun is out it
will feel pleasant enough in spite of a moderate
northerly breeze. - 2. Cloudy at times with a slight chance of
rain. Temperatures only reaching 15C this
afternoon and with any rain around and a moderate
northerly breeze it will feel cooler.
45Reading material on Computers as Authors
- Reiter 2007 An architecture for Data-to-text
systems. In Proceedings of ENLG-07. (Conference
paper on the NLG challenges involved in
mapping data to text) - van Deemter 2010 Not Exactly in praise of
vagueness. Oxford University Press. (Informal
book on the expression of quantitative
information chapters 3 and 11)
46In summary
- Complete objectivity may not always be achievable
47In summary
- Complete objectivity may not always be achievable
- But we can keep trying!
483. Science in daily life and politics
- Too large a topic to squeeze into the remaining
time - An entire 6th Century course on this topic
Science and the Media - http//www.abdn.ac.uk/thedifference/media-science.
php
49-
- Instead, lets have a brief wrap-up of the
course