Title: CS 178H Introduction to Computer Science Research
1CS 178HIntroduction to Computer Science Research
2What is CS Research?
- Discovery of new knowledge of computing through
mathematical analysis and experimental evaluation
of algorithms and computer software.
3Epistemology(definitions from Wikipedia)
- Epistemology (from Greek ep?st?µ? - episteme,
"knowledge" ?????, "logos") or theory of
knowledge is the branch of philosophy concerned
with the nature and scope (limitations) of
knowledge. It addresses the questions - "What is knowledge?"
- "How is knowledge acquired?"
- "What do people know?"
- "How do we know what we know?"
4Rationalism
- Rationalism is "any view appealing to reason as a
source of knowledge or justification" (Lacey
286). In more technical terms it is a method or a
theory "in which the criterion of the truth is
not sensory but intellectual and deductive"
(Bourke 263). - Originated with Socrates (469 BC399 BC) and
Plato (428/427 BC 348/347 BC).
5Empiricism
- Empiricism is a theory of knowledge which asserts
that knowledge arises from experience. Empiricism
emphasizes the role of experience and evidence,
especially sensory perception, in the formation
of ideas. - Originated with Aristotle (384 BC 322 BC)
6Rationalism in CS(Theoretical CS)
- Programs are formal mathematical objects.
- Therefore, important properties of
algorithms/software can be proven mathematically. - Termination
- Correctness (satisfies a formal specification)
- Computational Complexity (time and space
requirements)
7Theoretical CS Research
- Algorithm Design and Analysis
- Design a new (more efficient) algorithm for some
well-defined problem (e.g. sorting,
longest-common-subsequence) - Mathematically prove the correctness and improved
complexity of the new algorithm. - Theoretical Analysis
- Form a mathematical conjecture about a
computational problem (e.g. graph isomorphism is
NP-complete) - Mathematically prove the conjecture as a theorem.
8Limits of Rationalism in CS
- Sometimes software is too complex to analyze
theoretically. - Sometimes correctness cannot be characterized
formally and depends on natural or human
behavior. - Protein folding
- Handwriting/speech recognition
- Sometimes software behavior on real data depends
on unknown natural properties of this data. - Locality affecting paging performance
9Empiricism in CS(Experimental CS)
- Behavior of software can be studied
experimentally. - Anecdotal evidence (running a few sample cases)
is insufficient. - Collect data (e.g. accuracy, run-time) on running
programs many times on large, real-world
benchmark collections. - Verify hypotheses about behavior using controlled
experiments. - Statistically analyze results for significance.
10Scientific Method(steps from Wikipedia)
- 1) Define the question
- 2) Gather information and resources (observe)
- 3) Form hypothesis
- 4) Perform experiment and collect data
- 5) Analyze data
- 6) Interpret data and draw conclusions that serve
as a starting point for new hypothesis - 7) Publish results
- 8) Retest (frequently done by other scientists)
111) Define the question
- Example from My Research Search Query
Disambiguation from Short Sessions - Can a web search engine disambiguate queries?
scrubs
Search
?
122) Gather information and resources
- Obtained web search session data from Microsoft
- Find instances of ambiguous queries
- Find contextual clues that might help
disambiguate queries
13Context can Aid Disambiguation
143) Form Hypothesis
- Previous queries and clicks in a session can help
disambiguate queries by relating them to previous
sessions involving the same query (where we know
what result was clicked).
154) Perform Experiment and Collect Data
- Build system that uses prior context and previous
session data to predict clicked results for new
user. - Reorder results from existing search engine based
on predicted probability of clicking on a result. - Should reduce number of results user needs to
examine before finding a relevant one. - Test on unseen data and compare predictions to
actual results clicked.
16Using Relational Information with aMarkov Logic
Network (MLN)
huntsville school
. . .
scrubs
scrubs.com
. . .
hospitallink.com
scrubs
scrubs-tv.com
ebay.com
17Controlled Experiment
- Performance of experimental system must be
compared to some baseline or control. - Controls are necessary to demonstrate the system
is improving over some naïve method (strawman) or
current best system for a problem. - For example, in the old joke, someone claims that
they are snapping their fingers "to keep the
tigers away" and justifies this behavior by
saying "see - its working!" While this
"experiment" does not falsify the hypothesis
"snapping fingers keeps the tigers away", it does
not really support the hypothesis - not snapping
your fingers does not keep the tigers away as
well (Wikipedia Experiment)
18Control for Query Disambiguation
- Simple control is to order results from search
engine randomly. - Another baseline is to just use ordering from
existing (non-personalized) search engine.
19Performance Metrics
- Need quantitative measure of systems performance
(runtime or accuracy). - Compare quantitative performance of experimental
system to baseline control system. - To measure accuracy of ordering of web search
results we measure AUC-ROC - Percentage of irrelevant results not seen by user
before finding a relevant result (if scan results
from top)
205) Analyze Data
- Do results support the hypothesis?
- Are differences statistically significant?
- Use statistical test to determine if observed
differences are unlikely to be due only to random
variation, i.e. probability of null hypothesis
.05.
21Results (AUC-ROC)
Indicates statistically significant improvement
over previous result
226) Interpret data and draw conclusions that serve
as a starting point for new hypothesis
- Is random ordering the best baseline to compare
to? - What if just order results based on popularity
(i.e. how many people clicked on a particular
result after submitting a given ambiguous query).
23New Baseline Results
24Refine System
- Develop MLN that incorporates popularity
information. - Rerun experiment to obtain results for revised
version and verify the hypothesis that it
performs better than the popularity baseline.
25Results for Revised System
267) Publish Results
- Paper submitted to the international data mining
conference. - KDD-09 Paris, June 28 July 1, 2009