Title: The art and science of measuring people
1The art and science of measuring people
- Reliability
- Validity
- Operationalizing
2Overview of design and analysis
- Posing a usability question
- Conceptualizing the question
- Operationalizing the related concepts
- Identifying Independent, Dependent, Controlled
Variables - Developing the Hypothesis
3Choosing the testing method
- What method is appropriate for the current
situation? (experiment, observation, surveys
etc.) - gtgt choice of method as a
- trade off between control
- realism
- Experimental, Quasi-Experimental and
Non-Experimental Methods.
4Collecting data
- The art of finding and recruiting participants
- A practical view of randomization Randomization
and Pseudo Randomization - Random Selection and Random Assignment.
- Practical issues about sample size and
statistical power.
5Analyzing the data Basic Statistics
- Levels of measurement nominal, ordinal,
interval, and ratio - Mean, median, standard deviation
- Testing mean differences
- Significance levels and what they mean
6Analysis of experimental designs Single Factor
Experiments
- Statistical Hypothesis Testing
- Estimates of Experimental Error
- Estimates of Treatment Effects
- Evaluation of the Null Hypothesis
- Various ANOVA models
7Multi Factor Experiments
- Advantages of the factorial design
- Interaction Effects
- The power of within-subjects designs (reduction
of variance) - The two factorial experiment
- Higher Order Factorial Designs
8Analysis of Non-Experimental Studies
- Statistical methods for analyzing correlational
data - Correlations, Scatter Plots, Partial Corrs
- Multiple Regression
- Introduction to Factor Analysis, Cluster
Analysis and Multidimensional Scaling
9Surveys and Questionnaires
- The design of surveys and questionnaires
- How to frame questions
- Kinds of scales Likert, Semantic Differential
etc. - Analyzing survey data which items are useful,
Item Response Theory - Forming a scale to measure an attribute, e.g.,
satisfaction. Reliability, validity of scale
10Measuring Individual Differences
- How to test for individual differences within
users - Kinds of individual differences variables
- -demographic such a age, gender etc.
- -situational motivation, interest, fatigue
- -cognitive memory, cognitive style etc.,
- -personality internal/external locus of control
- How to analyze existing data to identify
individual differences, and how to design studies
to test for individual differences?
11Frenzied Shopping Obstacles to purchase, and the
perception of download times -A study on
ecommerce conducted by Jared Spool
- A critical analysis and Illustration of
alternative methods of examining this question
12Frenzied shopping
- Create a realistic scenario in present
situation, get person motivated - Counted obstacles to purchase
- Advantages of measure
- concrete people agree about measure
- valid good measure of actual ecommerce
experience - Disadvantages of measure
- not reliable since situation is not structured
- data analysis problems
13Results
- Found more than 200 obstacles to purchase
- The more the no of users, the greater the no of
problems - Whats wrong with each test discovering hundreds
of problems? - Client has limited resources, need to focus on
solving important (most common / most
catastrophic problems)
14More resultsPerception of download times
- How long will users wait for pages to download?
- -Should web developers waste their time in
making pages faster. - Method Users were asked to rate the perceived
speed of pages after they had completed task. -
- Ave. Speed Rated
- Amazon.com 30 sec Fastest
- About.com 8 sec Slowest
15So what do download times relate to?
- Only correlated with success or failure of
shopping. - (Amazon.com judged to be slower than About.com
even though About.com was much faster) - Result is foregone conclusion given the task.
- Problems with method
- Memory issues Users asked for ratings at the end
of their experience with all the sites.
Retrospective memory problems. - Ask someone waiting for a page to download if it
is taking too long!
16Rated speed no longer reflects the browsing,
searching part of the experience. Cannot infer
that download speeds are not important, can only
infer that perception of download speeds can be
influenced by other aspects of site
17Perception of download speed and all the ways to
study it...
18Survey Are people bothered by long download
times?
- Sample Question
- How often do you leave a site without waiting
for the first page to download? - 0-5 of times
- 5-10 of times
- 10 and higher
- In your opinion, how important are the below web
site characteristics. Rate their relative
importance. - Download speed
- site content
- site interactivity
Possibilities Task based surveys
19Observation Do people seem to like fast (without
graphics) sites as compared to slow (with
graphics) sites?
- Method Make two versions of a site, one with
sophisticated graphics (slower site) and the
other mostly text (faster site). Ask subjects to
browse / complete a task on both sites. - Measurement Watch participants for signs of
frustration or satisfaction with speed of site
20Experiment Relationship of perceived download
times to actual download times?
- Method Find similar sites with differential
speeds. Ask people to complete the same tasks on
both sites. Give them some interesting and some
boring tasks, and less than enough time to
complete the task. - Measurement Log the clicks of the users as they
traverse the sites. How many of the interesting
and how many boring tasks did they complete.
Relate that to download speed of site. - Do some users tend to be more frustrated with
slower sites.
21User Logs Do people leave sites while waiting
for slow pages to download?
- Method Find similar sites with differential
speeds. Analyze the server logs for the sites. - Measurement Log the clicks of the users as they
traverse the sites. How many of the interesting
and how many boring tasks did they complete.
Relate that to download speed of site. - Do some users tend to be more frustrated with
slower sites.
22The state of the art
- What usability methods are currently prevalent
and accepted in the field - CUE 2
23Comparative Usability Evaluation(CUE) 2
Purpose Too much emphasis on one-way mirrors and
scan converters Little knowledge of REAL
usability testing procedures Who checks the
checker? Method Nine teams tested the
usability of a web site Seven professional
teams Two student teams Four European, five US
teams Test web-site www.hotmail.com
24Problems found in Comparative Usability Evaluation
25Problem Found by Seven Teams
During the registration process Hotmail users are
asked to provide a password hint question. The
corresponding text box must be filled. Most
users did not understand the meaning of the
password hint question. Some entered their
Hotmail password in the Hint Question text box.
26Characteristics of the tests
27Problems by teams
28What factors predict no of problems no of
common (non-exclusive) problems?
29Inferences from CUE study
- Much disagreement about methods of usability
testing - How to test?
- Who should test?
- What methods to use?
- How many testers to have?
- How many users to have?