Data Analysis (Quantitative Methods) - PowerPoint PPT Presentation

1 / 59
About This Presentation
Title:

Data Analysis (Quantitative Methods)

Description:

Venn Diagram. The sample space of an experiment is the collection of all its sample points. ... Venn diagrams. Graphical representations. Ob1 Ob2 Obn. S. Examples ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 60
Provided by: scet8
Category:

less

Transcript and Presenter's Notes

Title: Data Analysis (Quantitative Methods)


1
Data Analysis (Quantitative Methods)
  • Lecture 1
  • Fundamental Statistics

2
What is statistics
  • Statistics is the science of data. This involves
    collecting, classifying, summarizing, organizing,
    analyzing, and interpreting numerical information.

3
Dealing with Data
  • Measurement Scales
  • Descriptive Statistics
  • Inferential Statistics

4
Dealing with Data
  • Quantitative data are measurements that are
    recorded on a naturally occurring numerical
    scale.
  • Qualitative data are measurements that cannot be
    measured on a natural numerical scale they can
    only be classified into one of a group of
    categories.

5
Levels of Measurement.
  • Nominal Scale (Qualitative category membership
    e.g. gender, eye colour, nationality).
  • Ordinal Scale (Ranks or assignments, positions in
    a group e.g. 1st 2nd 3rd).
  • Interval and Ratio Scales (measured on an
    independent scale with units, e.g. I.Q scale.
    Ratio scale has an absolute zero point e.g.
    distance, Kelvin scale).

6
Variables
  • A variable is a characteristic or property of an
    individual population unit (a set of unit we are
    interested in studying).
  • Discrete Variables There are no possible values
    between adjacent units on the scale. For
    Example, number of children in a family. X1, X2,
    , Xn
  • Continuous Variables Is a variable that
    theoretically can have an infinite number of
    values between adjacent units on the scale. For
    Example, Time, height, weight. X e 0,100,
    0,30), (12, 80, (1,2)

7
Descriptive Statistics
Descriptive statistics utilizes numerical and
graphical methods to look for patterns in a data
set, to summarize the information revealed in a
data set, and to represent that information in a
convenient form.
  • Graphical Representation of Data
  • Measures of Central Tendency
  • Measures of Dispersion

8
Representing Data Graphically
  • Bar Charts
  • Histograms
  • Pie Charts
  • Scattergrams

9
The Bar Chart
  • Used for Discrete variables
  • Bars are separated

10
Histogram
  • Columns can only represent frequencies.
  • All categories represented.
  • Columns are not spaced apart.

11
Pie Chart
  • Used to illustrate percentages

12
Scattergrams - Positive Relationships
13
Negative Correlation
14
No Relationship
15
Measures of Central Tendency
  • The Mean
  • The Median
  • The Mode

16
The Mean
Mean Sum of all values in a group divided by
the number of values in that group. So if 5
people took 135, 109, 95, 121, 140 seconds to
solve an anagram, the mean time taken is
135 109 95 121 140
600 --------------------------------------------
----------- 120 5
5
17
The Mean Pros Cons
  • Advantages
  • Very Sensitive Measure.
  • Forms the basis of most tests used in inferential
    statistics.
  • Disadvantages
  • Can be effected by outlying scores E.g.
  • 135, 109, 95, 121,140 480. Mean 1080/6 180
    seconds.

18
The Median
The median is the central value of a set of
numbers that are placed in numerical order.
For an odd set of numbers 95, 109, 121, 135, 140
The Median is 121
For an even set of numbers 95, 109, 121, 135,
140, 480 The Median is the two central scores
divided by 2. (121 135)/2 128
19
The Median Pros Cons
  • Advantages
  • Easier and quicker to calculate than the mean.
  • Unaffected by extreme values.
  • Disadvantages
  • Doesnt take into account the exact values of
    each item
  • If values are few it can be unrepresentative. E.G
  • 2,3,5,98,112 the median is 5

20
The Mode
The Mode The most frequently occurring value.
1, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6,
7, 7, 7, 8
The Mode 5
21
The Mode Pros cons
  • Disadvantages
  • Doesnt take into account the value of each item.
  • Not useful for small sets of data
  • Advantages
  • shows the most important value of a set.
  • Unaffected by extreme values

22
Data Types and Central Tendency Measures.
The Mode may also be used on Ordinal and Interval
Data. The median may also be used on Interval
Data.
23
Why look at dispersion?
  • 17, 32, 34, 58, 69, 70, 98, 142
  • Mean 65
  • 61, 62, 64, 65, 65, 66, 68, 69
  • Mean 65

24
Measures of Dispersion
  • The Range
  • Variance
  • The Standard Deviation

25
The Range
The Range is the difference between the highest
and the lowest scores.
Range Highest score - lowest score
4, 10, 5, 12, 6, 14 Range 14 - 4 10
26
Population
  • Population is a set of units (usually, people,
    objects, transactions, or events) that we are
    interested in studying.

27
Sample
  • A sample is a subset of the units of a
    population.
  • A statistical inference is an estimate,
    prediction, or some other generalization about a
    population based on information contained in a
    sample

28
Variance
Population Variance
Sample Variance
29
Calculating the Standard Deviation
Population standard deviation
Sample standard deviation
30
Inferential Statistics
  • Inferential statistics utilizes sample data to
    make estimates, decisions, predictions, or other
    generations about a larger set of data.
  • Inferential statistics allows us to draw
    conclusions about populations, and to test
    research hypotheses.
  • Inferential Statistics Involves
  • Probability
  • Statistical Tests e.g., t test.

31
Summary
  • All data is measured on either Nominal, Ordinal,
    Interval or Ratio Scales
  • Variables can be discrete and continuos
  • Descriptive Statistics such as measures of
    central tendency and dispersion are used to
    describe or characters data
  • Inferential Statistics is used to make inferences
    from sample data about the population at large.

32
Data Analysis(Quantitative Methods)
  • Lecture 5
  • Probability

33
Probability
  • An experiment is an act or process of observation
    that leads to a single outcome that cannot be
    predicted with certainty.
  • A sample point is the most basic outcome of an
    experiment. Ob1, Ob2, , Obn.

34
Sample Space Venn Diagram
  • The sample space of an experiment is the
    collection of all its sample points.
  • S Ob1, Ob2, , Obn
  • Venn diagrams.
  • Graphical representations.

Ob1 Ob2 Obn
S
35
Examples
  • Experiment Observe the up face on a coin
  • Sample Space 1. Observe a head H
  • 2. Observe a tail
    T
  • S H, T

H T
S
36
Examples
  • Experiment Observe the up face on a die.
  • Sample Space 1. Observe a 1.
  • 2. Observe a 2.
  • 3. Observe a 3.
  • 4. Observe a 4.
  • 5. Observe a 5.
  • 6. Observe a 6.
  • S 1,2,3,4,5,6

1 2 3 4 5 6
s
37
Examples
  • Experiment Observe the up face on two coins
  • Sample Space 1. Observe HH
  • 2. Observe HT
  • 3. Observe TH
  • 4. Observe TT
  • S HH,HT,TH,TT

HH HT TH TT
S
38
Probability Rules for Sample Points
  • All sample point probabilities must lie between
    0 and 1.
  • The probabilities of all the sample points within
    a sample space must sum to 1.

39
Probability
  • An Event is a specific collection of Sample
    points.
  • Example Consider the experiment of tossing two
    balanced coins.
  • Events
  • A Observe exactly one head
  • B Observe at least one head

40
Probability
  • Sample point Probability
  • HH 1/4
  • HT 1/4
  • TH 1/4
  • TT 1/4
  • P(A)P(HT)P(TH)1/2
  • P(B)P(HH)P(TH)P(HT)3/4

41
Probability of an Event
  • The probability of an event A is calculated by
    summing the probabilities of the sample points in
    the sample space for A.

42
Steps for Calculating Probabilities of Events
  • Define the experiment.
  • List the sample points. Ob1,Ob2,,Obn
  • Assign probabilities to the sample points.
  • P(Ob1), , P(Obn).
  • Determine the collection of sample points
    contained in the event of interest.
  • Sum the sample point probabilities to get the
    event probability.

43
Unions and intersections
  • The Union of two events A and B is the event that
    occurs if either A or B or both occur on a single
    performance of the experiment, denoted as the
    symbol

44
Unions and intersections
  • The intersection of two events A and B is the
    event that occurs if both A and B occur on a
    single performance of the experiment, denoted as
    the symbol
  • P(A ? B)

45
Unions and intersections
A
A
B
46
Unions and intersections
  • Example 1.
  • Consider the die-toss experiment. Define the
    following events
  • A Toss an even number
  • B Toss a number less than or equal to 3
  • Find

47
Unions and intersections
48
Complementary Events
  • The complement of an event A is the event that A
    does not occur -- that is, the event consisting
    of all sample points that are not in event A and
    denoted as symbol Ac
  • P(A)P(Ac)1

49
Probability
  • Additive Rule of Probability

50
Conditional Probability
  • To find the conditional probability that event A
    occurs given that event B occurs, divide the
    probability that both A and B occur by the
    probability that B occurs, that is,

51
Probability
  • Tree diagram

HH
H
T
H
HT
TH
H
T
TT
T
52
Independent Events
  • Events A and B are independent events if the
    occurrence of B does not alter the probability
    that A has occurred that is, events A and B are
    independent if
  • P(AB)P(A)
  • When events A and B are independent, it is also
    true that P(BA)P(B)
  • Events that are not independent are said to be
    dependent.

53
Probability
  • Probability of Intersection of Two independent
    Events
  • If events A and B are independent, the
    probability of the intersection of A and B equals
    the product of the probabilities of A and B that
    is P(A ? B)P(A) P(B)
  • The converse is also true
  • if P(A ? B)P(A) P(B), then A and B are
    independent.

54
Exercise
  • Calculate the mode, mean, and median of the
    following data
  • (1) 12, 13, 15, 18, 12, 56, 13, 17, 19, 20, 35,
    36
  • (2) 35, 23, 18, 26, 35, 23, 39, 45, 47, 37, 23,
    35, 19

55
Excercise
  • Calculate the range, variance and standard
    deviation of the following data
  • (1) 2, 3, 1, 6, 8, 5, 9, 4, 5
  • (2) 2, 0, 8, 4, 7, 5, 3, 2, 100

56
Excercise
  • Two fair coins are tossed and the following
    events are defined
  • AObserved at least one head
  • BObserved exactly one head
  • CObserved exactly one tail
  • DObserved at most one head
  • Find P(A), P(B ? D), P(AD)

57
Exercise
  • Use tree-diagram to obtain the Sample space of an
    experiment that consists of a fair coin being
    tossed four times. Consider the following
    events
  • AAll four results are the same.
  • Bexactly one Head occurs.
  • Cat least two Heads occur.
  • Find P(A),P(B),P(C), P(A)P(B)P(C), P(A? C),
    P(A? B)
  • Hence, explain why all the events A,B and C are
    not Mutually Exclusive.

58
Exercise
  • Let P(A)0.7, P(B)0.5 and P(A? B) 0.8.
  • Find (1) P(A? B) (2) P(BA)

(3) Is event A independent of event B?
59
References
  • Statistics, 8th Edition
  • MaClave and Sincich
  • Prentice Hall, 2000.
Write a Comment
User Comments (0)
About PowerShow.com