maths - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

maths

Description:

'Population' is all cars which could have had additive. ... Luxcar, makers of the best luxury cars. Burnol, the finest fuel you can buy ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 55
Provided by: ens8
Category:
Tags: maths

less

Transcript and Presenter's Notes

Title: maths


1
maths statistics from Probability to the
Normal Distribution Dr William
Megill 4E3.43 enswmm_at_bath.ac.uk
XX10118
2
Quick Review
  • Definition
  • Statistics Analysis and interpretation of data
    with a view toward objective evaluation of the
    reliability of the conclusions based on the
    data.
  • Why?
  • Variability in the real world (Stochasticity)
  • Issues with experimental data
  • Issues with design for robustness

3
Stats in experimental design
  • Knowing some stats ahead of time is a good thing
  • too often students/researchers attempt analysis
    of research data only to find
  • they have too few data to see differences
  • they have too much of the wrong data

4
Types of statistics
  • Descriptive stats
  • Organising data to talk about them
  • Inferential stats
  • Inferring information about the whole (popn)
    from characteristics of its parts (sample)

5
Types of data
  • Ratio data
  • Constant size interval between adjacent units
  • Need a zero (to enable multiplication)
  • e.g. a 15cm bar is half the length of a 30cm one
  • Interval data
  • Constant size interval
  • No zero (or zero is arbitrary)
  • e.g. temperature
  • 25-20C (77-68F) 10-5C (50-41F)
  • but 40C (104F) ltgt 2 x 20C (68F)
  • NB. Kelvin temperature is Ratio scale

6
Types of data
  • Circular data
  • Variation on interval scale
  • e.g. compass degrees
  • Ordinal data
  • Ordered, but not constant interval
  • e.g. Grades, comfort
  • Nominal data
  • No order
  • e.g. car names, blood types

7
Accuracy Precision
  • Accuracy
  • nearness of a measurement to the actual value
  • Precision
  • closeness of repeated measures to each other
  • Significant figures
  • refer to accuracy
  • use all digits in calcs, report proper sig figs

8
Frequency Distributions
  • Exploring data using tables graphs

You can get graphs to say anything Be wary when
reading them Be honest when drawing them
9
Grouping data
  • how many groups?
  • rule of thumb about 10 to span range

10
Frequency Polygons
  • Use instead of bar graphs/histograms
  • Not for ordinal or nominal data x-axis not a
    ratio scale
  • Absolute or relative frequency
  • NB do not read intermediate frequencies

11
Cumulative Frequency Polygons (Ogives)
  • How many were better than...

12
Populations Samples
  • Primary objective of statistics
  • to infer characteristics of a group
  • by analysing characteristics of a small
    sampling of the group
  • Population
  • entire collection of measurements about which to
    draw conclusions
  • e.g. cars in UK, lifetime of shock absorbers,
    bearing age at failure
  • Sample
  • subset of all of the measurements in the
    population
  • from the characteristics of sample, draw
    conclusions about popn
  • NB One often samples a population that does not
    exist
  • e.g. fuel additive in 40 cars, measure gas
    consumption. Population is all cars which
    could have had additive.
  • Such a population is called hypothetical or
    potential

13
Random Sampling
  • Every member of the population has an equal and
    independent chance of being selected
  • each measurement in the population has an equal
    chance of being selected in the sample
  • the selection of any member of the sample has no
    influence of the selection of any other member
  • Usually this is obvious. Sometimes it isnt...

14
Experimental Design
  • Pseudoreplication
  • Imagine a study of tyre wear.
  • Experimenter has access to 8 cars, with 4 tyres
    each.
  • How many independent measurements?

Depends on the experiment either 32 or 8
15
Probability
  • Definitions
  • Experiment
  • An activity with an observable result, or set of
    results
  • e.g. tossing a coin
  • Outcome
  • An observable result of an experiment
  • e.g. heads
  • Event
  • An outcome or set of outcomes of interest
  • e.g. H
  • Sample Space
  • Set of all possible outcomes of an experiment
  • e.g. H,T

16
Counting outcomes
  • IF
  • Sample space of one experiment has k1 elements
  • Sample space of another has k2
  • Then
  • Combined sample space has k1 x k2

duh...
17
Permutations Combinations
  • Permutation
  • arrangement of objects in a specific sequence
  • e.g. Horse (H), Cow (C), Sheep (S)
  • six different arrangements
  • HCS, HSC, CHS, CSH, SHC, SCH
  • note once first is placed, choices diminish for
    others
  • Notation nPn n(n-1)(n-2)...(3)(2)(1) n!

18
Permutations
  • Fewer than n positions...
  • Horse, Cow, Sheep, Pig
  • total if 4 positions 4P4 4! 24
  • but if only two positions...
  • HC,HS,HP,CH,CS,CP,SH,SC,SP,PH,PC,PS (12)
  • If some objects indistinguishable...
  • HHCC,CCHH,HCHC,CHCH,HCCH,CHHC

19
Combinations
  • group of objects where sequence doesnt matter
  • HC,HS,HP,CH,CS,CP,SH,SC,SP,PH,PC,PS
  • but HCCH, HSSH, etc.

20
Sets
  • Set Collection of elements x
  • Subset
  • Intersection Union
  • Complement
  • Venn Diagram

B
A
C
21
Probability of an Event
  • Likelihood of an event expressed by
  • relative frequency observed from large dataset
  • knowledge of the system under study

22
Adding Probabilities
  • Mutually exclusive?
  • Add em up
  • P(A or B) P(A) P(B)
  • Not mutually exclusive?
  • P(A or B) P(A) P(B) P(A and B)
  • P(A and B) P(A)P(B)

23
Distributions
  • Definition
  • the relative numbers of times each possible
    outcome will occur in a number of trials
  • Probability function/density
  • Function describing the probability that a given
    value will occur
  • Distribution function
  • Function describing the cumulative probability
    that a given value or any value smaller than it
    will occur

NB PDF is derivative of DF
24
  • Discrete Probability Distributions
  • Binomial
  • Poisson
  • Hypergeometric
  • Continuous Probability Distributions
  • Normal

25
Binomial Distribution
  • Discrete probability distribution of obtaining
    exactly n successes out of N trials
  • e.g. machine known to produce, on average, 2
    defective components. what is probability that 3
    items are defective in the next 20 produced
  • Binomial coefficient number of ways of picking k
    unordered outcomes from n possibilities

26
(No Transcript)
27
Poisson Distribution
  • Used to find probability of a single event
    occurring a n times in an interval of time
  • Different from Binomial, in that we dont know q
  • e.g. death by horse kick
  • Conditions
  • Random events throughout an interval
  • Interval can be subdivided such that
  • Prob of gt1 event occurring in subinterval is zero
  • Prob of 1 event occurring in subinterval is prop
    to length of subinterval
  • Events in one subinterval independent of other
    subintervals

28
Poisson Distribution
  • Start with binomial distribution
  • express prob as function of total obs
  • rewrite
  • take limit as N gets big
  • voila

29
Poisson Distribution
  • So what?

Neat thing about Poisson distribution mean
(expectation) variance n
30
Example
31
Hypergeometric Distribution
  • If M defective parts in a total population of N,
    then the prob of selecting r defectives in a
    sample of size n is

32
Normal Distribution
  • Generalisation of the binomial for continuous
    variables
  • Remember
  • Frequency distribution shows P(Xx)
  • Std normal distn
  • (m0, s21)
  • Z statisticstandardised normal variable

33
Normal Distribution
  • Area under whole curve 1
  • Probability of ?
  • Area under proportion of curve between Z0 and
    Z0.54
  • Use formula or lookup table
  • p(0ltZlt0.54)0.2054
  • linear interpolation
  • Careful to read table caption
  • Often given as proportion that lies beyond Z
  • Important values
  • p(0ltZlt0.955) 0.33
  • p(0ltZlt1.96) 0.4750
  • p(0ltZlt2.575) 0.495

34
Normal Distribution
  • Add/subtract areas to calculate other
    probabilities

35
(No Transcript)
36
Applications of Normal Distribution
  • i.e. not just nice curves...

37
Probability Intervals
  • 95 of observations within 1s of m
  • p(0ltZlt1.96) 0.4750
  • p(-1.96ltZlt0) 0.4750
  • p(-1.96ltZlt1.96)0.95
  • 5 of obs outsidethis range

38
Probability Limits
  • Z(x-m)/s -gt X m Zs
  • p(-1.96ltZlt1.96)95 -gt X ? m-1.96s,m1.96s

39
One-tailed probability
  • p(-?ltZlt1.65)0.95

40
Sums Diffs Normal Vars
  • See workbook
  • Basic point distribution of a sum or difference
    of normally distributed variables will itself
    also be normally distributed

41
Hypothesis testing
42
Verifying claims
  • Advertising line
  • Luxcar, makers of the best luxury cars
  • Burnol, the finest fuel you can buy
  • ConstructAll, designers of beautiful buildings
  • the average life of these tyres is 20,000miles
  • on avg, low energy bulbs will last 8000hrs
  • average bottle contents 330ml

43
Hypotheses
  • Null alternative hypotheses
  • Null nothing interesting is happening
  • Alternate something interesting is happening
  • e.g.
  • Explosives engineers deciding how fast to run
  • one says from experience, mean burn rate
    600mm/sec
  • other says gt600mm/sec
  • Stats test
  • H0 m600
  • H1 mgt600

44
Types of Error
  • Cant be 100 sure, so possible that
  • (a) correct hypothesis rejected
  • (b) false hypothesis accepted
  • (a) Type I error
  • (b) Type II error

45
Test of proportion
  • Hypotheses
  • H0 99 of the control modules match the spec
  • H1 lt99 of the control modules match the spec
  • Distribution?
  • either they match or they dont -gt binomial
  • calculate m and s
  • Normal approximation
  • because
  • we can use

46
Graphically...
  • Assess how close 985 is to expected 990
  • Convert to standard normal curve
  • H1 is directional gt one tail
  • since H1 pltp0, use left tail
  • Define 95 confidence
  • Probability is
  • Discrete binomial, not continuous normal, so
    were approximating

47
Choosing a tail
48
Test of a population mean
  • e.g. to verify a manufacturers claim re boiling
    point of a coolant
  • if we know the population variance (i.e. from
    spec sheet, or large of experiments)
  • null hypothesis- H0 x m
  • distribution- assume normal need mean stdev
  • since were sampling, we need std error of mean
  • assume large np gt
  • calculate Z statistic
  • alternate hypothesis

49
Testing a manufacturers claim
  • back to coolant example

50
Experimental differences between paired
treatments can be tested by comparing sample of
differences to a population with mean of zero....
51
Two-sample testing
  • Two samples...
  • do they represent a real (significant) treatment
    effect, or
  • just two samples from the same population?

52
Two-sample testing
  • same as before
  • H0 m1m2 H1 m1 ? m2
  • Combined variance
  • Z statistic
  • compare to standard normal distribution
  • two-tailed in this case, Z0.05(2)1.96

53
Two-sample testing
  • Unknown, but equal variance (usual case)
  • Use pooled sample variance
  • t statistic
  • n1n2-2 degrees of freedom
  • Unknown, unequal variance
  • same statistic
  • but degrees of freedom, n

54
(No Transcript)
55
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com