PROBABILISTIC AND - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

PROBABILISTIC AND

Description:

Other useful measures skewness and kurtosis. measure of asymmetry of data set about mean ... Kurtosis measure of peakedness' of data set ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 53
Provided by: Lowe
Category:

less

Transcript and Presenter's Notes

Title: PROBABILISTIC AND


1
PROBABILISTIC AND STATISTICAL FOUNDATIONS
Engineering systems dynamic systems i.e.
their states change with time -
stochastically (randomly) Models of the system
inputs random variables Random variables
basis of engineering systems
2
Random changes not often truly random Often
some hint of a pattern or trend Random
nature of engineering systems as defined by a
set of random variables leads to random changes
in state(s) of system Random nature of
engineering systems - modelled using
probability and statistics Interpretation of
these probabilities and statistics -
optimization of the system for current and future
use
3
STATISTICS (descriptive) Used to describe system
characteristics based on data collected on
it Used in the analysis of system data to
provide an intuitive understanding of the system
data Used to display system information to
external viewers Used to develop system
performance measures
4
STATISTICS (inferential) and PROBABILITY Used to
derive statements about what a system might do if
certain scenarios were inputted (more
later) Statistics for current and previous
system behaviour Probability for predictive
(future) system behaviour
5
  • What is Applied Statistics?
  • Applications from various fields.
  • What is statistics?
  • What is probability?
  • Relationship between probability and Statistics.

6
What is Applied Statistics?
  • Collection of (statistical) techniques used in
    practice.
  • Range from very simple ones such as graphical
    display, summary statistics, and time-series
    plots, to sophisticated ones such as design of
    experiments, regression analysis, principal
    component analysis, and statistical process
    control.
  • Successful application of statistical methods
    depends on the close interplay between theory and
    practice.
  • There should be interplay (communication and
    understanding) between engineers and
    statisticians.

7
  • Engineers should have adequate statistics
    background to (a) know what questions to ask (b)
    mix engineering concepts with statistics to
    optimize productivity (c) get help and
    understand the implementation.
  • The object of statistical methods is to make the
    scientific process as efficient as possible.
    Thus, the process will involve several
    iterations, each of which will consist of an
    hypothesis, data collection, and inference.
    The iterations stop when satisfactory results are
    obtained.

8
WHY WE NEED STATISTICS?
  • Quality is something we all look for in any
    product or service we get.
  • What is Quality?
  • It is not static and changes with time.
  • Continuous quality improvement program is a MUST
    to stay competitive in these days.
  • Final quality and cost of a product are pretty
    much dependent on the (engineering) designs and
    the manufacture of the products.
  • Variability is present in machines, materials,
    methods, people, environment, and measurements.
  • Manufacturing a product or providing a service
    involves at least one of the above 6 items (may
    be some other items in addition to these)

9
  • Need to understand the variability.
  • Statistically designed experiments are used to
    find the optimum settings that improve the
    quality.
  • In every activity, we see people use (or abuse?)
    statistics to express satisfaction (or
    dissatisfaction) towards a product.
  • There is no such a thing as good statistics or
    bad statistics.
  • It is the people who report the statistics
    manipulate the numbers to their advantage.
  • Statistics properly used will be more productive.

10
EXPLORE, ESTIMATE and CONFIRM
  • Statistical experiments are carried out to
  • EXPLORE gather data to study more about the
    process or the product.
  • ESTIMATE use the data to estimate various
    effects.
  • CONFIRM gather additional data to verify the
    hypotheses.

11
BASIC DESCRIPTIVE STATISTICS
Consider a random variable measured over a time
interval Data set will have measures of central
tendency and measures of deviation around
them i.e. target values and scatter
(variation) i.e. signal and noise Signal/noise
ratios very important in optimization
12
CASE STUDY 2
The data below represent 30 observations taken
over a 24-hour period that record the air
pressure (in atmospheres) provided by an
industrial compressor that is used to power a
pneumatic stamping facility
x each value n 30
Armed with this data provide a complete basic
statistical analysis of this data and comment on
the results.
Measures of central tendency
mean
or
Mean average, expected value etc.
Are you familiar with the difference between
sample mean and population mean?
13
Mean the expected value of a random variable,
which is also called the population mean. The
Median a number dividing the higher half of a
sample, a population, or a probability
distribution from the lower half. The median of a
finite list of numbers can be found by arranging
all the observations from lowest value to highest
value and picking the middle one. If there are an
even number of observations, the median is not
unique, so one often takes the mean of the two
middle values. The Mode In statistics, mode
means the most frequent value assumed by a random
variable, or occurring in a sampling of a random
variable, e.g. the highest peak on a data
histogram Like the statistical mean and the
median, the mode is a way of capturing important
information about a random variable or a
population in a single quantity.
Mode of a sample The mode of a data sample is the
element that occurs most often in the collection.
For example, the mode of the sample 1, 3, 6, 6,
6, 6, 7, 7, 12, 12, 17 is 6. Given the list of
data 1, 1, 2, 4, 4 the mode is not unique,
unlike the arithmetic mean.
14
Measures of dispersion
Give an indication as to how much scatter there
is in the data set Scatter variability
bad! The more variable the system the harder
to model and optimize Any dynamic system
variable measured over time will have a central
tendency and a variability to analyze
15
The Range Difference between highest (Xl) and
lowest (Xs) data value i.e. R Xl - Xs The
Variance Variance measure of fluctuation of
the observations around a mean
16
For a population, or For a sample
Why N versus n-1 ? Population variance
parameter cf. sample variance estimator will
change from sample to sample -but should average
out to the parameter - property of unbiasedness
17
POPULATION
  • Population is a collection of all units defined
    by some characteristic, which is the subject
    under study.
  • In the study of the MPG (miles per gallon ) of a
    new model car, the population consists of the
    MPG's of all cars of that model.
  • To study the income level of a particular city
    the population consists of the incomes of all
    working people in that city.

18
The sample In statistics, a sample is a subset
of a population. Typically, the population is
very large, making a census or a complete
enumeration of all the values in the population
impractical or impossible. The sample represents
a subset of manageable size.
  • Much care should be devoted to the sampling.
  • There is always going to be some error involved
    in making inferences about the populations based
    on the samples.
  • The goal is to minimize this error as much as
    possible.
  • There are many ways of bringing in systematic
    bias (consistently misrepresent the population).

19
  • This can be avoided by taking random samples.
  • Simple random sample all units are equally
    likely to be selected.
  • Multi-stage sample units are selected in several
    stages.
  • Cluster sample is used when there is no list of
    all the elements in the population and the
    elements are clustered in larger units.
  • Stratified sample In cases where population
    under study may be viewed as comprising different
    groups (stratas) and where elements in each group
    are more or less homogeneous, we randomly select
    elements from every one of the strata.
  • Convenience sample samples are taken based on
    convenience of the experimenter.
  • Systematic sample units are taken in a
    systematic way such as selecting every 10th item
    after selecting the first item at random.

20
DESCRIPTIVE STATISTICS
  • Deals with characterization and summary of key
    observations from the data.
  • Quantitative measures mean, median, mode,
    standard deviation, percentiles, etc.
  • Graphs histogram, Box plot, scatter plot, Pareto
    diagram, stem-and-leaf plot, etc.
  • Here one has to be careful in interpreting the
    numbers. Usually more than one descriptive
    measure will be used to assess the problem on
    hand.

21
  • Standard deviation
  • Square root of the variance
  • Measures scatter in same units as observations
  • Very popular in process control and tolerancing
  • e.g. diameter 63.40.1mm

or
22
Other useful measures skewness and kurtosis
measure of asymmetry of data set about
mean Consider the distribution in the figure.
The bars on the right side of the distribution
taper differently than the bars on the left side.
These tapering sides are called tails, and they
provide a visual means for determining which of
the two kinds of skewness a distribution
has positive skew The right tail is the
longest the mass of the distribution is
concentrated on the left of the figure. The
distribution is said to be right-skewed.
negative skew The left tail is the longest the
mass of the distribution is concentrated on the
right of the figure. The distribution is said to
be left-skewed.
23
Kurtosis measure of peakedness of data set
pdf for the Pearson type VII distribution with
kurtosis of infinity (red) 2 (blue) and 0
(black)
log-pdf for the Pearson type VII distribution
with kurtosis of infinity (red) 2 (blue) 1,
1/2, 1/4, 1/8, and 1/16 (gray) and 0 (black)
24
left skew right skew no
skew mesokurtic leptokurtic
platykurtic
  • A perfect mesokurtic curve is also called a
    normal curve, which by definition is not skewed
    in either direction.
  • A leptokurtic distribution is symmetrical in
    shape, similar to a normal distribution, but the
    centre peak is much higher that is, there is a
    higher frequency of values near the mean. If you
    move scores from shoulders of a distribution into
    the centre and tails of a distribution, the
    result is a peaked distribution with thick tails
  • A platykurtic distribution is one in which most
    of the values share about the same frequency of
    occurrence. As a result, the curve is very flat,
    or plateau-like. Uniform distributions are
    platykurtic.

25
Statistical Association Relationship between
variables Does variable x influence variable y
? Measured in terms of correlation coefficient
r 2 data sets, X and Y then r is given as
Presented as bi-variate x-y plots Indicated the
strength and direction of a linear relationship
between two random variables
26
CASE STUDY 3
The data below is taken from 40 observations on
the depth of cut and tool wear in a milling
operation. Is it true to say that the amount of
tool wear is generally independent of the cutting
depth?
27
Consider the data presented r is calculated as
0.9397 r ranges from -1 to 1 with zero showing
no correlation Answer indicates a strong
positive correlation i.e. increasing the cut
depth increases tool wear (94 certain of this!)
28
INTRODUCTORY PROBABILITY
Illustrative Example
  • The following data corresponds to an experiment
    in which the effect of engine RPM (revolutions
    per minute) on the horsepower is under study.
  • TABLE 1 Data for HP Example

29
INTRODUCTORY PROBABILITY
  • Looking at the data in Table 1, why is that the
    hp values, say at 4500 RPM, are not exactly the
    same if the experiment is repeated under the
    same conditions?
  • The fluctuation that occurs from one repetition
    to another is called experimental variation,
    which is usually referred to as noise or
    statistical error or simply error Recall
    this term from earlier discussion on data
    collection.
  • This represents the variation that is
    inherently present in any (practical) system.
  • The noise is a random variable and is studied
    through probability.

30
What is Probability?
  • A manufacturer of blender motors wants to
    determine the warranty period for this product.
  • If motor life were constant, (say 8 years) the
    manufacturer would have no problem. The motor
    could be warranted for 8 years.
  • But, in reality, the motor life is not a constant.
  • Some motors will fail quickly and others will
    last for several years.
  • There is an element of randomness in the life of
    the motors.
  • The manufacturer cannot precisely predict how
    long any motor will last.
  • Probability theory gives the manufacturer the
    means to quantify what is known about motor
    lifetimes and helps to quantify the risks
    involved in setting a warranty period.

31
  • Similar problems arise in the context of other
    products.
  • FMS play an important role in modern
    manufacturing. Improved quality, lower inventory,
    shorter lead times, higher productivity and
    greater safety are some of the benefits derived
    from FMS.
  • All of these have random elements.
  • Probability theory deals with randomness,
    allowing the study of quantities whose behavior
    cannot be predicted completely in advance.
  • The above examples deal with manufacturing.

32
  • We could just as easily find examples in
    business, electrical and computer engineering,
    biomedical science and engineering, sociology,
    economics, marketing, civil engineering, the
    behavioral sciences and so on. The underlying
    problem, randomness, is the same.
  • One should understand the ideas of probability
    and statistics from both theoretical and
    practical points of view.
  • To properly apply probability and statistics in
    the real world, we must appreciate both sides of
    the picture.
  • We cannot properly apply a procedure if we don't,
    at least in general terms, understand the
    reasoning (theory) behind it.

33
  • On the other hand, trying to apply theory without
    knowledge of the area of application is foolish.
    We have to have a proper perspective on both
    before meaningful progress can be made.
  • Probability theory develops mathematical models
    for random experiments.
  • A random experiment is a sequence of actions
    whose outcome cannot be predicted with certainty.
  • Outcomes of random experiments the length of a
    phone call, the gender mix of three people chosen
    from a group of 25 people, and the phenotype of
    the offspring of a cross breeding experiment, the
    number of defects on a painted panel.

34
  • EXPERIMENT
  • Calculation of MPG of a new model car.
  • Measurements of current in a thin copper wire.
  • Measurements of Film build thickness in a
    painting process.
  • Duration of phone calls.
  • Time to assemble a job.
  • Tossing a coin.
  • Any data set subject to variability
  • - Natures way
  • Creates a degree of randomness that generates
    uncertainty
  • - need probability theories to handle this

35
  • EVENTS
  • n samples collected for a quantity (variable) X
  • Values of n will differ from each other
  • for numerous reasons

All possible values of X Sample Space
S subset of S Event A i.e.
36
Venn Diagrams - Illustrations used in the
branch of mathematics known as set theory. They
show all of the possible mathematical or logical
relationships between sets - good graphical way
of representing probability i.e. to show
37
Events often combined e.g. 2 events A and
B or for n events e.g. A (x1, x3)
and B (x4, x8) Then C (x1, x3, x4, x8)
38
(No Transcript)
39
Intersection of Events Here -
or Where D contains all sample points common to
the events in question e.g. if A x1, x2, x3
and B x3, x4,x6 Then D x3
40
Mutually exclusive events 2 events, A and B are
mutually exclusive if - an impossible
event i.e.
41
CASE STUDY 4
A construction company is currently bidding for 2
jobs. Considering a sample space (S) containing
the outcome of winning or losing these jobs
42
  • S WW, WL, LW, LL
  • A WL, LW B LL C WW, WL, LW
  • i.e. mutually exclusive
  • WL, LW, WW

43
What is the likelihood of an event
occurring? i.e. the probability of an
event Probability of event A P(A) Note
P(F) 0 and P(S ) 1
numerical value between 0 and 1
44
CASE STUDY 5
45
P(V1) Buses 5/192 0.026 P(V2) 2-axle
15/192 0.078 P(V3) 3-axle 25/192
0.130 P(V4) 4-axle 30/192 0.156 P(V5)
5-axle 105/192 0.547 P(V6) 6-axle 6/192
0.031 P(V7) 7-axle 6/192 0.031 What is
the probability of seeing a truck? 1 0.026
0.974 97.4 chance What is the probability
of seeing a truck with 5 or more
axles? 0.5470.0310.031 0.609 60.9
46
Basic laws of probability For three
events etc.
47
Conditional probability If event A depends on
event B i.e. AB it is described as
conditional The conditional probability P(AB)
assumes event B occurs
48
Often described as and i.e. - Bayes
Theorem
49
CASE STUDY 6
50
  • P(A) 0.05 P(B) 0.005
  • and P(BA) 0.17
  • Both A and B need to occur i.e.
  • so
  • b) Here A or B is relevant i.e.

51
  • Here P(AB) is required Bayes theorem

De Morgans Rule Deals with complimentary
probabilities of large systems i.e. and
52
CASE STUDY 7
Use conditional probability! (De Morgan) P(s)
probability of no malfunction so
Because the events are independent
Answer 1 0.9631 0.0369
Write a Comment
User Comments (0)
About PowerShow.com