Believe it Or Not - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Believe it Or Not

Description:

– PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 42
Provided by: stat57
Category:
Tags: believe

less

Transcript and Presenter's Notes

Title: Believe it Or Not


1
Believe it Or Not
  • Mike Speed

2
Statistics Can Be Fun
3
Learning Outcomes
  • At the end of this talk, I hope that you will
  • Realize that statisticians are real people with
    real lives
  • Agree that Statistics is an important area
  • Have an understanding of Data Mining

4
Who Am I and Life Beyond Statistics
  • Worked for NASA put the men on the moon
    1964-1969
  • Teacher/Researcher
  • Businessman
  • Civic leader

5
Outside of Statistics
  • Wife, 2 children and 5 grandchildren
  • Wife makes porcelain dolls
  • Wife and I make pottery specialize in
    crystalline glazes
  • We fish as often as we can

6
Pictures
7
Dianes Dolls
8
Are You New To Statistics
  • Yes
  • No

9
But
Nice you asked I am a brain surgeon
I am a statistician
10
It is Nice to Be Alone
Just kidding around
11
Shakespeare Statistics
  • There is a lot in common
  • Good Party Talk

12
Shakespeare Statistics
  • BEATRICE
  • He set up his bills here in Messina and
    challengedCupid at the flight and my uncle's
    fool, readingthe challenge, subscribed for
    Cupid, and challengedhim at the bird-bolt. I
    pray you, how many hath hekilled and eaten in
    these wars? But how many hathhe killed? for
    indeed I promised to eat all of his killing.

And your literature professor would ask What
are the essential concepts that Shakespeare is
trying to convey? What is Shakespeare saying?
13
Statistics
And we want to know what are the essentials
parts of the this data. What is the data saying
to us?
14
Difference Between Shakespeare and Statistics
  • Statistics has a set of rules and reasonable
    people will come to similar conclusions about the
    data
  • In literature many different interpretations

15
Where did the Shakespeare Quote Come From?
  • Much Ado About Nothing!!

16
Data Mining
  • Data mining, or knowledge discovery, is the
    computer-assisted process of digging through and
    analyzing enormous sets of data and then
    extracting the meaning of the data.
  • Data mining tools predict behaviors and future
    trends, allowing businesses to make proactive,
    knowledge-driven decisions.

17
Example
  • One Midwest grocery chain used data mining to
    analyze local buying patterns. They discovered
    that when men bought diapers on Thursdays and
    Saturdays, they also tended to buy beer. Further
    analysis showed that these shoppers typically did
    their weekly grocery shopping on Saturdays. On
    Thursdays, however, they only bought a few items.
    The retailer concluded that they purchased the
    beer to have it available for the upcoming
    weekend. The grocery chain could use this newly
    discovered information in various ways to
    increase revenue. For example, they could move
    the beer display closer to the diaper display.
    And, they could make sure beer and diapers were
    sold at full price on Thursdays.

18
Another Example
  • Merck-Medco Managed Care is a mail-order business
    which sells drugs to the country's largest health
    care providers Blue Cross and Blue Shield state
    organizations, large HMOs, U.S. corporations,
    state governments, etc. Merck-Medco is mining its
    one terabyte data warehouse to uncover hidden
    links between illnesses and known drug
    treatments, and spot trends that help pinpoint
    which drugs are the most effective for what types
    of patients. The results are more effective
    treatments that are also less costly.
    Merck-Medco's data mining project has helped
    customers save an average of 10-15 on
    prescription costs.

19
Other Uses
  • Market segmentation - Identify the common
    characteristics of customers who buy the same
    products from your company.
  • Customer churn - Predict which customers are
    likely to leave your company and go to a
    competitor.
  • Fraud detection - Identify which transactions are
    most likely to be fraudulent.
  • Direct marketing - Identify which prospects
    should be included in a mailing list to obtain
    the highest response rate.

20
More uses
  • Interactive marketing - Predict what each
    individual accessing a Web site is most likely
    interested in seeing.
  • Market basket analysis - Understand what products
    or services are commonly purchased together
    e.g., beer and diapers.
  • Trend analysis - Reveal the difference between a
    typical customer this month and last.

21
Another definition
  • Data mining is the use of automated data analysis
    techniques to uncover previously undetected
    relationships among data items. Data mining
    involves the statistical analysis of data stored
    in a data warehouse. Three of the major data
    mining techniques are regression, classification
    and clustering.

22
Misconceptionfrom tu
23
Census 2000 Data Set
  • The CENSUS2000 data is a postal code-level
    summary of the entire 2000 United States Census.
    It features seven variables
  • ID postal code of the region
  • LOCX region longitude
  • LOCY region latitude
  • MEANHHSZ average household size in the region
  • MEDHHINC median household income in the region
  • REGDENS region population density percentile
    (1lowest density, 100highest density)
  • REGPOP number people in the region

24
Census Data33,000 observations
25
Data, Speak to Me with Thine Numbers
  • What is going on here?
  • What are the essential parts?
  • Let us do a plot.

26
Plot of Latitude and LongitudeColor by Density
27
Pattern Discovery
The Essence of Data Mining? the discovery of
interesting, unexpected, or valuable structures
in large data sets. David Hand
28
Pattern Discovery
The Essence of Data Mining? the discovery of
interesting, unexpected, or valuable structures
in large data sets. David Hand
If youve got terabytes of data, and youre
relying on data mining to find interesting things
in there for you, youve lost before youve even
begun.
Herb Edelstein
29
k-means Clustering Algorithm
Training Data
1. Select inputs. 2. Select k cluster
centers. 3. Assign cases to closest
center. 4. Update cluster centers. 5. Re-assign
cases. 6. Repeat steps 4 and 5 until
convergence.
30
k-means Clustering Algorithm
Training Data
1. Select inputs. 2. Select k cluster
centers. 3. Assign cases to closest
center. 4. Update cluster centers. 5. Re-assign
cases. 6. Repeat steps 4 and 5 until
convergence.
31
k-means Clustering Algorithm
Training Data
1. Select inputs. 2. Select k cluster
centers. 3. Assign cases to closest
center. 4. Update cluster centers. 5. Re-assign
cases. 6. Repeat steps 4 and 5 until
convergence.
32
Demographic Segmentation Demonstration
Analysis goal
Group geographic regions into segments based on
income, household size, and population density.
Analysis plan
Select and transform segmentation inputs.
Select the number of segments to create.
Create segments with the Cluster tool.
Interpret the segments.
33
SAS Program
34
Skewed Data
35
Transformed Data - log
36
10 Clusters
37
Select a Cluster
38
What Can We Say?
39
Where Do They Live
40
Summary
  • Statistics is a great field of study
  • Thank you for being here and wanting to teach AP
    statistics
  • We want to help you

41
  • Thanks
Write a Comment
User Comments (0)
About PowerShow.com