Title: Collection, Organization and Presentation of data
1Collection, Organization and Presentation of data
2Objectives
- Organize and tabulate the data sets using
frequency distributions. - Present the data in forms of pictures, charts or
graphs e.g. histograms, frequency polygons etc.
3Introduction
- One useful way of organizing data is to divide or
categorize the data into similar classes or
categories. This will then enable us to produce
various charts like histogram, frequency
polygons, bar charts, pie charts etc.
4Frequency distribution
- The collected data which have not been summarized
or rearranged in a meaningful way is known as raw
data e.g. 3,7,2,4,8. - When the data is arranged either in ascending or
descending order they form a data array. - A data array is one of the simplest ways of
presenting the data. E.g. 2,3,4,7,8
5- A better way of arranging the data is to use a
frequency distribution. - A frequency distribution is an organized display
of data showing the number of observations that
fall into each of a set of mutually exclusive
classes. - The table on the next slide gives the weekly
earnings of 50 Manufacturing Workers ().
6Weekly earnings of 50 workers
151 201 182 148 209
179 193 163 185 161
163 162 180 158 187
142 198 155 195 163
180 187 178 165 140
195 164 174 184 175
150 188 172 174 176
206 175 154 161 179
194 198 190 171 177
143 178 185 165 156
7Mutually exclusive classes
- Each data can only fall into one class. Sometimes
referred to non overlapping classes. - E.g class females and males. A person can only
belong one class. - E.g.
- class a greater than 10
- Class b greater than 6.
- These classes are not mutually exclusive.
8- Question 1
- Class 1 even numbers
- Class 2 odd numbers
- Question 2
- Category A married
- Category B either married or single
- Question 3
- Class x greater than 12
- Class y less than or equal to 12
9Constructing a frequency distribution
- We must consider the following
- Appropriate number of classes
- Choosing the proper class width
- Establishing the limits and the boundaries of
each class to avoid overlapping.
10The number of classes
- The number of classes depends on the number of
data points and the range of data collected - It is usually between 5 and 15
- Too many classes results in too little
concentration of data whereas too few classes
results in too much concentration of data
11Class intervals
- Desirable classes should be of equal with.
- If the lowest and highest values in a data set
are known, width of the class interval can be
given by the formula
12Class limits and boundaries
- Classes must be mutually exclusive
- i.e. they dont overlap
- Classes such as
- 4 .00 - 4.50
- 4.50 - 5.00 these two classes are not mutually
exclusive - The classes must be all inclusive. All data must
be given a category or class to record in.
13Frequency Distribution of Weekly Earnings for 50
workers
Weekly Earnings () Frequency
140 - 149 4
150 159 6
160 169 9
170 179 12
180 189 9
190 199 7
200 209 3
total 50
14Another way of classifying the data from slide 14
140 and under 150
150 and under 160
160 and under 170
170 and under 180
180 and under 190
190 and under 200
200 and under 210
15Class size
- Also known as class width.
- Its the difference between two successive lower
or upper limits. For example the class width of
the first class is 150 140 10 or
149.5-139.510.
16The class mark or midpoint
- Its the average of the lower and upper class
limits (stated or real limits). - For example, the class mark for the class 140-
149 is 144.50, i.e. - Class mark stated lower stated upper limit
2
(140 149)/2 144.50
17- Class mark is usually used to represent all the
data values included in a class. - When the number of classes, the class interval
and the class limits have been determined, the
data are then sorted into classes
18Relative Frequency Distribution
- A relative frequency distribution can be obtained
by dividing the value of frequency in each class
by the total value of all frequencies. By doing
this we get a fraction or a percentage value from
each class. - Relative frequency of a class frequency of the
class total observations
19Relative Frequency Distribution of Weekly
Earnings for 50 manufacturing workers
Weekly earnings () Frequency Relative Frequency
140 149 4 4/50 0.08
150 159 6 6/500.12
160 169 9 9/500.18
170 179 12 12/500.24
180 189 9 9/500.18
190 199 7 7/500.14
200 209 3 3/500.06
totals 50 1.00
20Histogram
- A histogram is constructed by first marking
off class intervals on the x-axis and then
drawing rectangles whose heights equal the class
frequencies if classes are of equal width.
21Histogram
Note this chart is obtained by using microsoft
excel.
22Histogram
- Suppose however, we decide to combine classes
190 to 199 and 200 to 209 because there are
so few observations in the range 200 to 209. - Then the new class has frequency 10, but its
width is now 20, twice as large as the other
widths. - Do not leave gaps when constructing a histogram.
23Frequency Polygon ( line chart)
- A histogram can be converted to a frequency into
a frequency polygon. - It is obtained by joining the midpoints of the
tops of the rectangles. - Note that an extra class with frequency zero is
usually added so that the polygon begins and ends
on the horizontal axis.
24(No Transcript)
25Frequency polygon
- If we a draw a smooth curve along the frequency
polygon, staying as close to the polygon as is
consistent with having a smooth curve, we can
obtain a frequency curve.
26Ogive
- Ogive is a cumulative frequency curve. It may be
arranged on - a more than or or more basis, and
- A less than or or less basis.
- Cumulative frequencies are obtained by summing
the class frequencies successively.
27Ogive
- To construct an ogive, plot a point with height
representing cumulative frequency above the real
upper limit of each class. Then join these
points. - Note that 139.5, the real lower limit of the
first non-empty class of the less than ogive,
is used as a starting point with zero height. - Similarly we construct the or more ogive, but
starting from the bottom frequency. Hence 209.5,
the real upper limit of the last non-empty class,
has zero height.
28Cumulative Frequency Distribution of weekly
earnings of 50 workers
Weekly earnings Less than cumulative frequency Weekly earnings Or More Cumulative Frequency
139.5 0 139.5 50
149.5 4 149.5 46
159.5 10 159.5 40
169.5 19 169.5 31
179.5 31 179.5 19
189.5 40 189.5 10
199.5 47 199.5 3
209.5 50 209.5 0
29Pie chart
- The class 140-149 is a numerical or quantitative
class. Frequency distribution can also have
non-numerical or qualitative classes. For
example, as shown in the table at the next slide.
30Private Households by Type of Household in 1980
- Type of Household_____No. of Households
- One person 42,386
- Other No Family Nucleus 18,492
- One-Family Nucleus 397,125
- Multi-Family Nucleus 51,521
- ______
- Total 509,524
-
31- In the case of qualitative data, it is more
appropriate to use a pie chart to present the
data in the frequency distribution. - Pie charts are usually used to present
qualitative or categorical data, particularly
percentage distribution. - It is desirable to keep the number of components
under six in pie chart.
32Pie chart showing classification of private
households by Type
33Bar Chart
- The data in the previous table can also be
presented by a bar chart with each bar
representing a type of household. - The height of each bar is proportional to the
frequency of the class or category it represents.
34Bar chart showing classification of Private
households by type
35Stem-and-leaf
- The stem-and-leaf display can be best explained
by the following example. - The following data shows the number of teachers
in primary schools. - 30 33 65 32 70 42 37
- Stem (10s digit) Leaf (1S DIGIT)
- 30237 represents
(30,32,33,370 - 42 represents(42)
- 5 no data values
- 65 represents 65
- 70 represents 70
36- The stem-and-leaf display shows just two figures
for each data value. For example, the data value
1475 could be represented as - 14 (1000s digit and 100s digit)
- or 147 (100s digit and 10s digit)
- or 145 (10s digit and 1s digit)
37recap
- Organize the data sets using frequency
distributions. - Present the data in forms of pictures, charts and
graphs e.g. bar charts, histograms, frequency
polygons etc.
38References Lecture Tutorial Notes from
Department of Business Management, Institute
Technology Brunei, Brunei Darussalam.