Statistics


   
 
Frequency Distribution
A teacher gave a test to a class of 26 students. The maximum mark is 5. The marks obtained by the pupils are:
 
 
Such data as above is called ungrouped (or raw) data.
 
We may arrange the marks in ascending or descending order. The data so represented is called an array.
 
 
The difference between the greatest and the smallest number is called range of the data. Thus for the above data, the range is 5 - 1 which equals 4 marks.
 
Each of the above two forms of data representation is called simple distribution.
 
We may arrange the above data in a frequency distribution table as given below:
 
marksTally

 
The number which tells us how many times a particular data appears is called the frequency. For example, 2 marks have been scored by five students which means marks 2 occurs five times. Therefore, the frequency of score 2 is five. Similarly, the frequency of marks 5 is three because three students scored five marks. The data so distributed is called frequency distribution and the tabular form is called frequency distribution table.
 
Tally mark
 
It is a vertical stroke put to represent a particular score. After 4 vertical strokes the fifth stroke is drawn across them to make a bunch of five. This makes the totalling easier.
 
If the number of scores is large, then the range is divided into a number of classes. The tally score is obtained for each class. The resulting frequency table is called grouped frequency distribution.
 
There are two types of frequency distribution
 
(a) Inclusive
 
(b) Exclusive
 
(a) In inclusive type of frequency distribution, the upper limit of one class does not coincide with the lower limit of the next class.
 
(b) In exclusive type of frequency distribution, the upper limit of one class coincides with the lower limit of the next class.
 
The following example illustrates the inclusive type of frequency distribution. Consider the following ungrouped marks (out of 50) given to 30 students:
 
 
 
The range for the above ungrouped data is 49 - 12 = 37. Normally it is desirable to divide the range into 6 to 10 classes. Consider the class 11 - 15. If a student scores 11 marks or 15 marks, he will be put in this class. For this class, 11 is the lower limit and 15 is the upper limit and both are included in the class. But in case of 'exclusive' frequency, mostly one of the limits of class is excluded from the class, the above frequency distribution can be reformed in the following exclusive way also:
 
 
The score 20 is now included in the class 20 - 25 and not in 15 - 20.
 
In the above table, the data is grouped in class intervals of 5. It can also be grouped into class intervals of 10, 15. The table can also have unequal classes but it is not desirable. The very purpose of grouping will be lost if there are too few or too many or unequal intervals.
 
Class
 
 
 
Inclusive type of frequency distribution can be converted into exclusive type as in Table (b)
 
Class limits for Table (a):
 
The lowest values of various classes are 11, 16, 21,…. These values are called lower limits. The greatest values of various classes are 15, 20, 25,…. These values are called upper limits. Therefore,
 
Lower limit of class 11 - 15 is 11
 
Upper limit of class 11 - 15 is 15
 
Now, class limits are not always exactly what they represent. It is obvious that the scores are rounded off to the nearest whole numbers. Thus to obtain the actual class intervals, the class limits are extended both ways by adding 0.5 to the limits. For the class interval (11 - 15), actual lower limit or lower boundary is 10.5 and actual upper limit or upper boundary is 15.5. In this way Table (b) with actual limits is obtained wherever necessary.
 
 
1. The class limits can be extended by 0.5 only for continuous variables and not for discrete variables.
 
2. In discrete variables, the lower limit and actual lower limit remain the same and so also the upper limits and actual upper limits.
 
Class width or Class size is the difference between actual limits of any class. The class width of class (11 - 15) is 15.5 - 10.5 equal to 5. Similarly, the class interval of other classes is also 5. In inclusive type of frequency distribution, the class width of 11 - 15 is also 5 as both are included.
 
Class mid-mark or Class mark is the mid-point of each interval. Class mark for Class (11 - 15) is . Similarly, class marks for other intervals are 18, 23, 28…
 
 
When the data is put in a frequency distribution table, the marks obtained by an individual has lost its meaning. Thus some details have been lost. But we have learnt much more about the whole data.
 
Thus statistics deals with the study of masses rather than individuals. We can tell the general pattern of the marks obtained by the class but not of an individual.
 
With reference to the tables (a, b)
 
(a) Lower limit of fifth class is 31
 
(b) Upper limit of fifth class is 35
 
(c) Actual lower limit or lower boundary of fifth class is 30.5
 
(d) Actual upper limit or upper boundary of fifth class is 35.5
 
(e) Class size or class width of 5th class is 35.5 - 30.5 = 5
 
(f) The class mark or mid point of class is
 
Sometimes class marks rather than classes are given for a frequency table. Consider the case:
 
 
By observing the class marks 13, 18, 23, it is obvious that the class interval is 5. Therefore, classes can be obtained as in
 
Table (a) 11 - 15, 16 - 20, …
 
Or Table (b) 10.5 - 15.5, 15,5 - 20.5,…
 
Consider this example for exclusive type of distribution. The following is a survey of the pocket money of 40 students in a school (pocket money in rupees per week):
 
 
The range for the above ungrouped data is obtained. The range is 78 - 27, which equals 51. We make intervals of 20 - 30, 30 - 40, 40 - 50,…
 
 
For class 20 - 30, we read it to mean 20 and above but less than 30. For class 30 - 40, we read it to mean 30 and above but less than 40 and so on. Now frequency distribution table is obtained.
 
 
In the example, overlapping classes are selected. Therefore, lower limits and actual lower limits as also the upper limits and actual upper limits are the same.
 
With reference to the table:
 
(a) Lower limit of fourth class is 50
 
(b) Upper limit of fourth class is 60
 
(c) Actual lower limit (lower boundary) of fourth class is 50
 
(d) Actual upper limit (upper boundary) of fourth class is 60
 
(e) Class size or class width of fifth class is 10
 
(f) The class mark or mid-mark of fifth class is
 
Sometimes we use 'open intervals', which means that the lowest or highest interval is not mentioned.
 
Consider the following example:
 
The wages of the skilled workers in a factory are given below:
 
 
 
Open intervals are used when extreme scores are given. Sometimes, one more column called cumulative frequency is obtained in frequency distribution.
 
 
     
   
Get FREE Live Tutoring
Get FREE Live Tutoring
(No credit card required)

Customer Care

Click to get customer service, technical support and subscription help.

Customer Care Chat


Refer-A-Friend

Get One Month Free!
When you refer a friend