Contents     Contents Index APA Style Guide Dr. P's Place

Frequency Distributions

  1. Ungrouped
    1. Example - [Minitab]
    2. Rules
    3. Review of the Mathematics Involved
  2. Grouped
    1. Example - [Minitab]
    2. Rules

Practice Problems (includes Graphing from the next section) (Answers)
Homework


A frequency distribution is a procedure for describing a set of data. There are two types: ungrouped and grouped.

I. Ungrouped (also called a "Tally")

  1. Example - [Minitab]

    The following data represent the number of hours of TV viewing per week (X) for 20 people.

    7 5 4 7 4 6 6 6 5 4 6 7 2 7 5 5 6 2 2 7

    What follows is an ungrouped frequency distribution for the above data.

    X f fr=p % Cf Cp C%
    7 5 (5/20=).25 25 N=20 p=1.00 %=100
    6 5 .25 25 15 .75 75
    5 4 .20 20 10 .50 50
    4 3 .15 15 6 .30 30
    3 0 .00 0 3 .15 15
    2 (or 1.5-2.5) 3 .15 15 3 .15 15
        ∑f=20=N ∑p=1.00 ∑%=100

    How frequently the scores are distributed within the distribution is clearly shown. For example, you can quickly see that half of the people watched 6 or more hours of TV. It is shown in different ways (i.e., f, p, & %). Notice also the Cumulative frequency columns. The concept refers to the frequency of scores falling at or below the upper exact limit of a score. Lastly, note that decimal usage within each column is consistent.

  2. Rules for creating an ungrouped frequency distribution
    1. Locate the extreme values (the lowest score or XL and the highest score or XH) and make a column that lists all the values from XH to XL (Note that XH is at the top of the column and XLis at the bottom. This is a convention which will make some things easier later).
    2. Make an adjacent column labeled "frequency" (f). Count the frequency of each X and put it in the f column. This is called a tally of the scores. It is a good idea when doing this to put a slash through the number so you know it has been counted. (If you make a mistake, you can go back and use a slash in the opposite direction.) Check that the ∑f = N and put this in a row at the bottom of the column.
    3. Create another column called "relative frequency" or "proportion" (fr = p = f/N). As a general rule, proportions should be expressed in hundredths. Check that the ∑fr ≈ 1 and show it at the bottom of the column.
    4. Create another column called "percent" (% = fr * 100). As a general rule, percents should be expressed without decimal places (or sometimes with one). Show that ∑% ≈ 100 at the bottom of the column.
    5. Create another column called "cumulative frequency." Remember that this column gives the frequency of scores falling at or below the upper exact limit of a score. When creating this column, start at XL and work your way up by adding all frequencies for the scores at or below the score you are interested in. Thus:
        For XL, the Cf = f
        For XL+1, the Cf = f of XL + f of XL+1, etc.
        The Cf of XH must = N.
    6. Create another column called "cumulative proportion" (Cp = Cf/N).
    7. Could create another column called "cumulative percent" (C% = Cp * 100 = Cf/N * 100).
      Note that the last two columns are not created in the same manner as the cf column. They are computed on the basis of the values in the cf column.

  3. Review of the Mathematics Involved
    1. fr = p = f/N.
    2. % = p * 100.
    3. ∑p ≈ 1.00 and ∑% ≈ 100. "≈" rather "=" is used because of rounding error. However, if the proportions are off by more than .02 or the percents are off by more than 1, you have too much rounding error. The solution is to increase the number of decimals used in the intermediate calculations.
    4. Cp = Cf/N (& the Cp for XH must = 1.00, notice the "=" rather "≈").
    5. C% = Cp * 100 (& the C% for XH must = 100, notice the "=" rather "≈").

II. Grouped Frequency Distribution

Useful for large data sets and because they make the form or shape of the distribution more obvious. A disadvantage, though, is that the scores lose their individual identity.

  1. Example - [Minitab]
    The following data are the expected scores on Exam 1 (N=25).

    95 88 81 79 73 92 88 81 79 72 92 86 81 77 67 91 85 80 77 62 89 84 80 74 61

    A simple tally or ungrouped frequency distribution would not be very helpful in this case. What follows is a grouped frequency distribution for this data.

    Interval
    (Stated or
    Apparent
    limits)
    Real or
    Exact
    limits
    Mid-
    point
      f   p % Cf Cp C%
    95-99 94.5-99.5 97 1 .04 4 N=25 p=1 %=100
    90-94 89.5-94.5 92 3 .12 12 24 .96 96
    85-89 84.5-89.5 87 5 .20 20 21 .84 84
    80-84 79.5-84.5 82 6 .24 24 16 .64 64
    75-79 74.5-79.5 77 4 .16 16 10 .40 40
    70-74 69.5-74.5 72 3 .12 12 6 .24 24
    65-69 64.5-69.5 67 1 .04 4 3 .12 12
    60-64 59.5-64.5 62 2 .08 8 2 .08 8
    ∑f=25 ∑p=1 ∑%=100

    Note how easy it is to see the form or shape of the distribution. For example, it is easy to derive:

    Number
    Grade
    Letter
    Grade
    % of
    Class
    90s
    A
    16
    80s
    B
    44
    70s
    C
    28
    60s
    D
    12
    100  

    However, using just the grouped frequency distribution we wouldn't be able to tell exactly what the highest score is (i.e., the individual scores have lost their identity).

  2. Rules for creating a grouped frequency distribution
    1. Compute the Range using the formula below.

      (Consider the scores 4, 5, & 6. R = 6 - 4 + 1 = 3 scores.)
    2. The width of a group or interval (i) must be an odd number.
      If i=1, we have an ungrouped frequency distribution. Thus the smallest reasonable value for i is 3.
    3. Use about 10 to 20 groups or intervals.
    4. Make the lowest apparent limit divisible by i. This interval should contain XL.
    5. Use the formula below to help you determine an appropriate number of groups or intervals to group the data into. Note that the formula is a guide and just gives an estimate (as we will see below). To work with the formula, start by using the lowest possible value of i (i.e., 3) and divide it into R.

    So, in our example R = XH - XL + 1 = 95 - 61 + 1 = 35.

    Since we have limited space on the screen, an i of 5 was chosen (& thus we would have 7 groups). Note however, that since 61 is not divisible by i, the lowest stated limit has to be 60 which explains why the # of groups actually used was 8 rather than 7. So here we see that the formula is just a guide.

    Once we decide on the lowest apparent limit and the number of groups, we can create the left most column of the grouped frequency distribution which is the scores or, more precisely, the intervals of scores. The next column should give the exact limits for these intervals. This column helps with interpreting and understanding the cumulative frequency columns. The next column should give the midpoints (which will come in handy as we will see later). The rest of the columns are created in the same manner as for the ungrouped frequency distribution.


Contents Index APA Style Guide Dr. P's Place Copyright © 1997-2016 M. Plonsky, Ph.D.
Comments? mplonsky@uwsp.edu.