Math 3118, section 4
Spring 2001
Some facts about the normal curve
Purpose: A bit of further explanation
about the normal curve and how to work with it.
As explained in the text, the normal curve is given by the following
equation:
We don't have to work directly with this function very often, so we'll just need
to know about a few of its basic properties By one of
the class exercises, 11.4.2:
- the values of the function are all > 0.
- the graph of this function is symmetric about the y-axis;
thus f(x) = f(-x) for all x.
- the value of f(x) becomes very small when
x grows very large.
The graph of the function is shown in Figure 1 below.
Figure 1: the normal curve
By definition, a random variable is normally distributed if its
probability distribution looks like the normal curve, suitably shifted
and re-scaled. Thus:
- the curve is shifted so that its midpoint
corresponds to the mean of our random variable,
- it is stretched
so that 1 unit of distance from the standard normal curve corresponds to
the standard deviation of our random variable, and
the vertical
scale is adjusted so that the total area remains equal to 1.
We don't have to work directly with the equation here!!
Instead, we just have use Table 11.6 in chapter 11 of the text
(on page 254 in the July 17, 2001 version) to find the area under the curve
between the two corresponding z-values. So, in Figure 1 above, the
area of the yellow shaded region has to be
A(z2) - A(z1).
This area (being a number between
0 and 1) tells us the percentage of our
probability distribution -- or population -- for which
the value of the random variable is within the indicated range.
We'll explain below how to do this in some frequently
encountered cases.
A couple of instances.In §11.5, we'll see that the
outcome of an independent trials process -- repeated a large number
of times -- is approximately normally distributed.
IQ scores provide another example, as shown in figure 2 below.
Figure 2: the distribution of IQ scores
This example illustrates the fact that the re-scaling is "almost invisible".
Namely, standard deviation is 20, so the difference between 120 and the mean
( = 100) is exactly 1 standard deviation unit. Therefore the area
-- shaded in magenta in Figure 2 -- between
the corresponding z-values (z = 0 and
z = 1) is A(1) = 0.34.
Hence, 34% of the population has IQ scores in this range.
Working with normal distributions
Part I:Determining the percentage of the
population when a range of scores is given
- Finding the area to the right of a given [positive] z-value.
For instance, this is the situation if we're given some test score
above the mean and we want to find what percentage of the people taking
the test had scores above that level.
For a specific numerical instance, suppose that in a test with a mean of 73
and a standard deviation of 6, we want to know how what percentage of the
students had scores of 80 or above. (This is like the discussion at the
bottom of page 253 or the text.) So there are 3 steps:
- Find the z-value. We work with 79.5 instead of 80; subtract
the mean from this value, and then divide by the standard deviation:
z =
=
= 1.083.
- Look up the value of A(z) in the table. We round off
to z = 1.1 and then look up:
A(1.1) = 0.3643.
This gives the area that's shaded with red stripes in the diagram.
- Subtract from 0.5 to find the area of the right-hand tail
of the distribution. Thus:
0.5 - A(1.1) = 0.5 - 0.3643 = 0.1357.
This area is shaded with green in the diagram.
Thus, about 13.6% of the students have scores of 80 or above.
Figure 3: A(z) and the area of the "tail"
- Finding the area within a given distance of the mean.
Given our information about IQ scores, suppose we want to know
what percentage of the people have IQ score between 75 and 125, inclusive.
Assuming that our scores just take integer values, we work with 125.5 and
74.5 when finding the corresponding z-values.
- For a score of 125.5, we obtain:
z =
=
= 1.275.
For a score of 76.5, a similar calculation gives
z = -1.275.
- Working with the positive value, we round off to z = 1.3
and look in the table to find:
A(1.3) = 0.4032.
This gives the area that's shaded with green stripes in Figure 4.
- By the symmetry of the graph, the area on the other side (with orange
stripes) also has area = 0.4032. [Note that the z-value on the
other side is exactly the negative of the one that we just looked at.]
So, the total shaded area is:
0.4032 + 0.4032 = 0.8064.
Thus, about 80.6% of the people have IQ scores between 75 and 125,
inclusive.
Figure 4: The region within a given distance
of the mean
- Finding the area to the left of a given [positive]
z-value. Once again, we'll illustrate this with a specific
example. So, let's suppose that 1000 students have taken an exam, where
100 points is the maximum score, the mean is 73, and the standard deviation
is 5. We'll ask how many students had scores of 80 or lower.
- To find the z-value, we work with 80.5, since this "splits the
difference" between 80 and 81. Here is our calculation:
z =
=
= 1.5.
- As usual, we look in Table 11.6 to find A(z):
A(1.5) = 0.4332.
This gives the area that's shaded with green stripes in Figure 5,
corresponding to scores which are
above the mean but below the indicated z-value.
- To find the total shaded area, we have to add the area to the left
of the midpoint -- shaded in turquoise in Figure 5. Since it's
exactly half of the entire area under the normal curve,
this area is = 0.5. Hence, the total shaded area is:
0.5 + 0.4332 = .9332.
We conclude that 933 of the 1000 students had test scores of 80 or lower.
Figure 5: The area to the left of a given
positive z-value
Part II:Determining the range of scores when a
percentage of the population is given
Each of these problems is "inverse" to the corresponding problem in
Part I. Thus, in Part 1 we were given a range of scores and wanted to
find the percentage of the population whose scores are in that range.
In terms of Table 11.6, this meant that we calculated z
and then looked up A(z). Here; the situation is
turned around. We're given a percentage of the population, and we
want to find what range of scores corresponds to it. So, we start
by figuring out a value for A(z) and then looking in the
table to find the value of z which corresponds most
closely to it.
- Finding the [positive] z-value such that a given percentage
of the area under the normal curve lies to the right of that value.
(This question makes sense only if the given percentage
is < 50%.)
This is inverse to the problem discussed in Part IA, so that you
can refer to the same figure. In Part IA we were given the
z-value, so we looked up the value of A(z) and then subtracted
it from 0.5 to get the area of the tail -- shown with the
light green [solid] shading in the figure. Here, we are given the area
of the tail, so we do the steps in the opposite order, namely:
- Subtract from 0.5 to get the area under the normal curve between
the midpoint (z = 0) and the horizontal line corresponding
to the [still unknown] z-value. This number is the value of
A(z) that we have to work with.
- We now "read the table backwards" to find the sought-for
z-value. Thus, we look in the right-hand column
of Table 11.6 to find the number that's closest to the value of
A(z) that we just calculated. Our z-value is on the same
line in the left-hand column of the table. For better accuracy,
if our value of A(z) is about halfway between two entries
in the right hand column of table 11.6, then we can split the difference
between the two corresponding z-values.
An instance: Let's determine the z-value such that
10% of the population has scores above that value. This means that we
have to take
A(z) = .5 - .1 = .4.
Looking in the table, the closest value is
A(z) = .4032, and this corresponds to
z = 1.3.
In an applied problem, we still have to re-scale and shift.
For instance in the case of IQ scores, with mean = 100
and s.d. = 20, a z-value of 1.3
corresponds to an IQ score which is 1.3·20 = 26
points above the mean, and thus to a score of
100 + 26 = 126.
- Finding the [positive] z-value such that a given percentage
of the area under the normal curve lies within z
standard deviation units of the mean.
This is inverse to the problem discussed in Part IB, so that you
can refer to the same figure. In Part IB we were given the
z-value, so we looked up the value of A(z) and then multiplied
it 2 to get the area under the the middle part of the bell curve -- shown
as the total of the two shaded areas in the figure. Here, we are given the
area of the symmetric middle part, so that we have to reverse the order of
the steps and appropriately substitute division for multiplication.
Thus:
- Divide the given symmetric area by 2, in order to find
the value of A(z).
- Look for this value in the right-hand column of Table 11.6. Our
answer is then the z-value on the same line of the table.
For instance, suppose that we want to determine the IQ
scores that characterize the middle
2 / 3 of the population.
Then we take
A(z) = 1 / 3
( = 2 / 3 . 1 / 2 ).
Looking in the table, the two closest values are
0.3159 and 0.3413. These correspond to
z = 0.9 and
z = 1.0 respectively.
Accordingly, our answer is about halfway between and thus
corresponds to z = 0.95.
The actual range is between 19 points
( = 0.95·20) below the mean and 19 points
above the mean, i.e., IQ scores between 81 and 119.
- Finding the [positive] z-value such that a given percentage
of the area under the normal curve lies to the left of that value.
(This question makes sense only if the given percentage
is > 50%.)
Look at the figure in Part IC to help with understanding this case.
We have to subtract 0.5 from our given area in order to find the
value of A(z) to work with. For instance, if we want to find
the score corresponding to the 85th percetile, then we take
A(z) = 0.85 - 0.5 = 0.35.
(Etc., by analogy with the
other cases ... )
Back to the class homepage