- Home
- >
- Normal Probability Plot – Explanation & Examples
JUMP TO TOPIC
Normal Probability Plot – Explanation & Examples
The definition of the normal probability plot is:
“The normal probability plot is a plot used to assess the normal distribution of numerical data.”
In this topic, we will discuss the normal probability plot from the following aspects:
- What is a normal probability plot?
- How to make a normal probability plot?
- How to read a normal probability plot?
- Practice questions.
- Answer key.
1. What is a normal probability plot?
The normal probability plot is a plot used to assess the normal distribution of any numerical data.
Making a histogram of your data can help you decide whether or not a set of data is normal, but there is a more specialized type of plot you can create, called a normal probability plot.
If the data follow a normal distribution then a normal probability plot of the theoretical percentiles of the normal distribution on the x-axis versus the observed sample percentiles on the y-axis should be approximately linear.
The theoretical p% percentile of a normal distribution is the value such that p% of the values are lower than that value.
The sample p% percentile of any numerical data is the value such that p% of the measurements fall below that value.
For example, the 50% percentile or the median is the value so that 50% or half of your measurements fall below that value.
Another example, the 27% percentile is the value so that 27% of the data points in your numerical data fall below that value.
2. How to make a normal probability plot?
We will go through several examples.
– Example 1
The following are the weights (in kg) of 100 persons from a certain survey.
52.44 52.77 54.56 53.07 53.13 54.72 53.46 51.73 52.31 52.55 54.22 53.36 53.40 53.11 52.44 54.79 53.50 51.03 53.70 52.53 51.93 52.78 51.97 52.27 52.37 51.31 53.84 53.15 51.86 54.25 53.43 52.70 53.90 53.88 53.82 53.69 53.55 52.94 52.69 52.62 52.31 52.79 51.73 55.17 54.21 51.88 52.60 52.53 53.78 52.92 53.25 52.97 52.96 54.37 52.77 54.52 51.45 53.58 53.12 53.22 53.38 52.50 52.67 51.98 51.93 53.30 53.45 53.05 53.92 55.05 52.51 50.69 54.01 52.29 52.31 54.03 52.72 51.78 53.18 52.86 53.01 53.39 52.63 53.64 52.78 53.33 54.10 53.44 52.67 54.15 53.99 53.55 53.24 52.37 54.36 52.40 55.19 54.53 52.76 51.97.
Draw a normal probability plot of this data.
1. Order the numbers from smallest to largest number.
50.69 51.03 51.31 51.45 51.73 51.73 51.78 51.86 51.88 51.93 51.93 51.97 51.97 51.98 52.27 52.29 52.31 52.31 52.31 52.37 52.37 52.40 52.44 52.44 52.50 52.51 52.53 52.53 52.55 52.60 52.62 52.63 52.67 52.67 52.69 52.70 52.72 52.76 52.77 52.77 52.78 52.78 52.79 52.86 52.92 52.94 52.96 52.97 53.01 53.05 53.07 53.11 53.12 53.13 53.15 53.18 53.22 53.24 53.25 53.30 53.33 53.36 53.38 53.39 53.40 53.43 53.44 53.45 53.46 53.50 53.55 53.55 53.58 53.64 53.69 53.70 53.78 53.82 53.84 53.88 53.90 53.92 53.99 54.01 54.03 54.10 54.15 54.21 54.22 54.25 54.36 54.37 54.52 54.53 54.56 54.72 54.79 55.05 55.17 55.19.
2. Assign a rank to each value of your data.
weight | rank |
50.69 | 1 |
51.03 | 2 |
51.31 | 3 |
51.45 | 4 |
51.73 | 5 |
51.73 | 6 |
51.78 | 7 |
51.86 | 8 |
51.88 | 9 |
51.93 | 10 |
51.93 | 11 |
51.97 | 12 |
51.97 | 13 |
51.98 | 14 |
52.27 | 15 |
52.29 | 16 |
52.31 | 17 |
52.31 | 18 |
52.31 | 19 |
52.37 | 20 |
52.37 | 21 |
52.40 | 22 |
52.44 | 23 |
52.44 | 24 |
52.50 | 25 |
52.51 | 26 |
52.53 | 27 |
52.53 | 28 |
52.55 | 29 |
52.60 | 30 |
52.62 | 31 |
52.63 | 32 |
52.67 | 33 |
52.67 | 34 |
52.69 | 35 |
52.70 | 36 |
52.72 | 37 |
52.76 | 38 |
52.77 | 39 |
52.77 | 40 |
52.78 | 41 |
52.78 | 42 |
52.79 | 43 |
52.86 | 44 |
52.92 | 45 |
52.94 | 46 |
52.96 | 47 |
52.97 | 48 |
53.01 | 49 |
53.05 | 50 |
53.07 | 51 |
53.11 | 52 |
53.12 | 53 |
53.13 | 54 |
53.15 | 55 |
53.18 | 56 |
53.22 | 57 |
53.24 | 58 |
53.25 | 59 |
53.30 | 60 |
53.33 | 61 |
53.36 | 62 |
53.38 | 63 |
53.39 | 64 |
53.40 | 65 |
53.43 | 66 |
53.44 | 67 |
53.45 | 68 |
53.46 | 69 |
53.50 | 70 |
53.55 | 71 |
53.55 | 72 |
53.58 | 73 |
53.64 | 74 |
53.69 | 75 |
53.70 | 76 |
53.78 | 77 |
53.82 | 78 |
53.84 | 79 |
53.88 | 80 |
53.90 | 81 |
53.92 | 82 |
53.99 | 83 |
54.01 | 84 |
54.03 | 85 |
54.10 | 86 |
54.15 | 87 |
54.21 | 88 |
54.22 | 89 |
54.25 | 90 |
54.36 | 91 |
54.37 | 92 |
54.52 | 93 |
54.53 | 94 |
54.56 | 95 |
54.72 | 96 |
54.79 | 97 |
55.05 | 98 |
55.17 | 99 |
55.19 | 100 |
Note that repeated values or ties are ranked sequentially as usual.
The first (smallest) value is 50.69 so its rank is 1, the next value is 51.03 so its rank is 2.
The last (largest) value is 55.19 so its rank is 100.
3. Calculate the cumulative probability (pi) associated with each rank (i) using the following formula:
pi=(i-a)/(n+1-2a)
Where:
i = 1,2,3,…..n. n is the number of data points.
a = 3/8 for n ≤ 10, and = 0.5 for n > 10.
Since the number of data points = 100 which is larger than 10, so the formula reduces to:
pi=(i-0.5)/n
The following table will be produced:
weight | rank | pi |
50.69 | 1 | 0.005 |
51.03 | 2 | 0.015 |
51.31 | 3 | 0.025 |
51.45 | 4 | 0.035 |
51.73 | 5 | 0.045 |
51.73 | 6 | 0.055 |
51.78 | 7 | 0.065 |
51.86 | 8 | 0.075 |
51.88 | 9 | 0.085 |
51.93 | 10 | 0.095 |
51.93 | 11 | 0.105 |
51.97 | 12 | 0.115 |
51.97 | 13 | 0.125 |
51.98 | 14 | 0.135 |
52.27 | 15 | 0.145 |
52.29 | 16 | 0.155 |
52.31 | 17 | 0.165 |
52.31 | 18 | 0.175 |
52.31 | 19 | 0.185 |
52.37 | 20 | 0.195 |
52.37 | 21 | 0.205 |
52.40 | 22 | 0.215 |
52.44 | 23 | 0.225 |
52.44 | 24 | 0.235 |
52.50 | 25 | 0.245 |
52.51 | 26 | 0.255 |
52.53 | 27 | 0.265 |
52.53 | 28 | 0.275 |
52.55 | 29 | 0.285 |
52.60 | 30 | 0.295 |
52.62 | 31 | 0.305 |
52.63 | 32 | 0.315 |
52.67 | 33 | 0.325 |
52.67 | 34 | 0.335 |
52.69 | 35 | 0.345 |
52.70 | 36 | 0.355 |
52.72 | 37 | 0.365 |
52.76 | 38 | 0.375 |
52.77 | 39 | 0.385 |
52.77 | 40 | 0.395 |
52.78 | 41 | 0.405 |
52.78 | 42 | 0.415 |
52.79 | 43 | 0.425 |
52.86 | 44 | 0.435 |
52.92 | 45 | 0.445 |
52.94 | 46 | 0.455 |
52.96 | 47 | 0.465 |
52.97 | 48 | 0.475 |
53.01 | 49 | 0.485 |
53.05 | 50 | 0.495 |
53.07 | 51 | 0.505 |
53.11 | 52 | 0.515 |
53.12 | 53 | 0.525 |
53.13 | 54 | 0.535 |
53.15 | 55 | 0.545 |
53.18 | 56 | 0.555 |
53.22 | 57 | 0.565 |
53.24 | 58 | 0.575 |
53.25 | 59 | 0.585 |
53.30 | 60 | 0.595 |
53.33 | 61 | 0.605 |
53.36 | 62 | 0.615 |
53.38 | 63 | 0.625 |
53.39 | 64 | 0.635 |
53.40 | 65 | 0.645 |
53.43 | 66 | 0.655 |
53.44 | 67 | 0.665 |
53.45 | 68 | 0.675 |
53.46 | 69 | 0.685 |
53.50 | 70 | 0.695 |
53.55 | 71 | 0.705 |
53.55 | 72 | 0.715 |
53.58 | 73 | 0.725 |
53.64 | 74 | 0.735 |
53.69 | 75 | 0.745 |
53.70 | 76 | 0.755 |
53.78 | 77 | 0.765 |
53.82 | 78 | 0.775 |
53.84 | 79 | 0.785 |
53.88 | 80 | 0.795 |
53.90 | 81 | 0.805 |
53.92 | 82 | 0.815 |
53.99 | 83 | 0.825 |
54.01 | 84 | 0.835 |
54.03 | 85 | 0.845 |
54.10 | 86 | 0.855 |
54.15 | 87 | 0.865 |
54.21 | 88 | 0.875 |
54.22 | 89 | 0.885 |
54.25 | 90 | 0.895 |
54.36 | 91 | 0.905 |
54.37 | 92 | 0.915 |
54.52 | 93 | 0.925 |
54.53 | 94 | 0.935 |
54.56 | 95 | 0.945 |
54.72 | 96 | 0.955 |
54.79 | 97 | 0.965 |
55.05 | 98 | 0.975 |
55.17 | 99 | 0.985 |
55.19 | 100 | 0.995 |
4. Calculate the Z-score for each pi value (zi). The function qnorm of the R programming language finds the Z-score that is associated with each pi or probability.
For example, when pi = 0.5, the Z-score = 0.
qnorm(0.5)
## [1] 0
This is because the Z-score is for a normal distribution with mean = 0 and standard deviation = 1.
We know from the normal distribution properties that when the data value equals the mean or 0, the probability of data points < 0 = the probability of data points > 0 = 0.5.
As a result, the Z-score values are negative for every data point that has an associated p less than 0.5 and positive for those that have a p greater than 0.5.
The following table will be produced.
weight | rank | pi | zi |
50.69 | 1 | 0.005 | -2.58 |
51.03 | 2 | 0.015 | -2.17 |
51.31 | 3 | 0.025 | -1.96 |
51.45 | 4 | 0.035 | -1.81 |
51.73 | 5 | 0.045 | -1.70 |
51.73 | 6 | 0.055 | -1.60 |
51.78 | 7 | 0.065 | -1.51 |
51.86 | 8 | 0.075 | -1.44 |
51.88 | 9 | 0.085 | -1.37 |
51.93 | 10 | 0.095 | -1.31 |
51.93 | 11 | 0.105 | -1.25 |
51.97 | 12 | 0.115 | -1.20 |
51.97 | 13 | 0.125 | -1.15 |
51.98 | 14 | 0.135 | -1.10 |
52.27 | 15 | 0.145 | -1.06 |
52.29 | 16 | 0.155 | -1.02 |
52.31 | 17 | 0.165 | -0.97 |
52.31 | 18 | 0.175 | -0.93 |
52.31 | 19 | 0.185 | -0.90 |
52.37 | 20 | 0.195 | -0.86 |
52.37 | 21 | 0.205 | -0.82 |
52.40 | 22 | 0.215 | -0.79 |
52.44 | 23 | 0.225 | -0.76 |
52.44 | 24 | 0.235 | -0.72 |
52.50 | 25 | 0.245 | -0.69 |
52.51 | 26 | 0.255 | -0.66 |
52.53 | 27 | 0.265 | -0.63 |
52.53 | 28 | 0.275 | -0.60 |
52.55 | 29 | 0.285 | -0.57 |
52.60 | 30 | 0.295 | -0.54 |
52.62 | 31 | 0.305 | -0.51 |
52.63 | 32 | 0.315 | -0.48 |
52.67 | 33 | 0.325 | -0.45 |
52.67 | 34 | 0.335 | -0.43 |
52.69 | 35 | 0.345 | -0.40 |
52.70 | 36 | 0.355 | -0.37 |
52.72 | 37 | 0.365 | -0.35 |
52.76 | 38 | 0.375 | -0.32 |
52.77 | 39 | 0.385 | -0.29 |
52.77 | 40 | 0.395 | -0.27 |
52.78 | 41 | 0.405 | -0.24 |
52.78 | 42 | 0.415 | -0.21 |
52.79 | 43 | 0.425 | -0.19 |
52.86 | 44 | 0.435 | -0.16 |
52.92 | 45 | 0.445 | -0.14 |
52.94 | 46 | 0.455 | -0.11 |
52.96 | 47 | 0.465 | -0.09 |
52.97 | 48 | 0.475 | -0.06 |
53.01 | 49 | 0.485 | -0.04 |
53.05 | 50 | 0.495 | -0.01 |
53.07 | 51 | 0.505 | 0.01 |
53.11 | 52 | 0.515 | 0.04 |
53.12 | 53 | 0.525 | 0.06 |
53.13 | 54 | 0.535 | 0.09 |
53.15 | 55 | 0.545 | 0.11 |
53.18 | 56 | 0.555 | 0.14 |
53.22 | 57 | 0.565 | 0.16 |
53.24 | 58 | 0.575 | 0.19 |
53.25 | 59 | 0.585 | 0.21 |
53.30 | 60 | 0.595 | 0.24 |
53.33 | 61 | 0.605 | 0.27 |
53.36 | 62 | 0.615 | 0.29 |
53.38 | 63 | 0.625 | 0.32 |
53.39 | 64 | 0.635 | 0.35 |
53.40 | 65 | 0.645 | 0.37 |
53.43 | 66 | 0.655 | 0.40 |
53.44 | 67 | 0.665 | 0.43 |
53.45 | 68 | 0.675 | 0.45 |
53.46 | 69 | 0.685 | 0.48 |
53.50 | 70 | 0.695 | 0.51 |
53.55 | 71 | 0.705 | 0.54 |
53.55 | 72 | 0.715 | 0.57 |
53.58 | 73 | 0.725 | 0.60 |
53.64 | 74 | 0.735 | 0.63 |
53.69 | 75 | 0.745 | 0.66 |
53.70 | 76 | 0.755 | 0.69 |
53.78 | 77 | 0.765 | 0.72 |
53.82 | 78 | 0.775 | 0.76 |
53.84 | 79 | 0.785 | 0.79 |
53.88 | 80 | 0.795 | 0.82 |
53.90 | 81 | 0.805 | 0.86 |
53.92 | 82 | 0.815 | 0.90 |
53.99 | 83 | 0.825 | 0.93 |
54.01 | 84 | 0.835 | 0.97 |
54.03 | 85 | 0.845 | 1.02 |
54.10 | 86 | 0.855 | 1.06 |
54.15 | 87 | 0.865 | 1.10 |
54.21 | 88 | 0.875 | 1.15 |
54.22 | 89 | 0.885 | 1.20 |
54.25 | 90 | 0.895 | 1.25 |
54.36 | 91 | 0.905 | 1.31 |
54.37 | 92 | 0.915 | 1.37 |
54.52 | 93 | 0.925 | 1.44 |
54.53 | 94 | 0.935 | 1.51 |
54.56 | 95 | 0.945 | 1.60 |
54.72 | 96 | 0.955 | 1.70 |
54.79 | 97 | 0.965 | 1.81 |
55.05 | 98 | 0.975 | 1.96 |
55.17 | 99 | 0.985 | 2.17 |
55.19 | 100 | 0.995 | 2.58 |
5. Create an x-y scatter plot of your z-score values on the x-axis versus their corresponding data points on the y-axis.
6. If the weight data are consistent with the normal percentiles from a normal distribution, the points should lie close to a straight line.
As a reference, a straight line can be added to the plot which passes through the first and third quartiles.
From the table, we see that the first quartile (at pi = 0.25) was about 52.50 kg and zi = -0.69 and third quartile (at pi = 0.75) was 53.69 kg and zi = 0.66.
The further the points vary from this line, the greater the indication of departure from normality.
Nearly all the data are on the straight line, so it is normally distributed data.
– Example 2
The following is the ankle diameter in centimeters, measured as the sum of two ankles for 60 physically active individuals from a certain survey.
14.1 15.1 14.1 15.0 14.9 13.9 15.6 14.6 13.2 15.0 14.5 16.0 15.4 13.2 14.0 14.0 16.0 14.7 14.8 15.5 13.9 14.4 13.8 14.1 14.7 14.9 15.3 14.5 13.2 13.2 15.8 14.0 15.1 15.0 12.9 14.0 13.0 14.0 15.4 16.4 15.2 13.8 14.9 16.0 16.0 16.3 15.3 16.5 14.4 13.4 14.4 14.2 15.4 15.0 13.0 13.0 14.8 16.2 15.4 14.4.
Reference:
Heinz G, Peterson LJ, Johnson RW, Kerk CJ. 2003. Exploring Relationships in Body Dimensions. Journal of Statistics Education 11(2).
Draw a normal probability plot of this data.
1. Order the numbers from smallest to largest number.
12.9 13.0 13.0 13.0 13.2 13.2 13.2 13.2 13.4 13.8 13.8 13.9 13.9 14.0 14.0 14.0 14.0 14.0 14.1 14.1 14.1 14.2 14.4 14.4 14.4 14.4 14.5 14.5 14.6 14.7 14.7 14.8 14.8 14.9 14.9 14.9 15.0 15.0 15.0 15.0 15.1 15.1 15.2 15.3 15.3 15.4 15.4 15.4 15.4 15.5 15.6 15.8 16.0 16.0 16.0 16.0 16.2 16.3 16.4 16.5.
2. Assign a rank to each value of your data.
diameter | rank |
12.9 | 1 |
13.0 | 2 |
13.0 | 3 |
13.0 | 4 |
13.2 | 5 |
13.2 | 6 |
13.2 | 7 |
13.2 | 8 |
13.4 | 9 |
13.8 | 10 |
13.8 | 11 |
13.9 | 12 |
13.9 | 13 |
14.0 | 14 |
14.0 | 15 |
14.0 | 16 |
14.0 | 17 |
14.0 | 18 |
14.1 | 19 |
14.1 | 20 |
14.1 | 21 |
14.2 | 22 |
14.4 | 23 |
14.4 | 24 |
14.4 | 25 |
14.4 | 26 |
14.5 | 27 |
14.5 | 28 |
14.6 | 29 |
14.7 | 30 |
14.7 | 31 |
14.8 | 32 |
14.8 | 33 |
14.9 | 34 |
14.9 | 35 |
14.9 | 36 |
15.0 | 37 |
15.0 | 38 |
15.0 | 39 |
15.0 | 40 |
15.1 | 41 |
15.1 | 42 |
15.2 | 43 |
15.3 | 44 |
15.3 | 45 |
15.4 | 46 |
15.4 | 47 |
15.4 | 48 |
15.4 | 49 |
15.5 | 50 |
15.6 | 51 |
15.8 | 52 |
16.0 | 53 |
16.0 | 54 |
16.0 | 55 |
16.0 | 56 |
16.2 | 57 |
16.3 | 58 |
16.4 | 59 |
16.5 | 60 |
Note that repeated values or ties are ranked sequentially as usual.
The first (smallest) value is 12.9 cm so its rank is 1, the next value is 13.0 cm so its rank is 2.
The last (largest) value is 16.5 so its rank is 60.
3. Calculate the cumulative probability (pi) associated with each rank (I).
Since the number of data points = 60 which is larger than 10, so the formula reduces to:
pi=(i-0.5)/n
The following table will be produced:
diameter | rank | pi |
12.9 | 1 | 0.008 |
13.0 | 2 | 0.025 |
13.0 | 3 | 0.042 |
13.0 | 4 | 0.058 |
13.2 | 5 | 0.075 |
13.2 | 6 | 0.092 |
13.2 | 7 | 0.108 |
13.2 | 8 | 0.125 |
13.4 | 9 | 0.142 |
13.8 | 10 | 0.158 |
13.8 | 11 | 0.175 |
13.9 | 12 | 0.192 |
13.9 | 13 | 0.208 |
14.0 | 14 | 0.225 |
14.0 | 15 | 0.242 |
14.0 | 16 | 0.258 |
14.0 | 17 | 0.275 |
14.0 | 18 | 0.292 |
14.1 | 19 | 0.308 |
14.1 | 20 | 0.325 |
14.1 | 21 | 0.342 |
14.2 | 22 | 0.358 |
14.4 | 23 | 0.375 |
14.4 | 24 | 0.392 |
14.4 | 25 | 0.408 |
14.4 | 26 | 0.425 |
14.5 | 27 | 0.442 |
14.5 | 28 | 0.458 |
14.6 | 29 | 0.475 |
14.7 | 30 | 0.492 |
14.7 | 31 | 0.508 |
14.8 | 32 | 0.525 |
14.8 | 33 | 0.542 |
14.9 | 34 | 0.558 |
14.9 | 35 | 0.575 |
14.9 | 36 | 0.592 |
15.0 | 37 | 0.608 |
15.0 | 38 | 0.625 |
15.0 | 39 | 0.642 |
15.0 | 40 | 0.658 |
15.1 | 41 | 0.675 |
15.1 | 42 | 0.692 |
15.2 | 43 | 0.708 |
15.3 | 44 | 0.725 |
15.3 | 45 | 0.742 |
15.4 | 46 | 0.758 |
15.4 | 47 | 0.775 |
15.4 | 48 | 0.792 |
15.4 | 49 | 0.808 |
15.5 | 50 | 0.825 |
15.6 | 51 | 0.842 |
15.8 | 52 | 0.858 |
16.0 | 53 | 0.875 |
16.0 | 54 | 0.892 |
16.0 | 55 | 0.908 |
16.0 | 56 | 0.925 |
16.2 | 57 | 0.942 |
16.3 | 58 | 0.958 |
16.4 | 59 | 0.975 |
16.5 | 60 | 0.992 |
4. Calculate the Z-score for each pi value using the function qnorm of the R programming language.
diameter | rank | pi | zi |
12.9 | 1 | 0.008 | -2.41 |
13.0 | 2 | 0.025 | -1.96 |
13.0 | 3 | 0.042 | -1.73 |
13.0 | 4 | 0.058 | -1.57 |
13.2 | 5 | 0.075 | -1.44 |
13.2 | 6 | 0.092 | -1.33 |
13.2 | 7 | 0.108 | -1.24 |
13.2 | 8 | 0.125 | -1.15 |
13.4 | 9 | 0.142 | -1.07 |
13.8 | 10 | 0.158 | -1.00 |
13.8 | 11 | 0.175 | -0.93 |
13.9 | 12 | 0.192 | -0.87 |
13.9 | 13 | 0.208 | -0.81 |
14.0 | 14 | 0.225 | -0.76 |
14.0 | 15 | 0.242 | -0.70 |
14.0 | 16 | 0.258 | -0.65 |
14.0 | 17 | 0.275 | -0.60 |
14.0 | 18 | 0.292 | -0.55 |
14.1 | 19 | 0.308 | -0.50 |
14.1 | 20 | 0.325 | -0.45 |
14.1 | 21 | 0.342 | -0.41 |
14.2 | 22 | 0.358 | -0.36 |
14.4 | 23 | 0.375 | -0.32 |
14.4 | 24 | 0.392 | -0.27 |
14.4 | 25 | 0.408 | -0.23 |
14.4 | 26 | 0.425 | -0.19 |
14.5 | 27 | 0.442 | -0.15 |
14.5 | 28 | 0.458 | -0.11 |
14.6 | 29 | 0.475 | -0.06 |
14.7 | 30 | 0.492 | -0.02 |
14.7 | 31 | 0.508 | 0.02 |
14.8 | 32 | 0.525 | 0.06 |
14.8 | 33 | 0.542 | 0.11 |
14.9 | 34 | 0.558 | 0.15 |
14.9 | 35 | 0.575 | 0.19 |
14.9 | 36 | 0.592 | 0.23 |
15.0 | 37 | 0.608 | 0.27 |
15.0 | 38 | 0.625 | 0.32 |
15.0 | 39 | 0.642 | 0.36 |
15.0 | 40 | 0.658 | 0.41 |
15.1 | 41 | 0.675 | 0.45 |
15.1 | 42 | 0.692 | 0.50 |
15.2 | 43 | 0.708 | 0.55 |
15.3 | 44 | 0.725 | 0.60 |
15.3 | 45 | 0.742 | 0.65 |
15.4 | 46 | 0.758 | 0.70 |
15.4 | 47 | 0.775 | 0.76 |
15.4 | 48 | 0.792 | 0.81 |
15.4 | 49 | 0.808 | 0.87 |
15.5 | 50 | 0.825 | 0.93 |
15.6 | 51 | 0.842 | 1.00 |
15.8 | 52 | 0.858 | 1.07 |
16.0 | 53 | 0.875 | 1.15 |
16.0 | 54 | 0.892 | 1.24 |
16.0 | 55 | 0.908 | 1.33 |
16.0 | 56 | 0.925 | 1.44 |
16.2 | 57 | 0.942 | 1.57 |
16.3 | 58 | 0.958 | 1.73 |
16.4 | 59 | 0.975 | 1.96 |
16.5 | 60 | 0.992 | 2.41 |
5. Create an x-y scatter plot of your z-score values on the x-axis versus their corresponding data points on the y-axis.
6. If the diameter data are consistent with the normal percentiles from a normal distribution, the points should lie close to a straight line.
As a reference, a straight line is plotted which passes through the first and third quartiles.
From the table, we see that the first quartile (at pi = 0.25) was about 14.0 cm and zi = -0.65 and third quartile (at pi = 0.75) was 15.4 cm and zi = 0.70.
Nearly all the data are on the straight line, so it is normally distributed data.
3. How to read a normal probability plot?
The shape of a normal probability plot can tell you the distribution of your data.
– Example 1: normally-distributed variable
The following plot is the histogram and normal probability plot for heights in cm of 100 individuals.
When the data is normally distributed, the histogram is nearly symmetric, unimodal, and bell-shaped.
The normal probability plot of normally distributed data will show nearly all the points on the reference straight line, at least when the few large and small values are ignored.
– Example 2: normally-distributed variable with one outlier
The following plot is the histogram and normal probability plot for heights in cm of 100 individuals.
The histogram of the data will be the same except for a faraway bin for the outlier.
The normal probability plot will show that nearly all the points are near the straight line except the far away outlier point.
– Example 3: Right-skewed variable
The following plot is the histogram and normal probability plot for the Annual income of 100 individuals.
The histogram of right-skewed data looks unimodal with less frequent large values.
The normal probability plot of right-skewed data has an inverted C shape.
– Example 4: Left-skewed variable
The following plot is the histogram and normal probability plot for the Physical ability Lawyers’ ratings of state judges in the US Superior Court.
The histogram of left-skewed data looks unimodal with less frequent small values.
The normal probability plot of left-skewed data has a nearly C shape.
4. Practice questions
1. The following is the age in years for 20 participants from a certain survey.
26 48 67 39 25 25 36 44 44 47 53 52 52 51 52 40 77 44 40 45.
Draw a normal probability plot of this data.
2. The following normal probability plots for the weights (in kg) of males and females from a certain survey.
Which sex has a normally distributed weight?
3. The following normal probability plots for the total cholesterol (in mg/dl) of different smoking statuses from a certain survey.
Which smoking status has a normally distributed total cholesterol level?
4. The following normal probability plots for the annual income (in USD) of different employment statuses from a certain survey.
Which employment status has a normally distributed annual income?
5. The following normal probability plots for the air pressure (in millibars) of different storm classes (status).
Which storm class has a normally distributed pressure?
5. Answer key
1. Order the numbers from smallest to largest number.
25 25 26 36 39 40 40 44 44 44 45 47 48 51 52 52 52 53 67 77.
- Assign a rank to each value of your data.
Age | rank |
25 | 1 |
25 | 2 |
26 | 3 |
36 | 4 |
39 | 5 |
40 | 6 |
40 | 7 |
44 | 8 |
44 | 9 |
44 | 10 |
45 | 11 |
47 | 12 |
48 | 13 |
51 | 14 |
52 | 15 |
52 | 16 |
52 | 17 |
53 | 18 |
67 | 19 |
77 | 20 |
- Calculate the cumulative probability (pi) associated with each rank (I).
Since the number of data points = 20 which is larger than 10, so the formula reduces to:
pi=(i-0.5)/n
The following table will be produced:
Age | rank | pi |
25 | 1 | 0.025 |
25 | 2 | 0.075 |
26 | 3 | 0.125 |
36 | 4 | 0.175 |
39 | 5 | 0.225 |
40 | 6 | 0.275 |
40 | 7 | 0.325 |
44 | 8 | 0.375 |
44 | 9 | 0.425 |
44 | 10 | 0.475 |
45 | 11 | 0.525 |
47 | 12 | 0.575 |
48 | 13 | 0.625 |
51 | 14 | 0.675 |
52 | 15 | 0.725 |
52 | 16 | 0.775 |
52 | 17 | 0.825 |
53 | 18 | 0.875 |
67 | 19 | 0.925 |
77 | 20 | 0.975 |
- Calculate the Z-score for each pi value.
Age | rank | pi | zi |
25 | 1 | 0.025 | -1.96 |
25 | 2 | 0.075 | -1.44 |
26 | 3 | 0.125 | -1.15 |
36 | 4 | 0.175 | -0.93 |
39 | 5 | 0.225 | -0.76 |
40 | 6 | 0.275 | -0.60 |
40 | 7 | 0.325 | -0.45 |
44 | 8 | 0.375 | -0.32 |
44 | 9 | 0.425 | -0.19 |
44 | 10 | 0.475 | -0.06 |
45 | 11 | 0.525 | 0.06 |
47 | 12 | 0.575 | 0.19 |
48 | 13 | 0.625 | 0.32 |
51 | 14 | 0.675 | 0.45 |
52 | 15 | 0.725 | 0.60 |
52 | 16 | 0.775 | 0.76 |
52 | 17 | 0.825 | 0.93 |
53 | 18 | 0.875 | 1.15 |
67 | 19 | 0.925 | 1.44 |
77 | 20 | 0.975 | 1.96 |
- Create an x-y scatter plot of your z-score values on the x-axis versus their corresponding data points on the y-axis.
- As a reference, a straight line can be added to the plot which passes through the first and third quartiles.
Nearly all the points on the straight line except small and large values, so it is nearly normally distributed data.
2. Males have nearly normally distributed weights as nearly all the points are along the straight line.
In females, the normal probability plot shows an inverted C shape which means that the female weights are right-skewed.
3. All the smoking statuses have nearly normally distributed total cholesterol levels as nearly all the points are along the straight line, except for small and large values.
4. “not in labor force” and “unemployed” statuses have nearly normally distributed annual income as nearly all the points are along the straight line, except for large values.
“employed” status has right-skewed annual income as the normal probability plot takes an inverted C-shape.
5. Tropical depression storms have nearly normally distributed pressure as nearly all the points are along the straight line, except for large and small values.
Hurricane and tropical storms have left-skewed pressure values as the normal probability plot takes a C-shape.