The Normal Distribution (2024)

What does the protein content in cows' milk have in common with human IQ?

Both variables have approximately normal distributions. The normal distribution is a good model for measurements of many kinds, including IQs, heights,and lengths of pregnancies.

The distribution of the protein content in cow's milk has the classic bell shapeof the normal distribution. Most observations are near the mean (3.4 grams) but a few are much larger or smaller.

The normal distribution is widely used in probability theory and underlies much of statistical inference.

The Normal Distribution (1)

The normal distribution is also called the "Gaussian distribution" or "bell curve". A normal distribution has two parameters, the mean $\mu$, and the variance $\sigma^2$. The mean can be any real number and the variancecan be any non-negative number.


NOTATION: "\(X\sim N(\mu, \sigma^2)\)" indicates that the random variable X is normally distributed with mean $\mu$ and variance $\sigma^2$.


$f(x)=\frac{1}{\sqrt{2\pi}}e^{\frac{-(x-\mu)^2}{2\sigma^2}}$, $-\infty \lt x \lt \infty$


Adjust the sliders to see how changing the values affected the graph.

$-5 \leq \mu \leq 5$
$0.5 \leq \sigma \leq 5$.

Note: \(\sigma=\sqrt{\sigma^2}\)


The Standard Normal Distribution

The normal distribution that has mean 0 and variance 1 is called the 'standard normal'distribution. A random variable that has a standard normal distribution is usually denoted with $Z$. That is $Z\sim N(0,1)$. Moreover, we use $\phi(z)$ and $\Phi(z)$ to denoterespectively the probability mass function and cumulative distribution function of astandard normal random variable.

NOTATION: The probability density function of a standard normal distribution is denoted by $\phi(z)$ and the cumulative distribution function, $P(Z\leq z)$, by $\Phi(z)$.


The Empirical Rule

For a normal distribution, the area under the curve within a given number of standard deviations (SDs) of themean is the same regardless of the value of the mean and the standard deviation.In particular, about 68% of the area is within 1 standard deviation of the mean, 95% is within 2 standard deviations of the mean, and 99.7% is within 3 standard deviations of the mean.

  • $P(\mu - \sigma \lt Z \lt \mu+\sigma)\approx 0.68$.
  • $P(\mu - 2\sigma \lt Z \lt \mu+2\sigma)\approx 0.95$.
  • $P(\mu - 3\sigma \lt Z \lt \mu+3\sigma)\approx 0.997$.


Use the radiobuttons at the bottom to show the regions within 1, 2, or 3 standard deviations ($\sigma$) of the mean ($\mu$).

Change the values of $\mu$ and $\sigma$ to verify that the areas within a given number of sd's from the mean are the same regardless of the values of the mean and standard deviation.

The green 'z' can be dragges to mark an area within 'z' standard deviations of the mean.


Standard Units

Determining how many standard deviations a value is from the mean is called standardizingthe value or converting it to standard units.

Standard units indicate how many standard deviations a value is from the mean.

To standardize a value means to indicate how many standard deviations the value is from the mean.

Suppose the mean height of students in a particular statistics class is 69inches with a standard deviation of 4 inches. How many standard deviations from the meanis a height of 73 inches? 63 inches?

73 inches is 4 inches, or 1 standard deviation, above the mean.

63 inches is 6 inches, or1.5 standard deviations, below the mean.

When we standardize a normally distributed random variable, the resulting random variable has a standard normal distribution.

If $X$ is a random variable such that $X\sim N(\mu,\sigma^2)$, $$Z=\frac{X-\mu}{\sigma}\sim N(0,1)$$


We can also standardize a partiular value or find how many standard deviations the value is from the mean.

If $X\sim N(\mu, \sigma^2)$, to find how many standard deviations a value, x, is from the mean (i.e. to standardize x), subtract $\mu$ from $x$ and divide the result by the standard deviation: $z=\frac{x-\mu}{\sigma}$. The resultis typcially denoted with 'z' and is often referred to as a z-score.

A z-score indicates how many standard deviations a particular value is from the mean, the standard units.



$z=\frac{x-\mu}{\sigma}$.


The units of the standard normal curve are standard units. That is if $Z\sim N(0,1)$, the value of Z that is 1 sd above the mean is 1, the value that is2 sd's below the mean is -2, etc.


The units of the standard normal curve are standard units.

The mean protein content in the milk of a group of cows in the weeksafter calving is 3.4g with a standard deviation of 0.3g.

  1. How many sd's from the mean is a value of 4.1g?
  2. Express 3.2g in standard units.
  3. What value is 2.5 sd's above the mean?

  1. How many sd's from the mean is a value of 4.1g?
    $z=\frac{4.1-3.4}{.3} = 2.333$.
    4.1g is 2.333 sd's below the mean.

  2. Express 3.2g in standard units.
    $z=\frac{3.2-3.4}{.3} = \frac{2}{3}$.
    3.2g is $\frac{2}{3}$ in standard units.

  3. What value is 2.5 sd's above the mean?
    2.5 sd's is 2.5×0.3 = 0.75g.
    0.75g above the mean is 3.4+0.75 = 4.15g.

Finding Probabilties with the Normal Distribution

For a continuous random variable X with probability densityfunction \(\small{f(x)}\), \(\small{P(a \leq X \leq b) = \int_a^bf(x)dx}\). However, the probability density functionof a normal random variable cannot be integrated by hand. To find probabilities pertainingto a normal distribution therefore, it is necessary either to use software or to use a table.


Use the sliders to adjust $\mu$ and $\sigma$.

Drag the orange triangle to change the value of x in the expression $P(X\leq x)$. The resulting probabability is given.

A normal cdf table gives values of the cumulative distribution function for the standard normaldistribution. To use the table to find $P(X\leq x)$ where $X\sim N(\mu, \sigma^2)$:

  1. Compute the z-score for $x$, $z=\frac{x-\mu}{\sigma}$.
  2. Find the z-score on the table, the first two digits along the left margin and a third digit along the top.
  3. The value in the table on the row and column indicated by the previous step is $P(X \leq x)$.


z-scores are indicated along the margins of the table. The body of the table contains cumulative probabilities, $P(Z\leq z) = \Phi(z)$.

Move the arrows in the margins to locate a specific z-score. The first two digits of the z-score are marked on the left margin and the third digit on the top.The value in the box, where the indicate row and column interact is the descired probability.

Findthe probability that protein content, X, in the milk of a cow is less than 3g. $X \sim N(3.4, 0.3^2)$.

  1. Find 3g in standard units: $z=\frac{3-3.4}{0.3} = -1.33$
  2. Find -1.3 along the left margin of the table.
  3. Find 0.03 along the top margin of the table.

$\Phi(-1.33) = 0.1020$
$P(X \leq -1.33) = 0.1020$

The Normal Distribution (2)

Find the probability that a standard normal random variable takes a value less than -0.72.

Using the table, find -0.7 along the left margin and 0.02 along the top.
$\Phi(-0.72) = 0.2483$.

The Normal Distribution (3)

Normal probabilities can also be found using software or with a calculator with statistical functions.


\(X\sim N(3.4,0.3^2)\)

Evaluate $P(X \leq 2)$ on a TI-84

2ND VARS 2
lower: -1E99.
upper: 2
μ: 3.4
σ:0.3
ENTER

Notes:

  1. The true lower endpoint should be $-\infty$, since the calculator can't handle this, enter something very small compared to the mean.
  2. When using a calculator or software to find a normal probability, it is typically not necessary to standardize first.

Find the probability that a standard normal random variable takes a value greater than 0.63.


$P(Z\leq 0.63) = \Phi(0.63) = 0.7357$.

Using the complementrule, $P(Z > 0.63) = 1 - \Phi(0.63) = 1 - 0.7357 = 0.2643$

The Normal Distribution (4)

Suppose $X\sim N(25, 9)$.

  1. Without doing any calculations, is $P(X \leq 23)$ greater than or less than 0.5?
  2. Find $P(X \leq 23)$.

  1. Since the normal curve is symmetric about its mean, half of the area under the curveis above the mean and half is below. $P(X \leq 23)$ must be less than 0.5 since 23 < 25.

  2. $P(X \leq 23) = P(Z \leq \frac{23-25}{3}) = \Phi(-0.67) = 0.2981$

The Normal Distribution (5)

To find the probability that a normal random variable takes on a value over a given interval:$\small{P(a \leq X \leq b) = F(b)-F(a)}$. Calculator or software can evaluate this probability directly.

$\small{P(a \leq X \leq b) = F(b)-F(a)}$

What is the probability that the protein content in themilk is between 2.5 and 3 grams? That is, if $X \sim N(3.4, 0.3^2)$, what is $P(2.5 \leq X \leq 3)$?


Using the table:
$$\small{\begin{array}{lcl}P(2.5 \leq X \leq 3) &=& F(3) - F(2.5) \\&=& P(X \leq 3) - P(X \leq 2.5)\\&=& P(Z \leq \frac{3-3.4}{.3}) - P(Z \leq \frac{2.5-3.4}{.3}) \\&=& \Phi(-1.33) - \Phi(-3)\\&=& 0.1020 - 0.0013 \\&=& 0.1007\end{array}}$$


Using a TI-84 calculator:

normalcdf
lower: 2.5.
upper: 3
μ: 3.4
σ:0.3
0.1007

$P(2.5 \leq X \leq 3)=0.1007$

$X\sim N(25, 9)$. Find $P(22 \leq X \leq 30)$.

$$\begin{array}{lcl}P(22 \leq X \leq 30 &=& F(30) - F(22) \\&=& P(X \leq 30) - P(X \leq 22)\\&=& P(Z \leq \frac{30-25}{3}) - P(Z \leq \frac{22-25}{3}) \\&=& \Phi(1.67) - \Phi(-1)\\&=& 0.9525 - 0.1587\\&=& 0.7938\end{array}$$

Finding Percentiles of the Normal Distribution

To find percentiles from the normal distribution, calculate how many sd's the given percentile is from the mean, then find the valueof the variable that corresponds to that z-score.

How many SDs from the mean is the 70th percentile?

The Normal Distribution (6)

This question is equivalent to asking for the value of z-score associated with the 70th percentile. Regardless of the values of the mean and variance of a normal distribution, the z-score corresponding to the 70th percentile is the same.

The 70th percentile is the value, T, such that 70% of the area is less than T.

To find T using the table, look for 0.7 in the body of the table and find the associated z-score. Since the exact value 0.7 is not in the table, it is reasonable to use the closest available value, 0.6985. Reading from the margins, the z-score associated with 0.6985 is 0.52, that is$P(Z \leq 0.52) = 0.6985$

A more precise value can be obtained using software or a calculator (below). Using either of these shows that $z = 0.5244$, that is $P(Z \leq 0.5244) = 0.7$

Find the Z-score corresponding to a given percentile:

Find the 70th percentile on a TI-84

2ND VARS 3
invNorm
area: 0.7.
μ: 0
σ:1
ENTER

Notes:

  1. Since the units of the standard normal curve are standard units, use the standard normal distribution to find the z-score that corresponds to a percentile.
  2. Divide the percentile by 100 to find the corresponding area.

Find the value of a given percentile:

$X \sim N(3.4, 0.3^2)$

Find the 70th percentile on a TI-84

2ND VARS 3
invNorm
area: 0.7.
μ: 3.4
σ:0.3
ENTER

Use the z-score to find the 70th percentile of protein content for cow's milk. $X\sim N(3.4,0.3^2)$,

The 70th percentile is 0.52 SDs abovethe mean.

0.52 sd's is $0.52(0.3) = 0.156$ grams.

The value that is 0.52 sd's above the mean is $3.4 + 0.156 = 3.556$ grams.

The 70th percentile for the protein content of cows' milk is 3.556 grams.

Sums of Normal Random Variables

A linear combination of normally distributed random variables is also normally distributed. For example, if $X$ is normally distributed and $Y$ is normally distributed, then ($\small{X+Y}$), ($\small{Y-X}$), and ($\small{2X+3Y}$) are all normally distributed random variables as well. The Linearity Properties facilitate finding the expected value and variance.

Let $X_1, X_2, \ldots X_n$ be independent, normally distributed random variables with expected values $\mu_1, \mu_2\ldots \mu_n$ and variances $\sigma^2_1, \sigma^2_2, \ldots \sigma^2_n$ respectively and let $a_1, a_2, \ldots a_n$ be constants.

$\sum_{i=1}^na_iX_i \sim N\left(\sum_{i=1}^na_i\mu_i, \sum_{i=1}^na_i^2\sigma^2_i\right)$

Let $X_1, X_2, \ldots X_n$, $Y_1, Y_2, \ldots Y_n$, and $Z_1, Z_2, \ldots Z_n$ be independent random variables such that $X_i \sim N(4,4)$, $Y_i \sim N(2,9)$, and $Z_i \sim N(0,1)$. Find the distributions of thefollowing random variables:

  1. $X_1+Y_1$
  2. $2Y_1-Z_2$
  3. $\sum_{i=1}^3 X_i-2Z_3$
  1. $E(X_1+Y_1)=E(X_1)+E(Y_1)=4+2=6$
    $Var(X_1+Y_1)=Var(X_1)+Var(Y_1)=4+9=13$
    Since $X_1$ and $Y_1$ are both normally distributed, so is $X_1+Y_1$. $(X_1+Y_1) \sim N(6,13)$.

  2. $E(2Y_1-Z_2)=E(2Y_1)+E(-Z_2)=2E(Y_1)-E(Z_2)=2(2)-0=4$
    $Var(2Y_1-Z_2)=Var(2Y_1)+Var(-Z_2)=2^2Var(Y_1)+(-1)^2Var(Z_2)=4(9)+1(1)=37$
    Since $Y_1$ and $Z_2$ are both normally distributed, so is $2Y_1-Z_2.$ $(2Y_1-Z_2) \sim N(4,37)$.

  3. $E(\sum_{i=1}^3 X_i-2Z_3)=E(\sum_{i=1}^3 X_i)+E(-2Z_3)=\sum_{i=1}^3E(X_i)-2E(Z_3)=\sum_{i=1}^3(4)-2(0)=3(4)-0=12$
    $Var(\sum_{i=1}^3 X_i-2Z_3)=Var(\sum_{i=1}^3 X_i)+Var(-2Z_3)=\sum_{i=1}^3Var(X_i)+(-2)^2E(Z_3)=\sum_{i=1}^3(4)+4(1)=3(4)+4(1)=16$
    Since $X_i$ and $Z_3$ are both normally distributed, so is $\sum_{i=1}^3 X_i-2Z_3$. $(\sum_{i=1}^3 X_i-2Z_3) \sim N(12,16)$.

Let $X_1, X_2, \ldots X_n$, be independent random variables such that $X_i \sim N(\mu,\sigma^2)$. What is the distribution of $\bar{X} = \sum_{i=1}^n\frac{1}{n}X_i$?

\begin{array}{lcl}E(\bar{X}) &=& E\left(\sum_{i=1}^n\frac{1}{n}X_i\right)\\&=&\sum_{i=1}^n\frac{1}{n}E(X_i)\\&=&\sum_{i=1}^n\frac{1}{n}\mu\\&=& n\frac{1}{n}\mu\\&=& \mu \end{array}

\begin{array}{lcl}Var(\bar{X}) &=& Var\left(\sum_{i=1}^n\frac{1}{n}X_i\right)\\&=&\sum_{i=1}^n\left(\frac{1}{n}\right)^2E(X_i)\\&=&\sum_{i=1}^n\left(\frac{1}{n}\right)^2\sigma^2\\&=& n\left(\frac{1}{n}\right)^2\sigma^2\\&=& \frac{\sigma^2}{n} \end{array}

$\bar{X}\sim N\left(\mu, \frac{\sigma^2}{n}\right)$.

  • ‹ Previous
  • Next›
The Normal Distribution (2024)

FAQs

What can a normal distribution be described as _______________? ›

The normal distribution is the proper term for a probability bell curve. In a normal distribution, the mean is zero and the standard deviation is 1. It has zero skew and a kurtosis of 3.

How do you answer normal distribution? ›

z = (X – μ) / σ

where X is a normal random variable, μ is the mean of X, and σ is the standard deviation of X. You can also find the normal distribution formula here. In probability theory, the normal or Gaussian distribution is a very common continuous probability distribution.

How many samples do I need for 95 confidence? ›

To be 95% confident that the true value of the estimate will be within 5 percentage points of 0.5, (that is, between the values of 0.45 and 0.55), the required sample size is 385. This is the number of actual responses needed to achieve the stated level of accuracy.

What does normal distribution tell you? ›

The normal distribution is also known as a Gaussian distribution or probability bell curve. It is symmetric about the mean and indicates that values near the mean occur more frequently than the values that are farther away from the mean.

What is an example of a normal distribution? ›

A normal distribution is a common probability distribution . It has a shape often referred to as a "bell curve." Many everyday data sets typically follow a normal distribution: for example, the heights of adult humans, the scores on a test given to a large class, errors in measurements.

Where is normal distribution used? ›

A large number of random variables are either nearly or exactly represented by the normal distribution, in every physical science and economics. Furthermore, it can be used to approximate other probability distributions, therefore supporting the usage of the word 'normal 'as in about the one, mostly used.

What are the 5 characteristics of a normal distribution? ›

Normal distributions are symmetric, unimodal, and asymptotic, and the mean, median, and mode are all equal. A normal distribution is perfectly symmetrical around its center.

How do you describe the normal and standard normal distributions? ›

All normal distributions, like the standard normal distribution, are unimodal and symmetrically distributed with a bell-shaped curve. However, a normal distribution can take on any value as its mean and standard deviation. In the standard normal distribution, the mean and standard deviation are always fixed.

How to tell if data is normally distributed? ›

Graphical test for normal distribution

If you want to check the normal distribution using a histogram, plot the normal distribution on the histogram of your data and check that the distribution curve of the data approximately matches the normal distribution curve.

What is the 95 confidence rule? ›

The means and their standard errors can be treated in a similar fashion. If a series of samples are drawn and the mean of each calculated, 95% of the means would be expected to fall within the range of two standard errors above and two below the mean of these means.

What is a good sample size? ›

For populations under 1,000, a minimum ratio of 30 percent (300 individuals) is advisable to ensure representativeness of the sample. For larger populations, such as a population of 10,000, a comparatively small minimum ratio of 10 percent (1,000) of individuals is required to ensure representativeness of the sample.

Why is 30 a good sample size? ›

Why is 30 the minimum sample size? The rule of thumb is based on the idea that 30 data points should provide enough information to make a statistically sound conclusion about a population. This is known as the Law of Large Numbers, which states that the results become more accurate as the sample size increases.

What is a normal distribution for dummies? ›

A normal distribution is symmetrical around the mean. Normal distribution reaches its highest point at the mean. It is bell-shaped. It has a zero point at the mean and it decreases as you move away from the mean on both sides.

Why is normal distribution used? ›

Answer. The first advantage of the normal distribution is that it is symmetric and bell-shaped. This shape is useful because it can be used to describe many populations, from classroom grades to heights and weights.

What is a normal distribution quizlet? ›

Normal Distribution. a bell-shaped curve, describing the spread of a characteristic throughout a population. Two Pieces of data that specify a Distribution. Mean (μ) and Standard Deviation (σ)

Can a normal distribution be described as asymptotic? ›

A normal distribution has a density function that is symmetrical, with a range that is equal to the whole real line. As the value approaches either positive infinite or negative infinite, the density function approaches zero, but never reaches zero, hence, the density function is also asymptotic.

What is a normal distribution best described as quizlet? ›

A continuous, symmetric, bell-shaped distribution of a variable.

Which shape best describes a normal distribution? ›

A bell curve is a graph depicting the normal distribution, which has a shape reminiscent of a bell. The top of the curve shows the mean, mode, and median of the data collected. Its standard deviation depicts the bell curve's relative width around the mean.

References

Top Articles
Latest Posts
Article information

Author: Dr. Pierre Goyette

Last Updated:

Views: 6104

Rating: 5 / 5 (70 voted)

Reviews: 93% of readers found this page helpful

Author information

Name: Dr. Pierre Goyette

Birthday: 1998-01-29

Address: Apt. 611 3357 Yong Plain, West Audra, IL 70053

Phone: +5819954278378

Job: Construction Director

Hobby: Embroidery, Creative writing, Shopping, Driving, Stand-up comedy, Coffee roasting, Scrapbooking

Introduction: My name is Dr. Pierre Goyette, I am a enchanting, powerful, jolly, rich, graceful, colorful, zany person who loves writing and wants to share my knowledge and understanding with you.