14.8. Linear Regression Practice Problems#

14.8.1. Learning Objectives#

After studying this notebook and your lecture notes, you should be able to:

  • Interpret correlation coefficient

  • Compute simple linear regression best fits

  • Check linear regression error assumptions using residual analysis (plots)

  • Compute residual standard error and covariance matrix for fitted parameters

  • Assemble confidence intervals for fitted parameters

# load libraries
import scipy.stats as stats
import numpy as np
import math
import matplotlib.pyplot as plt

14.8.2. Supplemental Exercise 7.5 (Navidi 2015)#

A chemist is calibrating a spectrophotometer that will be used to measure the concentration of carbon monoxide (CO) in atmospheric samples. To check the calibration, samples of known concentration are measured. The true concentrations (x) and the measured concentrations (y) are given in the variables below. Because of random error, repeated measurements on the same sample will vary. The machine is considered to be in calibration if its mean response is equal to the true concentration.

x = np.array([0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]) ## True concentrations (ppm)
y = np.array([1, 11, 21, 28, 37, 48, 56, 68, 75, 86, 96])  ## Measured concentrations (ppm)

print ("The true concentrations:\n", x)
print ("The measured concentrations:\n", y)
The true concentrations:
 [  0  10  20  30  40  50  60  70  80  90 100]
The measured concentrations:
 [ 1 11 21 28 37 48 56 68 75 86 96]

To check the calibration, the linear model \(y = \beta_0 + \beta_1 x + \epsilon\) is fit. Ideally, the value of \(\beta_0\) should be 0 and the value of \(\beta_1\) should be 1.

a. Compute the least-squares estimates \({\hat{\beta_{0}}}\) and \({\hat{\beta_{1}}}\).

##There are multiple ways to solve this problem. One is to use the
##analytical equations provided. 
# Add your solution here

b. Can you reject the null hypothesis \(H_0\) : \(\beta_0\) = 0?

# Add your solution here

c. Can you reject the null hypothesis \(H_0\) : \(\beta_1\) = 1?

# Add your solution here

d. Do the data provide sufficient evidence to conclude that the machine is out of calibration?

# Add your solution here

e. Compute a 95% interval for the mean measurement \(\hat{y}\) when the true concentration is 20 ppm.

# Add your solution here

f. Compute a 95% interval for the mean measurement when the true concentration is 80 ppm.

# Add your solution here

g. Someone claims that the machine is in calibration for concentrations near 20 ppm. Do these data provide sufficient evidence for you to conclude that this claim is false? Explain.

# Add your solution here

14.8.3. Supplemental Exercise 7.8 (Navidi 2015)#

Rate of lipase production, y (in \(\mu\)mol per mL enzyme per minute) and x, the cell mass (in g/L) were measured and results are bellow:

x = np.array([4.5, 4.68, 5.4, 5.45, 4.2, 4.12, 4, 4.41, 3.98, 4.72, 3.41, 4.8, 3.6, 4.95, 3.25, 4.4, 3.65, 4.23, 4.1, 5.03, 
              4.19, 4.4, 3.92, 3.5, 4.15, 4.3, 4.9, 5.23, 5.4, 4.85, 5.1, 4.94]) ##The cell mass in g/L
y = np.array([2.06, 2.1, 3.15, 4.1, 2.2, 3.2, 2.85, 4.5, 2.1, 2.75, 2.8, 4.6, 2.5, 4.1, 2.15, 4.4, 2.2, 2.3, 2.4, 4.75, 3.15,
              3.9, 3.2, 2.1, 3.75, 3.15, 5.1, 5.04, 4.96, 5, 4.92, 4.98]) ##Lipase production in micromol per mL enzyme per minute


print ("The cell mass:\n", x)
print ("Lipase production:\n", y)
The cell mass:
 [4.5  4.68 5.4  5.45 4.2  4.12 4.   4.41 3.98 4.72 3.41 4.8  3.6  4.95
 3.25 4.4  3.65 4.23 4.1  5.03 4.19 4.4  3.92 3.5  4.15 4.3  4.9  5.23
 5.4  4.85 5.1  4.94]
Lipase production:
 [2.06 2.1  3.15 4.1  2.2  3.2  2.85 4.5  2.1  2.75 2.8  4.6  2.5  4.1
 2.15 4.4  2.2  2.3  2.4  4.75 3.15 3.9  3.2  2.1  3.75 3.15 5.1  5.04
 4.96 5.   4.92 4.98]

a. Compute the least-squares line for predicting lipase production from cell mass.

# Add your solution here

b. Compute 95% confidence intervals for \(\beta_0\) and \(\beta_1\).

# Add your solution here

c. In two experiments, the cell masses differed by 1.5 g/L. By how much do you estimate that their lipase production will differ?

# Add your solution here

d. Find a 95% confidence interval for the mean lipase production when the cell mass is 5.0 g/L.

# Add your solution here

e. Can you conclude that the mean lipase production when the cell mass is 5.0 g/L is less than 4.4? Explain.

# Add your solution here