15.5. Monte Carlo Uncertainty Analysis for Nonlinear Regression#

Further Reading: §4.12

15.5.1. Learning Objectives#

After studying this notebook, attending class, asking questions, and reviewing your notes, you should be able to:

  • Use simulation to calculate and analyze probabilities.

  • Write code to simulate data, add noise, and visually inspect the distribution of fitted parameters through Monte Carlo Uncertainty Analysis.

# load libraries
import scipy.stats as stats
import numpy as np
import scipy.optimize as optimize
import math
import matplotlib.pyplot as plt

15.5.2. Simulation#

See this notebook for notes on simulating a problem to calculate probability, mean, variance, and to determine if a population is normal.

15.5.3. Motivating Example: Cart + Incline#

See this notebook to look at a motivating example for Monte Carlo Error Propogation.

15.5.4. Motivating Example: Michaelis-Menten Enzymatic Reaction Kinetics#

The Michaelis-Menten equation is an extremely popular model to describe the rate of enzymatic reactions.

\[ \mathrm{E} + \mathrm{S} \leftrightharpoons \mathrm{ES} \rightarrow \mathrm{E} + \mathrm{P} \]
\[ r = V_{max} \frac{[\mathrm{S}]}{K_M + [\mathrm{S}]} \]

Additional information: https://en.wikipedia.org/wiki/Michaelis–Menten_kinetics

## Create dataset

# Define exact coefficients
Vmaxexact=2;
Kmexact=5;

# Evaluate model
Sexp = np.array([.3, .4, 0.5, 1, 2, 4, 8, 16]);
rexp = Vmaxexact*Sexp / (Kmexact+Sexp);

# Add some random error to simulate 
rexp += 0.05*np.random.normal(size=len(Sexp))

# Evaluate model to plot smooth curve
S = np.linspace(np.min(Sexp),np.max(Sexp),100)
r = Vmaxexact*S / (Kmexact+S)

15.5.5. Main Idea#

  • Use residuals to estimate uncertainty in dependent variable

  • Simulate regression procedure 1000s of times, adding random noise to dependent variable

  • Examine distribution of fitted values

# variance of residuals
print(sigre)
0.003717977540144133

15.5.6. Pseudocode#

Class Activity

Write pseudocode with a partner.

15.5.7. Python Hints#

Let’s say we want to generate a vector with 5 elements, each normally distributed with mean 0 and standard deviation 2. We can do this in one line using Python:

my_vec = np.random.normal(loc = 0,scale = 2,size=(5))
print(my_vec)
[-2.71241136  0.91627376 -0.96646476  2.57309647  0.73030355]

15.5.8. Implement in Python#

Class Activity

Implement your pseudocode in Python.

# Number of Monte Carlo samples
nmc = 1000;

# Number of data points
n = len(rexp)

# Declare a matrix to save the fitted parameters
theta_mc = np.zeros((nmc,2))

# Recall, the standard deviation of the residuals is sigre**0.5

# Perform Monte Carlo simulation
# Add your solution here

15.5.9. Visualize Distribution of Fitted Parameters#

# Histogram for Vmax
plt.hist(theta_mc[:,0])
plt.xlabel("Best Fit Value for $V_{max}$")
plt.ylabel("Number of MC Simulations")
plt.show()
../../_images/c6f23757d17b6d71494bf7b544d596dfa37be23f0c34a9fe5bc5bd9409221de1.png
# Histogram for Km
plt.hist(theta_mc[:,1])
plt.xlabel("Best Fit Value for $K_{m}$")
plt.ylabel("Number of MC Simulations")
plt.show()
../../_images/e25871cd4faba22a81c0c453624129fb399f73dda7254d74bf4285a1faefe52b.png
# We can even make a 2D histogram
plt.hist2d(theta_mc[:,0], theta_mc[:,1],bins=50)
plt.xlabel("Best Fit Value for $V_{max}$")
plt.ylabel("Best Fit Value for $K_{m}$")
plt.show()
../../_images/82286d93378cdb7a5ab01d95f7d7ce6932c427825c53c508f616f0f900ebbef8.png

15.5.10. Compute Covariance Matrix#

## We can calculate covariance in one, carefully constructed line
# For some reason, the NumPy developers choose to assume each ROW is a new variable
# and each COLUMN is a different observation. I would have choose a flipped convention.
# Our 'theta_mc' has each row as a different observation (simulation).
# 'rowvar=False' tells NumPy to flip its convention. This is a great example of
# WHY YOU SHOULD ALWAYS CHECK THE DOCUMENTATION!!!
np.cov(theta_mc,rowvar=False)
array([[0.01497054, 0.08128554],
       [0.08128554, 0.54461604]])