Statistical Power in Python

13.9. Statistical Power in Python#

# load libraries
import scipy.stats as stats
import numpy as np
import math
import matplotlib.pyplot as plt

13.9.1. Learning Objectives#

After studying this notebook, completing the activities, participating in class, and reviewing your notes, should be able to:

Calculate statistical power in Python.

13.9.2. Documentation and References#

13.9.2.1. Exploring Statistical Power in Python#

The statistical power calculations above were for a z-test, which is reasonable with a large sample size (recall the CLT). Power calculations are more intricate with the t-distribution and other statistical tests.

We will now look at some tools to explore statistical power in Python. Here are some tutorials with more information:

Here is the documentation for the function we will use.

Class Activity

Discuss the code and plot below. This code analyzes Activity:Question 2 from the Chap 11-08 notebook.

13.9.3. Example Revisited#

# Import tools for power calculations for one-sample t-test
from statsmodels.stats.power import TTestPower

# let's continue to examine the process yield improvement example from the last notebook
mu0 = 80 # given in last notebook's example

# means for the alternate hypothesis
mu_alt = np.array([81, 82, 83, 84, 85, 86, 87])

# standard deviation
s = 5

# significance level
alpha = 0.05

# "effect size" refers to the between the null and alternate hypothesis means normalized by 
# the sample standard deviation
effect_sizes = (mu_alt - mu0) / s

# vector of sample sizes
sample_sizes = np.array(range(5, 100))

# declare numpy array to store power calculation results
nE = len(effect_sizes)
nS = len(sample_sizes)
p = np.zeros((nE,nS))

# create empty figure
plt.figure(figsize=(10,8))

analysis = TTestPower()

# loop over all combinations of effect size and sample size
for i in range(nE):
    for j in range(nS):
        
        # Calculate statistical power
        p[i,j] = analysis.power(effect_sizes[i], # effect size
                                  sample_sizes[j], # number of observations
                                  alpha, # significance level
                                  df=sample_sizes[j] - 1, # degrees of freedom
                                  alternative='larger' # specify one-sided hypothesis
                                 )
        
    # plot curve for this effect size
    plt.plot(sample_sizes,p[i,:],label="$\mu$ = "+str(mu_alt[i]),linewidth=3)

plt.xlabel("Number of Observations",fontsize=16)
plt.ylabel("Statistical Power",fontsize=16)
plt.title("Significance Level: "+str(alpha),fontsize=16)
plt.legend(fontsize=16,loc="lower right")
plt.grid(True)
plt.title
plt.show()

../../_images/47cc9d0e8a9dc809d93e87b73a5e8e7c03059db83be31f7157bbb54550f92972.png