This notebook contains material from cbe67701-uncertainty-quantification; content is available on Github.
Created by Chenguang Zhang (czhang9@nd.edu)
import matplotlib.pyplot as plt
import numpy as np
from sklearn.linear_model import ElasticNet
from sklearn.linear_model import ElasticNetCV
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import load_boston
This example was adapted from:
McClarren, Ryan G (2018). Uncertainty Quantification and Predictive Computational Science: A Foundation for Physical Scientists and Engineers, Chapter 4: Local Sensitivity Analysis Based on Derivative Approximations, Springer, https://doi.org/10.1007/978-3-319-99525-0_4
The elastic net penalty Combines both ridge and lasso penalties:
\begin{equation*} \widehat{\mathbf{\beta}}_{el}=\min_{\beta}\sum ^{I}_{i=1}(y_i- \mathbf{\beta} \cdot \mathbf{x}_i)^2 + \lambda_1\left \| \mathbf{\beta} \right \|_1 + \lambda_2\left \| \mathbf{\beta} \right \|_2^2 \end{equation*}The first term from lasso encourages sparseness in the coefficients. The second term encourages highly correlated features to be averaged.
This is an example adapted from the book. We are going to fit a 100-parameter data set with 5 dominating inputs using ElasticNet
from sklearn
. for the time being, Latin hypercube sampling is not used.
The alpha
and l1_ratio
in ElasticNet
correspond to $\lambda_1+\lambda_2$ and $\alpha$ in the book:
$\lambda_1=$alpha * l1_ratio
$\lambda_2=$0.5 * alpha * (1 - l1_ratio)
# Generate Samples
np.random.seed(2020)
n_samples, n_features = 40, 100
X = np.random.randn(n_samples, n_features)
sensitivities = np.array([20, 20, 5, 2.5, 1] + [0.1] * (n_features-5)).reshape(n_features,1)
y = np.dot(X, sensitivities) + np.random.normal(scale = 0.01, size=n_samples).reshape(n_samples, 1)
# Define Elastic Net Models
enet0 = ElasticNet(alpha=1, l1_ratio=0.01, max_iter=1000000)
enet1 = ElasticNet(alpha=1, l1_ratio=0.1, max_iter=1000000)
enet2 = ElasticNet(alpha=1, l1_ratio=0.5, max_iter=1000000)
enet3 = ElasticNet(alpha=1, l1_ratio=0.9, max_iter=1000000)
enet4 = ElasticNet(alpha=1, l1_ratio=1, max_iter=1000000)
# Fitting the data
enet0.fit(X, y)
enet1.fit(X, y)
enet2.fit(X, y)
enet3.fit(X, y)
enet4.fit(X, y)
m = plt.scatter(np.where(enet0.coef_)[0], enet0.coef_[enet0.coef_ != 0], s = 8, marker= 'D', label='l1_ratio=0.01')
m = plt.scatter(np.where(enet1.coef_)[0], enet1.coef_[enet1.coef_ != 0], s = 8, marker= 'v', label='l1_ratio=0.1')
m = plt.scatter(np.where(enet2.coef_)[0], enet2.coef_[enet2.coef_ != 0], s = 8, marker= 'o', label='l1_ratio=0.5')
m = plt.scatter(np.where(enet3.coef_)[0], enet3.coef_[enet3.coef_ != 0], s = 8, marker= 's', label='l1_ratio=0.9')
m = plt.scatter(np.where(enet4.coef_)[0], enet4.coef_[enet4.coef_ != 0], s = 20, marker= 'p', label='Lasso')
m = plt.scatter(np.where(sensitivities)[0], sensitivities[sensitivities != 0], s=2, marker=',', label='True coefficients')
plt.legend(loc='best')
plt.xlabel('parameters')
plt.ylabel('coefficients')
plt.show()
Cross-validation is often used to choose the best parameters for the regression models. We can split the data into training set and test set. We can fit the data in the training set to obtain parameters for our model and evaluate them with the test set. sklearn
has built-in cross-validation module.
# 5-fold CV
enetCV = ElasticNetCV(cv=5, l1_ratio=0.5, random_state=0)
enetCV.fit(X, y.ravel())
print("selected alpha = ", enetCV.alpha_)
m = plt.scatter(np.where(enetCV.coef_)[0], enetCV.coef_[enetCV.coef_ != 0], s = 8, marker= 'D', label='CV selected')
m = plt.scatter(np.where(enet2.coef_)[0], enet2.coef_[enet2.coef_ != 0], s = 8, marker= 'o', label='previous result')
m = plt.scatter(np.where(sensitivities)[0], sensitivities[sensitivities != 0], s=2, marker=',', label='True coefficients')
plt.legend(loc='best')
plt.xlabel('parameters')
plt.ylabel('coefficients')
plt.show()
In this session, we play with the toy dataset in sklearn
using elastic net regression to check the sensitivities of each parameters. To keep it simple, data transforming is not performed. The demo is for general informational and educational purposes only and is not intended to constitute legal, tax, accounting or investment advice.
The dataset contains 13 parameters:
# loading dataset
dataset = load_boston()
X = dataset.data
y = dataset.target
# iterate with different l1_ratios
l1_ratios = [0.1, 0.3, 0.5, 0.7, 0.9, 0.99]
fold = 40
min_mse = 100 # Smallest Mean square error
# Select the CV model with smallest MSE
for rho in l1_ratios:
enetCV = ElasticNetCV(cv=fold, l1_ratio=rho, random_state=0)
enetCV.fit(X, y)
mses = np.mean(enetCV.mse_path_, axis=1)
mse = mses[np.where(enetCV.alphas_ == enetCV.alpha_)[0][0]] # return mse with the chosen alpha
m = plt.plot(-np.log10(enetCV.alphas_), mses, label='l1_ratio='+str(rho) )
if mse < min_mse:
min_mse = mse
best_rho = rho
best_alpha = enetCV.alpha_
best_sens = enetCV.coef_
print("Best l1_ratio and alpha: %f, %f" % (best_rho, best_alpha))
plt.legend(loc='best')
plt.xlabel('-Log(Alpha)')
plt.ylabel('MSE')
plt.show()
print(dict(zip(dataset.feature_names, best_sens)))