Lab 2: Model Identification

Lab 2: Model Identification#

Your Name:

Fitting models to data is an engineering skill that links between the real world of engineering systems to the theory you’ve been learning in the classroom. For this laboratory session you will collect data from a step test experiments, then fit the data to models derived from first-principles energy balances.

Overview#

Download a copy of this notebook to your laptop or lab computer.
Complete Exercise 0 and submit before lab on Friday.
Complete Exercise 1 during lab on Friday. The results should be embedded in the notebook. Be sure to ‘save-as-you-go’ to avoid losing your work.
After completing Exercise 1, save the resulting data file ending in .csv to your Google Drive, email it to yourself, etc.
Use your step test data (Lab 1) and sine test data (Lab 2) to fit two versions of a model for the temperature control lab: (a) one-state model with one input, and (b) two-state model with one input.
Submit your completed lab notebook via Gradescope.

Before you leave lab on Friday, you should complete Exercise 1 and email yourself a copy of this .ipynb file and your .csv data file. If you need help finding the .csv file on your computer, please ASK!

Reminder: Do NOT open Jupyter Lab via the Anaconda Navigator. Instead, open the Anaconda Prompt (or terminal on macOS) and type the following commands:

conda activate controls
jupyter lab

If you open Jupyter Lab via the Anaconda Navigator, you may experiment issues changing the kernel to controls.

Our labratory assignments this semester build up each other. For best results, you want to use the same hardware configuration for each lab. To help with this, please fill in the following:

TCLab: Did you use your own TCLab or borrow one (friend, class set)?
Power adapter: How did you plug the TCLab into power? Did you use the adapter that came with your kit, a different USB power adapter, or one of the plugs in our computer classroom? What is the power rating of the adapter? (e.g., 1A or 2A or 2.1A at 5V?, Was the power adapter/USB port labeled “phone” or “tablet”?)
Suggestion: take a picture of your TCLab setup including the power adapter. Keep this on your phone until the end of the semester.

Exercise 0. Fitting a First-Order Linear Model#

As discussed in class, a simple energy balance model for T1 is given by

\[\begin{equation*} C_p \frac{dT_1}{dt} = U_a(T_{amb} - T_1) + \alpha P_1 U_1 \end{equation*}\]

where the parameter \(\alpha\) has, through independent means, been determined as 0.00016 when U1 is given in percent of full power and power is measured in Watts.

Do the following:

Perform nonlinear regression to estimate \(U_a\) and \(C_p\). In Lab 1, we estimated \(\tau\) and \(K\) and then calculated \(U_a\) and \(C_p\). In this lab, you will instead directly regress \(U_a\) and \(C_p\).
Quantify the uncertainty in the estimates \(U_a\) and \(C_p\) by computing the Fisher information matrix, covariance matrix, and correlation matrix.
Perform an eigendecomposition of the covariance matrix.

This pre-lab exercise builds up the analysis workflow you will adapt for the rest of the lab.

import pandas as pd
import matplotlib.pyplot as plt

# Start by loading your data from the previous lab

# Add your solution here

# Next, quickly plot your data from the previous lab

# Add your solution here

import numpy as np
from scipy.integrate import solve_ivp

# Next complete the function below which simulates 
# the first order model for the temperature of the heater.

def tclab_model1(data, param, alpha = 0.00016, P1 = 200, plot=False):
    """ First order linear model for TCLab
    
    Arguments:
        data: pandas dataframe with columns 'Time', 'T1', 'Q1'
        param: list of parameters [Ua, CpH]
        alpha: power conversion factor, default 0.00016 watts / (units P1 * percent U1)
        P1: power setting for heater 1, default 200 (P1 units)
    
    Returns:
        T1: numpy array of model predictions for T1
    """

    # unpack the adjustable parameters
    Ua, CpH = param

    # extract ambient temperature
    T_amb = data['T1'][0]

    # extract experiment time
    t_expt = data['Time']

    # model solution
    def deriv(t, y):

        # interpolate the heater input to nearest value is faster than linear
        u = np.interp(t, data['Time'], data['Q1'], t)

        # unpack the state
        T1 = y

        # Note: We do NOT need to use deviation variables here.
        # We can use the true temperature and the true ambient temperature

        # Add your solution here


        return dT1dt
    soln = solve_ivp(deriv, [min(t_expt), max(t_expt)], [T_amb, T_amb], t_eval=t_expt)

    T1 = soln.y[0]

    if plot:
        # Plot the temperature data and heat power
        plt.figure()
        plt.subplot(2,1,1)
        plt.plot(t_expt, data['T1'], 'ro', label='Measured')
        plt.plot(t_expt, T1, 'b-', label='Predicted')
        plt.ylabel('Temperature (degC)')
        plt.legend()
        plt.subplot(2,1,2)
        plt.plot(t_expt, data['Q1'], 'r-', label='Heater')
        plt.ylabel('Heater (%)')
        plt.legend()
        plt.xlabel('Time (sec)')
        plt.tight_layout()
        plt.show()

    return T1


# Test your function with the data and initial parameters
# Use the initial parameters from Lab 1 for Ua and CpH
# Add your solution here

from scipy.optimize import least_squares

# Next, complete the function below to perform nonlinear regression

def tclab_estimation1(data, param_guess=[0.1, 4], plot=True):
    """ Nonlinear regression for TCLab one-state model
    
    Arguments:
        data: pandas dataframe with columns 'Time', 'T1', 'Q1'
        param_guess: list of initial guess for parameters [Ua, CpH]
    
    Returns:
        results: scipy least_squares results object
        residuals: numpy array of residuals
    """
    
    def residual(param):
        # Add your solution here

    # Set bounds (non-negative)
    bnds = ([0.0, 0.0], [np.inf, np.inf])

    results = least_squares(residual, param_guess, verbose=2, method='trf', bounds=bnds, loss='arctan')

    if plot:
        tclab_model1(data, results.x, plot=True)

    residuals = tclab_model1(data, results.x) - data['T1']

    if plot:
        # Plot the residuals
        plt.figure()
        plt.hist(residuals)

        # font size
        fs = 20
        
        # labels
        plt.xlabel("$T_1$ residuals [$^\circ{}$C]",fontsize=fs,fontweight = 'bold')
        plt.ylabel("Count",fontsize=fs,fontweight = 'bold')

        # define tick size
        plt.xticks(fontsize=fs)
        plt.yticks(fontsize=fs)
        plt.tick_params(direction="in",top=True, right=True)

        # finish plot
        plt.show()

    return results, residuals

# Test your function with the data
step_test_results1, step_test_residuals1 = tclab_estimation1(data1)

def print_results1(results):
    print(f"Ua = {results.x[0]:1.3f} W/degC")
    print(f"CpH = {results.x[1]:1.3f} J/degC")
    print(f"Success: {results.success}")
    print(f"Message: {results.message}")

# Save the results for later
Ua_ex0 = step_test_results1.x[0]
Cp_ex0 = step_test_results1.x[1]

print_results1(step_test_results1)

Next, we will write a function to quantify the uncertainty in the estimated parameters. Let’s review some key concepts.

Let \(\mathbf{Q}\) represent the Jacobian matrix of the model residuals \(\mathbf{r}\) with respect to the model parameters \(\mathbf{\theta}\).

Often, we assume the measurement error is iid (independent and identifcally distributed) Gaussian. However, we do not know the standard deviation of the measurement error. Instead, we can estimate it!

\[\begin{equation*} \hat{\sigma}^2 = \frac{\mathbf{r}^\intercal \mathbf{r}}{|\mathbf{r}| - |\mathbf{\theta}|} \end{equation*}\]

Here \(\hat{\sigma}^2\) is the estimate for the variance of the measurement error. The operator \(| \cdot |\) means cardinality, i.e., number of items in a set or the length of a vector. Thus, \(| \mathbf{r} |\) is the number of resdiauls (data points) and \(| \mathbf{\theta} |\) is the number of fitted parameters.

The Fisher information matrix is one way to quantify the amount of data contained in the experimental data about the model parameters \(\mathbf{\theta}\). With iid Gaussian measurement error, the FIM is:

\[\begin{equation*} \mathbf{M} \approx \hat{\sigma}^{-2} \mathbf{Q}^\intercal \mathbf{Q} \end{equation*}\]

This formula should look familar. This is because the the FIM \(\mathbf{M}\) is approximately the inverse of the parameter covariance matrix:

\[\begin{equation*} \mathbf{\Sigma_{\hat{\theta}}} \approx \mathbf{M}^{-1} \approx \hat{\sigma}^{2} \left( \mathbf{Q}^\intercal \mathbf{Q} \right)^{-1} \end{equation*}\]

Complete the function below to compute the FIM and parameter covariance matrix. We will also compute an eigendecomposition of the parameter covariance matrix.

# Perform uncertainty quantification

def covariance_to_correlation(cov):
    ''' Convert covariance matrix into correlation matrix

    Argument:
        cov: covariance matrix

    Returns:
        cor: correlation matrix

    '''

    # Copy matrix
    cor = cov.copy()

    # Get number of rows
    n = cor.shape[0]

    # Loop over rows
    for r in range(n):
        # Loop over columns
        for c in range(n):
            # Scale element
            cor[r,c] = cor[r,c] / np.sqrt(cov[r,r]*cov[c,c])

    return cor


def uncertainty_quantification(results, residuals):
    """ Uncertainty quantification for the estimated parameters
    
    Arguments:
        res: scipy least_squares results object
    
    Returns:
        fim: Fisher Information Matrix
        cov: covariance matrix
    """

    # compute standard deviation of the residuals
    sigma_residuals = np.sqrt(residuals.T @ residuals / (len(residuals) - len(results.x)))
    print("Standard deviation of the residuals = ", round(sigma_residuals, 3), " deg C")
    print("")

    # compute Fisher Information Matrix
    # hint: see formula above
    # Add your solution here
    print("Fisher Information Matrix")
    print(fim)
    print("")

    # compute the covariance matrix
    # hint: see formula above, this is easy once you have the FIM
    print("Covariance Matrix")
    # Add your solution here
    print(cov)
    print("")

    # eigendecomposition of the covariance matrix
    print("Eigendecomposition of the covariance matrix")
    w, v = np.linalg.eig(cov)

    print("")
    for i in range(len(w)):
        print(f"lambda_{i+1} = {w[i]:1.3e}")
        print(f"v_{i+1} = {v[:,i]}")
        print("")


    # compute the correlation matrix
    print("Correlation Matrix")
    cor = covariance_to_correlation(cov)
    print(cor)

    def print_with_abs_uncertainty(i, name, unit):
        print(f"{name} = {results.x[i]:1.3f} {unit} +/- {np.sqrt(cov[i,i]):1.3f} {unit}")

    def print_with_percent_uncertainty(i, name, unit):
        print(f"{name} = {results.x[i]:1.3f} {unit} +/- {(100*np.sqrt(cov[i,i])/results.x[i]):1.2f}% ")

    # print the estimated parameters with uncertainty
    print("\nParameter estimates with uncertainty (covariance diagonals):")
    if(len(results.x) == 2):
        print_with_abs_uncertainty(0, "Ua", "W/degC")
        print_with_abs_uncertainty(1, "CpH", "J/degC")
        print(" ")
        print_with_percent_uncertainty(0, "Ua", "W/degC")
        print_with_percent_uncertainty(1, "CpH", "J/degC")

    elif(len(results.x) == 4):
        print_with_abs_uncertainty(0, "Ua", "W/degC")
        print_with_abs_uncertainty(1, "Ub", "W/degC")
        print_with_abs_uncertainty(2, "CpH", "J/degC")
        print_with_abs_uncertainty(3, "CpS", "J/degC")
        print(" ")
        print_with_percent_uncertainty(0, "Ua", "W/degC")
        print_with_percent_uncertainty(1, "Ub", "W/degC")
        print_with_percent_uncertainty(2, "CpH", "J/degC")
        print_with_percent_uncertainty(3, "CpS", "J/degC")

    else:
        print("Skipped printing... check dimensions")
    
    return fim

fim1 = uncertainty_quantification(step_test_results1, step_test_residuals1)

Q1: Write a few sentences to comment on your residual plot. (Recall from Num Stats, the historgram of the residuals is helpful to check regression assumptions.) Based on your residual plot, are the regression assumptions reasonably satisfied?

Answer:

Q2: Write a few sentences to comment on your uncertainty estimates. As engineers, you will need to develop a “gut instinct” if calculations/results are reasonable based on your intuition. Do you think your uncertainty estimates are reasonable? Why or why not?

Answer:

Exercise 1. Sine Test and Data Collection#

In this part of the lab, you will execute a sine test experiment. In the remaining exercises, you will use both the sine test experiment (Lab 2) and the step test experiment (Lab 1) to estimate state-space models for the TCLab system.

Step 1. Verify operation.#

Execute the following cell to verify that you have a working connection to the temperature control lab hardware. This will test for installation of TCLab.py, connection to the Arduino device, and working firmware within the Arduino.

from tclab import TCLab, clock, Historian, Plotter, setup

run_tclab = False

"""
In the labs, we will us "run_tclab" to control whether the TCLab is used.
After you finish the lab experiment, set run_tclab = False.
This way, you can run all the cells without losing your TCLab output.
"""

if run_tclab:

    TCLab = setup(connected=True)

    lab = TCLab()
    print("TCLab Temperatures:", lab.T1, lab.T2)
    lab.close()

Step 2. Check for steady state#

As discussed in class, for step testing the device must be initially at steady state. Run the following code to verify the heaters are off and that the temperatures are at a steady ambient temperature.

if run_tclab:
    # experimental parameters
    tfinal = 30

    # perform experiment
    with TCLab() as lab:
        lab.U1 = 0
        lab.U2 = 0
        h = Historian(lab.sources)
        p = Plotter(h, tfinal)
        for t in clock(tfinal):
            p.update(t)

Step 3. Sine Test#

The step test consists of modulating the power to one heat to follow a sine wave and recording temperature data for at least 1200 seconds.

import numpy as np
import matplotlib.pyplot as plt
def sine_wave(t, period=4*60, amplitude=50, offset=50):
    """ Generate a sine wave with the given parameters.

    Arguments:
    t:          time (seconds)
    period:     period of the sine wave (seconds)
    amplitude:  amplitude of the sine wave
    offset:     offset of the sine wave
    
    """
    return amplitude * np.sin(2 * np.pi * t / period) + offset

t = np.linspace(0, 900, 901)
plt.plot(t, sine_wave(t))
plt.xlabel('Time / seconds')
plt.ylabel('Power / %')
plt.show()

Copy the code from Step 2 into the following cell. Then modify as needed to accomplish the step test with P1 at 200.

if run_tclab:

    # experimental parameters
    tfinal = 1200

    # perform experiment
    with TCLab() as lab:
        lab.P1 = 200
        lab.U1 = 50 # this is not needed because we will reset it in the loop
        lab.U2 = 0
        h = Historian(lab.sources)
        p = Plotter(h, tfinal)
        for t in clock(tfinal):
            # Set lab.U1 to the sine wave function

            # Add your solution here
            
            p.update(t)

Step 4. Save data to a .csv file#

Run the following cell to verify and save your data to a ‘.csv’ file. Be sure you can find and locate the data on your laptop before ending your session. You will need access to this data for subsequent exercises.

import matplotlib.pyplot as plt
import os.path

# Change the filename here
data_file = 'lab2-sine-test.csv'

if run_tclab:

    # Set to True to overwrite the file. Default is False
    # to safeguard against accidentally rerunning this cell.
    overwrite_file = False

    if not overwrite_file and os.path.isfile('./'+data_file):
        raise FileExistsError(data_file + ' already exisits. Either choose a new filename or set overwrite_file = True.')
    elif run_tclab:
        h.to_csv(data_file)
        print("Successfully saved data to "+data_file)
    else:
        print("Data was not saved because run_tclab = False")

Exercise 2. Fit First-Order Model using Sine Test Data#

You will now repeat the steps for Exercise 0 but using the sine test data.

Start by loading the data into a pandas dataframe. Verify your pandas dataframe is correct by plotting the data.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Load the data
# Add your solution here

# Add your solution here

Next, simulate the first-order linear model you estimated in Exercise 0 but for your sine wave experiment.

# Simulate the first order model for the sine wave test
# using the values of Ua and Cp you identified in Exercise 0

# Add your solution here

Next, perform nonlinear regression to refine the parameter estimates. Be sure to:

Set verbose=2 for least_squares
Plot the histogram of the residuals.

# Perform nonlinear regression on the sine wave test data
# Hint: Use the results from Exercise 0 as the initial guess
# Add your solution here

# Save the results for later
Ua_ex1 = sine_test_results1.x[0]
Cp_ex1 = sine_test_results1.x[1]

print_results1(sine_test_results1)
### END SOLUTION

Next, quantify the uncertainty in your regression results. Save the FIM in the variable fim2. (The 2 is for Exercise 2.)

# Perform uncertainty quantification on the sine wave test data
# Add your solution here

Q1: Compare your regression results, especially the time series plot, to Exercise 0 (step test)? Does the first order model better explain the step test or sine test experiment?

Answer:

Q2: Comment on your uncertainty estimates in a few sentences. Based on your uncertainty estimations, does the first order model better describe the step test or sine test experiments?

Answer:

Exercise 3. Fitting a Second-Order Model using Sine Test Data#

A second order model for the heater/sensor combination is given by

\[\begin{align*} C^H_p\frac{dT_{H,1}}{dt} & = U_a(T_{amb} - T_{H,1}) + U_b(T_{S,1} - T_{H,1}) + \alpha P_1u_1\\ C^S_p\frac{dT_{S,1}}{dt} & = U_b(T_{H,1} - T_{S,1}) \end{align*}\]

where \(T_{H,1}\) is the heater temperature, \(T_{S,1}\) is the sensor temperature, and \(U_b\) is the heat transfer coefficient between the heater and sensor.

Modify the code you developed for Exercises 0 and 1 to fit this second order model.

Do the following:

Report your best fit for \(U_a\), \(U_b\), \(C^H_p\), and \(C^S_p\) with units and a reasonable number of significant digits.
Characterize the uncertainty in these estimates.
Visualize the quality of fit and the residuals.
Answer the discussion questions.

Start by completing the function below to simulate the second-order linear model for the TCLab.

def tclab_model2(data, param, alpha = 0.00016, P1 = 200, plot=False, plot_title=None):
    """ Second order linear model for TCLab
    
    Arguments:
        data: pandas dataframe with columns 'Time', 'T1', 'Q1'
        param: list of parameters [Ua, Ub, CpH, CpS]
        alpha: power conversion factor, default 0.00016 watts / (units P1 * percent U1)
        P1: power setting for heater 1, default 200 (P1 units)
        plot: boolean, if True, plot the data and model predictions
        plot_title: string, title for the plot
    
    Returns:
        TS: numpy array of model predictions for TS
    """

    # unpack the adjustable parameters
    Ua, Ub, CpH, CpS = param

    # extract ambient temperature
    T_amb = data['T1'][0]

    # extract experiment time
    t_expt = data['Time']

    # model solution
    def deriv(t, y):

        # interpolate the heater input to nearest value is faster than linear
        u = np.interp(t, data['Time'], data['Q1'], t)

        T1H, T1S = y
        # Add your solution here
        return [dT1H, dT1S]

    soln = solve_ivp(deriv, [min(t_expt), max(t_expt)], [T_amb, T_amb], t_eval=t_expt)

    TH = soln.y[0]
    TS = soln.y[1]

    if plot:
        # Plot the temperature data and heat power
        plt.figure()
        plt.subplot(2,1,1)
        plt.plot(t_expt, data['T1'], 'ro', label='Measured $T_S$')
        plt.plot(t_expt, TS, 'b-', label='Predicted $T_S$')
        plt.plot(t_expt, TH, 'g-', label='Predicted $T_H$')
        plt.ylabel('Temperature (degC)')
        plt.legend()
        plt.subplot(2,1,2)
        plt.plot(t_expt, data['Q1'], 'r-', label='Heater')
        plt.ylabel('Heater (%)')
        plt.legend()
        plt.xlabel('Time (sec)')
        plt.title(plot_title)
        plt.tight_layout()
        plt.show()

    return TS

# Test your function with the data and initial parameters
TS = tclab_model2(data2, [0.1, 0.2, 4, 0.1], plot=True)

Next, perform nonlinear regression to improve the parameter estimates.

# Next, complete the function below to perform nonlinear regression

def tclab_estimation2(data, param_guess=[0.1, 0.2, 6, 3], plot=True, verbose=2):
    """ Nonlinear regression for TCLab two-state model
    
    Arguments:
        data: pandas dataframe with columns 'Time', 'T1', 'Q1'
        param_guess: list of initial guess for parameters [Ua, Ub, CpH, CpS]
        plot: if True, create a histogram of the residuals
        verbose: level of output passed to least_squares
    
    Returns:
        results: scipy least_squares results object
        residuals: numpy array of residuals
    """
    
    

    def residual(param):
        # Add your solution here

    # Set bounds (non-negative)
    bnds = ([0.0, 0.0, 0.0, 0.0], [np.inf, np.inf, np.inf, np.inf])

    results = least_squares(residual, param_guess, verbose=verbose, method='trf', bounds=bnds, loss='arctan')

    if plot:
        tclab_model2(data, results.x, plot=True)

    residuals = tclab_model2(data, results.x) - data['T1']

    if plot:
        # Plot the residuals
        plt.figure()
        plt.hist(residuals)

        # font size
        fs = 20
        
        # labels
        plt.xlabel("$T_S$ residuals [$^\circ{}$C]",fontsize=fs,fontweight = 'bold')
        plt.ylabel("Count",fontsize=fs,fontweight = 'bold')

        # define tick size
        plt.xticks(fontsize=fs)
        plt.yticks(fontsize=fs)
        plt.tick_params(direction="in",top=True, right=True)

        # finish plot
        plt.show()


    return results, residuals

# Test your function with the data
sine_test_results2, sine_test_residuals2 = tclab_estimation2(data2)

def print_results2(results):
    # Add your solution here

# Save the results for later
Ua_ex3, Ub_ex3, CpH_ex3, CpS_ex3 = sine_test_results2.x

print_results2(sine_test_results2)

Next, quantify the uncertainty in the parameter estimates.

# Perform uncertainty quantification
# Add your solution here

Finally, following discussion questions. Each response should be a few sentences in length.

Q1: Compare the model predictions, especially the time series plots, for the first-order and second-order linear models. Which model better described the sine test experiment? Use some quantitative data to support your answer.

Answer:

Q2: Why does your observation for Q1 mathematically make sense?

Answer:

Exercise 4. Multistart Initialization#

Nonlinear regression is (often) a nonconvex problem with many unique local minimum. We want to find a set of parameters that are (near the) global minimum. One pragmatic approach to this is multi-start initialization.

Study the following code below. Then, run the code. Warning: This cell may take ~5 minutes to execute.

def multi_start_initialization(regression_function, lowerbounds, upperbounds, n_restarts, plot=False):
    ''' Perform multi-start initialization for nonlinear regression

    Arguments:
        regression_function: function to perform nonlinear regression
        lowerbounds: lower bounds for the parameters
        upperbounds: upper bounds for the parameters
        n_restarts: number of restarts

    Returns:
        best_theta: best parameters
    ''' 

    best_theta = None
    best_objective = np.inf

    all_objectives = np.zeros(n_restarts)

    for i in range(n_restarts):
        
        print("\n***************************************************")
        print(f"Restart {i+1} of {n_restarts}")
        
        # Generate random initial guess
        theta0 = np.random.uniform(lowerbounds, upperbounds)
        print("theta0 = ", theta0)
        

        # Perform nonlinear regression
        results, residuals = regression_function(theta0)
        
        print("result object:\n", results)

        # Compute objective
        objective = results.cost

        # Store best objective
        all_objectives[i] = objective

        # Update best parameters
        if objective < best_objective:
            best_theta = results.x
            best_objective = objective


    plt.figure()
    plt.hist(all_objectives)
    plt.xlabel('Sum of Squared of Residuals')
    plt.ylabel('Frequency')
    plt.tight_layout()
    plt.show()

    return best_theta

# Define the bounds for the parameters
lowerbounds = [0.01, 0.1, 3, 1]
upperbounds = [0.1, 0.3, 7, 4]

# Define the regression function
reg_function = lambda theta, plot=False: tclab_estimation2(data2, theta, plot=plot, verbose=1)

# Perform multi-start initialization
best_theta = multi_start_initialization(reg_function, lowerbounds, upperbounds, 20)

The historgram and the output above show the results of repeating the parameter estimation problem several times using different initial conditions.

Next, let’s rerun the parameter estimation problem using the best parameter estimate obtained from the multi-start initialization. This time, we will plot the data and model predictions.

multi_start_results, multi_start_residuals = tclab_estimation2(data2, best_theta, plot=True)

Next, quantify the uncertainty in the parameter estimates. Store the results in fim4.

# Add your solution here

Finally, answer the following discussion questions.

Q1: Compare the time series plots and residual histogram plots from the original parameter estimation (Exercise 3) and with multi-start initialization (Exercise 4). Quantitatively describe a few differences. Based on these plots, is multi-start initialization important?

Answer:

Q2: Quantitatively compare the uncertainty estimates from Exercises 3 and 4. Based on these results, is multi-start initialization important?

Answer:

Exercise 5: Fit Second Order Model Using Step Test Data#

Next, we will repeat parameter estimation for the second-order linear model, but using the step test data (Lab 1).

# Perform regression
# Hint: use your previous results as a starting point
# Add your solution here

Next, quantify the uncertainty in the parameter estimates. Store the results in fim5.

# Perform uncertainty quantification
# Add your solution here

Finally, answer the following discussion questions:

Q1: Compare the time-series plots from Exercises 4 and 5. Which experimental data is better described by the model?

Answer:

Q2: Compare the residual histogram plots from Exercises 4 and 5. For which experiment are the measurement error assumptions best satisfied?

Answer:

Exercise 6: Fit Second Order Model Using Both Data Sets#

Next, perform simultanous parameter estimation for both data sets. The goal here is to find a single set of parameter values that adequately describes both experiments.

We will start by defining and testing some residual functions.

# Define functions to compute residuals for all data sets

def residual_single_data_set(param, df):
    '''Evaluate the residuals for a single data set
        Arguments:
            param: list of parameters [Ua, Ub, CpH, CpS]
            df: pandas dataframe with columns 'Time', 'T1', 'Q1'

        Returns:
            residuals: numpy array of residuals
    '''

    # Add your solution here

def residuals_all_data_sets(param, data_sets):
    ''' Return the residuals for all of the experiments'''

    residuals = np.array([])
    for name, df in data_sets.items():
        
        # Hint: Look at the documentation for np.concatenate
        residuals = np.concatenate((residuals, residual_single_data_set(param, df).to_numpy())) 
        
    return residuals 

# Assemble the dictionary of data sets
data_sets = {'sine': data2, 'step': data1}

# Define the initial guess for the parameters
param = [0.1, 0.2, 4, 0.1]

# Test the function
residual_single_data_set(param, data_sets['sine'])

# Test the function
residual_single_data_set(param, data_sets['step'])

# Test the function
residuals_all_data_sets(param, data_sets)

Next, complete the function below to perform nonlinear regression with multiple datasets.

def tclab_estimation2_multiple_datasets(data_sets, param_guess=[0.1, 0.2, 4, 0.1], plot=True, verbose=2):
    """ Nonlinear regression for TCLab two-state model that supports multiple datasets
    
    Arguments:
        data_sets: dictionary
            keys = data set names (unqiue strings)
            values = pandas dataframes with columns 'Time', 'T1', 'Q1'
        param_guess: list of initial guess for parameters [Ua, Ub, CpH, CpS]
        plot: if True, create a histogram of the residuals
        verbose: level of output passed to least_squares (default 2)
    
    Returns:
        results: scipy least_squares results object
        residuals: numpy array of residuals
    """

    # Set bounds (non-negative)
    bnds = ([0.0, 0.0, 0.0, 0.0], [np.inf, np.inf, np.inf, np.inf])
    
    results = least_squares(residuals_all_data_sets, 
                            param_guess, 
                            verbose=verbose, 
                            method='trf', 
                            bounds=bnds, 
                            loss='arctan', 
                            args=(data_sets,))

    # Compute the residuals usin the best fit value
    residuals = residuals_all_data_sets(results.x, data_sets)

    if plot:

        # Loop over the datasets
        for name, df in data_sets.items():
            # Rerun the model and plot the results
            
            tclab_model2(df, results.x, plot=True, plot_title=f"{name} test")
            
            # Plot the residuals
            residuals_single = residual_single_data_set(results.x, df)
            plt.figure()
            plt.hist(residuals_single)

            # font size
            fs = 20
            
            # labels
            plt.xlabel("$T_S$ residuals [$^\circ{}$C]",fontsize=fs,fontweight = 'bold')
            plt.ylabel("Count",fontsize=fs,fontweight = 'bold')

            # define tick size
            plt.xticks(fontsize=fs)
            plt.yticks(fontsize=fs)
            plt.tick_params(direction="in",top=True, right=True)

            # add title
            plt.title(f"{name} test", fontsize=fs,fontweight = 'bold')

            # finish plot
            plt.show()

    return results, residuals

Finally, perform nonlinear regression. Use the parameter estimates from Exercise 4 (multi-start with sine test data) as an initial point.

# Perform nonlinear regression
both_results2, both_residuals2 = tclab_estimation2_multiple_datasets(data_sets, param_guess=best_theta)

Next, quantify the uncertainty in the estimated parameters. Store the results in fim6.

# Perform uncertainty quantification
# Add your solution here

Yep, you guessed it. We should perform multi-start initialization. Store the new parameter estimate in both_multi_start. Warning: This cell may take ~5 minutes to execute.

# Add your solution here

Next, repeat parameter estimation using the best solution, stored in both_multi_start, as the initial guess. Be sure to make plots.

both_results_restart, both_residuals_restart = tclab_estimation2_multiple_datasets(data_sets, param_guess=both_multi_start, plot=True)

Complete the calculations by quantifying the uncertainty in the parameter estimates.

# Add your solution here

In place of discussion questions, write a short paragraph comparing the plots from Exercises 4, 5, and 6.

Answer:

Exercise 7: Analyze Uncertainty Estimates for Second-Order Models#

Take a few minutes to review your uncertainty analysis results for Exercises 4, 5, and 6. Then, complete the following table by filling in the estimated value \(\pm\) the uncertainty. Please only report a reasonable number of significant digits.

	Sine Test	Step Test	Both
\(U_a~~\left(\frac{\text{W}}{\text{°C}}\right)\)
\(U_b~~\left(\frac{\text{W}}{\text{°C}}\right)\)
\(C_p^H~~\left(\frac{\text{J}}{\text{°C}}\right)\)
\(C_p^S~~\left(\frac{\text{J}}{\text{°C}}\right)\)

Next, use your quantitative results to answer the following questions.

Q1: Which parameter(s) have the least uncertainty? Is your answer consistent across the three Exercises? What is the physical justification/insight from this finding?

Answer:

Q2: Which parameter(s) have the greatest uncertainty? Is your answer consistent across the three Exercises? What is the physical justification/insight from this finding?

Answer:

Q3: What are some advantages of a step test? In other words, why are step tests extremely popular for systems identification?

Answer:

Q4: Why is it important to perform a sine test? In prior years, this lab focused on only a step test. Speculate as to why the lab was refactored to introduce a sine test.

Answer:

Declarations#

Collaboration: If you worked with any classmates, please give their names here. Describe the nature of the collaboration.

Generative AI: If you used any Generative AI tools, please elaborate here.

Reminder: The written discussions responses must be in your own words. Many of these questions ask about your specific results or are open-ended questions with many reasonable answers. Thus we expect unique responses, analyses, and ideas.

We may use writing analysis software to check for overly similar written responses. You are responsible for reviewing the colaboration policy outlined in the class syllabus to avoid violations of the honor code.

Lab 2: Model Identification

Contents

Lab 2: Model Identification#

Overview#

Exercise 0. Fitting a First-Order Linear Model#

Exercise 1. Sine Test and Data Collection#

Step 1. Verify operation.#

Step 2. Check for steady state#

Step 3. Sine Test#

Step 4. Save data to a .csv file#

Exercise 2. Fit First-Order Model using Sine Test Data#

Exercise 3. Fitting a Second-Order Model using Sine Test Data#

Exercise 4. Multistart Initialization#

Exercise 5: Fit Second Order Model Using Step Test Data#

Exercise 6: Fit Second Order Model Using Both Data Sets#

Exercise 7: Analyze Uncertainty Estimates for Second-Order Models#

Declarations#