{ "cells": [ { "cell_type": "markdown", "id": "8cf2197d", "metadata": {}, "source": [ "\n", "*This notebook contains material from [CBE60499](https://ndcbe.github.io/CBE60499);\n", "content is available [on Github](git@github.com:ndcbe/CBE60499.git).*\n" ] }, { "cell_type": "markdown", "id": "d83d0eaf", "metadata": {}, "source": [ "\n", "< [3.8 Algorithms Homework 2](https://ndcbe.github.io/CBE60499/03.08-Algorithms2.html) | [Contents](toc.html) | [Tag Index](tag_index.html) | [4.0 Constrained Nonlinear Optimization: Theory and Applications](https://ndcbe.github.io/CBE60499/04.00-Constrained.html) >

\"Open

\"Download\"" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 1, "link": "[3.9 Algorithms Homework 3](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9-Algorithms-Homework-3)", "section": "3.9 Algorithms Homework 3" } }, "source": [ "# 3.9 Algorithms Homework 3\n", "\n", "**Assignment Overview:** In this assignment, we will complete the function `unconstrained_newton` which implements four Hessian options (exact, SR1 approximation, BFGS approximation, steepest descent) and three globalization strategies (none, line search, trust region). We will then compare 10 algorithm variations on three test problems.\n", "\n", "The best way to really learn the algorithms discussed in lecture is to implement them yourselves. Creating a function like `unconstrained_newton` from scratch takes a lot of time and comfort with Python. Instead, we will start with fairly complete version of `unconstrained_newton`. You will need to fill in the following details:\n", "* BFGS Hessian approximation\n", "* Backtracking line search with Armijo-Goldstein conditions\n", "* Trust region with Powell dogleg step\n", "\n", "This assignment could take a long time, especially if you are still learning Python. Recognizing this, we will try a new grading policy for this assignment.\n", "\n", "**Instructions**: To be eligible for full credit on the assignment, you need to:\n", "1. Complete all requested pseudocode\n", "2. Spend an honest **6 hours** (4 hours for CBE 40499) total working on Python implementation.\n", "3. Be sure to complete the **Feature Status** subsection.\n", "4. The (hopefully) correct results for the test cases are available online in this notebook. Answer the questions using these results." ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 2, "link": "[3.9.1 Pseudocode](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.1-Pseudocode)", "section": "3.9.1 Pseudocode" } }, "source": [ "## 3.9.1 Pseudocode\n", "\n", "After reading through the entire assignment, please prepare pseudocode for the following functions/algorithms:\n", "1. SR1 update (Python code is provided. Going from code to pseudocode is good practice.)\n", "2. BFGS update\n", "3. Line search\n", "4. Trust region\n", "\n", "Please turn in this pseudocode via Gradescope.\n", "\n", "**Reminder:** pseudocode should not look like Python code copied to paper. Instead, pseudocode should clearly communicate the main steps of the algorithm with flow logic. Add lots of comments. It should not be programming language specific." ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 2, "link": "[3.9.2 Unconstrained NLP Algorithm in Python](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.2-Unconstrained-NLP-Algorithm-in-Python)", "section": "3.9.2 Unconstrained NLP Algorithm in Python" } }, "source": [ "## 3.9.2 Unconstrained NLP Algorithm in Python" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 3, "link": "[3.9.2.1 Library of Helper Functions](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.2.1-Library-of-Helper-Functions)", "section": "3.9.2.1 Library of Helper Functions" } }, "source": [ "### 3.9.2.1 Library of Helper Functions\n", "\n", "Below is a library of helpful functions. Most of these came from in class examples." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbpages": { "level": 3, "link": "[3.9.2.1 Library of Helper Functions](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.2.1-Library-of-Helper-Functions)", "section": "3.9.2.1 Library of Helper Functions" } }, "outputs": [], "source": [ "# Load required Python libraries.\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "from scipy import linalg\n", "\n", "## Check is element of array is NaN\n", "def check_nan(A):\n", " return np.sum(np.isnan(A))\n", "\n", "## Calculate gradient with central finite difference\n", "def my_grad_approx(x,f,eps1,verbose=False):\n", " '''\n", " Calculate gradient of function my_f using central difference formula\n", " \n", " Inputs:\n", " x - point for which to evaluate gradient\n", " f - function to consider\n", " eps1 - perturbation size\n", " \n", " Outputs:\n", " grad - gradient (vector)\n", " '''\n", " \n", " n = len(x)\n", " grad = np.zeros(n)\n", " \n", " if(verbose):\n", " print(\"***** my_grad_approx at x = \",x,\"*****\")\n", " \n", " for i in range(0,n):\n", " \n", " # Create vector of zeros except eps in position i\n", " e = np.zeros(n)\n", " e[i] = eps1\n", " \n", " # Finite difference formula\n", " my_f_plus = f(x + e)\n", " my_f_minus = f(x - e)\n", " \n", " # Diagnostics\n", " if(verbose):\n", " print(\"e[\",i,\"] = \",e)\n", " print(\"f(x + e[\",i,\"]) = \",my_f_plus)\n", " print(\"f(x - e[\",i,\"]) = \",my_f_minus)\n", " \n", " \n", " grad[i] = (my_f_plus - my_f_minus)/(2*eps1)\n", " \n", " if(verbose):\n", " print(\"***** Done. ***** \\n\")\n", " \n", " return grad\n", "\n", "## Calculate gradient using central finite difference and my_hes_approx\n", "def my_hes_approx(x,grad,eps2):\n", " '''\n", " Calculate Hessian of function my_f using central difference formula and my_grad\n", " \n", " Inputs:\n", " x - point for which to evaluate gradient\n", " grad - function to calculate the gradient\n", " eps2 - perturbation size (for Hessian NOT gradient approximation)\n", " \n", " Outputs:\n", " H - Hessian (matrix)\n", " '''\n", " \n", " n = len(x)\n", " H = np.zeros([n,n])\n", " \n", " for i in range(0,n):\n", " # Create vector of zeros except eps in position i\n", " e = np.zeros(n)\n", " e[i] = eps2\n", " \n", " # Evaluate gradient twice\n", " grad_plus = grad(x + e)\n", " grad_minus = grad(x - e)\n", " \n", " # Notice we are building the Hessian by column (or row)\n", " H[:,i] = (grad_plus - grad_minus)/(2*eps2)\n", "\n", " return H\n", "\n", "## Linear algebra calculation\n", "def xxT(u):\n", " '''\n", " Calculates u*u.T to circumvent limitation with SciPy\n", " \n", " Arguments:\n", " u - numpy 1D array\n", " \n", " Returns:\n", " u*u.T\n", " \n", " Assume u is a nx1 vector.\n", " Recall: NumPy does not distinguish between row or column vectors\n", " \n", " u.dot(u) returns a scalar. This functon returns an nxn matrix.\n", " '''\n", " \n", " n = len(u)\n", " A = np.zeros([n,n])\n", " for i in range(0,n):\n", " for j in range(0,n):\n", " A[i,j] = u[i]*u[j]\n", " \n", " return A\n", "\n", "## Analyze Hessian\n", "def analyze_hes(B):\n", " print(B,\"\\n\")\n", " \n", " l = linalg.eigvals(B)\n", " print(\"Eigenvalues: \",l,\"\\n\")" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 3, "link": "[3.9.2.2 Main Function](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.2.2-Main-Function)", "section": "3.9.2.2 Main Function" } }, "source": [ "### 3.9.2.2 Main Function\n", "\n", "Below is the main function. You need to complete details in four functions." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbpages": { "level": 3, "link": "[3.9.2.2 Main Function](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.2.2-Main-Function)", "section": "3.9.2.2 Main Function" } }, "outputs": [], "source": [ "def unconstrained_newton(calc_f,calc_grad,calc_hes,x0,verbose=False,max_iter=50,\n", " algorithm=\"newton\",globalization=\"line-search\", # specify algorithm\n", " eps_dx=1E-6,eps_df=1E-6, # Convergence tolerances (all)\n", " eta_ls=0.25,rho_ls=0.9,alpha_max=1.0, # line search parameters\n", " delta_max_tr=10.0,delta_0_tr=2.0, # trust region parameters\n", " kappa_1_tr = 0.25, kappa_2_tr = 0.75, gamma_tr=0.125 # trust region parameters\n", " ):\n", " '''\n", " Newton-Type Methods for Unconstrained Nonlinear Continuous Optimization\n", " \n", " Arguments (required):\n", " calc_f : function f(x) to minimize [scalar]\n", " calc_grad: gradient of f(x) [vector]\n", " calc_hes: Hessian of f(x) [matrix]\n", " \n", " Arguments (options):\n", " algorithm : specify main algorithm.\n", " choices: \"newton\", \"sr1\", \"bfgs\", \"steepest-descent\"\n", " \n", " globalization : specify globalization strategy\n", " choices: \"none\", \"line-search\", \"trust-region\"\n", " \n", " eps_dx : tolerance for step size norm (convergence), eps1 in notes\n", " \n", " eps_df : tolerance for gradient norm (convergence), eps2 in notes\n", " \n", " eta_ls : parameter for Goldstein-Armijo conditions (line search only)\n", " \n", " rho_ls : parameter to shrink (backstep) line search\n", " \n", " alpha_max : initial step length scaling for line search and/or steepest-descent\n", " \n", " delta_max_tr : maximum trust region size\n", " \n", " delta_0_tr : initial trust region size\n", " \n", " kappa_1_tr : 'shrink' tolerance for trust region\n", " \n", " kappa_2_tr : 'expand' tolerance for trust region\n", " \n", " gamma_tr : 'accept step' tolerance for trust region\n", " \n", " Returns:\n", " x : iteration history for x (decision variables) [list of numpy arrays]\n", " f : iteration history for f(x) (objective values) [list of numpy arrays]\n", " p : iteration history for p (steps)\n", " B : iteration history for Hessian approximations [list of numpy arrays]\n", " '''\n", " \n", " # Allocate outputs as empty lists\n", " x = [] # decision variables\n", " f = [] # objective values\n", " p = [] # steps\n", " grad = [] # gradients\n", " alpha = [] # line search / steepest descent step scalar\n", " B = [] # Hessian approximation\n", " delta = [] # trust region size\n", " rho = [] # trust region actual/prediction ratio\n", " step_accepted = [] # step status for trust region. True means accepted. False means rejected.\n", " \n", " # Note: alpha, delta and rho will remain empty lists unless used in the algorithm below\n", " \n", " # Store starting point\n", " x.append(x0)\n", " k = 0\n", " \n", " flag = True\n", " \n", " # Print header for iteration information\n", " print(\"Iter. \\tf(x) \\t\\t||grad(x)|| \\t||p|| \\t\\tmin(eig(B)) \\talpha \\t\\tdelta \\t\\tstep\")\n", " \n", " while flag and k < max_iter:\n", " # Evaluate f(x) at current iteration\n", " f.append(calc_f(x[k]))\n", " \n", " # Evaluate gradient\n", " grad.append(calc_grad(x[k]))\n", " \n", " if(check_nan(grad[k])):\n", " print(\"WARNING: gradiant calculation returned NaN\")\n", " break\n", " \n", " if verbose:\n", " print(\"\\n\")\n", " print(\"k = \",k)\n", " print(\"x = \",x[k])\n", " print(\"grad = \",grad[k])\n", " print(\"f = \",f[k])\n", "\n", " # Calculate exact Hessian or update approximation\n", " if(algorithm == \"newton\"):\n", " B.append(calc_hes(x[k]))\n", " \n", " elif k == 0 or algorithm == \"steepest-descent\":\n", " # Initialize or set to identity\n", " B.append(np.eye(len(x0)))\n", " \n", " elif algorithm == \"sr1\" or algorithm == \"bfgs\":\n", " # Change in x\n", " s = x[k] - x[k-1]\n", "\n", " # Change in gradient\n", " y = grad[k] - grad[k-1]\n", "\n", " if verbose:\n", " print(\"s = \",s)\n", " print(\"y = \",y)\n", " \n", " if algorithm == \"sr1\": # Calculate SR1 update\n", " dB = sr1_update(s, y, k, B, verbose)\n", " B.append(B[k-1] + dB)\n", " \n", " else: # Calculate BFGS update \n", " dB = bfgs_update(s, y, k, B, verbose)\n", " B.append(B[k-1] + dB) \n", "\n", " else:\n", " print(\"WARNING. algorithm = \",algorithm,\" is not supported.\")\n", " break\n", "\n", " if verbose:\n", " print(\"B = \\n\",B[k])\n", " print(\"B[k].shape = \",B[k].shape)\n", " \n", " if(check_nan(B[k])):\n", " print(\"WARNING: Hessian update returned NaN\")\n", " break\n", " \n", " c = np.linalg.cond(B[k])\n", " if c > 1E12:\n", " flag = False\n", " print(\"Warning: Hessian approximation is near singular.\")\n", " print(\"B[k] = \\n\",B[k])\n", " \n", " else:\n", " \n", " # Solve linear system to calculate step\n", " pk = linalg.solve(B[k],-grad[k])\n", " \n", " if globalization == \"none\":\n", " if algorithm == \"steepest-descent\":\n", " # Save step and scale by alpha_max\n", " p.append(pk*alpha_max)\n", " \n", " else:\n", " # Take full step\n", " p.append(pk)\n", " \n", " # Apply step and calculate x[k+1]\n", " x.append(x[k] + p[k])\n", " \n", " elif globalization == \"line-search\":\n", " \n", " # Line Search Function\n", " update, alphak = line_search(x, f, grad, calc_f, pk, k, alpha_max, eta_ls, rho_ls, verbose)\n", " \n", " # Now the line search is complete, apply final value of alphak\n", " p.append(update)\n", " \n", " # Save alpha\n", " alpha.append(alphak)\n", " \n", " # Apply step and calculate x[k+1]\n", " x.append(x[k] + p[k])\n", " \n", " elif globalization == \"trust-region\":\n", " \n", " # Trust region function\n", " update = trust_region(x, grad, B, delta, k, pk, delta_0_tr, verbose)\n", " p.append(update)\n", " \n", " ### Trust region management\n", "\n", " # Actual reduction\n", " ared = f[k] - calc_f(x[k] + p[k])\n", "\n", " # Predicted reduction\n", " pred = -(grad[k].dot(p[k]) + 0.5*p[k].dot(B[k].dot(p[k])))\n", "\n", " # Calculate rho\n", " if ared == 0 and pred == 0:\n", " # This occurs is the gradient is zero and Hessian is P.D.\n", " rho.append(1E4)\n", " else:\n", " rho.append(ared/pred)\n", "\n", " if(verbose):\n", " print(\"f[k] = \",f[k])\n", " print(\"p[k] = \",p[k])\n", " print(\"f(x[k] + p[k]) = \",calc_f(x[k] + p[k]))\n", " print(\"ared = \",ared)\n", " print(\"pred = \",pred)\n", " print(\"rho = \",rho[k])\n", "\n", " ## Check trust region shrink/expand logic\n", " if rho[k] < kappa_1_tr:\n", " # Shrink trust region\n", " delta.append(kappa_1_tr*linalg.norm(p[k]))\n", "\n", " elif rho[k] > kappa_2_tr and np.abs(linalg.norm(p[k]) - delta[k]) < 1E-6:\n", " # Expand trust region\n", " delta.append(np.min([2*delta[k], delta_max_tr]))\n", "\n", " else:\n", " # Keep trust region same size\n", " delta.append(delta[k])\n", "\n", " # Compute step\n", " if rho[k] > gamma_tr:\n", " # Take step\n", " x.append(x[k] + p[k])\n", " step_accepted.append(True)\n", " else:\n", " # Skip step\n", " x.append(x[k])\n", " step_accepted.append(False)\n", " else:\n", " print(\"Warning: globalization strategy not supported\")\n", "\n", " if verbose:\n", " print(\"p = \",p[k])\n", "\n", " # Calculate norms\n", " norm_p = linalg.norm(p[k])\n", " norm_g = linalg.norm(grad[k])\n", "\n", " # Calculate eigenvalues of Hessian (for display only)\n", " min_ev = np.min(np.real(linalg.eigvals(B[k])))\n", "\n", " '''\n", " print(\"f[k] = \",f[k])\n", " print(\"norm_g = \",norm_g)\n", " print(\"norm_g = \",norm_p)\n", " print(\"min_ev = \",min_ev)\n", " '''\n", " \n", " print(k,' \\t{0: 1.4e} \\t{1:1.4e} \\t{2:1.4e} \\t{3: 1.4e}'.format(f[k],norm_g,norm_p,min_ev),end='')\n", "\n", " # Python tip. The expression 'if my_list' is false if 'my_list' is empty\n", " if not alpha:\n", " # alpha is an empty list\n", " print(' \\t -------',end='')\n", " else:\n", " # otherwise print value of alpha\n", " print(' \\t{0: 1.2e}'.format(alpha[k]),end='')\n", " \n", " if not delta:\n", " # delta is an empty list\n", " print(' \\t -------',end='')\n", " else:\n", " # otherwise print value of alpha\n", " print(' \\t{0: 1.2e}'.format(delta[k]),end='')\n", " \n", " if not step_accepted:\n", " print(' \\t -----')\n", " else:\n", " if step_accepted[k]:\n", " print(' \\t accept')\n", " else:\n", " print(' \\t reject')\n", " \n", " # Check convergence criteria.\n", " flag = (norm_p > eps_dx) and (norm_g > eps_df)\n", "\n", " # Update iteration counter\n", " k = k + 1\n", " \n", " if(k == max_iter and flag):\n", " print(\"Reached maximum number of iterations.\")\n", " \n", " print(\"x* = \",x[-1])\n", " \n", " return x,f,p,B" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbpages": { "level": 3, "link": "[3.9.2.2 Main Function](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.2.2-Main-Function)", "section": "3.9.2.2 Main Function" } }, "outputs": [], "source": [ "def sr1_update(s, y, k, B, verbose):\n", " \"\"\" \n", " Function that implements the sr1 optimization algorithm\n", " \n", " Inputs:\n", " s : Change in x\n", " y : Change in gradient\n", " k : Iteration number\n", " B : Hessian approximation\n", " verbose : toggles verbose output (True or False)\n", " \n", " Outputs:\n", " dB : Update to Hessian approximation\n", " \"\"\"\n", " \n", " # SR1 formulation\n", " u = y - B[k-1].dot(s)\n", " denom = (u).dot(s)\n", "\n", " # Formula: dB = u * u.T / (u.T * s) if u is a column vector.\n", " if abs(denom) <= 1E-10:\n", " # Skip update\n", " dB = 0\n", " else:\n", " # Calculate update\n", " dB = xxT(u)/denom\n", "\n", " if(verbose):\n", " print(\"SR1 update denominator, (y-B[k-1]*s).T*s = \",denom)\n", " print(\"SR1 update u = \",u)\n", " print(\"SR1 update u.T*u/(u.T*s) = \\n\",dB)\n", " \n", " return dB #return the update to the Hessian approximation" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbpages": { "level": 3, "link": "[3.9.2.2 Main Function](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.2.2-Main-Function)", "section": "3.9.2.2 Main Function" } }, "outputs": [], "source": [ "def bfgs_update(s, y, k, B, verbose):\n", " \"\"\" \n", " Function that implements the BFGS optimization algorithm\n", " \n", " Inputs:\n", " s : Change in x\n", " y : Change in gradient\n", " k : Iteration number\n", " B : Hessian approximation\n", " verbose : toggles verbose output (True or False)\n", " \n", " Outputs:\n", " dB : Update to Hessian approximation\n", " \"\"\"\n", " \n", " # Define constant used to check norm of the update\n", " # See Eq. (3.19) in Biegler (2010)\n", " C2 = 1E-4\n", " \n", " # YOUR SOLUTION HERE\n", " \n", " return dB #return the update to the Hessian approximation" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbpages": { "level": 3, "link": "[3.9.2.2 Main Function](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.2.2-Main-Function)", "section": "3.9.2.2 Main Function" } }, "outputs": [], "source": [ "def line_search(x, f, grad, calc_f, pk, k, alpha_max, eta_ls, rho_ls, verbose):\n", " \"\"\"\n", " Function that implements the line search globalization strategy\n", " \n", " Inputs:\n", " x : decision variables\n", " f : objective values\n", " grad : gradients\n", " calc_f : function f(x) to minimize [scalar]\n", " pk : step\n", " k - Iteration number\n", " alpha_max : initial step length scaling for line search and/or steepest-descent\n", " eta_ls : parameter for Goldstein-Armijo conditions (line search only)\n", " rho_ls : parameter to shrink (backstep) line search\n", " verbose : toggles verbose output (True or False)\n", " \n", " Outputs:\n", " update : update to p\n", " alphak : update to alpha\n", " \"\"\"\n", " \n", " # Flag - continue line search?\n", " ls = True\n", "\n", " ## Initialize alpha\n", " alphak = alpha_max\n", "\n", " if(verbose):\n", " print(\"\\t LINE SEARCH\")\n", "\n", " while ls:\n", "\n", " # Calculate test point (if step accepted)\n", " xtest = x[k] + alphak*pk\n", "\n", " # Evaluate f(x) at test point. This is used for Armijo and Goldstein conditions\n", " ftest = calc_f(xtest)\n", " \n", " # YOUR SOLUTION HERE\n", "\n", " # Now the line search is complete, apply final value of alphak\n", " update = alphak*pk\n", " \n", " return update, alphak\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbpages": { "level": 3, "link": "[3.9.2.2 Main Function](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.2.2-Main-Function)", "section": "3.9.2.2 Main Function" } }, "outputs": [], "source": [ "def trust_region(x, grad, B, delta, k, pk, delta_0_tr, verbose):\n", " \"\"\" \n", " Function that implements the trust region globalization strategy\n", " \n", " Inputs:\n", " x : decision variables\n", " grad : gradients\n", " B : Hessian approximation\n", " delta : trust region size\n", " k : Iteration number\n", " pk : step\n", " delta_0_tr : initial trust region size\n", " verbose : toggles verbose output (True or False)\n", " \n", " Outputs:\n", " update : update to p\n", " \"\"\"\n", " \n", " ### Initialize trust region radius\n", " if(k == 0):\n", " delta.append(delta_0_tr)\n", "\n", " grad_zero = (linalg.norm(grad[k]) < 1E-14)\n", "\n", " ### Powell dogleg step\n", "\n", " # Calculate Cauchy step (pC)\n", " denom = grad[k].dot(B[k].dot(grad[k]))\n", " if verbose:\n", " print(\"TR: Cauchy step. grad.T*B*grad = \",denom)\n", "\n", " if denom > 0:\n", " # Term in ( ) is a scalar\n", " pC = -(grad[k].dot(grad[k])/denom)*grad[k]\n", " elif grad_zero:\n", " pC = np.zeros(len(x[k]))\n", " else:\n", " pC = - delta[k]/linalg.norm(grad[k])*grad[k]\n", "\n", " # Use Newton step (calculate above)\n", " pN = pk\n", "\n", " # Determine step\n", " if linalg.norm(pN) <= delta[k]:\n", " # Take Newton step. pN is inside the trust region.\n", " update = pN\n", " \n", " # YOUR SOLUTION HERE\n", " \n", " return update" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 3, "link": "[3.9.2.3 Feature Status](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.2.3-Feature-Status)", "section": "3.9.2.3 Feature Status" } }, "source": [ "### 3.9.2.3 Feature Status\n", "\n", "For each feature, please indicate the status upon submission.\n", "\n", "#### SR1\n", "\n", "*Status*: please choose from \"implemented and tested\", \"implemented but testing incomplete\", \"implementation incomplete\", \"did not attempt implementation\"\n", "\n", "*Details:* Please describe in a few sentences or bullet points the outstanding tasks.\n", "\n", "#### BFGS\n", "\n", "*Status*: please choose from \"implemented and tested\", \"implemented but testing incomplete\", \"implementation incomplete\", \"did not attempt implementation\"\n", "\n", "*Details:* Please describe in a few sentences or bullet points the outstanding tasks.\n", "\n", "#### Line Search\n", "\n", "*Status*: please choose from \"implemented and tested\", \"implemented but testing incomplete\", \"implementation incomplete\", \"did not attempt implementation\"\n", "\n", "*Details:* Please describe in a few sentences or bullet points the outstanding tasks.\n", "\n", "#### Trust Region\n", "\n", "*Status*: please choose from \"implemented and tested\", \"implemented but testing incomplete\", \"implementation incomplete\", \"did not attempt implementation\"\n", "\n", "*Details:* Please describe in a few sentences or bullet points the outstanding tasks." ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 2, "link": "[3.9.3 Benchmark Tests](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3-Benchmark-Tests)", "section": "3.9.3 Benchmark Tests" } }, "source": [ "## 3.9.3 Benchmark Tests\n", "\n", "In the remainder of the assignment, we will compare using several variants of Newton-type methods in different test problems taken from in class examples. For each problem at a given starting point, we will consider ten algorithm cases:\n", "1. Newton method with **exact Hessian** and no globalization strategy\n", "2. Newton method with exact Hessian and line search\n", "3. Newton method with exact Hessian and Powell dogleg trust region\n", "4. Quasi-Newton method with **SR1 Hessian approximation** and no globalization strategy\n", "5. Quasi-Newton method with SR1 Hessian approximation and line search\n", "6. Quasi-Newton method with SR1 Hessian approximation and Powell dogleg trust region\n", "7. Quasi-Newton method with **BFGS Hessian approximation** and no globalization strategy\n", "8. Quasi-Newton method with BFGS Hessian approximation and line search\n", "9. Quasi-Newton method with BFGS Hessian approximation and Powell dogleg trust region\n", "10. **Steepest descent** with line search\n", "\n", "We will then assess each candidate solution by the eigenvalues of the true Hessian. The norm of the gradient is already displayed.\n", "\n", "All of this analysis has been automated in a function below." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbpages": { "level": 2, "link": "[3.9.3 Benchmark Tests](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3-Benchmark-Tests)", "section": "3.9.3 Benchmark Tests" } }, "outputs": [], "source": [ "def check_sln(x,calc_hes):\n", " Hval = calc_hes(x)\n", " l, v = linalg.eig(Hval)\n", " print(\"Eigenvalues of Hessian at x* = \",l)\n", "\n", "def benchmark(x0,calc_f,calc_grad,calc_hes,verbose,cases=range(0,10)):\n", " '''\n", " Test 10 algorithm cases for a single starting point\n", " \n", " Arguments:\n", " x0 - starting point\n", " calc_f - function to evaluate objective\n", " calc_grad - function to evaluate gradient\n", " calc_hes - function to evaluate Hessian\n", " verbose - toggles verbose output (True or False)\n", " cases - list of cases to consider. Especially helpful for debugging.\n", " For example, setting cases=[1, 2] runs only case 2 and 3 (recall, Python starting counting at 0)\n", " '''\n", " \n", " for i in cases:\n", " if i == 0:\n", " print(\"***Case 1: Newton method with exact Hessian and no globalization strategy***\")\n", " x1,f,p,B = unconstrained_newton(calc_f,calc_grad,calc_hes,x0,verbose,\n", " algorithm=\"newton\",globalization=\"none\")\n", " check_sln(x1[-1],calc_hes)\n", " print(\"\\n\")\n", " \n", " if i == 1:\n", " print(\"***Case 2: Newton method with exact Hessian and line search***\")\n", " x2,f,p,B = unconstrained_newton(calc_f,calc_grad,calc_hes,x0,verbose,\n", " algorithm=\"newton\",globalization=\"line-search\")\n", " check_sln(x2[-1],calc_hes)\n", " print(\"\\n\")\n", " \n", " if i == 2:\n", " print(\"***Case 3: Newton method with exact Hessian and trust region***\")\n", " x3,f,p,B = unconstrained_newton(calc_f,calc_grad,calc_hes,x0,verbose,\n", " algorithm=\"newton\",globalization=\"trust-region\")\n", " check_sln(x3[-1],calc_hes)\n", " print(\"\\n\")\n", " \n", " if i == 3:\n", " print(\"***Case 4: Quasi-Newton method with SR1 Hessian approximation and no globalization strategy***\")\n", " x4,f,p,B = unconstrained_newton(calc_f,calc_grad,calc_hes,x0,verbose,\n", " algorithm=\"sr1\",globalization=\"none\")\n", " check_sln(x4[-1],calc_hes)\n", " print(\"\\n\")\n", " \n", " if i == 4:\n", " print(\"***Case 5: Quasi-Newton method with SR1 Hessian approximation and line search***\")\n", " x5,f,p,B = unconstrained_newton(calc_f,calc_grad,calc_hes,x0,verbose,\n", " algorithm=\"sr1\",globalization=\"line-search\")\n", " check_sln(x5[-1],calc_hes)\n", " print(\"\\n\")\n", " \n", " if i == 5:\n", " print(\"***Case 6: Quasi-Newton method with SR1 Hessian approximation and trust region***\")\n", " x6,f,p,B = unconstrained_newton(calc_f,calc_grad,calc_hes,x0,verbose,\n", " algorithm=\"sr1\",globalization=\"trust-region\")\n", " check_sln(x6[-1],calc_hes)\n", " print(\"\\n\")\n", "\n", " if i == 6:\n", " print(\"***Case 7: Quasi-Newton method with BFGS Hessian approximation and no globalization strategy***\")\n", " x7,f,p,B = unconstrained_newton(calc_f,calc_grad,calc_hes,x0,verbose,\n", " algorithm=\"bfgs\",globalization=\"none\")\n", " check_sln(x7[-1],calc_hes)\n", " print(\"\\n\")\n", " \n", " if i == 7:\n", " print(\"***Case 8: Quasi-Newton method with BFGS Hessian approximation and line search***\")\n", " x8,f,p,B = unconstrained_newton(calc_f,calc_grad,calc_hes,x0,verbose,\n", " algorithm=\"bfgs\",globalization=\"line-search\")\n", " check_sln(x8[-1],calc_hes)\n", " print(\"\\n\")\n", "\n", " if i == 8:\n", " print(\"***Case 9: Quasi-Newton method with BFGS Hessian approximation and trust region***\")\n", " x9,f,p,B = unconstrained_newton(calc_f,calc_grad,calc_hes,x0,verbose,\n", " algorithm=\"bfgs\",globalization=\"trust-region\")\n", " check_sln(x9[-1],calc_hes)\n", " print(\"\\n\")\n", "\n", " if i == 9:\n", " print(\"***Case 10: Steepest Descent and line search***\")\n", " x10,f,p,B = unconstrained_newton(calc_f,calc_grad,calc_hes,x0,verbose,\n", " algorithm=\"steepest-descent\",globalization=\"line-search\")\n", " check_sln(x10[-1],calc_hes)\n", " print(\"\\n\")" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 3, "link": "[3.9.3.1 Quadratic Test Problem](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.1-Quadratic-Test-Problem)", "section": "3.9.3.1 Quadratic Test Problem" } }, "source": [ "### 3.9.3.1 Quadratic Test Problem\n", "\n", "Start debugging your function with the following problem:\n", "\n", "$$\\min_x ~~ x_1^2 + (x_2 -1)^2$$" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbpages": { "level": 3, "link": "[3.9.3.1 Quadratic Test Problem](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.1-Quadratic-Test-Problem)", "section": "3.9.3.1 Quadratic Test Problem" } }, "outputs": [], "source": [ "def my_f(x):\n", " return x[0]**2 + (x[1]-1)**2\n", "\n", "def my_grad(x):\n", " return np.array([2*x[0], 2*(x[1] - 1) ])\n", "\n", "def my_hes(x):\n", " return 2*np.eye(2)" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 4, "link": "[3.9.3.1.1 Benchmark Cases](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.1.1-Benchmark-Cases)", "section": "3.9.3.1.1 Benchmark Cases" } }, "source": [ "#### 3.9.3.1.1 Benchmark Cases\n", "\n", "The following code below runs cases 1 through 10. You can specify only a subset using the ``cases`` keyword." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbpages": { "level": 4, "link": "[3.9.3.1.1 Benchmark Cases](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.1.1-Benchmark-Cases)", "section": "3.9.3.1.1 Benchmark Cases" } }, "outputs": [], "source": [ "x0 = np.array([-3.0,2.0])\n", "benchmark(x0,my_f,my_grad,my_hes,False)" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 4, "link": "[3.9.3.1.2 Discussion and Analysis](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.1.2-Discussion-and-Analysis)", "section": "3.9.3.1.2 Discussion and Analysis" } }, "source": [ "#### 3.9.3.1.2 Discussion and Analysis\n", "\n", "Classify the solutions.\n", "\n", "**Answer:**\n", "\n", "Should we expect the SR1, BFGS, and steepest descent (all with line search) to always have the same results for the first iteration? Explain in a few sentences.\n", "\n", "**Answer:**\n", "\n", "For this problem, should we always expect all of the cases to converge to the same solution (regardless of starting point)? Explain in a few sentences.\n", "\n", "**Answer:**" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 3, "link": "[3.9.3.2 One-Dimensional Example](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.2-One-Dimensional-Example)", "section": "3.9.3.2 One-Dimensional Example" } }, "source": [ "### 3.9.3.2 One-Dimensional Example\n", "\n", "Consider a scalar function $f(x): \\mathbb{R} \\rightarrow \\mathbb{R}$. Recall\n", "\n", "$$f(x) = 0.5 (x-1)^4 + (x+1)^3 - 10 x^2 + 5 x$$\n", "\n", "$$f'(x) = 6 - 8 x - 3 x^2 + 2 x^3$$\n", "\n", "$$f''(x) = -8 - 6 x + 6 x^2 $$\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbpages": { "level": 3, "link": "[3.9.3.2 One-Dimensional Example](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.2-One-Dimensional-Example)", "section": "3.9.3.2 One-Dimensional Example" } }, "outputs": [], "source": [ "## Define f(x)\n", "f_ = lambda x : 0.5*(x[0]-1)**4 + (x[0]+1)**3 - 10*x[0]**2 + 5*x[0]\n", "\n", "## Define f'(x)\n", "df_ = lambda x : (6 - 8*x[0] - 3*x[0]**2 + 2*x[0]**3)*np.ones(1)\n", "\n", "## Define f''(x)\n", "ddf_ = lambda x : (-8 - 6*x[0] + 6*x[0]**2)*np.ones((1,1))" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 4, "link": "[3.9.3.2.1 Benchmark Cases with $x_0 = -3$](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.2.1-Benchmark-Cases-with-$x_0-=--3$)", "section": "3.9.3.2.1 Benchmark Cases with $x_0 = -3$" } }, "source": [ "#### 3.9.3.2.1 Benchmark Cases with $x_0 = -3$" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbpages": { "level": 4, "link": "[3.9.3.2.1 Benchmark Cases with $x_0 = -3$](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.2.1-Benchmark-Cases-with-$x_0-=--3$)", "section": "3.9.3.2.1 Benchmark Cases with $x_0 = -3$" } }, "outputs": [], "source": [ "x0 = np.ones(1)*(-3.0)\n", "benchmark(x0,f_,df_,ddf_,False)" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 4, "link": "[3.9.3.2.2 Discussion](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.2.2-Discussion)", "section": "3.9.3.2.2 Discussion" } }, "source": [ "#### 3.9.3.2.2 Discussion\n", "\n", "Did all of the cases converge to the nearby solution?\n", "\n", "**Answer**" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 3, "link": "[3.9.3.3 Benchmark Case with $x_0 = 0$](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.3-Benchmark-Case-with-$x_0-=-0$)", "section": "3.9.3.3 Benchmark Case with $x_0 = 0$" } }, "source": [ "### 3.9.3.3 Benchmark Case with $x_0 = 0$" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbpages": { "level": 3, "link": "[3.9.3.3 Benchmark Case with $x_0 = 0$](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.3-Benchmark-Case-with-$x_0-=-0$)", "section": "3.9.3.3 Benchmark Case with $x_0 = 0$" } }, "outputs": [], "source": [ "x0 = np.zeros(1)\n", "benchmark(x0,f_,df_,ddf_,False)" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 4, "link": "[3.9.3.3.1 Discussion](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.3.1-Discussion)", "section": "3.9.3.3.1 Discussion" } }, "source": [ "#### 3.9.3.3.1 Discussion\n", "\n", "Which cases converged to a **local maximizer**?\n", "\n", "**Answer:**\n", "\n", "Which cases converged to a **local minimizer**?\n", "\n", "**Answer:**\n", "\n", "Based on these results, which algorithms do you recommend for non-convex problems?\n", "\n", "**Answer:**" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 3, "link": "[3.9.3.4 Return of Example 2.19](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.4-Return-of-Example-2.19)", "section": "3.9.3.4 Return of Example 2.19" } }, "source": [ "### 3.9.3.4 Return of Example 2.19\n", "\n", "Now we will benchmark the algorithms using **Example 2.19**.\n", "\n", "![ex2-19](figures/ex2-19.png)\n", "\n", "Consider the function $g(z) = \\sqrt{z}$, which is only defined for $z \\geq 0$. We will shortly learn how handle bounds/inequality constraints in an optimization problem.\n", "\n", "But for this assignment, we will focus on algorithms for *unconstrained* nonlinear optimization. We will use a smoothed approximation to ensure $g(z)$ is defined and twice-differentiable over $z \\in \\mathbb{R}$. Consider:\n", "\n", "$$g(z) \\approx \\sqrt{\\max(z,0)}$$\n", "\n", "This ensures that $g(z)$ is defined over $z \\in \\mathbb{R}$. But what about twice-differentiable? We will further approximate\n", "\n", "$$\\max(z,0) \\approx \\frac{1}{2}\\left(\\sqrt{z^2 + 10^{-4}} + z \\right)$$\n", "\n", "Below is an implementation of the Example 2.19 that uses this smoothed approximation." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbpages": { "level": 3, "link": "[3.9.3.4 Return of Example 2.19](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.4-Return-of-Example-2.19)", "section": "3.9.3.4 Return of Example 2.19" } }, "outputs": [], "source": [ "def asqrt(z):\n", " '''\n", " Approximate/smoothed square root\n", " '''\n", "\n", " return np.sqrt(0.5*(np.sqrt(z**2 + 1E-4) + z))\n", " \n", "def ex2_19_smoothed(x,verbose=False):\n", " ''' Evaluate function given above at point x\n", "\n", " Inputs:\n", " x - vector with 2 elements\n", " \n", " Outputs:\n", " f - function value (scalar)\n", " '''\n", " # Constants\n", " a = np.array([0.3, 0.6, 0.2])\n", " b = np.array([5, 26, 3])\n", " c = np.array([40, 1, 10])\n", " \n", " # Intermediates. Recall Python indicies start at 0\n", " u = x[0] - 0.8\n", " s = asqrt(1-u)\n", " s2 = asqrt(1+u)\n", " v = x[1] -(a[0] + a[1]*u**2*s-a[2]*u)\n", " alpha = -b[0] + b[1]*u**2*s2+ b[2]*u\n", " beta = c[0]*v**2*(1-c[1]*v)/(1+c[2]*u**2)\n", " \n", " f = alpha*np.exp(-beta)\n", " \n", " if verbose:\n", " print(\"##### my_f at x = \",x, \"#####\")\n", " print(\"u = \",u)\n", " print(\"sqrt(1-u) = \",s)\n", " print(\"sqrt(1+u) = \",s2)\n", " print(\"v = \",v)\n", " print(\"alpha = \",alpha)\n", " print(\"beta = \",beta)\n", " print(\"f(x) = \",f)\n", " print(\"##### Done. #####\\n\")\n", " \n", " return f\n", "\n", "my_f2 = lambda x : ex2_19_smoothed(x,verbose=False)\n", "\n", "my_grad2 = lambda x : my_grad_approx(x,my_f2,1E-6,verbose=False)\n", "\n", "my_hes2 = lambda x : my_hes_approx(x,my_grad2,1E-6)" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 4, "link": "[3.9.3.4.1 Benchmark with $x_0$ Near Solution](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.4.1-Benchmark-with-$x_0$-Near-Solution)", "section": "3.9.3.4.1 Benchmark with $x_0$ Near Solution" } }, "source": [ "#### 3.9.3.4.1 Benchmark with $x_0$ Near Solution" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbpages": { "level": 4, "link": "[3.9.3.4.1 Benchmark with $x_0$ Near Solution](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.4.1-Benchmark-with-$x_0$-Near-Solution)", "section": "3.9.3.4.1 Benchmark with $x_0$ Near Solution" } }, "outputs": [], "source": [ "x0 = np.array([0.7, 0.3])\n", "benchmark(x0,my_f2,my_grad2,my_hes2,False)" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 4, "link": "[3.9.3.4.2 Discussion](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.4.2-Discussion)", "section": "3.9.3.4.2 Discussion" } }, "source": [ "#### 3.9.3.4.2 Discussion\n", "\n", "Which cases do not converge to the known solution?\n", "\n", "**Answer:**\n", "\n", "What is happening with the SR1 and BFGS update when no globalization strategy is used? *Hint:* This is related to the square root approximation.\n", "\n", "**Answer:**" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 4, "link": "[3.9.3.4.3 Benchmark with $x_0$ Far From the Solution](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.4.3-Benchmark-with-$x_0$-Far-From-the-Solution)", "section": "3.9.3.4.3 Benchmark with $x_0$ Far From the Solution" } }, "source": [ "#### 3.9.3.4.3 Benchmark with $x_0$ Far From the Solution" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbpages": { "level": 4, "link": "[3.9.3.4.3 Benchmark with $x_0$ Far From the Solution](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.4.3-Benchmark-with-$x_0$-Far-From-the-Solution)", "section": "3.9.3.4.3 Benchmark with $x_0$ Far From the Solution" } }, "outputs": [], "source": [ "x0 = np.array([0.0, 0.0])\n", "benchmark(x0,my_f2,my_grad2,my_hes2,False)" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 4, "link": "[3.9.3.4.4 Discussion](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.4.4-Discussion)", "section": "3.9.3.4.4 Discussion" } }, "source": [ "#### 3.9.3.4.4 Discussion" ] }, { "cell_type": "markdown", "metadata": { "nbpages": { "level": 4, "link": "[3.9.3.4.4 Discussion](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.4.4-Discussion)", "section": "3.9.3.4.4 Discussion" } }, "source": [ "Did any of the 10 cases converge to the known solution?\n", "\n", "What do you recommend trying next to more reliably solve this particular problem?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbpages": { "level": 4, "link": "[3.9.3.4.4 Discussion](https://ndcbe.github.io/CBE60499/03.09-Algorithms3.html#3.9.3.4.4-Discussion)", "section": "3.9.3.4.4 Discussion" } }, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "8dd51e03", "metadata": {}, "source": [ "\n", "< [3.8 Algorithms Homework 2](https://ndcbe.github.io/CBE60499/03.08-Algorithms2.html) | [Contents](toc.html) | [Tag Index](tag_index.html) | [4.0 Constrained Nonlinear Optimization: Theory and Applications](https://ndcbe.github.io/CBE60499/04.00-Constrained.html) >

\"Open

\"Download\"" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }