7.1. Convexity Revisited#
# This code cell installs packages on Colab
import sys
if "google.colab" in sys.modules:
!wget "https://raw.githubusercontent.com/ndcbe/optimization/main/notebooks/helper.py"
import helper
helper.easy_install()
else:
sys.path.insert(0, '../')
import helper
helper.set_plotting_style()
import pandas as pd
import pyomo.environ as pyo
7.1.1. Background#
Reference: Beginning of Chapter 4 in Biegler (2010)
7.1.1.1. Canonical Nonlinear Program (NLP)#
Assumption: functions \(f(\mathbf{x}) : \mathbb{R}^n \to \mathbb{R}\), \(\mathbf{h}(\mathbf{x}) : \mathbb{R}^n \to \mathbb{R}^m\), and \(\mathbf{g}(\mathbf{x}) : \mathbb{R}^n \to \mathbb{R}^r\) have continuous first and second derivatives.
Denote the feasible region as:
7.1.1.2. Types of Constrained Optimal Solutions#
Definition 4.1: Constrained Optimal Solutions
A point \(x^*\) is a global minimizer if \(f(x^*) \leq f(x)\) for all \(x \in \mathcal{F}\).
A point \(x^*\) is a local minimizer if \(f(x^*) \leq f(x)\) for all \(x \in \mathcal{N}(x^*) \cap \mathcal{F}\), where we define $\( \mathcal{N}(x^*) = \{x : \|x - x^*\| < \epsilon\}, \quad \epsilon > 0. \)$
A point \(x^*\) is a strict local minimizer if \(f(x^*) < f(x)\) for all \(x \in \mathcal{N}(x^*) \cap \mathcal{F}\).
A point \(x^*\) is an isolated local minimizer if there are no other local minimizers in \(\mathcal{N}(x^*) \cap \mathcal{F}\).
7.1.1.3. Key Questions#
As with unconstrained optimization, the following questions need to be considered:
If a solution \(x^*\) exists, is it a global solution in \(\mathcal{F}\) or is it only a local solution?
What conditions characterize the optimal solutions?
Are there special problem classes of the NLP whose solutions have stronger properties and are easier to solve?
Are there efficient and reliable methods to solve the NLP?
7.1.2. Convexity for Constrained Optimization#
Reference: Section 4.1 in Biegler (2010)
7.1.2.1. Illustrative Examples#
Main idea: are the objective function and feasible region both convex?
\(f(x)\) is convex on the domain \(x \in X\) if and only if
$\(
\alpha f(x^a) + (1 - \alpha)f(x^b) \geq f(\alpha x^a + (1 - \alpha)x^b) \quad \forall x^a, x^b \in X, \, \alpha \in (0, 1).
\)$
Strict convexity requires the inequality to be strict.
The region \(\mathcal{Y}\) is convex if and only if
$\(
\alpha x^a + (1 - \alpha)x^b \in \mathcal{Y} \quad \forall x^a, x^b \in \mathcal{Y}, \, \alpha \in [0, 1].
\)$
7.1.2.2. Theorem 4.2: Convexity#
Theorem 4.2 If \(g(x)\) is convex and \(h(x)\) is linear, then the region
is convex, i.e.,
Proof
Consider two points \(x^a, x^b \in \mathcal{F}\) and
\[ \bar{x} = \alpha x^a + (1 - \alpha)x^b \quad \text{for some } \alpha \in (0, 1). \]If \(\bar{x} \not\in \mathcal{F}\), then (i) \(g(\bar{x}) > 0\) or (ii) \(h(\bar{x}) \neq 0\) or both.
Case (i): Recall \(g(x)\) is convex, and \(g(x^a) \leq 0\) and \(g(x^b) \leq 0\) (both \(x^a\) and \(x^b\) are feasible).
$\( 0 \geq \alpha g(x^a) + (1 - \alpha)g(x^b) \geq g(\alpha x^a + (1 - \alpha)x^b) = g(\bar{x}) \quad \text{(definition of convexity)}. \)\( Thus, \)g(\bar{x}) \leq 0$.Case (ii): Recall \(h(x)\) is linear, and \(h(x^a) = 0\) and \(h(x^b) = 0\).
\[\begin{align*} 0 & = \alpha h(x^a) + (1 - \alpha)h(x^b) \quad \text{(property of linear functions)} \\ & = h(\alpha x^a + (1 - \alpha)x^b) = h(\bar{x}) \quad \text{(definition of } \bar{x} \text{)}. \end{align*}\]Thus, \(g(\bar{x}) \leq 0\) and \(h(\bar{x}) = 0\).
This leads to a contradiction.
7.1.2.3. Theorem 4.3: Global Minimizers#
Theorem 4.3 If \(f(x)\) is convex and \(\mathcal{F}\) is convex, then every local minimum in \(\mathcal{F}\) is a global minimum. If \(f(x)\) is strictly convex in \(\mathcal{F}\), then a local minimum is the unique global minimum.
Proof: Convexity Claim
Assumption: There are two local minima \(x^a, x^b \in \mathcal{F}\) with \(f(x^a) > f(x^b)\). Seek contradiction.
Definition of Local Minimum:
\(f(x^a) \leq f(x), \, x \in \mathcal{N}(x^a) \cap \mathcal{F}\)
\(f(x^b) \leq f(x), \, x \in \mathcal{N}(x^b) \cap \mathcal{F}\)
By Convexity:
\((1 - \alpha)x^a + \alpha x^b \in \mathcal{F}\)
\(f((1 - \alpha)x^a + \alpha x^b) \leq (1 - \alpha)f(x^a) + \alpha f(x^b), \, \forall \alpha \in (0, 1)\)
Choose \(\alpha\) such that:
\[ \bar{x} = (1 - \alpha)x^a + \alpha x^b \in \mathcal{N}(x^a) \cap \mathcal{F}. \]Thus,
$\( f(\bar{x}) \leq f(x^a) + \alpha (f(x^b) - f(x^a)). \)$Recall \(f(x^b) < f(x^a)\), so
\[ f(\bar{x}) < f(x^a). \]This is a contradiction, as \(x^a\) cannot be a local minimizer.
Proof: Stricty Convexity Claim. Same idea as above. Assume \(f(x^a) \geq f(x^b)\) in (1). Use strict inequality (from the strict convexity definition) in (3) and (4).
7.1.2.4. More Illustrative Examples#
7.1.3. Circle Packing Example#
Reference: Section 4.1 in Biegler (2010)
Motivating Question: Is this problem convex?
What is the smallest rectangle you can use to enclose three given circles? Reference: Example 4.4 in Biegler (2010).
7.1.3.1. Optimization Model and Pyomo Implementation#
The following optimization model is given in Biegler (2010) and adapted to use set notation.
import random
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
def create_circle_model(circle_radii):
''' Create circle optimization model in Pyomo
Arguments:
circle_radii: dictionary with keys=circle name and value=radius (float)
Returns:
model: Pyomo model
'''
# Number of circles to consider
n = len(circle_radii)
# Create a concrete Pyomo model.
model = pyo.ConcreteModel()
# Initialize index for circles
model.CIRCLES = pyo.Set(initialize = circle_radii.keys())
# Create parameter
model.R = pyo.Param(model.CIRCLES, domain=pyo.PositiveReals, initialize=circle_radii)
# Create variables for box
model.a = pyo.Var(domain=pyo.PositiveReals)
model.b = pyo.Var(domain=pyo.PositiveReals)
# Set objective
model.obj = pyo.Objective(expr=2*(model.a + model.b), sense = pyo.minimize)
# Create variables for circle centers
model.x = pyo.Var(model.CIRCLES, domain=pyo.PositiveReals)
model.y = pyo.Var(model.CIRCLES, domain=pyo.PositiveReals)
# "In the box" constraints
def left_x(m,c):
return m.x[c] >= model.R[c]
model.left_x_con = pyo.Constraint(model.CIRCLES, rule=left_x)
def left_y(m,c):
return m.y[c] >= model.R[c]
model.left_y_con = pyo.Constraint(model.CIRCLES, rule=left_y)
def right_x(m,c):
return m.x[c] <= m.b - model.R[c]
model.right_x_con = pyo.Constraint(model.CIRCLES, rule=right_x)
def right_y(m,c):
return m.y[c] <= m.a - model.R[c]
model.right_y_con = pyo.Constraint(model.CIRCLES, rule=right_y)
# No overlap constraints
def no_overlap(m,c1,c2):
if c1 < c2:
return (m.x[c1] - m.x[c2])**2 + (m.y[c1] - m.y[c2])**2 >= (model.R[c1] + model.R[c2])**2
else:
return pyo.Constraint.Skip
model.no_overlap_con = pyo.Constraint(model.CIRCLES, model.CIRCLES, rule=no_overlap)
return model
def initialize_circle_model(model, a_init=25, b_init=25):
''' Initialize the x and y coordinates using uniform distribution
Arguments:
a_init: initial value for a (default=25)
b_init: initial value for b (default=25)
Returns:
Nothing. But per Pyomo scoping rules, the input argument `model`
can be modified in this function.
'''
# Initialize
model.a = 25
model.b = 25
for i in model.CIRCLES:
# Adding circle radii ensures the remains in the >0, >0 quadrant
model.x[i] = random.uniform(0,10) + model.R[i]
model.y[i] = random.uniform(0,10) + model.R[i]
Next, we will create a dictionary containing the circle names and radii values.
# Create dictionary with circle data
circle_data = {'A':10.0, 'B':5.0, 'C':3.0}
circle_data
{'A': 10.0, 'B': 5.0, 'C': 3.0}
# Access the keys
circle_data.keys()
dict_keys(['A', 'B', 'C'])
Now let’s create the model.
# Create model
model = create_circle_model(circle_data)
And let’s initialize the model.
# Initialize model
initialize_circle_model(model)
model.pprint()
1 Set Declarations
CIRCLES : Size=1, Index=None, Ordered=Insertion
Key : Dimen : Domain : Size : Members
None : 1 : Any : 3 : {'A', 'B', 'C'}
1 Param Declarations
R : Size=3, Index=CIRCLES, Domain=PositiveReals, Default=None, Mutable=False
Key : Value
A : 10.0
B : 5.0
C : 3.0
4 Var Declarations
a : Size=1, Index=None
Key : Lower : Value : Upper : Fixed : Stale : Domain
None : 0 : 25 : None : False : False : PositiveReals
b : Size=1, Index=None
Key : Lower : Value : Upper : Fixed : Stale : Domain
None : 0 : 25 : None : False : False : PositiveReals
x : Size=3, Index=CIRCLES
Key : Lower : Value : Upper : Fixed : Stale : Domain
A : 0 : 11.513225996655489 : None : False : False : PositiveReals
B : 0 : 8.43454062603906 : None : False : False : PositiveReals
C : 0 : 4.040445303542623 : None : False : False : PositiveReals
y : Size=3, Index=CIRCLES
Key : Lower : Value : Upper : Fixed : Stale : Domain
A : 0 : 13.183454470479534 : None : False : False : PositiveReals
B : 0 : 14.420996792060324 : None : False : False : PositiveReals
C : 0 : 8.426415825883453 : None : False : False : PositiveReals
1 Objective Declarations
obj : Size=1, Index=None, Active=True
Key : Active : Sense : Expression
None : True : minimize : 2*(a + b)
5 Constraint Declarations
left_x_con : Size=3, Index=CIRCLES, Active=True
Key : Lower : Body : Upper : Active
A : 10.0 : x[A] : +Inf : True
B : 5.0 : x[B] : +Inf : True
C : 3.0 : x[C] : +Inf : True
left_y_con : Size=3, Index=CIRCLES, Active=True
Key : Lower : Body : Upper : Active
A : 10.0 : y[A] : +Inf : True
B : 5.0 : y[B] : +Inf : True
C : 3.0 : y[C] : +Inf : True
no_overlap_con : Size=3, Index=CIRCLES*CIRCLES, Active=True
Key : Lower : Body : Upper : Active
('A', 'B') : 225.0 : (x[A] - x[B])**2 + (y[A] - y[B])**2 : +Inf : True
('A', 'C') : 169.0 : (x[A] - x[C])**2 + (y[A] - y[C])**2 : +Inf : True
('B', 'C') : 64.0 : (x[B] - x[C])**2 + (y[B] - y[C])**2 : +Inf : True
right_x_con : Size=3, Index=CIRCLES, Active=True
Key : Lower : Body : Upper : Active
A : -Inf : x[A] - (b - 10.0) : 0.0 : True
B : -Inf : x[B] - (b - 5.0) : 0.0 : True
C : -Inf : x[C] - (b - 3.0) : 0.0 : True
right_y_con : Size=3, Index=CIRCLES, Active=True
Key : Lower : Body : Upper : Active
A : -Inf : y[A] - (a - 10.0) : 0.0 : True
B : -Inf : y[B] - (a - 5.0) : 0.0 : True
C : -Inf : y[C] - (a - 3.0) : 0.0 : True
12 Declarations: CIRCLES R a b obj x y left_x_con left_y_con right_x_con right_y_con no_overlap_con
7.1.3.2. Visualize Initial Point#
Next, we’ll define a function to plot the solution (or initial point)
# Plot initial point
def plot_circles(m):
''' Plot circles using data in Pyomo model
Arguments:
m: Pyomo concrete model
Returns:
Nothing (but makes a figure)
'''
# Create figure
fig, ax = plt.subplots(1,figsize=(6,6))
# Adjust axes
l = max(m.a.value,m.b.value) + 1
ax.set_xlim(0,l)
ax.set_ylim(0,l)
# Draw box
art = mpatches.Rectangle((0,0), width=m.b.value, height=m.a.value,fill=False)
ax.add_patch(art)
# Draw circles and mark center
for i in m.CIRCLES:
art2 = mpatches.Circle( (m.x[i].value,m.y[i].value), radius=m.R[i],fill=True,alpha=0.25)
ax.add_patch(art2)
plt.scatter(m.x[i].value,m.y[i].value,color='black')
# Show plot
plt.show()
plot_circles(model)
7.1.3.3. Solve and Inspect the Solution#
# Specify the solver
solver = pyo.SolverFactory('ipopt')
# Solve the model
results = solver.solve(model, tee = True)
Ipopt 3.14.16:
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit https://github.com/coin-or/Ipopt
******************************************************************************
This is Ipopt version 3.14.16, running with linear solver MUMPS 5.7.3.
Number of nonzeros in equality constraint Jacobian...: 0
Number of nonzeros in inequality constraint Jacobian.: 30
Number of nonzeros in Lagrangian Hessian.............: 12
Total number of variables............................: 8
variables with only lower bounds: 8
variables with lower and upper bounds: 0
variables with only upper bounds: 0
Total number of equality constraints.................: 0
Total number of inequality constraints...............: 15
inequality constraints with only lower bounds: 9
inequality constraints with lower and upper bounds: 0
inequality constraints with only upper bounds: 6
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
0 1.0000000e+02 2.14e+02 1.00e+00 -1.0 0.00e+00 - 0.00e+00 0.00e+00 0
1 9.7404570e+01 1.41e+02 2.57e+00 -1.0 1.90e+01 - 1.53e-01 1.79e-01h 1
2 9.8778030e+01 4.43e+01 4.34e-01 -1.0 6.90e+00 - 5.38e-01 5.49e-01h 1
3 9.7534613e+01 2.66e+01 4.02e-01 -1.0 1.12e+01 - 3.40e-01 2.08e-01h 1
4 9.8462936e+01 7.50e+00 2.95e-01 -1.0 9.82e-01 - 9.09e-01 6.57e-01h 1
5 9.8784717e+01 0.00e+00 1.63e-02 -1.0 5.11e+00 - 1.00e+00 9.29e-01h 1
6 9.8395226e+01 0.00e+00 2.45e-03 -1.7 2.38e+01 - 1.00e+00 1.00e+00h 1
7 9.8404859e+01 0.00e+00 8.76e-04 -1.7 3.43e+01 - 1.00e+00 1.00e+00h 1
8 9.8404875e+01 0.00e+00 2.03e-05 -1.7 5.60e+00 - 1.00e+00 1.00e+00h 1
9 9.8300803e+01 0.00e+00 1.10e-05 -2.5 2.48e-01 - 1.00e+00 1.00e+00h 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
10 9.8285163e+01 0.00e+00 2.83e-07 -3.8 3.59e-02 - 1.00e+00 1.00e+00h 1
11 9.8284281e+01 0.00e+00 1.05e-09 -5.7 2.05e-03 - 1.00e+00 1.00e+00h 1
12 9.8284270e+01 0.00e+00 8.68e-13 -8.6 1.53e-04 - 1.00e+00 1.00e+00h 1
Number of Iterations....: 12
(scaled) (unscaled)
Objective...............: 9.8284270438748564e+01 9.8284270438748564e+01
Dual infeasibility......: 8.6837913151726545e-13 8.6837913151726545e-13
Constraint violation....: 0.0000000000000000e+00 0.0000000000000000e+00
Variable bound violation: 0.0000000000000000e+00 0.0000000000000000e+00
Complementarity.........: 2.5110436111686484e-09 2.5110436111686484e-09
Overall NLP error.......: 2.5110436111686484e-09 2.5110436111686484e-09
Number of objective function evaluations = 13
Number of objective gradient evaluations = 13
Number of equality constraint evaluations = 0
Number of inequality constraint evaluations = 13
Number of equality constraint Jacobian evaluations = 0
Number of inequality constraint Jacobian evaluations = 13
Number of Lagrangian Hessian evaluations = 12
Total seconds in IPOPT = 0.005
EXIT: Optimal Solution Found.
Next, we can inspect the solution. Because Pyomo is a Python extension, we can use Pyoth (for loops, etc.) to programmatically inspect the solution.
# Print variable values
print("Name\tValue")
for c in model.component_data_objects(pyo.Var):
print(c.name,"\t", pyo.value(c))
# Plot solution
plot_circles(model)
Name Value
a 19.999999803189517
b 29.142135416184765
x[A] 19.14213551493193
x[B] 4.999999951252107
x[C] 5.397289325741121
y[A] 9.999999901252016
y[B] 14.999999849644135
y[C] 4.723178462635149
# Print constraints
for c in model.component_data_objects(pyo.Constraint):
print(c.name,"\t", pyo.value(c.lower),"\t", pyo.value(c.body),"\t", pyo.value(c.upper))
left_x_con[A] 10.0 19.14213551493193 None
left_x_con[B] 5.0 4.999999951252107 None
left_x_con[C] 3.0 5.397289325741121 None
left_y_con[A] 10.0 9.999999901252016 None
left_y_con[B] 5.0 14.999999849644135 None
left_y_con[C] 3.0 4.723178462635149 None
right_x_con[A] None 9.874716511149018e-08 0.0
right_x_con[B] None -19.142135464932657 0.0
right_x_con[C] None -20.744846090443644 0.0
right_y_con[A] None 9.806249856580962e-08 0.0
right_y_con[B] None 4.6454617930180575e-08 0.0
right_y_con[C] None -12.276821340554369 0.0
no_overlap_con[A,B] 225.0 224.9999977854188 None
no_overlap_con[A,C] 169.0 216.7656412595597 None
no_overlap_con[B,C] 64.0 105.77089666756719 None
7.1.3.4. Reinitialize and Resolve#
Reinitialize the model, plot the initial point, resolve, and plot the solution. Is there more than one solution?
# Initialize and print the model
initialize_circle_model(model)
# Plot initial point
plot_circles(model)
# Solve the model
results = solver.solve(model, tee = True)
Ipopt 3.14.16:
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit https://github.com/coin-or/Ipopt
******************************************************************************
This is Ipopt version 3.14.16, running with linear solver MUMPS 5.7.3.
Number of nonzeros in equality constraint Jacobian...: 0
Number of nonzeros in inequality constraint Jacobian.: 30
Number of nonzeros in Lagrangian Hessian.............: 12
Total number of variables............................: 8
variables with only lower bounds: 8
variables with lower and upper bounds: 0
variables with only upper bounds: 0
Total number of equality constraints.................: 0
Total number of inequality constraints...............: 15
inequality constraints with only lower bounds: 9
inequality constraints with lower and upper bounds: 0
inequality constraints with only upper bounds: 6
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
0 1.0000000e+02 1.06e+02 1.00e+00 -1.0 0.00e+00 - 0.00e+00 0.00e+00 0
1 1.0393839e+02 5.88e+01 5.92e-01 -1.0 8.61e+00 - 5.61e-01 3.97e-01h 1
2 1.0671680e+02 3.79e+01 4.96e+00 -1.0 4.14e+00 0.0 8.43e-01 3.34e-01h 1
3 1.1190692e+02 0.00e+00 8.93e-01 -1.0 2.68e+00 -0.5 1.00e+00 1.00e+00h 1
4 1.0135309e+02 5.13e+00 1.08e+00 -1.0 1.30e+01 -1.0 9.82e-01 8.98e-01F 1
5 1.0112870e+02 4.47e+00 1.01e+00 -1.0 7.89e+00 -1.4 3.24e-01 5.67e-02h 1
6 1.0075878e+02 0.00e+00 6.13e-01 -1.0 1.56e+00 -1.0 1.00e+00 6.20e-01f 1
7 9.8921905e+01 0.00e+00 6.98e-01 -1.0 1.82e+01 -1.5 2.10e-01 1.05e-01f 1
8 9.9444694e+01 0.00e+00 1.28e-02 -1.0 2.52e+00 - 9.90e-01 1.00e+00f 1
9 9.8421857e+01 0.00e+00 3.72e-03 -1.7 3.28e+01 - 1.00e+00 9.53e-01f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
10 9.8404978e+01 0.00e+00 5.54e-04 -1.7 3.98e+01 - 1.00e+00 1.00e+00h 1
11 9.8405025e+01 0.00e+00 4.31e-05 -1.7 7.19e+00 - 1.00e+00 1.00e+00h 1
12 9.8300697e+01 0.00e+00 1.23e-05 -2.5 2.47e-01 - 1.00e+00 1.00e+00h 1
13 9.8285160e+01 0.00e+00 2.96e-07 -3.8 3.60e-02 - 1.00e+00 1.00e+00h 1
14 9.8284281e+01 0.00e+00 1.37e-09 -5.7 2.06e-03 - 1.00e+00 1.00e+00h 1
15 9.8284270e+01 0.00e+00 1.70e-12 -8.6 2.13e-04 - 1.00e+00 1.00e+00h 1
Number of Iterations....: 15
(scaled) (unscaled)
Objective...............: 9.8284270438747257e+01 9.8284270438747257e+01
Dual infeasibility......: 1.7041755864611193e-12 1.7041755864611193e-12
Constraint violation....: 0.0000000000000000e+00 0.0000000000000000e+00
Variable bound violation: 0.0000000000000000e+00 0.0000000000000000e+00
Complementarity.........: 2.5155239053889899e-09 2.5155239053889899e-09
Overall NLP error.......: 2.5155239053889899e-09 2.5155239053889899e-09
Number of objective function evaluations = 17
Number of objective gradient evaluations = 16
Number of equality constraint evaluations = 0
Number of inequality constraint evaluations = 17
Number of equality constraint Jacobian evaluations = 0
Number of inequality constraint Jacobian evaluations = 16
Number of Lagrangian Hessian evaluations = 15
Total seconds in IPOPT = 0.003
EXIT: Optimal Solution Found.
# Plot solution
plot_circles(model)
7.1.4. Take Away Messages#
Nonlinear programs may be nonconvex. For nonconvex problems, there often exists many local optima that are not also global optima.
Initialization is really important in optimization problems with nonlinear objectives or constraints!