7.1. Convexity Revisited#

# This code cell installs packages on Colab

import sys
if "google.colab" in sys.modules:
    !wget "https://raw.githubusercontent.com/ndcbe/optimization/main/notebooks/helper.py"
    import helper
    helper.easy_install()
else:
    sys.path.insert(0, '../')
    import helper
helper.set_plotting_style()
import pandas as pd
import pyomo.environ as pyo

7.1.1. Background#

Reference: Beginning of Chapter 4 in Biegler (2010)

7.1.1.1. Canonical Nonlinear Program (NLP)#

minxf(x)(objective function)s.t.gi(x)0,i=1,,m(inequality constraints)hj(x)=0,j=1,,p(equality constraints)xRn(decision variables)

Assumption: functions f(x):RnR, h(x):RnRm, and g(x):RnRr have continuous first and second derivatives.

Denote the feasible region as:

F={x|g(x)0,h(x)=0}.

7.1.1.2. Types of Constrained Optimal Solutions#

Definition 4.1: Constrained Optimal Solutions

A point x is a global minimizer if f(x)f(x) for all xF.

A point x is a local minimizer if f(x)f(x) for all xN(x)F, where we define $N(x)={x:xx<ϵ},ϵ>0.$

A point x is a strict local minimizer if f(x)<f(x) for all xN(x)F.

A point x is an isolated local minimizer if there are no other local minimizers in N(x)F.

7.1.1.3. Key Questions#

As with unconstrained optimization, the following questions need to be considered:

  • If a solution x exists, is it a global solution in F or is it only a local solution?

  • What conditions characterize the optimal solutions?

  • Are there special problem classes of the NLP whose solutions have stronger properties and are easier to solve?

  • Are there efficient and reliable methods to solve the NLP?

7.1.2. Convexity for Constrained Optimization#

Reference: Section 4.1 in Biegler (2010)

7.1.2.1. Illustrative Examples#

Main idea: are the objective function and feasible region both convex?

f(x) is convex on the domain xX if and only if
$αf(xa)+(1α)f(xb)f(αxa+(1α)xb)xa,xbX,α(0,1).$

Strict convexity requires the inequality to be strict.

concept_test_1

The region Y is convex if and only if
$αxa+(1α)xbYxa,xbY,α[0,1].$

concept_test_2

7.1.2.2. Theorem 4.2: Convexity#

Theorem 4.2 If g(x) is convex and h(x) is linear, then the region

F={x|g(x)0,h(x)=0}

is convex, i.e.,

αxa+(1α)xbFfor all α(0,1) and xa,xbF.

Proof

  1. Consider two points xa,xbF and

    x¯=αxa+(1α)xbfor some α(0,1).
  2. If x¯F, then (i) g(x¯)>0 or (ii) h(x¯)0 or both.

    • Case (i): Recall g(x) is convex, and g(xa)0 and g(xb)0 (both xa and xb are feasible).
      $0αg(xa)+(1α)g(xb)g(αxa+(1α)xb)=g(x¯)(definition of convexity).Thus,g(\bar{x}) \leq 0$.

    • Case (ii): Recall h(x) is linear, and h(xa)=0 and h(xb)=0.

      0=αh(xa)+(1α)h(xb)(property of linear functions)=h(αxa+(1α)xb)=h(x¯)(definition of x¯).

      Thus, g(x¯)0 and h(x¯)=0.

This leads to a contradiction.

7.1.2.3. Theorem 4.3: Global Minimizers#

Theorem 4.3 If f(x) is convex and F is convex, then every local minimum in F is a global minimum. If f(x) is strictly convex in F, then a local minimum is the unique global minimum.

Proof: Convexity Claim

  1. Assumption: There are two local minima xa,xbF with f(xa)>f(xb). Seek contradiction.

  2. Definition of Local Minimum:

    • f(xa)f(x),xN(xa)F

    • f(xb)f(x),xN(xb)F

  3. By Convexity:

    • (1α)xa+αxbF

    • f((1α)xa+αxb)(1α)f(xa)+αf(xb),α(0,1)

  4. Choose α such that:

    x¯=(1α)xa+αxbN(xa)F.

    Thus,
    $f(x¯)f(xa)+α(f(xb)f(xa)).$

    Recall f(xb)<f(xa), so

    f(x¯)<f(xa).

    This is a contradiction, as xa cannot be a local minimizer.

Proof: Stricty Convexity Claim. Same idea as above. Assume f(xa)f(xb) in (1). Use strict inequality (from the strict convexity definition) in (3) and (4).

7.1.2.4. More Illustrative Examples#

concept_test3

7.1.3. Circle Packing Example#

Reference: Section 4.1 in Biegler (2010)

Motivating Question: Is this problem convex?

What is the smallest rectangle you can use to enclose three given circles? Reference: Example 4.4 in Biegler (2010).

picture

7.1.3.1. Optimization Model and Pyomo Implementation#

The following optimization model is given in Biegler (2010) and adapted to use set notation.

minx,y,A,B2(A+B)s.t.A0,B0xi,yi,Ri,xiBRi,yiARi,iC(xixj)2+(yiyj)2(Ri+Rj)2,i,j{iC,jC:i<j}
import random
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches

def create_circle_model(circle_radii):
    ''' Create circle optimization model in Pyomo
    
    Arguments:
        circle_radii: dictionary with keys=circle name and value=radius (float)
        
    Returns:
        model: Pyomo model
    '''

    # Number of circles to consider
    n = len(circle_radii)

    # Create a concrete Pyomo model.
    model = pyo.ConcreteModel()

    # Initialize index for circles
    model.CIRCLES = pyo.Set(initialize = circle_radii.keys())
    
    # Create parameter
    model.R = pyo.Param(model.CIRCLES, domain=pyo.PositiveReals, initialize=circle_radii)

    # Create variables for box
    model.a = pyo.Var(domain=pyo.PositiveReals)
    model.b = pyo.Var(domain=pyo.PositiveReals)

    # Set objective
    model.obj = pyo.Objective(expr=2*(model.a + model.b), sense = pyo.minimize)

    # Create variables for circle centers
    model.x = pyo.Var(model.CIRCLES, domain=pyo.PositiveReals)
    model.y = pyo.Var(model.CIRCLES, domain=pyo.PositiveReals)

    # "In the box" constraints
    def left_x(m,c):
        return m.x[c] >= model.R[c]
    model.left_x_con = pyo.Constraint(model.CIRCLES, rule=left_x)

    def left_y(m,c):
        return m.y[c] >= model.R[c]
    model.left_y_con = pyo.Constraint(model.CIRCLES, rule=left_y)

    def right_x(m,c):
        return m.x[c] <= m.b - model.R[c]
    model.right_x_con = pyo.Constraint(model.CIRCLES, rule=right_x)

    def right_y(m,c):
        return m.y[c] <= m.a - model.R[c]
    model.right_y_con = pyo.Constraint(model.CIRCLES, rule=right_y)

    # No overlap constraints
    def no_overlap(m,c1,c2):
        if c1 < c2:
            return (m.x[c1] - m.x[c2])**2 + (m.y[c1] - m.y[c2])**2 >= (model.R[c1] + model.R[c2])**2
        else:
            return pyo.Constraint.Skip
    model.no_overlap_con = pyo.Constraint(model.CIRCLES, model.CIRCLES, rule=no_overlap)
    
    return model

def initialize_circle_model(model, a_init=25, b_init=25):
    ''' Initialize the x and y coordinates using uniform distribution
    
    Arguments:
        a_init: initial value for a (default=25)
        b_init: initial value for b (default=25)
        
    Returns:
        Nothing. But per Pyomo scoping rules, the input argument `model`
        can be modified in this function.
    
    '''
    # Initialize 
    model.a = 25
    model.b = 25

    for i in model.CIRCLES:
        # Adding circle radii ensures the remains in the >0, >0 quadrant
        model.x[i] = random.uniform(0,10) + model.R[i]
        model.y[i] = random.uniform(0,10) + model.R[i]

Next, we will create a dictionary containing the circle names and radii values.

# Create dictionary with circle data
circle_data = {'A':10.0, 'B':5.0, 'C':3.0}
circle_data
{'A': 10.0, 'B': 5.0, 'C': 3.0}
# Access the keys
circle_data.keys()
dict_keys(['A', 'B', 'C'])

Now let’s create the model.

# Create model
model = create_circle_model(circle_data)

And let’s initialize the model.

# Initialize model
initialize_circle_model(model)
model.pprint()
1 Set Declarations
    CIRCLES : Size=1, Index=None, Ordered=Insertion
        Key  : Dimen : Domain : Size : Members
        None :     1 :    Any :    3 : {'A', 'B', 'C'}

1 Param Declarations
    R : Size=3, Index=CIRCLES, Domain=PositiveReals, Default=None, Mutable=False
        Key : Value
          A :  10.0
          B :   5.0
          C :   3.0

4 Var Declarations
    a : Size=1, Index=None
        Key  : Lower : Value : Upper : Fixed : Stale : Domain
        None :     0 :    25 :  None : False : False : PositiveReals
    b : Size=1, Index=None
        Key  : Lower : Value : Upper : Fixed : Stale : Domain
        None :     0 :    25 :  None : False : False : PositiveReals
    x : Size=3, Index=CIRCLES
        Key : Lower : Value              : Upper : Fixed : Stale : Domain
          A :     0 : 11.513225996655489 :  None : False : False : PositiveReals
          B :     0 :   8.43454062603906 :  None : False : False : PositiveReals
          C :     0 :  4.040445303542623 :  None : False : False : PositiveReals
    y : Size=3, Index=CIRCLES
        Key : Lower : Value              : Upper : Fixed : Stale : Domain
          A :     0 : 13.183454470479534 :  None : False : False : PositiveReals
          B :     0 : 14.420996792060324 :  None : False : False : PositiveReals
          C :     0 :  8.426415825883453 :  None : False : False : PositiveReals

1 Objective Declarations
    obj : Size=1, Index=None, Active=True
        Key  : Active : Sense    : Expression
        None :   True : minimize : 2*(a + b)

5 Constraint Declarations
    left_x_con : Size=3, Index=CIRCLES, Active=True
        Key : Lower : Body : Upper : Active
          A :  10.0 : x[A] :  +Inf :   True
          B :   5.0 : x[B] :  +Inf :   True
          C :   3.0 : x[C] :  +Inf :   True
    left_y_con : Size=3, Index=CIRCLES, Active=True
        Key : Lower : Body : Upper : Active
          A :  10.0 : y[A] :  +Inf :   True
          B :   5.0 : y[B] :  +Inf :   True
          C :   3.0 : y[C] :  +Inf :   True
    no_overlap_con : Size=3, Index=CIRCLES*CIRCLES, Active=True
        Key        : Lower : Body                                : Upper : Active
        ('A', 'B') : 225.0 : (x[A] - x[B])**2 + (y[A] - y[B])**2 :  +Inf :   True
        ('A', 'C') : 169.0 : (x[A] - x[C])**2 + (y[A] - y[C])**2 :  +Inf :   True
        ('B', 'C') :  64.0 : (x[B] - x[C])**2 + (y[B] - y[C])**2 :  +Inf :   True
    right_x_con : Size=3, Index=CIRCLES, Active=True
        Key : Lower : Body              : Upper : Active
          A :  -Inf : x[A] - (b - 10.0) :   0.0 :   True
          B :  -Inf :  x[B] - (b - 5.0) :   0.0 :   True
          C :  -Inf :  x[C] - (b - 3.0) :   0.0 :   True
    right_y_con : Size=3, Index=CIRCLES, Active=True
        Key : Lower : Body              : Upper : Active
          A :  -Inf : y[A] - (a - 10.0) :   0.0 :   True
          B :  -Inf :  y[B] - (a - 5.0) :   0.0 :   True
          C :  -Inf :  y[C] - (a - 3.0) :   0.0 :   True

12 Declarations: CIRCLES R a b obj x y left_x_con left_y_con right_x_con right_y_con no_overlap_con

7.1.3.2. Visualize Initial Point#

Next, we’ll define a function to plot the solution (or initial point)

# Plot initial point

def plot_circles(m):
    ''' Plot circles using data in Pyomo model
    
    Arguments:
        m: Pyomo concrete model
    
    Returns:
        Nothing (but makes a figure)
    
    '''
    
    # Create figure
    fig, ax = plt.subplots(1,figsize=(6,6))
    
    # Adjust axes
    l = max(m.a.value,m.b.value) + 1
    ax.set_xlim(0,l)
    ax.set_ylim(0,l)
    
    # Draw box
    art = mpatches.Rectangle((0,0), width=m.b.value, height=m.a.value,fill=False)
    ax.add_patch(art)

    # Draw circles and mark center
    for i in m.CIRCLES:
        art2 = mpatches.Circle( (m.x[i].value,m.y[i].value), radius=m.R[i],fill=True,alpha=0.25)
        ax.add_patch(art2)
        
        plt.scatter(m.x[i].value,m.y[i].value,color='black')
    
    # Show plot
    plt.show()
    
plot_circles(model)
../../_images/5137110fd92d69a9f96acc41dc277d8473d6d59be233e047f0abfa3028b722a2.png

7.1.3.3. Solve and Inspect the Solution#

# Specify the solver
solver = pyo.SolverFactory('ipopt')

# Solve the model
results = solver.solve(model, tee = True)
Ipopt 3.14.16: 

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************

This is Ipopt version 3.14.16, running with linear solver MUMPS 5.7.3.

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:       30
Number of nonzeros in Lagrangian Hessian.............:       12

Total number of variables............................:        8
                     variables with only lower bounds:        8
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:       15
        inequality constraints with only lower bounds:        9
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        6

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  1.0000000e+02 2.14e+02 1.00e+00  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  9.7404570e+01 1.41e+02 2.57e+00  -1.0 1.90e+01    -  1.53e-01 1.79e-01h  1
   2  9.8778030e+01 4.43e+01 4.34e-01  -1.0 6.90e+00    -  5.38e-01 5.49e-01h  1
   3  9.7534613e+01 2.66e+01 4.02e-01  -1.0 1.12e+01    -  3.40e-01 2.08e-01h  1
   4  9.8462936e+01 7.50e+00 2.95e-01  -1.0 9.82e-01    -  9.09e-01 6.57e-01h  1
   5  9.8784717e+01 0.00e+00 1.63e-02  -1.0 5.11e+00    -  1.00e+00 9.29e-01h  1
   6  9.8395226e+01 0.00e+00 2.45e-03  -1.7 2.38e+01    -  1.00e+00 1.00e+00h  1
   7  9.8404859e+01 0.00e+00 8.76e-04  -1.7 3.43e+01    -  1.00e+00 1.00e+00h  1
   8  9.8404875e+01 0.00e+00 2.03e-05  -1.7 5.60e+00    -  1.00e+00 1.00e+00h  1
   9  9.8300803e+01 0.00e+00 1.10e-05  -2.5 2.48e-01    -  1.00e+00 1.00e+00h  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10  9.8285163e+01 0.00e+00 2.83e-07  -3.8 3.59e-02    -  1.00e+00 1.00e+00h  1
  11  9.8284281e+01 0.00e+00 1.05e-09  -5.7 2.05e-03    -  1.00e+00 1.00e+00h  1
  12  9.8284270e+01 0.00e+00 8.68e-13  -8.6 1.53e-04    -  1.00e+00 1.00e+00h  1

Number of Iterations....: 12

                                   (scaled)                 (unscaled)
Objective...............:   9.8284270438748564e+01    9.8284270438748564e+01
Dual infeasibility......:   8.6837913151726545e-13    8.6837913151726545e-13
Constraint violation....:   0.0000000000000000e+00    0.0000000000000000e+00
Variable bound violation:   0.0000000000000000e+00    0.0000000000000000e+00
Complementarity.........:   2.5110436111686484e-09    2.5110436111686484e-09
Overall NLP error.......:   2.5110436111686484e-09    2.5110436111686484e-09


Number of objective function evaluations             = 13
Number of objective gradient evaluations             = 13
Number of equality constraint evaluations            = 0
Number of inequality constraint evaluations          = 13
Number of equality constraint Jacobian evaluations   = 0
Number of inequality constraint Jacobian evaluations = 13
Number of Lagrangian Hessian evaluations             = 12
Total seconds in IPOPT                               = 0.005

EXIT: Optimal Solution Found.


Next, we can inspect the solution. Because Pyomo is a Python extension, we can use Pyoth (for loops, etc.) to programmatically inspect the solution.

# Print variable values
print("Name\tValue")
for c in model.component_data_objects(pyo.Var):
    print(c.name,"\t", pyo.value(c))

# Plot solution
plot_circles(model)
Name	Value
a 	 19.999999803189517
b 	 29.142135416184765
x[A] 	 19.14213551493193
x[B] 	 4.999999951252107
x[C] 	 5.397289325741121
y[A] 	 9.999999901252016
y[B] 	 14.999999849644135
y[C] 	 4.723178462635149
../../_images/646b723200b136ac990f528795216993d328a00a382ed201dfde381df32f281a.png
# Print constraints
for c in model.component_data_objects(pyo.Constraint):
    print(c.name,"\t", pyo.value(c.lower),"\t", pyo.value(c.body),"\t", pyo.value(c.upper))
left_x_con[A] 	 10.0 	 19.14213551493193 	 None
left_x_con[B] 	 5.0 	 4.999999951252107 	 None
left_x_con[C] 	 3.0 	 5.397289325741121 	 None
left_y_con[A] 	 10.0 	 9.999999901252016 	 None
left_y_con[B] 	 5.0 	 14.999999849644135 	 None
left_y_con[C] 	 3.0 	 4.723178462635149 	 None
right_x_con[A] 	 None 	 9.874716511149018e-08 	 0.0
right_x_con[B] 	 None 	 -19.142135464932657 	 0.0
right_x_con[C] 	 None 	 -20.744846090443644 	 0.0
right_y_con[A] 	 None 	 9.806249856580962e-08 	 0.0
right_y_con[B] 	 None 	 4.6454617930180575e-08 	 0.0
right_y_con[C] 	 None 	 -12.276821340554369 	 0.0
no_overlap_con[A,B] 	 225.0 	 224.9999977854188 	 None
no_overlap_con[A,C] 	 169.0 	 216.7656412595597 	 None
no_overlap_con[B,C] 	 64.0 	 105.77089666756719 	 None

7.1.3.4. Reinitialize and Resolve#

Reinitialize the model, plot the initial point, resolve, and plot the solution. Is there more than one solution?

# Initialize and print the model
initialize_circle_model(model)
# Plot initial point
plot_circles(model)
../../_images/021db11f97d7e3cfc76b677a529f925731c255241955a4259d84365f59931768.png
# Solve the model
results = solver.solve(model, tee = True)
Ipopt 3.14.16: 

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************

This is Ipopt version 3.14.16, running with linear solver MUMPS 5.7.3.

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:       30
Number of nonzeros in Lagrangian Hessian.............:       12

Total number of variables............................:        8
                     variables with only lower bounds:        8
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:       15
        inequality constraints with only lower bounds:        9
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:        6

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  1.0000000e+02 1.06e+02 1.00e+00  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  1.0393839e+02 5.88e+01 5.92e-01  -1.0 8.61e+00    -  5.61e-01 3.97e-01h  1
   2  1.0671680e+02 3.79e+01 4.96e+00  -1.0 4.14e+00   0.0 8.43e-01 3.34e-01h  1
   3  1.1190692e+02 0.00e+00 8.93e-01  -1.0 2.68e+00  -0.5 1.00e+00 1.00e+00h  1
   4  1.0135309e+02 5.13e+00 1.08e+00  -1.0 1.30e+01  -1.0 9.82e-01 8.98e-01F  1
   5  1.0112870e+02 4.47e+00 1.01e+00  -1.0 7.89e+00  -1.4 3.24e-01 5.67e-02h  1
   6  1.0075878e+02 0.00e+00 6.13e-01  -1.0 1.56e+00  -1.0 1.00e+00 6.20e-01f  1
   7  9.8921905e+01 0.00e+00 6.98e-01  -1.0 1.82e+01  -1.5 2.10e-01 1.05e-01f  1
   8  9.9444694e+01 0.00e+00 1.28e-02  -1.0 2.52e+00    -  9.90e-01 1.00e+00f  1
   9  9.8421857e+01 0.00e+00 3.72e-03  -1.7 3.28e+01    -  1.00e+00 9.53e-01f  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10  9.8404978e+01 0.00e+00 5.54e-04  -1.7 3.98e+01    -  1.00e+00 1.00e+00h  1
  11  9.8405025e+01 0.00e+00 4.31e-05  -1.7 7.19e+00    -  1.00e+00 1.00e+00h  1
  12  9.8300697e+01 0.00e+00 1.23e-05  -2.5 2.47e-01    -  1.00e+00 1.00e+00h  1
  13  9.8285160e+01 0.00e+00 2.96e-07  -3.8 3.60e-02    -  1.00e+00 1.00e+00h  1
  14  9.8284281e+01 0.00e+00 1.37e-09  -5.7 2.06e-03    -  1.00e+00 1.00e+00h  1
  15  9.8284270e+01 0.00e+00 1.70e-12  -8.6 2.13e-04    -  1.00e+00 1.00e+00h  1

Number of Iterations....: 15

                                   (scaled)                 (unscaled)
Objective...............:   9.8284270438747257e+01    9.8284270438747257e+01
Dual infeasibility......:   1.7041755864611193e-12    1.7041755864611193e-12
Constraint violation....:   0.0000000000000000e+00    0.0000000000000000e+00
Variable bound violation:   0.0000000000000000e+00    0.0000000000000000e+00
Complementarity.........:   2.5155239053889899e-09    2.5155239053889899e-09
Overall NLP error.......:   2.5155239053889899e-09    2.5155239053889899e-09


Number of objective function evaluations             = 17
Number of objective gradient evaluations             = 16
Number of equality constraint evaluations            = 0
Number of inequality constraint evaluations          = 17
Number of equality constraint Jacobian evaluations   = 0
Number of inequality constraint Jacobian evaluations = 16
Number of Lagrangian Hessian evaluations             = 15
Total seconds in IPOPT                               = 0.003

EXIT: Optimal Solution Found.

# Plot solution
plot_circles(model)
../../_images/40bd1dc35edf334c78b27ded5a47535e30cf9ad47a6785f9799e3ed4389c3689.png

7.1.4. Take Away Messages#

  • Nonlinear programs may be nonconvex. For nonconvex problems, there often exists many local optima that are not also global optima.

  • Initialization is really important in optimization problems with nonlinear objectives or constraints!