7.4. Constraint Qualifications#
7.4.1. Concepts#
Feasible Sequence
Feasible point \(\bar{x}\)
Sequence \(\{x^k\}\) (e.g., optimization algorithm iterations)
Limit \(\lim_{k \rightarrow \infty} \{x^k\} = \bar{x}\)
\(x^{k}\) feasible for all \(k \geq K\) where \(K\) is sufficiently large
Limiting Direction \(d\)
Recall the ball rolling example.
Feasible sequence: path of the ball rolling to rest at local min
Limiting direction: tangent to the path and opposite of the direct the ball is rolling
In the above picture:
Green sequence is always feasible and thus is a feasible sequence
Blue sequence becomes feasible starting with \(x^3\) and is a feasible sequence
Theorem 4.8: If \(x^*\) is a solution of (NLP), then all feasible sequences leading to \(x^*\) must satisfy
where \(d\) is the limiting direction of the feasible sequence.
Ball analogy: can only roll down the hill. Cannot come from downhill direction (and still be feasible).
Big Picture
Constraint qualification act as the link between KKT conditions (which we numerical solve) and limiting directions (Theorem 4.8).
Key questions:
Does \(\nabla g\) and \(\nabla h\) properly characterize/identify local solutions?
Does \(\nabla h_i(x^*)^T d = 0\) and \(\nabla g_i(x^*)^T d \leq 0\) \(\forall i \in A(x^*)\) capture the limiting directions of the solution?
7.4.2. Linear Constrainted Optimization Problems#
7.4.3. Example#
Consider the following two dimensional optimization problem:
This is an example from Section 4.3 in Nonlinear Programming by Biegler.
import numpy as np
import matplotlib.cm as cm
import matplotlib.pyplot as plt
7.4.3.1. Visualize Feasible Set#
n = 101
x1 = np.linspace(-3,3,n)
plt.figure()
g1 = np.power(x1,3)
g2 = -g1
plt.plot(x1,g1,color="blue",linestyle="-",label="$x_2 \leq x_1^3$")
plt.fill_between(x1,g1,np.min(g1)*np.ones(n),color="blue",alpha=0.5)
plt.plot(x1,g2,color="red",linestyle="-",label="$-x_1^3 \leq x_2$")
plt.fill_between(x1,np.max(g2)*np.ones(n),g2,color="red",alpha=0.5)
plt.xlabel("$x_1$")
plt.ylabel("$x_2$")
plt.grid()
plt.legend(loc="center left")
plt.show()
Discussion
Where on the graph are both constraints satisfied? Choices:
White region
Blue region
Red region
Purple region
7.4.3.2. KKT Conditions#
Rewrite the problem in cannonical form:
The Lagrangian is defined as:
Specifically:
Set the gradient of the Lagrangian with respect to \(x\) equal to zero:
Complementary slackness conditions:
\((x_2 - x_1^3) u_1 = 0\)
\((-x_1^3 - x_2) u_2 = 0\)
Inequality constraints:
\(x_2 - x_1^3 \leq 0\)
\(-x_1^3 - x_2 \leq 0\)
Non-negativity of Lagrange multipliers:
\(u_1 \geq 0\)
\(u_2 \geq 0\)
7.4.3.3. Enumerate the Possible Feasible Sets#
Neither constraint is strongly active
\(u_1 = 0\), \(u_2 = 0\)Gradient of the Lagrangian:
\[\begin{split} \nabla_x L = \begin{bmatrix} 1 \\ 0 \end{bmatrix} \neq \begin{bmatrix} 0 \\ 0 \end{bmatrix} \end{split}\]Solution is not bounded!
\(g_1\) is active, \(g_2\) is NOT strongly active
\(u_1 \geq 0\), \(u_2 = 0\)Gradient of the Lagrangian:
\[\begin{split} \nabla_x L = \begin{bmatrix} 1 - 3u_1 x_1^2 \\ 0 + u_1 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \end{split}\]From this, \(u_1 = 0\)… KKT conditions are not satisfied!
\(g_1\) is NOT strongly active, \(g_2\) is active
Same problem as case 2.Both \(g_1\) and \(g_2\) are strongly active
\(u_1 \geq 0\), \(u_2 \geq 0\)Gradient of the Lagrangian:
\[\begin{split} \nabla_x L = \begin{bmatrix} 1 - 3u_1 x_1^2 - 3u_2 x_1^2 \\ u_1 - u_2 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \end{split}\]From \(\nabla_{x_2} L = 0\), we get \(u_1 = u_2\).
Substituting into \(\nabla_{x_1} L = 0\):
\[ 1 - 3u_1 x_1^2 - 3u_2 x_1^2 = 0 \]Solving for \(x_1\):
\[ x_1 = \sqrt{\frac{1}{3u_1}} \]This means we can see from visual inspection that the optimal solution occurs at:
\(x_1 \to 0\), \(x_2 \to 0\)
\(u_1, u_2 \to \infty\)
How to handle this case? Need to draw the feasible region and possible limiting directions.
7.4.3.4. Return to Theorem 4.8#
Recall that KKT conditions encode a limiting direction too:
Consider the example when both constraints are active.
Gradients of the constraints:
Now we construct the KKT conditions:
Recall both constraints are active at the solution \(x_1 = x_2 = 0\):
Substituting \(x_1 = x_2 = 0\):
Thus there are only two solution:
… and all scalar multiples of these.
Does anyone notice anything particular about the gradients of the constraints, especially at the solution \(x_1 = x_2 = 0\)?
7.4.4. Constraint Qualifications#
Constraint qualifications are the link between KKT conditions and limiting directions (Theorem 4.8).
Main idea: Ensure active constraints are not “too nonlinear” and KKT conditions adequately describe limit directions.
7.4.4.1. Linear Independence Constraint Qualification (LICQ)#
Definition 4.12 Given a local solution \(x^*\) to:
with active set \(\mathcal{A}(x^*)\), LICQ holds if:
are linearly independent.
i.e., the part of the Jacobian corresponding to the active set is full rank.
Does LICQ hold for the example? No.
Lemma 4.13 Consider “the cone”:
The set of limiting directions for all feasible sequences is a subset of “the cone”.
If LICQ holds, “the cone” is equivalent to the set of limiting directions for all feasible sequences.
Consider all limiting directions in “the cone” with \(\|d\| = 1\). When LICQ holds, a feasible sequence \(\{x^k\}\):
\[\begin{equation*} x^k = x^* + t d + o(t^2) \end{equation*}\]can always be constructed to satisfy:
\[\begin{equation*} h_i(x^k) = t \nabla h(x^*)^T d = 0, \end{equation*}\]\[\begin{equation*} g_i(x^k) = t \nabla g_i(x^*)^T d \leq 0, \, i \in \mathcal{A}(x^*) \end{equation*}\]for some small positive \(t\) with \(\lim t \to 0\).
What does this mean? When LICQ holds, it is always possible to construct a feasible sequence in the limiting direction. This is important for algorithms.
Theorem 4.14: Consider local solution \(x^*\). If LICQ holds at \(x^*\), then:
The KKT conditions are satisfied.
If:
(i) \(f(x)\) and \(g(x)\) are convex,
(ii) \(h(x)\) is linear,
and (iii) the KKT conditions are satisfied,
then \(x^*\) is a global solution (this implies optimality).
Theorem 4.15 LICQ and Multipliers
Given a point \(x^*\) that satisfies the KKT conditions with multipliers \(u^*\) and \(v^*\), and with active set \(\mathcal{A}(x^*)\). If LICQ holds, then \(u^*\) and \(v^*\) are unique.
Why is this important for interpretation of KKT multipliers? Discuss.
7.4.4.2. Mangasarian-Fromovitz Constraint Qualification (MFCQ)#
What can we say if LICQ does not hold?
Theorem 4.16 Mangasarian-Fromovitz Constraint Qualification (MFCQ)
Given:
A local solution \(x^*\)
Active set \(\mathcal{A}(x^*)\)
MFCQ is defined by:
Linear independence of equality constraint gradients.
There exists a search direction \(d\) such that:
\[\begin{equation*} \nabla g_i(x^*)^T d < 0, \quad \nabla h_i(x^*)^T d = 0, \quad i \in \mathcal{A}(x^*). \end{equation*}\]
Key Properties
LICQ \(\implies\) MFCQ.
If MFCQ holds, then multipliers \(u^*\) and \(v^*\) are bounded (but not necessarily unique).
Does MFCQ hold for the example?
No \(d\) satisfies:
Thus, MFCQ does not hold!
7.4.5. Example Revisited: Solve with Pyomo#
from pyomo.environ import ConcreteModel, Var, Objective, Constraint, SolverFactory, Suffix, minimize, value
## Create concrete Pyomo model
m = ConcreteModel()
## Set up to extract dual variables after model solve.
m.dual = Suffix(direction=Suffix.IMPORT)
## Declare variables with initial values
m.x1 = Var(initialize=1)
m.x2 = Var(initialize=1)
## Declare objective
m.OBJ = Objective(expr=m.x1, sense = minimize)
m.g1 = Constraint(expr=m.x2 <= m.x1**3)
m.g2 = Constraint(expr=-m.x1**3 <= m.x2)
## Specify IPOPT as solver
solver = SolverFactory('ipopt')
## Solve the model
results = solver.solve(m, tee = True)
## Return the solution
print("x1 = ",value(m.x1))
print("x2 = ",value(m.x2))
print("\n")
## Inspect dual variables
m.dual.display()
Ipopt 3.14.16:
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit https://github.com/coin-or/Ipopt
******************************************************************************
This is Ipopt version 3.14.16, running with linear solver MUMPS 5.7.3.
Number of nonzeros in equality constraint Jacobian...: 0
Number of nonzeros in inequality constraint Jacobian.: 4
Number of nonzeros in Lagrangian Hessian.............: 1
Total number of variables............................: 2
variables with only lower bounds: 0
variables with lower and upper bounds: 0
variables with only upper bounds: 0
Total number of equality constraints.................: 0
Total number of inequality constraints...............: 2
inequality constraints with only lower bounds: 0
inequality constraints with lower and upper bounds: 0
inequality constraints with only upper bounds: 2
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
0 1.0000000e+00 0.00e+00 7.89e-01 -1.0 0.00e+00 - 0.00e+00 0.00e+00 0
1 9.7574511e-01 0.00e+00 1.04e-02 -1.0 2.54e-01 - 1.00e+00 1.00e+00f 1
2 6.6105230e-01 0.00e+00 2.39e-01 -1.7 4.64e+00 - 1.00e+00 3.73e-01f 1
3 4.6924687e-01 0.00e+00 2.74e-01 -1.7 1.92e-01 - 1.00e+00 1.00e+00h 1
4 3.2820073e-01 0.00e+00 3.14e-01 -1.7 1.62e-01 - 1.00e+00 8.72e-01h 1
5 2.6957767e-01 0.00e+00 1.60e-01 -1.7 5.86e-02 - 1.00e+00 1.00e+00f 1
6 1.8935274e-01 0.00e+00 2.24e+00 -2.5 2.21e-01 - 1.00e+00 3.63e-01f 1
7 1.3091124e-01 0.00e+00 3.19e-01 -2.5 5.84e-02 - 1.00e+00 1.00e+00h 1
8 9.6126564e-02 0.00e+00 2.03e+00 -2.5 4.69e-02 - 1.00e+00 7.41e-01h 1
9 7.1009142e-02 0.00e+00 2.58e-01 -2.5 2.51e-02 - 1.00e+00 1.00e+00f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
10 5.1777995e-02 0.00e+00 1.21e+01 -2.5 3.18e-02 - 1.00e+00 6.04e-01h 1
11 4.0824512e-02 0.00e+00 1.97e-01 -2.5 1.10e-02 - 1.00e+00 1.00e+00f 1
12 3.4583313e-02 0.00e+00 5.91e+01 -2.5 1.80e-02 - 1.00e+00 3.47e-01h 2
13 2.6914569e-02 0.00e+00 1.79e-01 -2.5 7.67e-03 - 1.00e+00 1.00e+00h 1
14 2.0308733e-02 0.00e+00 2.01e-01 -2.5 6.61e-03 - 1.00e+00 1.00e+00h 1
15 1.9046691e-02 0.00e+00 3.31e-02 -2.5 1.26e-03 - 1.00e+00 1.00e+00h 1
16 1.2746908e-02 0.00e+00 5.71e+02 -3.8 1.83e-02 - 1.00e+00 3.44e-01f 1
17 8.7118644e-03 0.00e+00 3.19e-01 -3.8 4.04e-03 - 1.00e+00 1.00e+00f 1
18 7.5197334e-03 0.00e+00 1.09e+03 -3.8 3.10e-03 - 1.00e+00 3.85e-01f 2
19 5.1147446e-03 0.00e+00 3.19e-01 -3.8 2.40e-03 - 1.00e+00 1.00e+00f 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
20 3.3178215e-03 0.00e+00 3.78e-01 -3.8 1.80e-03 - 1.00e+00 1.00e+00f 1
21 2.2893616e-03 0.00e+00 3.40e-01 -3.8 1.03e-03 - 1.00e+00 1.00e+00f 1
22 1.8283227e-03 0.00e+00 8.34e+04 -3.8 4.48e-03 - 1.00e+00 1.03e-01f 2
23 4.5313818e-04 0.00e+00 8.82e-01 -3.8 1.38e-03 - 1.00e+00 1.00e+00f 1
24 3.0653152e-04 0.00e+00 3.16e+05 -5.7 1.47e-04 4.0 2.32e-01 1.00e+00h 1
25 -5.5285928e-04 0.00e+00 1.00e+00 -5.7 8.59e-04 - 1.00e+00 1.00e+00f 1
26 -1.1223885e-02 1.40e-06 1.36e+04 -5.7 7.96e-01 - 1.00e+00 1.34e-02f 1
27 -1.1107786e-02 1.36e-06 1.16e+03 -5.7 3.72e-03 - 1.00e+00 3.12e-02h 6
28 -7.4290268e-03 4.00e-07 4.64e-01 -5.7 3.68e-03 - 1.00e+00 1.00e+00h 1
29 -5.8687786e-03 1.92e-07 8.66e+02 -5.7 2.41e-03 - 1.00e+00 6.47e-01h 1
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
30 -5.8542172e-03 1.91e-07 3.64e+03 -5.7 1.86e-03 - 1.00e+00 7.81e-03f 8
31 -3.9974633e-03 5.39e-08 3.37e-01 -5.7 1.86e-03 - 1.00e+00 1.00e+00h 1
32 -2.9534118e-03 1.58e-08 5.23e+02 -5.7 1.12e-03 - 1.00e+00 9.29e-01h 1
33 -2.9344353e-03 1.53e-08 1.06e+04 -5.7 6.07e-04 - 1.00e+00 3.12e-02f 6
34 -2.3405157e-03 2.82e-09 1.76e-01 -5.7 5.94e-04 - 1.00e+00 1.00e+00h 1
35 -2.1660008e-03 1.62e-10 3.83e-02 -5.7 1.75e-04 - 1.00e+00 1.00e+00h 1
36 -2.1508378e-03 0.00e+00 6.75e-04 -5.7 1.52e-05 - 1.00e+00 1.00e+00h 1
37 -2.1544258e-03 0.00e+00 6.10e-06 -8.6 3.59e-06 - 1.00e+00 1.00e+00f 1
38 -2.1544329e-03 0.00e+00 7.78e-12 -9.0 7.05e-09 - 1.00e+00 1.00e+00h 1
Number of Iterations....: 38
(scaled) (unscaled)
Objective...............: -2.1544328718688505e-03 -2.1544328718688505e-03
Dual infeasibility......: 7.7819972688075723e-12 7.7819972688075723e-12
Constraint violation....: 0.0000000000000000e+00 0.0000000000000000e+00
Variable bound violation: 0.0000000000000000e+00 0.0000000000000000e+00
Complementarity.........: 9.0909382315597638e-10 9.0909382315597638e-10
Overall NLP error.......: 2.5317795697139416e-12 9.0909382315597638e-10
Number of objective function evaluations = 64
Number of objective gradient evaluations = 39
Number of equality constraint evaluations = 0
Number of inequality constraint evaluations = 64
Number of equality constraint Jacobian evaluations = 0
Number of inequality constraint Jacobian evaluations = 39
Number of Lagrangian Hessian evaluations = 38
Total seconds in IPOPT = 0.009
EXIT: Optimal Solution Found.
x1 = -0.0021544328718688505
x2 = 5.676436642098907e-25
dual : Direction=IMPORT, Datatype=FLOAT
Key : Value
g1 : -35907.30543965533
g2 : -35907.30543965533
Discussion
Why so many iterations for such a simple problem?
Why are the multipliers so negative?
Are the constraints satisfied?
Why is the solution not exactly \(x_1 = x_2 = 0\)?