Instructor: Hasan A. Poonawala
Mechanical and Aerospace Engineering
University of Kentucky, Lexington, KY, USA
Topics:
Equality Constraints
Inequality Constraints
KKT Conditions
General nonlinear programming problems are of the form
Definition
Consider all differentiable curves on passing through a point . The tangent plane at of is defined as the collection of the derivatives at of all these differentiable curves.
If is a regular point (to be defined) then we can make the following identification:
Another way: the tangent plane to is the set of all directions that are neither strict increase or strict decrease directions for
Definition (Regular Point)
A point satisfying the constraint is said to be a regular point of the constraint if the gradient vectors are linearly independent.
Lemma
Let be a regular point of the constraints and a local extremum point of subject to these constraints. Then for all , we have
Proof
Let and let such that and for for some .
Since is a constrained local minimum point of , we have
This lemma says that .
Theorem (FONC)
Let be a regular local minimum point of subject to the constraint . Then there is a such that
Proof
From the lemma, we may conclude that the linear system
has no feasible solution . Then, by Farkasโs lemma, its alternative system must have a solution. Specifically, there is a such that .
The FONC Equation 1 together with the constraints give a total of equations in the variables comprising .
Theorem
The first-order necessary conditions are sufficient if is convex and is affine.
The statements of various conditions often require that be a regular point1.
This requirement is known as a constraint qualification.
To see why constraint qualifications are important, consider the problem
The only feasible point is , so that is the (global) minimizer.
However, , and . cannot satisfy the FONC!
The Lagrange multipliers associated with a constrained minimization problem have an interpretation as prices, similar to the prices in LP.
Let a minimal solution be a regular point and be the corresponding Lagrange multiplier vector. Consider the family of problems
Sensitivity Theorem
Consider the family of problems Equation 4. Suppose that for every in a region containing , its minimizer is continuously differentiable depending on . Let with the corresponding Lagrange multiplier . Then
Sensitivity Theorem
Consider the family of problems Equation 4. Suppose that for every in a region containing , its minimizer is continuously differentiable depending on . Let with the corresponding Lagrange multiplier . Then
Proof
Using the chain rule and taking derivatives with respect to on both sides of
at , we have
On the other hand, using the chain rule and the first-order condition for and the above matrix equality
Theorem (Farkasโs Lemma).
Let be an matrix and be an -vector. The system of constraints has a feasible solution if and only if the system of constraints has no feasible solution . Therefore a single feasible solution for system Equation 6 establishes an infeasibility certificate for the system Equation 5.
Example 1
Suppose . Then, is feasible for system Equation 6, which proves that the system Equation 5 is infeasible.
Lemma
Let be the cone generated by the columns of matrix , that is Then C is a closed and convex set.
Proof (of Farkasโs Lemma).
Let the system Equation 5 have a feasible solution, say . Then, the system Equation 6 must be infeasible, since, otherwise, we have a contradiction
from and .
Now, let the system Equation 5 have no feasible solution, that is, . We now prove that its alternative system Equation 6 must have a feasible solution.
Since points is not in and is a closed convex set, by the separating hyperplane theorem, there is a such that But we know that for some , so we have Setting , we have from inequality Equation 7.
Proof (of Farkasโs Lemma) - Continued -
Furthermore, inequality Equation 7 also implies . Since otherwise, say the first entry of , , is positive. We can then choose a vector such that
Then, from this choice, we have
This tends to as . This is a contradiction because should be bounded from above by inequality Equation 7. Therefore, identified in the separating hyperplane theorem is a feasible solution to system Equation 6. Finally, we can always scale such that .
Geometric Interpretation
If is not in the closed and convex cone generated by the columns of the matrix , then there must be a hyperplane separating and the cone, and the feasible solution to the alternative system is the slope-vector of the hyperplane.
Corollary
Let be an matrix and an -vector. The system of constraints
has a feasible solution if and only if the system of constraints
has no feasible solution . Therefore a single feasible solution for system Equation 9 establishes an infeasibility certificate for the system Equation 8.
Theorem (SONC)
Suppose that is a regular local minimum of subject to . Then there is a such that If we denote by , the tangent plane, then the matrix on , that is, , .
Proof
From elementary calculus for every twice differentiable curve through we have Furthermore, differentiating the relation twice, we obtain Additing these two equations yields the result Since is arbitrary in , we have the stated conclusion.
Theorem (SOSC)
Suppose there is a point satisfying , and a such that Equation 10 holds. Suppose also that the matrix on . Then is a strict local minimum of subject to .
Proof
If is not a strict relative minimum point, a sequence of feasible points converging to s.t. for each , . Write , where and , . By Bolzano-Weierstrass some subsequence of converges. WLOG assume . We also have which implies . We have
Multiply Equation 12 by and add to Equation 13 to obtain
Consider the problem
The Lagrangian and subsection FONC would be
From the two equations we conclude , together with .
We have the two first-order stationary solutions
The Lagrangian Hessian matrix at these s becomes
Generally: . Solve for given .
Problem
FONC
with one solution , , , .
SOC
and the corresponding subspace is
In this case is the subspace spanned by the standard bases and of .
Therefore the restriction of is computed to be
Alternatively, we can construct matrices and determinants of order rather than .
For simplicity, let , which has full row rank.
Any satisfying can be expressed as
is the so-called projection matrix onto the nullspace of (i.e. onto )
Projected Hessian Test
The matrix is positive definite on iff the projected Hessian matrix to is positive semidefinite with rank .
In the previous example we had . Hence
Definition (relative minimum or local minimum).
A point is said to be a relative minimum point of over if such that for all within a distance of .
Definition (global minimum).
A point is said to be a global minimum point of over if for all .
Feasible direction
Given we say that a vector is a feasible direction at if there is an such that for all with .
Descent direction
An element of the set of directions with the property is called a descent direction.
If , then there is such that for all with . The direction is the steepest descent one.
Definition
Let be a point satisfying the constraints
and let be the set of indices for which . Then is said to be a regular point of the constraints Equation 14 if the gradient vectors , , , are linearly independent.
Karush-Kuhn-Tucker (KKT) Conditions
Let be a relative minimum point for the problem
and suppose is a regular point for the constraints. Then there is a vector and a vector with such that
Proof
Since and , the second of Equation 16 is equivalent to the statement that a component of may be nonzero only if the corresponding constraint is active. This is a complementary slackness condition studied in LP, which states that implies and implies .
Since is a relative minimum point over the constraint set, it is also a relative minimum over the subset of that set defined by setting the active constraints to zero. Thus, for the resulting equality constrained problem, defined in a nbhd. of , there are Lagrange multipliers. Therefore, we conclude that first of Equation 16 holds with if .
It remains to be shown that . Suppose for some . Let and be the surface and the tangent plane, resp., defined by all other active constraints at . By the regularity assumption, there is a such that , that is, and for all but , and . Multiplying this from the right to the first of Equation 16, we have
which implies that is a descent direction for the objective function.
Let with and . Then for small , is feasible โ it remains on the surface of and because (that is, constrant becomes inactive). But
which contradicts the minimality of .
Introduce the Lagrangian associated with the problem, defined as
First-Order Necessary Conditions
The Lagrangian and the (LDC) conditions are
and the (CSC) condition is .
If is convex and is affine , and are concave functions, then is convex in for every fixed and .
Therefore if meets the first of Equation 16, then is the global minimizer of the unconstrained with the same and .
Theorem
The FONC are sufficient if is convex, is affine, and is concave for all .
Proof
Let be any feasible solution and , together with and satisfy the FONC. Then we have
which completes the proof.
SONC
Suppose is a regular point of the constraints. If is a relative minimum point for the problem Equation 15, then there is a , , such that Equation 16 hold and such that is positive semidefinite on the tangent subspace of the active constraints in .
SOSC
Sufficient conditions that a point satisfying Equation 14 be a strict relative point of the problem Equation 15 is that there exist , such that
and the Hessian matrix
is positive definite on the subspace
where .
Sensitivity Theorem
Consider the family of problems
Suppose that for , , there is a local solution that is a regular point and that together with the associated Lagrange multipliers, , satisfies the SOSC for a strict local minimum. Assume further that no active inequality constraints is degenerate.
Then for every in a region containing , there is a solution , depending continuously on , such that and is a relative minimum of Equation 19. Furthermore
Optimization problem
Lagrangian
First-Order Conditions
(MSC): .
(LDC):
(CSC):
Systems Optimization I โข ME 647 Home