Instructor: Hasan A. Poonawala
Mechanical and Aerospace Engineering
University of Kentucky, Lexington, KY, USA
Topics:
Equality Constraints
Inequality Constraints
KKT Conditions
Linear Program (LP)
An LP is an optimization problem in which the objective function is linear in the unknowns and the constraints consist of linear (in)equalities.
Standard form
This problem may be alternatively expressed as
it is clear that this is equivalent to
with .
Variables, such as , adjoined in this fashion to convert a โgreater than or equal toโ inequality to equality are called surplus variables.
By suitably multiplying by minus unity, and adjoining slack and surplus variables, any set of linear inequalities can be converted to standard form.
Suppose one or more of the unknown variables is not required to be nonnegative, say, is not present so that is free to take on either positive or negative values.
where .
is free, so we can solve for it from the first constraint, obtaining
Substituting this into the objective and the second constraint, we obtain the equivalent problem
which is a problem in standard form.
After the smaller problem is solved (what is the answer?) the value for can be found from Equation 1.
Determine the most economical diet that satisfies the basic minimum nutritional requirements for good health
If we denote by the number of units of food in the diet, the problem is to select โs to minimize the total cost
subject to the nutritional constraints
and the nonnegative constraints on the food quantities.
This problem can be converted to standard form by subtracting a nonnegative surplus variable from the left side of each of the linear inequalities.
We wish to manufacture products at maximum revenue
subject to the resource constraints
and the nonnegativity consraints on all production variables.
| | |||||
| | |||||
| | |||||
| | |||||
โโ | โโ | โโ | โโ | ||
Maximal flow problem
Determine the maximal flow that can be established in such a network.
where for those no-arc pairs .
A warehouse is buying and selling stock of a certain commodity in order to maximize profit over a certain length of time.
where is a slack variable.
-dimensional data points are to be classified into two distinct classes.
where is the desired hyperplane.
Example
Markov Decision Process
An MDP problem is defined by a finite number of states, indexed by , where each state has a set of a finite number of actions, , to take.
MDP Problem
Find a stationary policy to minimize or maximize the discounted sum of expected costs or rewards over the infinite time horizon with a discount factor , when the process starts from state :
Maze Runner Game
Each square represents a state, where each of states has two possible actions to take:
Each state of has only one action
All actions have zero cost, except state โs (Trap state) action, which has a -unit cost to get out.
Suppose that the game is played infinitely, what is the optimal policy? That is, which action is best to take for every state at any time, to minimize the present-expected total cost?
MDP Problem
Two constraints for the two actions of State
Only constraint for the single action of State
where is the immediate cost of taking action at the current time period, and is the optimal expected cost from the next time period, and then on.
Basically, we relax the โโ operator to โโ from Bellmanโs principle and make them into the constraints and then maximize the sum of โs as the objective.
When the objective is maximized, at least one ineqality constraint in must become equal for every state so that is a fixed point solution of the Bellman operator.
System of equalities
Definition
Given the set of simultaneous linear equations unknowns Equation 2, let be any nonsingular submatrix made up of columns of . Then, if all components of not associated with columns of are set equal to zero, the solution to the resulting set of equations is said to be a basic solution to Equation 2 with respect to basis .
Definition
The components of associated with the columns of , denoted by the subvector according to the same column index order in , are called basic variables.
Full-Rank Assumption
The matrix has , and the rows of are linearly independent.
Definition
If one or more of the basic variables in a basic solution has value zero, that solution is said to be a degenerate basic solution.
Definition
A vector satisfying
is said to be feasible for these constraints. A feasible solution to the constraints Equation 3 that is also basic is said to be a basic feasible solution; if this solution is also a degenerate basic solution, it is called a degenerate basic feasible solution.
Corresponding to a linear program in standard form
a feasible solution to the constraints that achieves the minimum value of the objective function subject to those constraints is said to be an optimal feasible solution. If this solution is basic, it is an optimal basic feasible solution.
Fundamental Theorem
Given a linear program in standard form Equation 4 where is an matrix of rank ,
Proof (1)
Denote the columns of by . Suppose is a feasible solution. Then, in terms of the columns of , this solution satisfies
Proof (1) - Continued -
Assume that exactly of the variables are greater than zero, and wlog, that they are the first variables. Thus
There are now two cases, corresponding as to whether the set is linearly independent or linearly dependent.
Case 1: Assume are linearly independent. Then clearly, . If , the solution is basic and the proof is complete. If , then, since has rank , vectors can be found from the remaining vectors so that the resulting set of vectors is linearly independent. Assigning the value zero to the corresponding variables yields a (degenerate) basic feasible solution.
Case 2: Assume are linearly dependent. Then there is a nontrivial linear combination of these vectors that is zero. Thus there are constants , at least one of which can be assumed to be positive, such that
Multiplying this equation by a scalar and subtracting it from Equation 5, we obtain
This equation holds for every , and for each the components correspond to a solution of the linear equalities โ although they may violate . Denoting , we see that for any
is a solution to the equalities. For , this reduces to the original feasible solution.
Proof (1) - Continued -
As is increased from zero, the various components increase, decrease, or remain constant, depending upon whether the correspoding is negative, positive, or zero. Since we assume at least one is positive, at least one component will decrease as is increased. we increase to the first point where one or more components become zero. Specifically, we set
For this value of , the solution given by Equation 8 is feasible and has at most positive variables. Repeating this process as necessary, we can eliminate positive variables until we have a feasible solution with corresponding columns that are linearly independent. At that point Case 1 applies.
Proof (2)
Let be an optimal feasible solution and, as in the proof of (1) above, suppose there are exactly positive variables . Again, there are two cases; and Case 1, corresponding to the linear independence is exactly the same as before.
Case 2 also goes exactly the same as before, but it must be shown that for any the solution Equation 8 is optimal. To show this, note that the value of the solution is
For sufficiently small in magnitude, is a feasible solution for positive or negative values of . Thus, we conclude that (why?).
Proof (2)
Let be an optimal feasible solution and, as in the proof of (1) above, suppose there are exactly positive variables . Again, there are two cases; and Case 1, corresponding to the linear independence is exactly the same as before.
Case 2 also goes exactly the same as before, but it must be shown that for any the solution Equation 8 is optimal. To show this, note that the value of the solution is
For sufficiently small in magnitude, is a feasible solution for positive or negative values of . Thus, we conclude that . For, if , an of small magnitude and proper sign could be determined so as to render Equation 10 smaller than while maintaining feasibility.
Part (1) of the theorem is commonly referred to as Carathรฉodoryโs theorem.
Part (2) of the theorem reduces the task of solving a linear program to that of searching over basic feasible solutions.
Since for a problem having variables and constraints there are at most
basic solutions (corresponding to the number of ways of selecting of columns), there are only a finite number of possibilities.
Thus the fundamental theorem yields an obvious, but terribly inefficient, finite search technique.
By expanding upon the technique of proof as well as the statement of the fundamental theorem, the efficient simplex procedure is derived.
Definition
A point in a convex set is said to be an extreme point of if there are no two distinct points and in such that for some , .
Theorem (Equivalence of Extreme Points and Basic Solutions).
Let be an matrix of rank and an -vector. Let be the convex polytope consisting of all -vectors satisfying
A vector is an extreme point of if and only if is a basic feasible solution to Equation 11.
Proof
Suppose first that is a basic feasible solution to Equation 11. Then
where , the first columns of are linearly independent. Suppose that could be expressed as a convex combination of two other points in K; say, , , .
Proof
Since all components of , , are nonnegative and since , it follows immediately that the last components of and are zero. Thus, in particular, we have
and
Since the vectors are linearly independent, however, it follows that and hence is an extreme point of .
Conversely, assume that is an extreme point of . Let us assume that the nonzero components of are the first components. Then
To show that is a basic feasible solution (BFS) it must be shown that the vectors are linearly independent. We do this by contradiction. Suppose they are linearly dependent. Then there is a nontrivial linear combination that is zero:
Define the -vector . Since , , it is possible to select such that
We then have which expresses as a convex combination of two distinct vectors in .
Corollary
If the convex set corresponding to Equation 11 is nonempty, it has at least one extreme point.
Corollary
If there is a finite optimal solution to a linear programming problem, there is a finite optimal solution which is an extreme point of the constraint set.
Corollary
The constraint set corresponding to Equation 11 possesses at most a finite number of extreme points and each of them is finite.
Corollary
If the convex polytope corresponding to Equation 11 is bounded, then is a convex polyhedron, that is, consists of points that are convex combinations of a finite number of points.
Consider the constraint set in defined by
Consider the constraint set in defined by
Consider the constraint set in defined by
A basic solution for this system is obtained by setting any two variables to zero and solving for the remaining three.
As indicated in the figure, each edge of the figure corresponds to one variable being zero, and the extreme points are the points where two variables are zero.
Even when not expressed in the standard form, the extreme points of the set defined by the constraints of a lienar program correspond to the possible solution points.
The level sets of an objective function is included in the bottom figure.
Theorem (Farkasโs Lemma).
Let be an matrix and be an -vector. The system of constraints has a feasible solution if and only if the system of constraints has no feasible solution . Therefore a single feasible solution for system Equation 13 establishes an infeasibility certificate for the system Equation 12.
Example 1
Suppose . Then, is feasible for system Equation 13, which proves that the system Equation 12 is infeasible.
Lemma
Let be the cone generated by the columns of matrix , that is Then C is a closed and convex set.
Proof (of Farkasโs Lemma).
Let the system Equation 12 have a feasible solution, say . Then, the system Equation 13 must be infeasible, since, otherwise, we have a contradiction
from and .
Now, let the system Equation 12 have no feasible solution, that is, . We now prove that its alternative system Equation 13 must have a feasible solution.
Since points is not in and is a closed convex set, by the separating hyperplane theorem, there is a such that But we know that for some , so we have Setting , we have from inequality Equation 14.
Proof (of Farkasโs Lemma) - Continued -
Furthermore, inequality Equation 14 also implies . Since otherwise, say the first entry of , , is positive. We can then choose a vector such that
Then, from this choice, we have
This tends to as . This is a contradiction because should be bounded from above by inequality Equation 14. Therefore, identified in the separating hyperplane theorem is a feasible solution to system Equation 13. Finally, we can always scale such that .
Geometric Interpretation
If is not in the closed and convex cone generated by the columns of the matrix , then there must be a hyperplane separating and the cone, and the feasible solution to the alternative system is the slope-vector of the hyperplane.
Corollary
Let be an matrix and an -vector. The system of constraints
has a feasible solution if and only if the system of constraints
has no feasible solution . Therefore a single feasible solution for system Equation 16 establishes an infeasibility certificate for the system Equation 15.
We know that it is only necessary to consider basic feasible solutions to the system when solving a linear program.
The idea of the simplex method is to move from a basic feasible solution (extreme point) to an adjacent one with an improved objective value.
Definition
Two basic feasible solutions are said to be adjacent if and only if they differ by one basic variable.
If none of the โs are positive, then all coefficients in Equation 19 increase (or remain constant) as is increased, and no new basic feasible solution is obtained.
In this case the set of feasible solutions to Equation 17 is unbounded.
Example (Basis Change Illustration)
From Equation 21, and is the outgoing column so that the new basis is formed by and .
Now, suppose we start with and as the initial basis and select as the incoming column. Then
The idea of the simplex method is to select the incoming column so that the resulting new BFS will yield a lower value to the objective function than the previous one.
Suppose we have a basic feasible solution
The value of the objective function is
Hence for the current basic solution, the corresponding value is
Equation 25 expresses the cost of any feasible solution to Equation 17 in terms of the independent variable in .
Here, is the simplex multipliers or the shadow prices corresponding to basis .
Theorem (Improvement of Basic Feasible Solution)
Given a nondegenerate BFS with corresponding objective value , suppose that for some there holds . Then there is a feasible solution with objective value . If the column can be substituted for some vector in the original basis to yield a new basic feasible solution, this new solution will have . If cannot be substituted to yield a BFS then the solution set is unbounded and the objective function can be made arbitarily small (toward minus infinity).
This means that
and strong duality forces that is optimal.
Optimality Condition Theorem
If for some basic feasible solution , then that solution is optimal.
Optimality
Diet Problem (Exact nutritional requirements)
Step 0. Given the the inverse of a current basis, and the current solution .
Step 1. Calculate the current simplex multiplier vector and then calculate the relative cost coefficients . If stop; the current solution is optimal.
Step 2. Determine the vector that is to enter the basis by selecting its most negative cost coefficient, the () coefficient (break ties arbitrarily); and calculate .
Step 3. If , stop; the problem is unbounded. Otherwise, calculate the ratios for to determine the current basic column, where corresponds to the index of the minimum ratio, to leave the basis.
Step 4. Update (or its factorization) and the new basic feasible solution . Return to Step 1.
Remark
The basic columns in and the nonbasic columns in can be ordered arbitrarily and the components in follow the same index orders.
More precisely, let columns be permuted as and . Then when is identified as the most negative coefficient in in Step 2, is the entering column. Similarly, when is identified as the minimum ratio index in Step 3, is the outgoing column.
Suppose we start with the initial basis and .
Step 0. Initialization
Step 1. Calculate
and
Step 2. Then see , that is, is the incoming column, and calculate
Step 3. Since the ratios are, via the component-wise divide operation
The minimum ratio corresponds to column () that would be outgoing. That is replaces in the basis, which is now and .
Step 4. Update
Return to Step 1.
Second Iteration
Step 1. Calculate
and
Stop! All of the reduced costs are positive so the current basic feasible solution is optimal!
where is a vector of artifical variables.
If there is a feasible solution to Equation 28, then it is clear that Equation 29 has a minimum value of zero with .
If Equation 28 has no feasible solution, then the minimum value of Equation 29 is greater than zero.
Equation 29 is an LP and the a BFS for it is . It can readily be solved using the simplex technique.
and substitute in the dual by
One difference, in contrast to the primal simplex method, is that here the outgoing column is selected first and the incoming one is chosen later.
Step 0. Given the inverse of a dual feasible basis , primal solution , dual feasible solution , and reduced cost vectors .
Step 1. If , stop; the current solution pair is optimal. Otherwise, determine which column is to leave the basis by selecting the most negative entry, the entry (break ties arbitrarily), in . Now calculate and then calculate .
Step 2. If , stop; the problem is unbounded. Otherwise, calculate the ratios for ( to determine the current nonbasic column, , corresponding to the minimum ratio index, to become basic.
Step 3. Update the basis (or its factorization), and update primal solution , dual feasible solution , and reduced cost vector accordingly. Return to Step 1.
Step 0. Initialization
and
Step 1. We see that only the second component of is negative so that (which corr. to column ). Now, we compute
Step 2. Since all components in are negative, the componentwise ratios are
Here, we see the minimum ratio is the first component so that (which corresponds to column ), that is replaces in the current basis.
Step 3. The new basis is .
and
Stop! The solution pair is optimal!
and
Primal-Dual Optimality Theorem
Suppose that is feasible for the original dual and that and is feasible (and of course optimal) for the associated restricted primal. Then and are optimal for the original prime and dual programs, respectively.
Proof
Clearly is feasible for the primal. Also, we have because is identical to on the components corresponding to the nonzero elements of . Thus and optimality follows from strong duality or complementary slackness.
Step 1. Given a feasible solution to the dual program Equation 33, determine the associated restricted primal according to Equation 34.
Step 2. Optimize the associated restricted primal. If the minimal value of this problem is zero, the corresponding solution and is an optimal pair for the original LP Equation 33.
Step 3. If the minimal value of the associated restricted primal is strictly positive, the maximal objective value of the associated restricted dual Equation 35 is also positive from the strong duality theorem, that is, its optimal solution . If there is no for which for all , conclude the primal has no feasible solutions from Farkasโs lemma.
Step 4. If, on the other hand, for at least one , , define the new dual feasible vector where is referred to as the stepsize, is chosen as large as possible till one of the constraints, becomes equal
If can be increased to , then the original dual is unbounded. Otherwise and we go back to Step 1 using this new dual feasible solution whose dual objective is strictly increased.
Systems Optimization I โข ME 647 Home