ME/AER 647 Systems Optimization I


Linear Programming


Instructor: Hasan A. Poonawala

Mechanical and Aerospace Engineering
University of Kentucky, Lexington, KY, USA

Topics:
Equality Constraints
Inequality Constraints
KKT Conditions

Introduction

Standard Form

Linear Program (LP)

An LP is an optimization problem in which the objective function is linear in the unknowns and the constraints consist of linear (in)equalities.

Standard form

minimize๐œโŠค๐ฑ=c1x1+c2x2+โ‹ฏ+cnxnsubject to๐€๐ฑ=๐›and๐ฑโ‰ฅ๐ŸŽ. \begin{align} \operatorname{minimize} & \bm{c}^\top \bm{x} = c_1x_1 + c_2x_2 + \cdots + c_nx_n \\ \text{subject to} & \bm{A}\bm{x} = \bm{b} \quad \text{and} \quad \bm{x} \geq \bm{0}. \end{align}

  • ๐œ,๐ฑโˆˆโ„n\bm{c}, \bm{x} \in \mathbb{R}^n are column vectors, ๐€โˆˆโ„mร—n\bm{A} \in \mathbb{R}^{m \times n} a fat matrix (m<nm < n), ๐›โˆˆโ„m\bm{b} \in \mathbb{R}^m a column vector.
  • bib_iโ€™s, cic_iโ€™s and aija_{ij}โ€™s are fixed real constants, and the xix_iโ€™s are real numbers to be determined.
  • We assume that each equation has been multiplied by minus unity, if necessary, so that each biโ‰ฅ0b_i \geq 0.

Conversion to Standard Form: Slack Variables

maximizec1x2+c2x2+โ‹ฏ+cnxnsubject toa11x1+a12x2+โ‹ฏ+a1nxnโ‰คb1,a21x1+a22x2+โ‹ฏ+a2nxnโ‰คb2,โ‹ฎโ‹ฎam1x1+am2x2+โ‹ฏ+amnxnโ‰คbm,x1,x2,โ€ฆ,xnโ‰ฅ0. \begin{align} \operatorname{maximize} & c_1x_2 + c_2x_2 + \cdots + c_nx_n \\ \text{subject to} & a_{11}x_1 + a_{12}x_2 + \cdots + a_{1n}x_n &\leq b_1, \\ & a_{21}x_1 + a_{22}x_2 + \cdots + a_{2n}x_n &\leq b_2, \\ & \vdots & \vdots \\ & a_{m1}x_1 + a_{m2}x_2 + \cdots + a_{mn}x_n &\leq b_m, \\ & x_1, x_2, \ldots, x_n \geq 0. \end{align}

This problem may be alternatively expressed as

minimizeโˆ’c1x2โˆ’c2x2โˆ’โ‹ฏโˆ’cnxnsubject toa11x1+a12x2+โ‹ฏ+a1nxn+xn+1=b1,a21x1+a22x2+โ‹ฏ+a2nxn12+xn+2=b2,โ‹ฎโ‹ฎam1x1+am2x2+โ‹ฏ+amnxn12345+xn+m=bm,x1,x2,โ€ฆ,xn+1,โ€ฆ,xn+mโ‰ฅ0. \begin{align} \operatorname{minimize} & -c_1x_2 - c_2x_2 - \cdots - c_nx_n \\ \text{subject to} & a_{11}x_1 + a_{12}x_2 + \cdots + a_{1n}x_n + x_{n+1} &= b_1, \\ & a_{21}x_1 + a_{22}x_2 + \cdots + a_{2n}x_n \phantom{12} + x_{n+2} &= b_2, \\ & \vdots & \vdots \\ & a_{m1}x_1 + a_{m2}x_2 + \cdots + a_{mn}x_n \phantom{12345} + x_{n+m} &= b_m, \\ & x_1, x_2, \ldots, x_{n+1}, \ldots, x_{n+m} \geq 0. \end{align}

  • The new nonnegative variables xn+ix_{n+i}, i=1,โ€ฆ,mi=1, \ldots, m convert inequalities to equalities are called slack variables.

 

  • The constrant matrix now is transformed to ๐€โ†’[๐€๐ˆ]\bm{A} \rightarrow \begin{bmatrix} \bm{A} & \bm{I} \end{bmatrix}.

Conversion to Standard Form: Surplus Variables

  • If the linear equalities are reversed so that a typical inequality is

ai1x1+ai2x2+โ‹ฏ+ainxnโ‰ฅbi, a_{i1}x_1 + a_{i2}x_2 + \cdots + a_{in} x_n \geq b_i,

it is clear that this is equivalent to

ai1x1+ai2x2+โ‹ฏ+ainxnโˆ’yi=bi, a_{i1}x_1 + a_{i2}x_2 + \cdots + a_{in} x_n - y_i = b_i,

with yiโ‰ฅ0y_i \geq 0.

  • Variables, such as yiy_i, adjoined in this fashion to convert a โ€œgreater than or equal toโ€ inequality to equality are called surplus variables.

  • By suitably multiplying by minus unity, and adjoining slack and surplus variables, any set of linear inequalities can be converted to standard form.

Conversion to Standard Form: Free variables โ€“ First Method

Suppose one or more of the unknown variables is not required to be nonnegative, say, x1โ‰ฅ0x_1 \geq 0 is not present so that x1x_1 is free to take on either positive or negative values.

  • We then write x1=u1โˆ’v1x_1 = u_1 - v_1, and require that u1,v1โ‰ฅ0u_1, v_1 \geq 0.
  • We substitute u1โˆ’v1u_1 - v_1 for x1x_1 everywhere.
  • The problem is then expressed in terms of the n+1n+1 variables u1,v1,x2,x3,โ€ฆ,xnu_1, v_1, x_2, x_3, \ldots, x_n.
  • There is a certain degree of redundancy introduced in this technique since a constant added to u1u_1 and v1v_1 does not change x1x_1.
  • Nevertheless, the simplex method can still be used to find the solution.

Conversion to Standard Form: Free variables โ€“ Second Method

  • Take the same free variable situation, i.e, x1x_1 is free to take on positive or negative values.
  • One can take any one of the mm equality constraints which has a nonzero coefficient for x1x_1, say, for example,

ai1x1+ai2x2+โ‹ฏ+ainxn=bi, a_{i1}x_1 + a_{i2}x_2 + \cdots + a_{in}x_n = b_i,

where ai1โ‰ 0a_{i1} \neq 0.

  • Then x1x_1 can be expressed as a linear combination of the other variables, plus a constant.
  • This expression can be substituted everywhere for x1x_1 and we are led to a new problem of exactly the same form but expressed in terms of the variables x2,x3,โ€ฆ,xnx_2, x_3, \ldots, x_n only.
  • Furthermore, the ithi^{\text{th}} equation, used to determine x1x_1 is now identically zero and it too can be eliminated.
  • We obtain a linear program having nโˆ’1n-1 variables and mโˆ’1m-1 constraint equations.

Example โ€“ Specific Case

minimizex1+3x2+4x3subject tox1+2x2+x3=5,2x1+3x2+x3=6,x2,x3โ‰ฅ0. \begin{align} \operatorname{minimize} & x_1 + 3x_2 + 4x_3 \\ \text{subject to} & x_1 + 2x_2 + x_3 = 5, \\ & 2x_1 + 3x_2 + x_3 = 6, \\ & x_2, x_3 \geq 0. \end{align}

x1x_1 is free, so we can solve for it from the first constraint, obtaining

x1=5โˆ’2x2โˆ’x3.(1) x_1 = 5 - 2x_2 - x_3. \qquad(1)

Substituting this into the objective and the second constraint, we obtain the equivalent problem

minimizex2+3x3subject tox2+x3=4,x2,x3โ‰ฅ0. \begin{align} \operatorname{minimize} & x_2 + 3x_3 \\ \text{subject to}& x_2 + x_3 = 4, \\ & x_2, x_3 \geq 0. \end{align}

which is a problem in standard form.

After the smaller problem is solved (what is the answer?) the value for x1=โˆ’3x_1 = -3 can be found from Equation 1.

Examples of Linear Programming Problems

Example 1 โ€“ The Diet Problem

Determine the most economical diet that satisfies the basic minimum nutritional requirements for good health

  • There are available nn different foods.
  • There are mm basic nutritional ingredients,
  • Each unit of food jj contains aija_{ij} units of the ithi^{\text{th}} nutrient.
  • jthj^{\text{th}} food sells at a price cjc_j per unit.
  • Each individual must receive at least bib_i units of the ithi^{\text{th}} nutrient per day.

If we denote by xjx_j the number of units of food jj in the diet, the problem is to select xjx_jโ€™s to minimize the total cost c1x1+c2x2+โ‹ฏ+cnxn c_1x_1 + c_2x_2 + \cdots + c_nx_n

subject to the nutritional constraints ai1x1+ai2x2+โ‹ฏ+ainxnโ‰ฅbi,i=1,โ€ฆ,m, a_{i1}x_1 + a_{i2}x_2 + \cdots + a_{in}x_n \geq b_i, \; i=1, \ldots, m,

and the nonnegative constraints x1โ‰ฅ0,x2โ‰ฅ0,โ€ฆ,xnโ‰ฅ0, x_1 \geq 0, x_2 \geq 0, \ldots, x_n \geq 0, on the food quantities.

This problem can be converted to standard form by subtracting a nonnegative surplus variable from the left side of each of the mm linear inequalities.

Example 2โ€“ The Resource-Allocation Problem

  • A facility is capable of manufacturing nn different products.
  • Each product can be produced at any level xjโ‰ฅ0x_j \geq 0, j=1,2,โ€ฆ,nj=1, 2, \ldots, n.
  • Each unit of the jthj^{\text{th}} product needs aija_{ij} units of the ithi^{\text{th}} resource, i=1,2,โ€ฆ,mi = 1, 2, \ldots, m.
  • Each product may require various amounts of mm different resources.
  • Each unit of the jthj^{\text{th}} product can sell for ฯ€j\pi_j dollars.
  • Each bib_i, i=1,2,โ€ฆ,mi = 1, 2, \ldots, m describe the available quantities of the mm resources.

We wish to manufacture products at maximum revenue

maximizeฯ€1x1+ฯ€2x2+โ‹ฏ+ฯ€nxn \begin{align} \operatorname{maximize} & \pi_1x_1 + \pi_2x_2 + \cdots + \pi_nx_n \end{align}

subject to the resource constraints

subject toai1x1+ai2x2+โ‹ฏ+ainxnโ‰คbi,i=1,โ€ฆ,m \begin{align} \text{subject to} & a_{i1}x_1 + a_{i2}x_2 + \cdots + a_{in}x_n \leq b_i, \; i=1, \ldots, m \end{align}

and the nonnegativity consraints on all production variables.

  • The problem can also be interpreted as
    • fund nn different activities, where
    • ฯ€j\pi_j is the full reward from the jthj^{\text{th}} activity,
    • xjx_j is restricted to 0โ‰คxjโ‰ค10 \leq x_j \leq 1, representing the funding level from 0%0\% to 100%100\%.

Example 3 โ€“ The Transportation Problem

  • Quantities a1,a2,โ€ฆ,ama_1, a_2, \ldots, a_m of a certain product are to be shipped from mm locations.
  • Shipping a unit of product from origin ii to destination jj costs cijc_{ij}.
  • These products will be received in amounts of b1,b2,โ€ฆ,bnb_1, b_2, \ldots, b_n at each of nn destinations.
  • We want to determine the amounts xijx_{ij} to be shipped between each origin-destination pair i=1,2,โ€ฆ,mi = 1, 2, \ldots, m; j=1,2,โ€ฆ,nj=1, 2, \ldots, n.
x11x_{11} x12x_{12} โ‹ฏ\cdots x1nx_{1n} | a1a_1
x21x_{21} x22x_{22} โ‹ฏ\cdots x2nx_{2n} | a2a_2
โ‹ฎ\vdots โ‹ฎ\vdots โ‹ฎ\vdots โ‹ฎ\vdots | โ‹ฎ\vdots
xm1x_{m1} xm2x_{m2} โ‹ฏ\cdots xmnx_{mn} | ama_m
โ€”โ€” โ€”โ€” โ€”โ€” โ€”โ€”
b1b_{1} b2b_{2} โ‹ฏ\cdots bnb_{n}
  • The ithi^{\text{th}} row in this array defines the variables associated with the ithi^{\text{th}} origin.
  • The jthj^{\text{th}} column defines the variables associated with the jthj^{\text{th}} destination.
  • Problem: find the nonnegative variables xijx_{ij} so that the sum across the ithi^{\text{th}} row is aja_j, the sum down the jthj^{\text{th}} column is bjb_j, and the weighted sum โˆ‘j=1nโˆ‘i=1mcijxij\sum_{j=1}^n\sum_{i=1}^m c_{ij}x_{ij} is minimized.

Example 4 โ€“ The Maximal Flow Problem

Maximal flow problem

Determine the maximal flow that can be established in such a network.

maximizefsubject toโˆ‘j=1nx1jโˆ’โˆ‘j=1nxj1โˆ’f=0,โˆ‘j=1nxijโˆ’โˆ‘j=1nxji=0,iโ‰ 1,m,โˆ‘j=1nxmjโˆ’โˆ‘j=1nxjm+f=0,0โ‰คxijโ‰คkij,โˆ€i,j, \begin{align} \operatorname{maximize} & f \\ \text{subject to} & \sum_{j=1}^n x_{1j} - \sum_{j=1}^n x_{j1} - f = 0, \\ & \sum_{j=1}^n x_{ij} - \sum_{j=1}^n x_{ji} = 0, \quad i \neq 1, m, \\ & \sum_{j=1}^n x_{mj} - \sum_{j=1}^n x_{jm} + f = 0, \\ & 0 \leq x_{ij} \leq k_{ij}, \quad \forall i, j, \end{align}

where kij=0k_{ij} = 0 for those no-arc pairs (i,j)(i,j).

  • Capacitated network in which two special nodes, called the source (node 1); and the sink (node mm) are distinguished.
  • All other nodes must satisfy the conservation requirement: net flow into these nodes must be zero.
    • the source may have a net outflow,
    • the sink may have a net inflow.
  • The outlow ff of the source will equal the inflow of the sink.

Example 5 โ€“ A Supply-Chain Problem

A warehouse is buying and selling stock of a certain commodity in order to maximize profit over a certain length of time.

  • Warehouse has a fixed capacity CC.
  • The price, pip_i, of the commodity is known to fluctuate over a number of time periods, say months, indexed by ii.
  • The warehouse is originally empty and is required to be empty at the end of the last period.
  • There is a cost rr per unit of holding stock for one period.
  • In any period the same price holds for both purchase and sale.
  • xix_i: level of stock in the warehouse at the beginning of period ii, uiu_i: amount bought during this period, sis_i: amount sold during this period.

maximizeโˆ‘i=1n(pi(siโˆ’ui)โˆ’rxi)subject toxi+1=xi+uiโˆ’si,i=1,2,โ€ฆ,nโˆ’1,0=xn+unโˆ’sn,xi+zi=C,i=2,โ€ฆ,n,x1=0,xiโ‰ฅ0,uiโ‰ฅ0,siโ‰ฅ0,ziโ‰ฅ0, \begin{align} \operatorname{maximize} & \sum_{i=1}^n \left(p_i(s_i - u_i) - rx_i \right) \\ \text{subject to} & x_{i+1} = x_i + u_i - s_i, \quad i = 1, 2, \ldots, n-1, \\ & 0 = x_n + u_n - s_n, \\ & x_i + z_i = C, \quad i = 2, \ldots, n, \\ & x_1 = 0, x_i \geq 0, u_i \geq 0, s_i \geq 0, z_i \geq 0, \end{align}

where ziz_i is a slack variable.

Example 6 โ€“ Linear Classifier and Support Vector Machine

dd-dimensional data points are to be classified into two distinct classes.

  • In general, we have vectors ๐šiโˆˆโ„d\bm{a}_i \in \mathbb{R}^d for i=1,2,โ€ฆ,n1i=1, 2, \ldots, n_1 and vector ๐›jโˆˆโ„d\bm{b}_j \in \mathbb{R}^d for j=1,2,โ€ฆ,n2j = 1, 2, \ldots, n_2.
  • We wish to find a hyperplane that separates ๐ši\bm{a}_iโ€™s from the ๐›j\bm{b}_jโ€™s, i.e., find a slope-vector yโˆˆโ„dy \in \mathbb{R}^d and an intercept ฮฒ\beta such that

๐šiโŠค๐ฒ+ฮฒโ‰ฅ1,โˆ€i,๐›jโŠค๐ฒ+ฮฒโ‰ค1,โˆ€j, \begin{align} \bm{a}_i^\top \bm{y} + \beta &\geq 1, \quad \forall i, \\ \bm{b}_j^\top \bm{y} + \beta &\leq 1, \quad \forall j, \\ \end{align}

where {๐ฑ:๐ฑ๐ฒ+ฮฒ=0}\{\bm{x}: \bm{x}^\bm{y} + \beta = 0\} is the desired hyperplane.

  • The separation is defined by the fixed margins +1+1 and โˆ’1-1, which could be made soft or variable later.

Example

  • Two-dimensional data points may be grade averages in science and humanities for different students.
  • We also know the academic major of each student, as being science or humanities, which serves as the classification.

Example 7 โ€“ Markov Decision Process (MDP)

Markov Decision Process

An MDP problem is defined by a finite number of states, indexed by i=1,โ€ฆ,mi = 1, \ldots, m, where each state has a set of a finite number of actions, ๐’œi\mathcal{A}_i, to take.

  • Each action, say jโˆˆ๐’œij \in \mathcal{A}_i, is associated with
    • an immediate cost cjc_j of taking,
    • a probability distribution ๐ฉjโˆˆโ„m\bm{p}_j \in \mathbb{R}^m to transfer to all possible states at the next time period.
  • A stationary policy for the decision maker is a function ฯ€={ฯ€1,ฯ€2,โ‹ฏ,ฯ€m}\pi = \{\pi_1, \pi_2, \cdots, \pi_m\} that specifies a single action in every state, ฯ€iโˆˆ๐’œi\pi_i \in \mathcal{A}_i.

MDP Problem

Find a stationary policy to minimize or maximize the discounted sum of expected costs or rewards over the infinite time horizon with a discount factor 0โ‰คฮณ<10 \leq \gamma < 1, when the process starts from state i0i^0:

โˆ‘t=0โˆžฮณt๐”ผ[cฯ€it] \sum_{t=0}^\infty \gamma^t \mathbb{E}[c_{\pi_{i^t}}]

Example 7 โ€“ Markov Decision Process (MDP)

Maze Runner Game

Each square represents a state, where each of states {1,2,3,4}\{1, 2, 3, 4\} has two possible actions to take:

  • red action: moves to the next state at the next time period,
  • blue action: shortcut move, with a probability distribution, to a state at the next time period.

Each state of {5,6}\{5, 6\} has only one action

  • moving to state 66 (Exit state),
  • moving to state 11 (Start state).

All actions have zero cost, except state 55โ€™s (Trap state) action, which has a 11-unit cost to get out.

Suppose that the game is played infinitely, what is the optimal policy? That is, which action is best to take for every state at any time, to minimize the present-expected total cost?

MDP Problem

Two constraints for the two actions of State 11

y1โˆ’ฮณy2โ‰ค0,y1โˆ’ฮณ(14y3+14y5+14y6)โ‰ค0. y_1 - \gamma y_2 \leq 0, \;\; y_1 - \gamma(\frac{1}{4}y_3 + \frac{1}{4}y_5 + \frac{1}{4}y_6) \leq 0.

Only constraint for the single action of State 55

y5โˆ’ฮณy6โ‰ค1. y_5 - \gamma y_6 \leq 1.

Example 7 โ€“ Markov Decision Process (MDP)

  • Let yi*y_i^\ast, i=1,โ€ฆ,mi = 1, \ldots, m represent the optimal present-expected cost when the process starts at state ii and time 00
    • also called the cost-to-go value of state ii.
    • The yi*y_i^\astโ€™s follow Bellmanโ€™s principle of optimality:

yi*=minjโˆˆ๐’œi(cj+ฮณ๐ฉjโŠค๐ฒ*), y_i^\ast = \operatorname{min}_{j \in \mathcal{A}_i} (c_j + \gamma \bm{p}_j^\top \bm{y}^\ast),

where cjc_j is the immediate cost of taking action jโˆˆ๐’œij \in \mathcal{A}_i at the current time period, and ๐ฉjโŠค๐ฒ*\bm{p}_j^\top \bm{y}^\ast is the optimal expected cost from the next time period, and then on.

  • When yi*y_i^\ast is known for every state, the optimal action in each state would be

ฯ€i*=argminjโˆˆ๐’œi(cj+ฮณ๐ฉjโŠค๐ฒ*),โˆ€i. \pi_i^\ast = \operatorname{arg min}_{j \in \mathcal{A}_i} (c_j + \gamma \bm{p}_j^\top \bm{y}^\ast), \quad \forall i.

Example 7 โ€“ Markov Decision Process (MDP)

  • One observes that ๐ฒ*โˆˆโ„m\bm{y}^\ast \in \mathbb{R}^m is a fixed-point of the Bellman operator, and it can be computed by the following linear program.

maximizeโˆ‘i=1myisubject toy1โˆ’ฮณ๐ฉjโŠค๐ฒโ‰คcjโˆ€jโˆˆ๐’œ1,โ‹ฎyiโˆ’ฮณ๐ฉjโŠค๐ฒโ‰คcjโˆ€jโˆˆ๐’œi,โ‹ฎymโˆ’ฮณ๐ฉjโŠค๐ฒโ‰คcjโˆ€jโˆˆ๐’œm. \begin{align} \operatorname{maximize} & \sum_{i=1}^m y_i \\ \text{subject to} & y_1 - \gamma \bm{p}_j^\top \bm{y} \leq c_j \quad \forall j \in \mathcal{A}_1, \\ & \vdots \\ & y_i - \gamma \bm{p}_j^\top \bm{y} \leq c_j \quad \forall j \in \mathcal{A}_i, \\ & \vdots \\ & y_m - \gamma \bm{p}_j^\top \bm{y} \leq c_j \quad \forall j \in \mathcal{A}_m. \\ \end{align}

  • Basically, we relax the โ€œmin\operatorname{min}โ€ operator to โ€œโ‰ค\leqโ€ from Bellmanโ€™s principle and make them into the constraints and then maximize the sum of yiy_iโ€™s as the objective.

  • When the objective is maximized, at least one ineqality constraint in ๐’œi\mathcal{A}_i must become equal for every state ii so that ๐ฒ\bm{y} is a fixed point solution of the Bellman operator.

Basic Feasible Solutions

Basic Solutions

System of equalities

๐€๐ฑ=๐›(2) \bm{Ax} = \bm{b} \qquad(2)

  • From the nn columns of ๐€\bm{A}, we select a set of mm linearly independent columns.
  • WLOG, assume that the first mm columns of ๐€\bm{A} are selected: denote the mร—mm \times m matrix determined by these columns by ๐\bm{B}.
  • The matrix ๐\bm{B} is nonsingular and we may uniquely solve the equation ๐๐ฑ๐=๐›or๐ฑ๐=๐โˆ’1๐› \bm{Bx_B} = \bm{b} \quad \text{or} \quad \bm{x_B} = \bm{B}^{-1}\bm{b}
  • We refer to ๐\bm{B} as a basis, since ๐\bm{B} consists of mm linearly independent columns that can be regarded as a basis for โ„m\mathbb{R}^m.

Definition

Given the set of mm simultaneous linear equations nn unknowns Equation 2, let ๐\bm{B} be any nonsingular mร—mm \times m submatrix made up of columns of ๐€\bm{A}. Then, if all nโˆ’mn-m components of ๐ฑ\bm{x} not associated with columns of ๐\bm{B} are set equal to zero, the solution to the resulting set of equations is said to be a basic solution to Equation 2 with respect to basis ๐\bm{B}.

Definition

The components of ๐ฑ\bm{x} associated with the columns of ๐\bm{B}, denoted by the subvector ๐ฑB\bm{x}_B according to the same column index order in ๐\bm{B}, are called basic variables.

Full-Rank Assumption

The mร—nm \times n matrix ๐€\bm{A} has m<nm < n, and the mm rows of ๐€\bm{A} are linearly independent.

Basic Feasible Solutions

Definition

If one or more of the basic variables in a basic solution has value zero, that solution is said to be a degenerate basic solution.

  • In a nondegenerate basic solution, the basic variables, and hence the basis ๐\bm{B}, can be immediately identified from the positive components of the solution.
    • There is ambiguity associated with a degenerate basic solution โ€“ some of the nonbasic variables can be interchanged!

Definition

A vector ๐ฑ\bm{x} satisfying

๐€๐ฑ=๐›,๐ฑโ‰ฅ๐ŸŽ,(3) \bm{Ax} = \bm{b}, \quad \bm{x} \geq \bm{0}, \qquad(3)

is said to be feasible for these constraints. A feasible solution to the constraints Equation 3 that is also basic is said to be a basic feasible solution; if this solution is also a degenerate basic solution, it is called a degenerate basic feasible solution.

The Fundamental Theorem of Linear Programming

The Fundamental Theorem of LP

Corresponding to a linear program in standard form

minimize๐œโŠค๐ฑsubject to๐€๐ฑ=๐›,๐ฑโ‰ฅ0(4) \begin{align} \operatorname{minimize} & \bm{c}^\top \bm{x} \\ \text{subject to} & \bm{Ax} = \bm{b}, \;\; \bm{x} \geq 0 \end{align} \qquad(4)

a feasible solution to the constraints that achieves the minimum value of the objective function subject to those constraints is said to be an optimal feasible solution. If this solution is basic, it is an optimal basic feasible solution.

Fundamental Theorem

Given a linear program in standard form Equation 4 where ๐€\bm{A} is an mร—nm \times n matrix of rank mm,

  1. if there is a feasible solution, there is a basic feasible solution;
  2. if there is an optimal feasible solution, there is an optimal basic feasible solution.

Proof (1)

Denote the columns of ๐€\bm{A} by ๐š1,๐š2,โ€ฆ,๐šn\bm{a}_1, \bm{a}_2, \ldots, \bm{a}_n. Suppose ๐ฑ=(x1,x2,โ€ฆ,xn)\bm{x} = (x_1, x_2, \ldots, x_n) is a feasible solution. Then, in terms of the columns of ๐€\bm{A}, this solution satisfies

x1๐š1+x2๐š2+โ‹ฏ+xn๐šn=๐›. x_1\bm{a}_1 + x_2\bm{a}_2 + \cdots + x_n\bm{a}_n = \bm{b}.

The Fundamental Theorem of LP

Proof (1) - Continued -

Assume that exactly pp of the variables xix_i are greater than zero, and wlog, that they are the first pp variables. Thus

x1๐š1+x2๐š2+โ‹ฏ+xp๐šp=๐›.(5) x_1 \bm{a}_1 + x_2 \bm{a}_2 + \cdots + x_p \bm{a}_p = \bm{b}. \qquad(5)

There are now two cases, corresponding as to whether the set ๐š1,๐š2,โ€ฆ,๐šp\bm{a}_1, \bm{a}_2, \ldots, \bm{a}_p is linearly independent or linearly dependent.

Case 1: Assume ๐š1,๐š2,โ€ฆ,๐šp\bm{a}_1, \bm{a}_2, \ldots, \bm{a}_p are linearly independent. Then clearly, pโ‰คmp \leq m. If p=mp = m, the solution is basic and the proof is complete. If p<mp < m, then, since ๐€\bm{A} has rank mm, mโˆ’pm-p vectors can be found from the remaining nโˆ’pn-p vectors so that the resulting set of mm vectors is linearly independent. Assigning the value zero to the corresponding mโˆ’pm-p variables yields a (degenerate) basic feasible solution.

Case 2: Assume ๐š1,๐š2,โ€ฆ,๐šp\bm{a}_1, \bm{a}_2, \ldots, \bm{a}_p are linearly dependent. Then there is a nontrivial linear combination of these vectors that is zero. Thus there are constants y1,y2,โ€ฆ,ypy_1, y_2, \ldots, y_p, at least one of which can be assumed to be positive, such that

y1๐š1+y2๐š2+โ‹ฏ+yp๐šp=๐ŸŽ.(6) y_1\bm{a}_1 + y_2\bm{a}_2 + \cdots + y_p\bm{a}_p = \bm{0}. \qquad(6)

Multiplying this equation by a scalar ฮต\varepsilon and subtracting it from Equation 5, we obtain

(x1โˆ’ฮตy1)๐š1+(x2โˆ’ฮตy2)๐š2+โ‹ฏ+(xpโˆ’ฮตyp)๐šp=๐›.(7) (x_1 - \varepsilon y_1)\bm{a}_1 + (x_2 - \varepsilon y_2)\bm{a}_2 + \cdots + (x_p - \varepsilon y_p)\bm{a}_p = \bm{b}. \qquad(7)

This equation holds for every ฮต\varepsilon, and for each ฮต\varepsilon the components xjโˆ’ฮตyjx_j - \varepsilon y_j correspond to a solution of the linear equalities โ€” although they may violate xiโˆ’ฮตyiโ‰ฅ0x_i - \varepsilon y_i \geq 0. Denoting ๐ฒ=(y1,y2,โ€ฆ,yp,0,0,โ€ฆ,0)\bm{y} = (y_1, y_2, \ldots, y_p, 0, 0, \ldots, 0), we see that for any ฮต\varepsilon

๐ฑโˆ’ฮต๐ฒ(8) \bm{x} - \varepsilon \bm{y} \qquad(8)

is a solution to the equalities. For ฮต=0\varepsilon = 0, this reduces to the original feasible solution.

The Fundamental Theorem of LP

Proof (1) - Continued -

As ฮต\varepsilon is increased from zero, the various components increase, decrease, or remain constant, depending upon whether the correspoding yiy_i is negative, positive, or zero. Since we assume at least one yiy_i is positive, at least one component will decrease as ฮต\varepsilon is increased. we increase ฮต\varepsilon to the first point where one or more components become zero. Specifically, we set

ฮต=min{xiyi:yi>0}. \varepsilon = \operatorname{min}\left\{\frac{x_i}{y_i}: y_i > 0 \right\}.

For this value of ฮต\varepsilon, the solution given by Equation 8 is feasible and has at most pโˆ’1p-1 positive variables. Repeating this process as necessary, we can eliminate positive variables until we have a feasible solution with corresponding columns that are linearly independent. At that point Case 1 applies.

Proof (2)

Let ๐ฑ=(x1,x2,โ€ฆ,xn)\bm{x} = (x_1, x_2, \ldots, x_n) be an optimal feasible solution and, as in the proof of (1) above, suppose there are exactly pp positive variables x1,x2,โ€ฆ,xpx_1, x_2, \ldots, x_p. Again, there are two cases; and Case 1, corresponding to the linear independence is exactly the same as before.

Case 2 also goes exactly the same as before, but it must be shown that for any ฮต\varepsilon the solution Equation 8 is optimal. To show this, note that the value of the solution ๐ฑโˆ’ฮต๐ฒ\bm{x} - \varepsilon\bm{y} is

๐œโŠค๐ฑโˆ’ฮต๐œโŠค๐ฒ.(9) \bm{c}^\top \bm{x} - \varepsilon \bm{c}^\top \bm{y}. \qquad(9)

For ฮต\varepsilon sufficiently small in magnitude, ๐ฑโˆ’ฮต๐ฒ\bm{x} - \varepsilon \bm{y} is a feasible solution for positive or negative values of ฮต\varepsilon. Thus, we conclude that ๐œโŠค๐ฒ=๐ŸŽ\bm{c}^\top \bm{y} = \bm{0} (why?).

Proof (2)

Let ๐ฑ=(x1,x2,โ€ฆ,xn)\bm{x} = (x_1, x_2, \ldots, x_n) be an optimal feasible solution and, as in the proof of (1) above, suppose there are exactly pp positive variables x1,x2,โ€ฆ,xpx_1, x_2, \ldots, x_p. Again, there are two cases; and Case 1, corresponding to the linear independence is exactly the same as before.

Case 2 also goes exactly the same as before, but it must be shown that for any ฮต\varepsilon the solution Equation 8 is optimal. To show this, note that the value of the solution ๐ฑโˆ’ฮต๐ฒ\bm{x} - \varepsilon\bm{y} is

๐œโŠค๐ฑโˆ’ฮต๐œโŠค๐ฒ.(10) \bm{c}^\top \bm{x} - \varepsilon \bm{c}^\top \bm{y}. \qquad(10)

For ฮต\varepsilon sufficiently small in magnitude, ๐ฑโˆ’ฮต๐ฒ\bm{x} - \varepsilon \bm{y} is a feasible solution for positive or negative values of ฮต\varepsilon. Thus, we conclude that ๐œโŠค๐ฒ=๐ŸŽ\bm{c}^\top \bm{y} = \bm{0}. For, if ๐œโŠค๐ฒโ‰ ๐ŸŽ\bm{c}^\top \bm{y} \neq \bm{0}, an ฮต\varepsilon of small magnitude and proper sign could be determined so as to render Equation 10 smaller than ๐œโŠค๐ฒ\bm{c}^\top \bm{y} while maintaining feasibility.

The Fundamental Theorem of LP

  • Part (1) of the theorem is commonly referred to as Carathรฉodoryโ€™s theorem.

  • Part (2) of the theorem reduces the task of solving a linear program to that of searching over basic feasible solutions.

  • Since for a problem having nn variables and mm constraints there are at most

(nm)=n!m!(nโˆ’m)! \binom{n}{m} = \frac{n!}{m!(n-m)!}

basic solutions (corresponding to the number of ways of selecting mm of nn columns), there are only a finite number of possibilities.

  • Thus the fundamental theorem yields an obvious, but terribly inefficient, finite search technique.

  • By expanding upon the technique of proof as well as the statement of the fundamental theorem, the efficient simplex procedure is derived.

Relations to Convex Geometry

Extreme Points

Definition

A point ๐ฑ\bm{x} in a convex set CC is said to be an extreme point of CC if there are no two distinct points ๐ฑ1\bm{x}_1 and ๐ฑ2\bm{x}_2 in CC such that ๐ฑ=ฮฑ๐ฑ1+(1โˆ’ฮฑ)๐ฑ2\bm{x} = \alpha \bm{x}_1 + (1-\alpha)\bm{x}_2 for some ฮฑ\alpha, 0<ฮฑ<10 < \alpha < 1.

  • An extreme point is thus a point that does not lie strictly within a line segment connecting two other points of the set.

Theorem (Equivalence of Extreme Points and Basic Solutions).

Let ๐€\bm{A} be an mร—nm \times n matrix of rank mm and ๐›\bm{b} an mm-vector. Let KK be the convex polytope consisting of all nn-vectors ๐ฑ\bm{x} satisfying

๐€๐ฑ=๐›,๐ฑโ‰ฅ๐ŸŽ.(11) \bm{Ax} = \bm{b}, \;\; \bm{x} \geq \bm{0}. \qquad(11)

A vector ๐ฑ\bm{x} is an extreme point of KK if and only if ๐ฑ\bm{x} is a basic feasible solution to Equation 11.

Proof

Suppose first that ๐ฑ=(x1,x2,โ€ฆ,xm,0,0,โ€ฆ,0)\bm{x} = (x_1, x_2, \ldots, x_m, 0, 0, \ldots, 0) is a basic feasible solution to Equation 11. Then

x1๐š1+x2๐š2+โ‹ฏ+xm๐šm=๐›, x_1 \bm{a}_1 + x_2\bm{a}_2 + \cdots + x_m\bm{a}_m = \bm{b},

where ๐š1,๐š2,โ€ฆ,๐šm\bm{a}_1, \bm{a}_2, \ldots, \bm{a}_m, the first mm columns of ๐€\bm{A} are linearly independent. Suppose that ๐ฑ\bm{x} could be expressed as a convex combination of two other points in K; say, ๐ฑ=ฮฑ๐ฒ+(1โˆ’ฮฑ)๐ณ\bm{x} = \alpha \bm{y} + (1-\alpha)\bm{z}, 0<ฮฑ<10 < \alpha < 1, ๐ฒโ‰ ๐ณ\bm{y} \neq \bm{z}.

Extreme Points

Proof

Since all components of ๐ฑ\bm{x}, ๐ฒ\bm{y}, ๐ณ\bm{z} are nonnegative and since 0<ฮฑ<10 < \alpha < 1, it follows immediately that the last nโˆ’mn-m components of ๐ฒ\bm{y} and ๐ณ\bm{z} are zero. Thus, in particular, we have

y1๐š1+y2๐š2+โ‹ฏ+ym๐šm=๐› y_1\bm{a}_1 + y_2\bm{a}_2 + \cdots + y_m\bm{a}_m = \bm{b}

and

z1๐š1+z2๐š2+โ‹ฏ+zm๐šm=๐›. z_1\bm{a}_1 + z_2\bm{a}_2 + \cdots + z_m\bm{a}_m = \bm{b}.

Since the vectors ๐š1,๐š2,โ€ฆ,๐šm\bm{a}_1, \bm{a}_2, \ldots, \bm{a}_m are linearly independent, however, it follows that ๐ฑ=๐ฒ=๐ณ\bm{x} = \bm{y} = \bm{z} and hence xx is an extreme point of KK.

Conversely, assume that ๐ฑ\bm{x} is an extreme point of KK. Let us assume that the nonzero components of ๐ฑ\bm{x} are the first kk components. Then

x1๐š1+x2๐š2+โ‹ฏ+xk๐šk=๐›,xi>0,i=1,2,โ€ฆ,k. x_1\bm{a}_1 + x_2\bm{a}_2 + \cdots + x_k\bm{a}_k = \bm{b}, \quad x_i > 0, \quad i = 1, 2, \ldots, k.

To show that ๐ฑ\bm{x} is a basic feasible solution (BFS) it must be shown that the vectors ๐š1,๐š2,โ€ฆ,๐šk\bm{a}_1, \bm{a}_2, \ldots, \bm{a}_k are linearly independent. We do this by contradiction. Suppose they are linearly dependent. Then there is a nontrivial linear combination that is zero:

y1๐š1+y2๐š2+โ‹ฏyk๐šk=๐ŸŽ. y_1\bm{a}_1 + y_2\bm{a}_2 + \cdots y_k\bm{a}_k = \bm{0}.

Define the nn-vector ๐ฒ=(y1,y2,โ€ฆ,yk,0,0,โ€ฆ,0)\bm{y} = (y_1, y_2, \ldots, y_k, 0, 0, \ldots, 0). Since xi>0x_i > 0, 1โ‰คiโ‰คk1 \leq i \leq k, it is possible to select ฮต\varepsilon such that

๐ฑ+ฮต๐ฒโ‰ฅ๐ŸŽ,๐ฑโˆ’ฮต๐ฒโ‰ฅ๐ŸŽ. \bm{x} + \varepsilon \bm{y} \geq \bm{0}, \quad \bm{x} - \varepsilon \bm{y} \geq \bm{0}.

We then have ๐ฑ=12(๐ฑ+ฮต๐ฒ)+12(๐ฑโˆ’ฮต๐ฒ)\bm{x} = \frac{1}{2}(\bm{x}+\varepsilon\bm{y}) + \frac{1}{2}(\bm{x}-\varepsilon\bm{y}) which expresses ๐ฑ\bm{x} as a convex combination of two distinct vectors in KK.

Extreme Points

Corollary

If the convex set KK corresponding to Equation 11 is nonempty, it has at least one extreme point.

Corollary

If there is a finite optimal solution to a linear programming problem, there is a finite optimal solution which is an extreme point of the constraint set.

Corollary

The constraint set KK corresponding to Equation 11 possesses at most a finite number of extreme points and each of them is finite.

Corollary

If the convex polytope KK corresponding to Equation 11 is bounded, then KK is a convex polyhedron, that is, KK consists of points that are convex combinations of a finite number of points.

Convex Geometry โ€“ Examples

Consider the constraint set in โ„3\mathbb{R}^3 defined by

x1+x2+x3=1,x1,x2,x3โ‰ฅ0. \begin{align} x_1 + x_2 + x_3 &= 1, \\ x_1, x_2, x_3 &\geq 0. \end{align}

  • This set is illustrated in the figure.
  • It has three extreme points, corresponding to the three basic solutions to x1+x2+x3=1x_1 + x_2 + x_3 = 1.

Consider the constraint set in โ„3\mathbb{R}^3 defined by

x1+x2+x3=1,2x1+3x2=1,x1,x2,x3โ‰ฅ0. \begin{align} x_1 + x_2 + x_3 &= 1, \\ 2x_1 + 3x_2 &= 1, \\ x_1, x_2, x_3 &\geq 0. \end{align}

  • This set is illustrated in the figure.
  • It has two extreme points, corresponding to the two basic feasible solutions.
  • Note that the system of equations itself has three basic solutions: (2,โˆ’1,0),(12,0,12),(0,13,23), (2, -1, 0), (\frac{1}{2}, 0, \frac{1}{2}), (0, \frac{1}{3}, \frac{2}{3}), the first of which is not feasible.

Convex Geometry โ€“ Examples

Consider the constraint set in โ„2\mathbb{R}^2 defined by

x1+83x2โ‰ค4,x1+x2โ‰ค2,2x1โ‰ค3,x1,x2โ‰ฅ0. \begin{align} x_1 + \frac{8}{3}x_2 &\leq 4, \\ x_1 + x_2 &\leq 2, \\ 2x_1 &\leq 3, \\ x_1, x_2 &\geq 0. \end{align}

  • The set has five extreme points.
  • In order to compare this example, we must introduce slack variables to yield the equivalent set in โ„5\mathbb{R}^5.

x1+83x2+x3=4,x1+x2+x4=2,2x1+x5=3x1,x2,x3,x4,x5โ‰ฅ0. \begin{align} x_1 + \frac{8}{3}x_2 + x_3 &= 4, \\ x_1 + x_2 + x_4 &= 2, \\ 2_x1 + x_5 &= 3 \\ x_1, x_2, x_3, x_4, x_5 \geq 0. \end{align}

  • A basic solution for this system is obtained by setting any two variables to zero and solving for the remaining three.
  • A basic solution for this system is obtained by setting any two variables to zero and solving for the remaining three.

  • As indicated in the figure, each edge of the figure corresponds to one variable being zero, and the extreme points are the points where two variables are zero.

  • Even when not expressed in the standard form, the extreme points of the set defined by the constraints of a lienar program correspond to the possible solution points.

  • The level sets of an objective function โˆ’2x1โˆ’x2-2x_1 - x_2 is included in the bottom figure.

    • As the level varies, different parallel lines are obtained.
    • The optimal value of the linear program is the smallest value of this level for which the corresponding line has a point in common with the feasible set.
    • In the figure, this occurs at the point (32,12)(\frac{3}{2}, \frac{1}{2}) with the level z=โˆ’72z = -\frac{7}{2}.

Farkasโ€™s Lemma and Alternative Systems

(In)feasibility Certificates

Theorem (Farkasโ€™s Lemma).

Let ๐€\bm{A} be an mร—nm \times n matrix and ๐›\bm{b} be an mm-vector. The system of constraints ๐€๐ฑ=๐›,๐ฑโ‰ฅ๐ŸŽ(12) \bm{Ax} = \bm{b}, \quad \bm{x} \geq \bm{0} \qquad(12) has a feasible solution ๐ฑ\bm{x} if and only if the system of constraints โˆ’๐ฒโŠค๐€โ‰ฅ๐ŸŽ,๐ฒโŠค๐›=1(or>0)(13) -\bm{y}^\top \bm{A} \geq \bm{0}, \quad \bm{y}^\top \bm{b} = 1 (\text{or} > 0) \qquad(13) has no feasible solution ๐ฒ\bm{y}. Therefore a single feasible solution ๐ฒ\bm{y} for system Equation 13 establishes an infeasibility certificate for the system Equation 12.

  • The two systems, Equation 12 and Equation 13, are called alternative systems: one of them is feasible and the other is infeasible.

Example 1

Suppose ๐€=[11],๐›=โˆ’1\bm{A} = \begin{bmatrix} 1 & 1 \end{bmatrix}, \quad \bm{b} = -1. Then, y=โˆ’1y = -1 is feasible for system Equation 13, which proves that the system Equation 12 is infeasible.

(In)feasibility Certificates

Lemma

Let CC be the cone generated by the columns of matrix ๐€\bm{A}, that is C={๐€๐ฑโˆˆโ„m:๐ฑโ‰ฅ๐ŸŽ}. C = \{\bm{Ax} \in \mathbb{R}^m: \bm{x} \geq \bm{0}\}. Then C is a closed and convex set.

Proof (of Farkasโ€™s Lemma).

Let the system Equation 12 have a feasible solution, say ๐ฑโ€พ\bar{\bm{x}}. Then, the system Equation 13 must be infeasible, since, otherwise, we have a contradiction

0<๐ฒโŠค๐›=๐ฒโŠค(๐€๐ฑโ€พ)=(๐ฒโŠค๐€)๐ฑโ€พโ‰ค0, 0 < \bm{y}^\top \bm{b} = \bm{y}^\top(\bm{A}\bar{\bm{x}}) = (\bm{y}^\top \bm{A})\bar{\bm{x}} \leq 0,

from ๐ฑโ€พโ‰ฅ๐ŸŽ\bar{\bm{x}} \geq \bm{0} and ๐ฒโŠค๐€โ‰ค๐ŸŽ\bm{y}^\top \bm{A} \leq \bm{0}.

Now, let the system Equation 12 have no feasible solution, that is, ๐›โˆ‰C:={๐€๐ฑ:๐ฑโ‰ฅ0}\bm{b} \notin C := \{\bm{Ax}: \bm{x} \geq 0\}. We now prove that its alternative system Equation 13 must have a feasible solution.

Since points ๐›\bm{b} is not in CC and CC is a closed convex set, by the separating hyperplane theorem, there is a ๐ฒ\bm{y} such that ๐ฒโŠค๐›>sup๐œโˆˆC๐ฒโŠค๐œ. \bm{y}^\top \bm{b} > \operatorname{sup}_{\bm{c} \in C} \bm{y}^\top \bm{c}. But we know that ๐œ=๐€๐ฑ\bm{c} = \bm{Ax} for some ๐ฑโ‰ฅ๐ŸŽ\bm{x} \geq \bm{0}, so we have ๐ฒโŠค๐›>sup๐ฑโ‰ฅ๐ŸŽ๐ฒโŠค๐€๐ฑ=sup๐ฑโ‰ฅ๐ŸŽ(๐ฒโŠค๐€)๐ฑ.(14) \bm{y}^\top \bm{b} > \operatorname{sup}_{\bm{x}\geq\bm{0}} \bm{y}^\top \bm{Ax} = \operatorname{sup}_{\bm{x} \geq \bm{0}} (\bm{y}^\top \bm{A})\bm{x}. \qquad(14) Setting ๐ฑ=๐ŸŽ\bm{x} = \bm{0}, we have ๐ฒโŠค๐›>0\bm{y}^\top \bm{b} > 0 from inequality Equation 14.

(In)feasibility Certificates

Proof (of Farkasโ€™s Lemma) - Continued -

Furthermore, inequality Equation 14 also implies ๐ฒโŠค๐€โ‰ค๐ŸŽ\bm{y}^\top \bm{A} \leq \bm{0}. Since otherwise, say the first entry of ๐ฒโŠค๐€\bm{y}^\top \bm{A}, (๐ฒโŠค๐€)1(\bm{y}^\top \bm{A})_1, is positive. We can then choose a vector ๐ฑโ€พโ‰ฅ๐ŸŽ\bar{\bm{x}} \geq \bm{0} such that

xโ€พ1=ฮฑ>0,xโ€พ2=โ‹ฏ=xโ€พn=0. \bar{x}_1 = \alpha > 0, \bar{x}_2 = \cdots = \bar{x}_n = 0.

Then, from this choice, we have

sup๐ฑโ‰ฅ๐ŸŽ(๐ฒโŠค๐€)๐ฑโ‰ฅ(๐ฒโŠค๐€)๐ฑโ€พ=ฮฑ(๐ฒโŠค๐€)1. \operatorname{sup}_{\bm{x} \geq \bm{0}} (\bm{y}^\top\bm{A})\bm{x} \geq (\bm{y}^\top \bm{A})\bar{\bm{x}} = \alpha(\bm{y}^\top \bm{A})_1.

This tends to โˆž\infty as ฮฑโ†’โˆž\alpha \rightarrow \infty. This is a contradiction because (๐ฒโŠค๐€)๐ฑโ€พ(\bm{y}^\top\bm{A})\bar{\bm{x}} should be bounded from above by inequality Equation 14. Therefore, ๐ฒ\bm{y} identified in the separating hyperplane theorem is a feasible solution to system Equation 13. Finally, we can always scale ๐ฒ\bm{y} such that ๐ฒโŠค๐›=1\bm{y}^\top \bm{b} = 1.

Geometric Interpretation

If ๐›\bm{b} is not in the closed and convex cone generated by the columns of the matrix ๐€\bm{A}, then there must be a hyperplane separating ๐›\bm{b} and the cone, and the feasible solution ๐ฒ\bm{y} to the alternative system is the slope-vector of the hyperplane.

Variant of Farkasโ€™s Lemma

Corollary

Let ๐€\bm{A} be an mร—nm \times n matrix and ๐œ\bm{c} an nn-vector. The system of constraints

๐€โŠค๐ฒโ‰คc(15) \bm{A}^\top \bm{y} \leq c \qquad(15)

has a feasible solution ๐ฒ\bm{y} if and only if the system of constraints

๐€๐ฑ=๐ŸŽ,๐ฑโ‰ฅ๐ŸŽ,๐œโŠค๐ฑ=โˆ’1(or<0)(16) \bm{Ax} = \bm{0}, \quad \bm{x} \geq \bm{0}, \quad \bm{c}^\top \bm{x} = -1 \; (\text{or} < 0) \qquad(16)

has no feasible solution ๐ฑ\bm{x}. Therefore a single feasible solution ๐ฑ\bm{x} for system Equation 16 establishes an infeasibility certificate for the system Equation 15.

Adjacent Basic Feasible Solutions (Extreme Points)

Adjacent Solutions

  • We know that it is only necessary to consider basic feasible solutions to the system ๐€๐ฑ=๐›,๐ฑโ‰ฅ๐ŸŽ(17) \bm{Ax} = \bm{b}, \quad \bm{x} \geq \bm{0} \qquad(17) when solving a linear program.

  • The idea of the simplex method is to move from a basic feasible solution (extreme point) to an adjacent one with an improved objective value.

Definition

Two basic feasible solutions are said to be adjacent if and only if they differ by one basic variable.

  • A new basic solution can be generated from an old one by replacing one current basic variable by a current nonbasic one.
    • It is impossible to arbitrarily specify a pair of variables whose roles are to be interchanged and expect to maintain the nonnegativity condition.
    • However, it is possible to arbitrarily specify which current nonbasic (entering or incoming) variable is to become basic and then determine which current basic (leaving or outgoing) variable should become nonbasic.

Determination of a Vector to Leave the Basis

  • Let the BFS be partitioned as ๐ฑ๐=(x1,x2,โ€ฆ,xm)\bm{x}_{\bm{B}} = (x_1, x_2, \ldots, x_m) and ๐ฑ๐ƒ=(xm+1,xm+2,โ€ฆ,xn)\bm{x}_{\bm{D}} = (x_{m+1}, x_{m+2}, \ldots, x_n).
  • ๐›\bm{b} is a linear combination of columns of ๐=[๐š1๐š2โ‹ฏ๐šm]\bm{B} = \begin{bmatrix} \bm{a}_1 & \bm{a}_2 & \cdots & \bm{a}_m \end{bmatrix} with the positive multipliers (x1,x2,โ€ฆ,xm)(x_1, x_2, \ldots, x_m).

๐›=๐๐ฑ๐=x1๐š1+x2๐š2+โ‹ฏ+xm๐šm,where๐ฑ๐=๐šโ€พ0:=๐โˆ’1๐›โ‰ฅ๐ŸŽ.(18) \bm{b} = \bm{B}\bm{x}_{\bm{B}} = x_1\bm{a}_1 + x_2 \bm{a}_2 + \cdots + x_m\bm{a}_m, \quad \text{where} \quad \bm{x}_{\bm{B}} = \bar{\bm{a}}_0 := \bm{B}^{-1}\bm{b} \geq \bm{0}. \qquad(18)

  • Suppose we decided to bring into representation the ethe^{\text{th}} (entering) column vector of ๐€\bm{A}, ๐še\bm{a}_e (e>me > m), while keeping all others nonbasic.
  • A new representation of ๐›\bm{b} as a linear combination of m+1m+1 vectors (๐še\bm{a}_e added to the current basis ๐\bm{B}) for any nonnegative multiplier xex_e and ๐ฑ๐\bm{x}_{\bm{B}}:

๐›=๐๐ฑ๐+๐šexe,or๐šโ€พ0=๐โˆ’1๐›=๐ฑ๐+(๐โˆ’1๐še)xe.(19) \bm{b} = \bm{Bx_B} + \bm{a}_ex_e, \quad \text{or} \quad \bar{\bm{a}}_0 = \bm{B}^{-1}\bm{b} = \bm{x}_{\bm{B}} + (\bm{B}^{-1}\bm{a}_e)x_e. \qquad(19)

  • Since xex_e is the incoming variable, its value needs to be increased from the current 00 to a positive value, say ฮตโ‰ฅ0\varepsilon \geq 0.
  • As xex_e value increases, the current basic variable ๐ฑ๐\bm{x}_{\bm{B}} needs to be adjusted accordingly to keep the feasibility.

๐ฑ๐=๐šโ€พ0โˆ’ฮต(๐โˆ’1๐še)=๐šโ€พ0โˆ’ฮต๐šโ€พeโ‰ฅ๐ŸŽ,where๐šโ€พe=๐โˆ’1๐še.(20) \bm{x}_{\bm{B}} = \bar{\bm{a}}_0 - \varepsilon(\bm{B}^{-1}\bm{a}_e) = \bar{\bm{a}}_0 - \varepsilon\bar{\bm{a}}_e \geq \bm{0}, \quad \text{where} \quad \bar{\bm{a}}_e = \bm{B}^{-1}\bm{a}_e. \qquad(20)

Determination of a Vector to Leave the Basis

  • For ฮต=0\varepsilon = 0, we have the old basic feasible solution ๐ฑ๐=๐šโ€พ0>๐ŸŽ\bm{x}_{\bm{B}} = \bar{\bm{a}}_0 > \bm{0}.
  • The values of ๐ฑ๐\bm{x}_{\bm{B}} will either increase or remain unchanged if aโ€พieโ‰ค0\bar{a}_{ie} \leq 0; or decrease linearly as ฮต\varepsilon is increased if aโ€พie>0\bar{a}_{ie} > 0.
  • For small enough ฮต\varepsilon, Equation 19 gives a feasible but nonbasic solution.
  • If any decrease, we may set ฮต\varepsilon equal to the value corresponding to the first place where one (or more) of the value vanishes

ฮต=mini{aโ€พi0aโ€พie:aโ€พie>0}.(21) \varepsilon = \operatorname{min}_i \left\{\frac{\bar{a}_{i0}}{\bar{a}_{ie}}: \bar{a}_{ie} > 0 \right\}. \qquad(21)

  • We have a new BFS, with the vector ๐še\bm{a}_e replacing (outgoing) column ๐šo\bm{a}_o, where index oโ‰คmo \leq m corresponds to the minimizing-ratio in Equation 21 o=argโ€†mini{aโ€พi0aโ€พie:aโ€พie>0}. o = \operatorname{arg \, min}_i \left\{\frac{\bar{a}_{i0}}{\bar{a}_{ie}}: \bar{a}_{ie} > 0 \right\}.
  • If none of the aโ€พie\bar{a}_{ie}โ€™s are positive, then all coefficients in Equation 19 increase (or remain constant) as ฮต\varepsilon is increased, and no new basic feasible solution is obtained.

  • In this case the set KK of feasible solutions to Equation 17 is unbounded.

Conic Combination Interpretations

  • The basis transformation can be represented in the requirements space, the space where columns of ๐€\bm{A} and ๐›\bm{b} are represented.

๐š1x1+๐š2x2+โ‹ฏ+๐šnxn=๐›. \bm{a}_1x_1 + \bm{a}_2x_2 + \cdots + \bm{a}_nx_n = \bm{b}.

  • A feasible solution defines a representation of ๐›\bm{b} as a conic combination of the ๐ši\bm{a}_iโ€™s.
  • A BFS will only use mm positive weights.
  • Suppose we start with ๐š1\bm{a}_1 and ๐š2\bm{a}_2 as the initial basis.
    • Then an adjacent basis is found by bringing in some other vector.
    • If ๐š3\bm{a}_3 is brought in, then clearly ๐š2\bm{a}_2 must go out (why?).
    • On the other hand, if ๐š4\bm{a}_4 is brought in, ๐š1\bm{a}_1 must go out (why?).
  • A BFS can be constructed with positive weights on ๐š1\bm{a}_1 and ๐š2\bm{a}_2 because ๐›\bm{b} lies between them.
  • A BFS cannot be constructed with positive weights on ๐š1\bm{a}_1 and ๐š4\bm{a}_4.
  • Another interpretation is in the activity space, the space where ๐ฑ\bm{x} lives.
    • Here, the feasible region is shown directly as a convex set, and BFS are extreme points.
    • Adjacent extreme points are points that lie on a common edge.

Conic Combination Interpretations โ€“ Example

Example (Basis Change Illustration)

3x1+x2โˆ’2x3+x4=2,x1+3x2โˆ’x4=2. \begin{align} 3x_1 + x_2 - 2x_3 + x_4 &= 2, \\ x_1 + 3x_2 - x_4 &= 2. \end{align}

  • Suppose we start with ๐š1\bm{a}_1 and ๐š2\bm{a}_2 as the initial basis and select ๐š3\bm{a}_3 as the incoming column. Then

๐=(3113),๐โˆ’1=(3818โˆ’1838),๐šโ€พ0=๐โˆ’1๐›=(1212),๐šโ€พ3=๐โˆ’1๐š3=(โˆ’3414). \begin{equation} \bm{B} = \begin{pmatrix} 3 & 1 \\ 1 & 3 \end{pmatrix}, \quad \bm{B}^{-1} = \begin{pmatrix} \frac{3}{8} & \frac{1}{8} \\ -\frac{1}{8} & \frac{3}{8} \end{pmatrix}, \quad \bar{\bm{a}}_0 = \bm{B}^{-1}\bm{b} = \begin{pmatrix} \frac{1}{2} \\ \frac{1}{2} \end{pmatrix}, \quad \bar{\bm{a}}_3 = \bm{B}^{-1}\bm{a}_3 = \begin{pmatrix} -\frac{3}{4} \\ \frac{1}{4} \end{pmatrix}. \end{equation}

  • From Equation 21, ฮต=2\varepsilon = 2 and ๐š2\bm{a}_2 is the outgoing column so that the new basis is formed by ๐š1\bm{a}_1 and ๐š3\bm{a}_3.

  • Now, suppose we start with ๐š1\bm{a}_1 and ๐š3\bm{a}_3 as the initial basis and select ๐š4\bm{a}_4 as the incoming column. Then

๐=(3โˆ’210),๐โˆ’1=(01โˆ’1232),๐šโ€พ0=๐โˆ’1๐›=(22),๐šโ€พ4=๐โˆ’1๐š4=(โˆ’1โˆ’2). \begin{equation} \bm{B} = \begin{pmatrix} 3 & -2 \\ 1 & 0 \end{pmatrix}, \quad \bm{B}^{-1} = \begin{pmatrix} 0 & 1 \\ -\frac{1}{2} & \frac{3}{2} \end{pmatrix}, \quad \bar{\bm{a}}_0 = \bm{B}^{-1}\bm{b} = \begin{pmatrix} 2 \\ 2 \end{pmatrix}, \quad \bar{\bm{a}}_4 = \bm{B}^{-1}\bm{a}_4 = \begin{pmatrix} -1 \\ -2 \end{pmatrix}. \end{equation}

  • Since the entries of the incoming column ๐šโ€พ4\bar{\bm{a}}_4 are all negative, ฮต\varepsilon in Equation 21 can go to โˆž\infty, indicating that the feasible region is unbounded.

The Primal Simplex Method

Determining an Optimal Feasible Solution

The idea of the simplex method is to select the incoming column so that the resulting new BFS will yield a lower value to the objective function than the previous one.

  • Assume that ๐\bm{B} consists of the first mm columns of AA. Then partition ๐€,๐ฑ\bm{A}, \bm{x} and ๐œโŠค\bm{c}^\top as

๐€=[๐๐] \bm{A} = \begin{bmatrix} \bm{B} & \bm{N} \end{bmatrix} ๐ฑ=[๐ฑ๐๐ฑ๐],๐œโŠค=[๐œ๐โŠค๐œ๐โŠค]. \bm{x} = \begin{bmatrix}\bm{x}_{\bm{B}} & \bm{x}_{\bm{N}}\end{bmatrix}, \quad \bm{c}^\top = \begin{bmatrix} \bm{c}_{\bm{B}}^\top & \bm{c}_{\bm{N}}^\top \end{bmatrix}.

  • Suppose we have a basic feasible solution ๐ฑ๐=๐šโ€พ0:=๐โˆ’1๐›โ‰ฅ๐ŸŽ,and๐ฑ๐=๐ŸŽ. \bm{x}_{\bm{B}} = \bar{\bm{a}}_0 := \bm{B}^{-1}\bm{b} \geq \bm{0}, \quad \text{and} \quad \bm{x}_{\bm{N}} = \bm{0}.

  • The value of the objective function is z=c1x1+c2x2+โ‹ฏ+cnxn=๐œ๐โŠค๐ฑ๐+๐œ๐โŠค๐ฑ๐(22) z = c_1x_1 + c_2x_2 + \cdots + c_nx_n = \bm{c}_{\bm{B}}^\top\bm{x}_{\bm{B}} + \bm{c}_{\bm{N}}^\top\bm{x}_{\bm{N}} \qquad(22)

  • Hence for the current basic solution, the corresponding value is z0=๐œ๐โŠค๐โˆ’1๐›.(23) z_0 = \bm{c}_{\bm{B}}^\top \bm{B}^{-1}\bm{b}. \qquad(23)

  • For any value of ๐ฑ๐\bm{x}_{\bm{N}} the necessary value of ๐ฑ๐\bm{x}_{\bm{B}} is determined from mm equality constraints of the linear program, i.e., from ๐€๐ฑ=๐›\bm{Ax} = \bm{b}.

๐๐ฑ๐+๐๐ฑ๐=๐›or๐ฑ๐=๐โˆ’1๐›โˆ’๐โˆ’1๐๐ฑ๐(24) \bm{Bx_{B} + Nx_{N}} = \bm{b} \quad \text{or} \quad \bm{x_B} = \bm{B}^{-1}\bm{b} - \bm{B}^{-1}\bm{N}\bm{x_{N}} \qquad(24)

  • When this expression is substituted into Equation 22 we obtain z=๐œ๐โŠค(๐โˆ’1๐›โˆ’๐โˆ’1๐๐ฑ๐)+๐œ๐โŠค๐ฑ๐=๐œ๐โŠค๐โˆ’1๐›+(๐œ๐โŠคโˆ’๐œ๐โŠค๐โˆ’1๐)๐ฑ๐=z0+(๐œ๐โŠคโˆ’๐ฒโŠค๐)๐ฑ๐.(25) \begin{align} z &= \bm{c}_{\bm{B}}^\top(\bm{B}^{-1}\bm{b}-\bm{B}^{-1}\bm{Nx_{N}}) + \bm{c}_{\bm{N}}^\top\bm{x_{N}} \\ &= \bm{c}_{\bm{B}}^\top\bm{B}^{-1}\bm{b} + (\bm{c}_{\bm{N}}^\top - \bm{c}_{\bm{B}}^\top\bm{B}^{-1}\bm{N})\bm{x_{N}} \\ &= z_0 + (\bm{c}_{\bm{N}}^\top - \bm{y}^\top\bm{N})\bm{x_{N}}. \end{align} \qquad(25)

Determining an Optimal Feasible Solution

  • Equation 25 expresses the cost of any feasible solution to Equation 17 in terms of the independent variable in ๐ฑ๐\bm{x_{N}}.

  • Here, ๐ฒโŠค=๐œ๐โŠค๐โˆ’1\bm{y}^\top = \bm{c}_{\bm{B}}^\top\bm{B}^{-1} is the simplex multipliers or the shadow prices corresponding to basis ๐\bm{B}.

  • Let ๐ซ๐โŠค=๐œ๐โŠคโˆ’๐ฒโŠค๐.(26) \bm{r}_{\bm{N}}^\top = \bm{c}_{\bm{N}}^\top - \bm{y}^\top\bm{N}. \qquad(26)
  • Then from formula Equation 25, z=๐œโŠค๐ฑ=z0+โˆ‘j=m+1nrjxj(27) z = \bm{c}^\top \bm{x} = z_0 + \sum_{j=m+1}^n r_jx_j \qquad(27)
  • The vector ๐ซ๐\bm{r_N} represents the relative cost vector, also called reduced cost or reduced gradient vector for nonbasic variables in ๐ฑ๐\bm{x_{N}}.
  • From formula Equation 27, we can now determine if there is any advantage in changing the basic solution by introducing one of the nonbasic variables.
    • If rjr_j is negative for some jj, m+1โ‰คjโ‰คnm+1 \leq j \leq n, then increasing xjx_j from zero to some positive value would decrease the total cost, yielding a better solution.
    • Equation 27 automatically takes into account the changes that would be required in the values of the basic variables x1,x2,โ€ฆ,xmx_1, x_2, \ldots, x_m to accommodate for the change in xjx_j.

Determining an Optimal Feasible Solution

Theorem (Improvement of Basic Feasible Solution)

Given a nondegenerate BFS with corresponding objective value z0z_0, suppose that for some jj there holds rj<0r_j < 0. Then there is a feasible solution with objective value z<z0z < z_0. If the column ๐šj\bm{a}_j can be substituted for some vector in the original basis to yield a new basic feasible solution, this new solution will have z<z0z < z_0. If ๐šj\bm{a}_j cannot be substituted to yield a BFS then the solution set KK is unbounded and the objective function can be made arbitarily small (toward minus infinity).

  • Final question: Does rjโ‰ฅ0,โˆ€jr_j \geq 0, \;\; \forall j imply optimality?
    • The answer is โ€œyesโ€ due to strong duality and the fact that

๐ซ๐โŠค=๐œ๐โŠคโˆ’๐ฒโŠค๐=๐œ๐โŠคโˆ’๐œ๐โŠค=๐ŸŽ. \bm{r}_{\bm{B}}^\top = \bm{c}_{\bm{B}}^\top - \bm{y}^\top\bm{B} = \bm{c}_{\bm{B}}^\top - \bm{c}_{\bm{B}}^\top = \bm{0}.

This means that

0=๐ซ๐โŠค๐ฑ๐=๐œ๐โŠค๐ฑ๐โˆ’๐ฒโŠค๐๐ฑ๐=๐œ๐โŠค๐ฑ๐โˆ’๐ฒโŠค๐๐โˆ’1๐›=๐œโŠค๐ฑโˆ’๐ฒโŠค๐›. 0 = \bm{r}_{\bm{B}}^\top \bm{x_B} = \bm{c}_{\bm{B}}^\top \bm{x}_{\bm{B}} - \bm{y}^\top \bm{Bx_B} = \bm{c}_{\bm{B}}^\top \bm{x}_{\bm{B}} - \bm{y}^\top \bm{B}\bm{B}^{-1}\bm{b} = \bm{c}^\top \bm{x} - \bm{y}^\top \bm{b}.

and strong duality forces that ๐ฑ\bm{x} is optimal.

Optimality Condition Theorem

If for some basic feasible solution rjโ‰ฅ0โˆ€jr_j \geq 0 \;\; \forall j, then that solution is optimal.

Economic Interpretations โ€“ Diet Problem

 

Optimality

  • We consider a certain food not in the basis โ€“ say carrots โ€“ and determine if it would be advantageous to bring it into the basis.
  • This is easily determined by examining the cost of carrots as compared with the cost of synthetic carrots.
  • Say carrots are food jj, whose unit cost is cjc_j. The cost of a unit of synthetic carrots is โˆ‘i=1mci(๐โˆ’1๐šj)i=๐œ๐โŠค๐โˆ’1๐šj=๐ฒโŠค๐šj. \sum_{i=1}^m c_i (\bm{B}^{-1}\bm{a}_j)_i = \bm{c}_{\bm{B}}^\top \bm{B}^{-1}\bm{a}_j = \bm{y}^\top \bm{a}_j.
  • If the reduced coefficient rj=cjโˆ’๐ฒโŠค๐šj<0r_j = c_j - \bm{y}^\top\bm{a}_j < 0, it is advantageous to use real carrots in place of synthetic carrots, and carrots should be brought into the basis.
  • In general, each ๐ฒโŠค๐šj\bm{y}^\top \bm{a}_j can be thought of as the price of a unit of the column ๐šj\bm{a}_j when constructed from the current basis.
    • The difference between this synthetic price and the direct price of that column determines whether that column should enter the basis.

Diet Problem (Exact nutritional requirements)

minimize๐œโŠค๐ฑsubject to๐€๐ฑ=๐›,๐ฑโ‰ฅ๐ŸŽ. \begin{align} \operatorname{minimize} & \bm{c}^\top \bm{x} \\ \text{subject to} & \bm{Ax} = \bm{b}, \quad \bm{x} \geq \bm{0}. \end{align}

  • ๐šj\bm{a}_j gives the nutritional equivalent of a unit of a particular food.
  • Given a basis ๐\bm{B}, say the first mm columns of ๐€\bm{A}, the corresponding ๐โˆ’1๐šj\bm{B}^{-1}\bm{a}_j shows how the nutritinal contents of any food jj can be constructed as a combination of the foods in the basis.
  • For instance, if carrots are not in the basis we can, using the description given by the tableau, construct a synthetic carrot, which is nutritionally equivalent to a carrot, by an appropriate combination of the foods in the basis.

The Simplex Procedure

Step 0. Given the the inverse ๐โˆ’1\bm{B}^{-1} of a current basis, and the current solution ๐ฑ๐=๐šโ€พ0=๐โˆ’1๐›\bm{x}_{\bm{B}} = \bar{\bm{a}}_0 = \bm{B}^{-1}\bm{b}.

Step 1. Calculate the current simplex multiplier vector ๐ฒโŠค=๐œ๐โŠค๐โˆ’1\bm{y}^\top = \bm{c}_{\bm{B}}^\top\bm{B}^{-1} and then calculate the relative cost coefficients ๐ซ๐โŠค=๐œ๐โŠคโˆ’๐ฒโŠค๐\bm{r}_{\bm{N}}^\top = \bm{c}_{\bm{N}}^\top - \bm{y}^\top\bm{N}. If ๐ซ๐โ‰ฅ๐ŸŽ\bm{r}_{\bm{N}} \geq \bm{0} stop; the current solution is optimal.

Step 2. Determine the vector ๐še\bm{a}_e that is to enter the basis by selecting its most negative cost coefficient, the ethe^{\text{th}} (e>me > m) coefficient (break ties arbitrarily); and calculate ๐šโ€พe=๐โˆ’1๐še\bar{\bm{a}}_e = \bm{B}^{-1}\bm{a}_e.

Step 3. If ๐šโ€พeโ‰ค๐ŸŽ\bar{\bm{a}}_e \leq \bm{0}, stop; the problem is unbounded. Otherwise, calculate the ratios aโ€พi0aโ€พie\frac{\bar{a}_{i0}}{\bar{a}_{ie}} for aโ€พie>0\bar{a}_{ie} > 0 to determine the current basic column, ๐šo\bm{a}_o where oโ‰คm+1o \leq m+1 corresponds to the index of the minimum ratio, to leave the basis.

Step 4. Update ๐โˆ’1\bm{B}^{-1} (or its factorization) and the new basic feasible solution ๐šโ€พ0=๐โˆ’1๐›\bar{\bm{a}}_0 = \bm{B}^{-1}\bm{b}. Return to Step 1.

Remark

The basic columns in ๐\bm{B} and the nonbasic columns in ๐\bm{N} can be ordered arbitrarily and the components in ๐ฑ๐,๐ฑ๐,๐œ๐,๐œ๐ƒ\bm{x_B}, \bm{x_{N}}, \bm{c_B}, \bm{c_D} follow the same index orders.

More precisely, let columns be permuted as ๐=[๐šฯƒ(1)๐šฯƒ(2)โ‹ฏ๐šฯƒ(m)]\bm{B} = \begin{bmatrix} \bm{a}_{\sigma(1)} & \bm{a}_{\sigma(2)} & \cdots & \bm{a}_{\sigma(m)} \end{bmatrix} and ๐=[๐šฯƒ(m+1)๐šฯƒ(m+2)โ‹ฏ๐šฯƒ(n)]\bm{N} = \begin{bmatrix} \bm{a}_{\sigma(m+1)} & \bm{a}_{\sigma(m+2)} & \cdots & \bm{a}_{\sigma(n)} \end{bmatrix}. Then when rer_e is identified as the most negative coefficient in ๐ซ๐\bm{r}_{\bm{N}} in Step 2, ๐šฯƒ(e)\bm{a}_{\sigma(e)} is the entering column. Similarly, when oo is identified as the minimum ratio index in Step 3, ๐šฯƒ(o)\bm{a}_{\sigma(o)} is the outgoing column.

Example โ€“ Primal Simplex Procedure Illustration

Suppose we start with the initial basis ๐=[๐š1๐š3]\bm{B} = \begin{bmatrix} \bm{a}_1 & \bm{a}_3 \end{bmatrix} and ๐=[๐š2๐š4]\bm{N} = \begin{bmatrix} \bm{a}_2 & \bm{a}_4 \end{bmatrix}.

Step 0. Initialization

๐=[3โˆ’210],๐โˆ’1=[01โˆ’1232],๐šโ€พ0=๐โˆ’1๐›=[22]. \begin{equation} \bm{B} = \begin{bmatrix} 3 & -2 \\ 1 & 0 \end{bmatrix}, \quad \bm{B}^{-1} = \begin{bmatrix} 0 & 1 \\ -\frac{1}{2} & \frac{3}{2} \end{bmatrix}, \quad \bar{\bm{a}}_0 = \bm{B}^{-1}\bm{b} = \begin{bmatrix} 2 \\ 2 \end{bmatrix}. \end{equation}

Step 1. Calculate

๐ฒโŠค=๐œ๐โŠค๐โˆ’1=[182][01โˆ’1232]=[โˆ’121]. \begin{equation} \bm{y}^\top = \bm{c}_{\bm{B}}^\top\bm{B}^{-1} = \begin{bmatrix} 18 & 2 \end{bmatrix} \begin{bmatrix} 0 & 1 \\ -\frac{1}{2} & \frac{3}{2} \end{bmatrix} = \begin{bmatrix} -1 & 21 \end{bmatrix}. \end{equation}

and

๐ซ๐โŠค=๐œ๐โŠคโˆ’๐ฒโŠค๐=[126]โˆ’[โˆ’121][113โˆ’1]=[โˆ’5028]. \begin{equation} \bm{r}_{\bm{N}}^\top = \bm{c}_{\bm{N}}^\top - \bm{y}^\top\bm{N} = \begin{bmatrix} 12 & 6 \end{bmatrix} - \begin{bmatrix} -1 & 21 \end{bmatrix} \begin{bmatrix} 1 & 1 \\ 3 & -1 \end{bmatrix} = \begin{bmatrix} -50 & 28 \end{bmatrix}. \end{equation}

Step 2. Then see e=2e = 2, that is, ๐š2\bm{a}_2 is the incoming column, and calculate

๐šโ€พ2=๐โˆ’1๐š2=[01โˆ’1232][13]=[34]. \bar{\bm{a}}_2 = \bm{B}^{-1}\bm{a}_2 = \begin{bmatrix} 0 & 1 \\ -\frac{1}{2} & \frac{3}{2} \end{bmatrix} \begin{bmatrix} 1 \\ 3 \end{bmatrix} = \begin{bmatrix} 3 \\ 4 \end{bmatrix}.

minimize18x1+12x2+2x3+6x4subject to3x1+x2โˆ’2x3+x4=2,x1+3x2โˆ’x4=2,x1,x2,x3,x4โ‰ฅ0. \begin{align} \operatorname{minimize} & 18x_1 + 12x_2 + 2x_3 + 6x_4 \\ \text{subject to} & 3x_1 + x_2 - 2x_3 + x_4 = 2, \\ & x_1 + 3x_2 - x_4 = 2, \\ & x_1, x_2, x_3, x_4 \geq 0. \end{align}

Step 3. Since ๐šโ€พ2โ‰ฅ๐ŸŽ\bar{\bm{a}}_2 \geq \bm{0} the ratios are, via the component-wise divide operation

๐šโ€พ0./๐šโ€พ2=[22]./[34]=[2312]. \bar{\bm{a}}_0 ./ \bar{\bm{a}}_2 = \begin{bmatrix} 2 \\ 2 \end{bmatrix} ./ \begin{bmatrix} 3 \\ 4 \end{bmatrix} = \begin{bmatrix} \frac{2}{3} \\ \frac{1}{2} \end{bmatrix}.

The minimum ratio corresponds to column ๐š3\bm{a}_3 (o=3o=3) that would be outgoing. That is ๐š2\bm{a}_2 replaces ๐š3\bm{a}_3 in the basis, which is now ๐=[๐š1๐š2]\bm{B} = \begin{bmatrix} \bm{a}_1 & \bm{a}_2 \end{bmatrix} and ๐=[๐š2๐š4]\bm{N} = \begin{bmatrix} \bm{a}_2 & \bm{a}_4 \end{bmatrix}.

Example โ€“ Primal Simplex Procedure Illustration

Step 4. Update

๐=[3113],๐โˆ’1=[38โˆ’18โˆ’1838],๐šโ€พ0=๐โˆ’1๐›=[1212]. \begin{equation} \bm{B} = \begin{bmatrix} 3 & 1 \\ 1 & 3 \end{bmatrix}, \quad \bm{B}^{-1} = \begin{bmatrix} \frac{3}{8} & -\frac{1}{8} \\ -\frac{1}{8} & \frac{3}{8} \end{bmatrix}, \quad \bar{\bm{a}}_0 = \bm{B}^{-1}\bm{b} = \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \end{bmatrix}. \end{equation}

Return to Step 1.

Second Iteration

Step 1. Calculate๐ฒโŠค=๐œ๐โŠค๐โˆ’1=[182][38โˆ’18โˆ’1838]=[21494]. \begin{equation} \bm{y}^\top = \bm{c}_{\bm{B}}^\top\bm{B}^{-1} = \begin{bmatrix} 18 & 2 \end{bmatrix} \begin{bmatrix} \frac{3}{8} & -\frac{1}{8} \\ -\frac{1}{8} & \frac{3}{8} \end{bmatrix} = \begin{bmatrix} \frac{21}{4} & \frac{9}{4} \end{bmatrix}. \end{equation}

and

๐ซ๐โŠค=๐œ๐โŠคโˆ’๐ฒโŠค๐=[26]โˆ’[21494][โˆ’210โˆ’1]=[2523]. \begin{equation} \bm{r}_{\bm{N}}^\top = \bm{c}_{\bm{N}}^\top - \bm{y}^\top\bm{N} = \begin{bmatrix} 2 & 6 \end{bmatrix} - \begin{bmatrix} \frac{21}{4} & \frac{9}{4} \end{bmatrix} \begin{bmatrix} -2 & 1 \\ 0 & -1 \end{bmatrix} = \begin{bmatrix} \frac{25}{2} & 3 \end{bmatrix}. \end{equation}

Stop! All of the reduced costs are positive so the current basic feasible solution is optimal!

Finding an Initial Basic Feasible Solution

  • The simplex procedure needs to start from a basic feasible solution.
  • Such a BFS is some times immediately available:
    • if the constraints are of this form ๐€๐ฑโ‰ค๐›,๐ฑโ‰ฅ๐ŸŽ\bm{Ax} \leq \bm{b}, \quad \bm{x} \geq \bm{0}, with ๐›โ‰ฅ๐ŸŽ\bm{b} \geq \bm{0}
    • a BFS to the corresponding standard form of the problem is provided by the slack variables.
  • An initial BFS is not always apparent for other types of LPs.
  • For those, an auxiliary LP and a corresponding application of the simplex method can be used to determined the required initial solution.
  • An LP can always be expressed in the so-called Phase I form: ๐€๐ฑ=๐›โ‰ฅ๐ŸŽ,๐ฑโ‰ฅ๐ŸŽ.(28) \bm{Ax} = \bm{b} \geq \bm{0}, \quad \bm{x} \geq \bm{0}. \qquad(28)
  • In order to find a solution to Equation 28, we consider the artificial minimization problem (Phase One linear program)

minimizeโˆ‘i=1mujsubject to๐€๐ฑ+๐ฎ=๐›๐ฑโ‰ฅ๐ŸŽ,๐ฎโ‰ฅ๐ŸŽ.(29) \begin{align} \operatorname{minimize} & \sum_{i=1}^m u_j \\ \text{subject to} & \bm{Ax} + \bm{u} = \bm{b} \\ & \bm{x} \geq \bm{0}, \quad \bm{u} \geq \bm{0}. \end{align} \qquad(29) where ๐ฎ=(u1,u2,โ€ฆ,um)\bm{u} = (u_1, u_2, \ldots, u_m) is a vector of artifical variables.

  • If there is a feasible solution to Equation 28, then it is clear that Equation 29 has a minimum value of zero with ๐ฎ=๐ŸŽ\bm{u} = \bm{0}.

  • If Equation 28 has no feasible solution, then the minimum value of Equation 29 is greater than zero.

  • Equation 29 is an LP and the a BFS for it is ๐ฎ=๐›\bm{u} = \bm{b}. It can readily be solved using the simplex technique.

The Dual Simplex Method

Motivations

 

  • Often there is a basis to an LP that is not feasible for the primal problem, but its simplex multiplier vector is feasible for the dual.
    • That is ๐ฒโŠค=๐œ๐โŠค๐โˆ’1\bm{y}^\top = \bm{c}_{\bm{B}}^\top \bm{B}^{-1} and ๐ซ๐=๐œ๐โŠคโˆ’๐ฒโŠค๐โ‰ฅ๐ŸŽ\bm{r}_{\bm{N}} = \bm{c}_{\bm{N}}^\top - \bm{y}^\top\bm{N} \geq \bm{0}.
  • Then we can apply the dual simplex method moving from the current solution to a new BFS with a better objective value.
  • Assume ๐\bm{B} consists of the first mm columns of AA.

maximize๐ฒโŠค๐›subject to๐ฒโŠค๐€โ‰ค๐œโŠค \begin{align} \operatorname{maximize} & \bm{y}^\top \bm{b} \\ \text{subject to} & \bm{y}^\top \bm{A} \leq \bm{c}^\top \end{align}

  โ‡” \Leftrightarrow

maximize๐ฒโŠค๐›subject to๐ฒโŠค๐โ‰ค๐œ๐โŠค,๐ฒโŠค๐โ‰ค๐œ๐โŠค. \begin{align} \operatorname{maximize} & \bm{y}^\top \bm{b} \\ \text{subject to} & \bm{y}^\top \bm{B} \leq \bm{c}_{\bm{B}}^\top, \\ & \bm{y}^\top \bm{N} \leq \bm{c}_{\bm{N}}^\top. \end{align}

  • Define a new dual variable vector ๐ฒโ€ฒ\bm{y}^\prime via an affine transformation ๐ฒโ€ฒโŠค=๐ฒโŠค๐โˆ’๐œ๐โŠค,or๐ฒโŠค=(๐ฒโ€ฒ+๐œ๐)โŠค๐โˆ’1(30) {\bm{y}^\prime}^\top = \bm{y}^\top \bm{B} - \bm{c}_{\bm{B}}^\top, \quad \text{or} \quad \bm{y}^\top = (\bm{y}^\prime + \bm{c}_{\bm{B}})^\top\bm{B}^{-1} \qquad(30)

and substitute ๐ฒ\bm{y} in the dual by ๐ฒโ€ฒ\bm{y}^\prime

maximize๐ฒโ€ฒโŠค๐โˆ’1๐›+๐œ๐โŠค๐โˆ’1๐›subject to๐ฒโ€ฒโŠคโ‰ค0,๐ฒโ€ฒโŠค๐โˆ’1๐โ‰ค๐œ๐โŠคโˆ’๐œ๐โŠค๐โˆ’1๐. \begin{align} \operatorname{maximize} & {\bm{y}^\prime}^\top \bm{B}^{-1}\bm{b} + \bm{c}_{\bm{B}}^\top \bm{B}^{-1}\bm{b} \\ \text{subject to} & {\bm{y}^\prime}^\top \leq 0, \\ & {\bm{y}^\prime}^\top\bm{B}^{-1}\bm{N} \leq \bm{c}_{\bm{N}}^\top - \bm{c}_{\bm{B}}^\top \bm{B}^{-1}\bm{N}. \end{align}

    โ‡” \Leftrightarrow

maximize๐ฒโ€ฒโŠค๐šโ€พ0+z0subject to๐ฒโ€ฒโŠคโ‰ค๐ŸŽ,๐ฒโ€ฒโŠค๐โˆ’1๐โ‰ค๐ซ๐โŠค,(31) \begin{align} \operatorname{maximize} & {\bm{y}^\prime}^\top \bar{\bm{a}}_0 + z_0 \\ \text{subject to} & {\bm{y}^\prime}^\top \leq \bm{0}, \\ & {\bm{y}^\prime}^\top \bm{B}^{-1}\bm{N} \leq \bm{r}_{\bm{N}}^\top, \end{align} \qquad(31)

Transformed Dual โ€“ Variable ๐ฒโ€ฒ\bm{y}^\prime

  • ๐ฒโ€ฒ=๐ŸŽ\bm{y}^\prime = \bm{0} is a BFS.
  • If ๐šโ€พ0โ‰ฅ๐ŸŽ\bar{\bm{a}}_0 \geq \bm{0}, i.e., the primal basic solution is also feasible, then ๐ฒโ€ฒโŠค=๐ŸŽ{\bm{y}^\prime}^\top = \bm{0} is optimal.
  • This implies ๐ฒโŠค=๐œ๐โŠค๐โˆ’1\bm{y}^\top = \bm{c}_{\bm{B}}^\top\bm{B}^{-1} is optimal to the original dual.
  • The vector ๐šโ€พ0\bar{\bm{a}}_0 can be viewed as the scaled gradient vector of the dual objective function at basis ๐\bm{B}.
    • If one entry of ๐šโ€พo0<0\bar{\bm{a}}_{o0} < 0, then one can decrease the variable ๐ฒoโ€ฒ\bm{y}_o^\prime to some โˆ’ฮต-\varepsilon while keeping others at 00โ€™s.
    • The new ๐ฒโ€ฒ\bm{y}^\prime remains feasible, but its objective value would increase linearly in ฮต\varepsilon.
    • The second block of constraints in Equation 31 becomes ฮต๐žoโŠค๐โˆ’1๐โ‰ค๐ซ๐โŠคorโˆ’ฮต๐šโ€พoโ‰ค๐ซ๐โŠค.(32) \varepsilon \bm{e}_o^\top\bm{B}^{-1}\bm{N} \leq \bm{r}_{\bm{N}}^\top \quad \text{or} \quad -\varepsilon \bar{\bm{a}}^o \leq \bm{r}_{\bm{N}}^\top. \qquad(32)
  • To keep dual feasibility, we need to choose ฮต\varepsilon such that this vector constraint is satisfied componentwise.
    • If all entries in ๐šโ€พo\bar{\bm{a}}^o are nonnegative, then ฮต\varepsilon may be chosen arbitrarily large so the dual objective is unbounded.
    • If some are negative we can increase ฮต\varepsilon until one of the inequality constraints Equation 32 become equal.
    • Say the ethe^{\text{th}} becomes equal. This indicates that the current nonbasic column ๐še\bm{a}_e replaces ๐šo\bm{a}_o in the new basis ๐\bm{B}.
  • The determination can be done by calculating the comp.-wise ratios (๐ซ๐)j(โˆ’๐šโ€พo)j\frac{(\bm{r_{N}})_j}{(-\bar{\bm{a}}^o)_j} for (๐šโ€พo)j<0(\bar{\bm{a}}^o)_j < 0 and j=m+1,โ€ฆ,nj = m+1, \ldots, n (inc. col. ๐šโ€พe\bar{\bm{a}}_e).

Dual Simplex Method

  • In each cycle we find a new feasible dual solution such that one of the equalities becomes inequality and one of the inequalities becomes equality.
    • At the same time we increase the value of the dual objective function.
  • The mm equalities in the new solution then determine a new basis.

One difference, in contrast to the primal simplex method, is that here the outgoing column is selected first and the incoming one is chosen later.

Step 0. Given the inverse ๐โˆ’1\bm{B}^{-1} of a dual feasible basis ๐\bm{B}, primal solution ๐šโ€พ0=๐โˆ’1๐›\bar{\bm{a}}_0 = \bm{B}^{-1}\bm{b}, dual feasible solution ๐ฒโŠค=๐œ๐โŠค๐โˆ’1\bm{y}^\top = \bm{c}_{\bm{B}}^\top\bm{B}^{-1}, and reduced cost vectors ๐ซ๐โŠค=๐œ๐โŠคโˆ’๐ฒโŠค๐โ‰ฅ๐ŸŽ\bm{r}_{\bm{N}}^\top = \bm{c}_{\bm{N}}^\top - \bm{y}^\top\bm{N} \geq \bm{0}.

Step 1. If ๐šโ€พ0โ‰ฅ๐ŸŽ\bar{\bm{a}}_0 \geq \bm{0}, stop; the current solution pair is optimal. Otherwise, determine which column ๐šo\bm{a}_o is to leave the basis by selecting the most negative entry, the otho^{\text{th}} entry (break ties arbitrarily), in ๐šโ€พ0\bar{\bm{a}}_0. Now calculate ๐ฒโ€พโŠค=๐žoโŠค๐โˆ’1\bar{\bm{y}}^\top = \bm{e}_o^\top \bm{B}^{-1} and then calculate ๐šโ€พo=๐ฒโ€พโŠค๐\bar{\bm{a}}^o = \bar{\bm{y}}^\top\bm{N}.

Step 2. If ๐šโ€พoโ‰ฅ๐ŸŽ\bar{\bm{a}}^o \geq \bm{0}, stop; the problem is unbounded. Otherwise, calculate the ratios (๐ซN)j(โˆ’๐šโ€พo)j\frac{(\bm{r}_{N})_j}{(-\bar{\bm{a}}^o)_j} for (๐šโ€พo)j<0\bar{\bm{a}}^o)_j < 0 to determine the current nonbasic column, ๐še\bm{a}_e, ee corresponding to the minimum ratio index, to become basic.

Step 3. Update the basis ๐โˆ’1\bm{B}^{-1} (or its factorization), and update primal solution ๐šโ€พ0\bar{\bm{a}}_0, dual feasible solution ๐ฒ\bm{y}, and reduced cost vector ๐ซ๐\bm{r}_{\bm{N}} accordingly. Return to Step 1.

Example โ€“ Dual Simplex Procedure Illustration

  • We start with the initial basis ๐=[๐š2๐š3]\bm{B} = \begin{bmatrix} \bm{a}_2 & \bm{a}_3 \end{bmatrix} and ๐=[๐š1๐š4]\bm{N} = \begin{bmatrix} \bm{a}_1 & \bm{a}_4 \end{bmatrix}.

Step 0. Initialization

๐=[1โˆ’230],๐โˆ’1=[013โˆ’1216],๐šโ€พ0=[013โˆ’1216][22]=[23โˆ’23],๐ฒโŠค=[122][013โˆ’1216]=[โˆ’1133], \begin{equation} \bm{B} = \begin{bmatrix} 1 & -2 \\ 3 & 0 \end{bmatrix}, \quad \bm{B}^{-1} = \begin{bmatrix} 0 & \frac{1}{3} \\ -\frac{1}{2} & \frac{1}{6} \end{bmatrix}, \quad \bar{\bm{a}}_0 = \begin{bmatrix} 0 & \frac{1}{3} \\ -\frac{1}{2} & \frac{1}{6} \end{bmatrix}\begin{bmatrix} 2 \\ 2 \end{bmatrix} = \begin{bmatrix} \frac{2}{3} \\ -\frac{2}{3} \end{bmatrix}, \quad \bm{y}^\top = \begin{bmatrix} 12& 2 \end{bmatrix} \begin{bmatrix} 0 & \frac{1}{3} \\ -\frac{1}{2} & \frac{1}{6} \end{bmatrix} = \begin{bmatrix} -1 & \frac{13}{3} \end{bmatrix}, \end{equation}

and

๐ซ๐โŠค=[186]โˆ’[โˆ’1133][311โˆ’1]=[503343]. \bm{r}_{\bm{N}}^\top = \begin{bmatrix} 18 & 6 \end{bmatrix} - \begin{bmatrix} -1 & \frac{13}{3} \end{bmatrix}\begin{bmatrix} 3 & 1 \\ 1 & -1 \end{bmatrix} = \begin{bmatrix} \frac{50}{3} & \frac{34}{3} \end{bmatrix}.

Step 1. We see that only the second component of ๐šโ€พ0\bar{\bm{a}}_0 is negative so that o=2o=2 (which corr. to column ๐š3\bm{a}_3). Now, we compute

๐ฒโ€พโŠค=๐ž2โŠค๐โˆ’1=[01][013โˆ’1216]=[โˆ’1216],๐šโ€พ2=๐ฒโ€พโŠค๐=[โˆ’1216][311โˆ’1]=[โˆ’43โˆ’23]. \bar{\bm{y}}^\top = \bm{e}_2^\top \bm{B}^{-1} = \begin{bmatrix} 0 & 1 \end{bmatrix}\begin{bmatrix} 0 & \frac{1}{3} \\ -\frac{1}{2} & \frac{1}{6} \end{bmatrix} = \begin{bmatrix} -\frac{1}{2} & \frac{1}{6} \end{bmatrix}, \quad \bar{\bm{a}}^2 = \bar{\bm{y}}^\top\bm{N} = \begin{bmatrix} -\frac{1}{2} & \frac{1}{6} \end{bmatrix} \begin{bmatrix} 3 & 1 \\ 1 & -1 \end{bmatrix} = \begin{bmatrix} -\frac{4}{3} & -\frac{2}{3} \end{bmatrix}.

Example โ€“ Dual Simplex Procedure Illustration

Step 2. Since all components in ๐šโ€พo\bar{\bm{a}}^o are negative, the componentwise ratios are

๐ซ๐./(โˆ’๐šโ€พ2)=[503343]./[4323]=[25217] \bm{r}_{\bm{N}} ./ (-\bar{\bm{a}}^2) = \begin{bmatrix} \frac{50}{3} & \frac{34}{3} \end{bmatrix} ./ \begin{bmatrix} \frac{4}{3} & \frac{2}{3} \end{bmatrix} = \begin{bmatrix} \frac{25}{2} & 17 \end{bmatrix}

Here, we see the minimum ratio is the first component so that e=1e=1 (which corresponds to column ๐š1\bm{a}_1), that is ๐šโˆ’1\bm{a}-1 replaces ๐š3\bm{a}_3 in the current basis.

Step 3. The new basis is ๐=[๐š2๐š1]\bm{B} = \begin{bmatrix} \bm{a}_2 & \bm{a}_1 \end{bmatrix}.

๐=[1331],๐โˆ’1=[โˆ’183838โˆ’18],๐šโ€พ0=[โˆ’183838โˆ’18][22]=[1212],๐ฒโŠค=[1218][โˆ’183838โˆ’18]=[21494], \begin{equation} \bm{B} = \begin{bmatrix} 1 & 3 \\ 3 & 1 \end{bmatrix}, \quad \bm{B}^{-1} = \begin{bmatrix} -\frac{1}{8} & \frac{3}{8} \\ \frac{3}{8} & -\frac{1}{8} \end{bmatrix}, \quad \bar{\bm{a}}_0 = \begin{bmatrix} -\frac{1}{8} & \frac{3}{8} \\ \frac{3}{8} & -\frac{1}{8} \end{bmatrix}\begin{bmatrix} 2 \\ 2 \end{bmatrix} = \begin{bmatrix} \frac{1}{2} \\ \frac{1}{2} \end{bmatrix}, \quad \bm{y}^\top = \begin{bmatrix} 12 & 18 \end{bmatrix} \begin{bmatrix} -\frac{1}{8} & \frac{3}{8} \\ \frac{3}{8} & -\frac{1}{8} \end{bmatrix} = \begin{bmatrix} \frac{21}{4} & \frac{9}{4} \end{bmatrix}, \end{equation}

and

๐ซ๐โŠค=[26]โˆ’[21494][โˆ’210โˆ’1]=[2523]. \bm{r}_{\bm{N}}^\top = \begin{bmatrix} 2 & 6 \end{bmatrix} - \begin{bmatrix} \frac{21}{4} & \frac{9}{4} \end{bmatrix}\begin{bmatrix} -2 & 1 \\ 0 & -1 \end{bmatrix} = \begin{bmatrix} \frac{25}{2} & 3 \end{bmatrix}.

Stop! The solution pair is optimal!

The Primal-Dual Algorithm

  • We can work simultaneously on the primal and dual problems to solve LP problems.
  • The procedure begins with a feasible solution to the dual that is improved at each step by optimizing an associated restricted primal problem.
  • As the method progresses, it can be regarded as striving to achieve the complementary slackness conditions for optimality.

minimize๐œโŠค๐ฑsubject to๐€๐ฑ=๐›,๐ฑโ‰ฅ๐ŸŽ \begin{align} \operatorname{minimize} & \bm{c}^\top\bm{x} \\ \text{subject to} & \bm{Ax} = \bm{b}, \quad \bm{x} \geq \bm{0} \end{align}

  and

maximize๐ฒโŠค๐›subject to๐ฒโŠค๐€โ‰ค๐œโŠค.(33) \begin{align} \operatorname{maximize} & \bm{y}^\top\bm{b} \\ \text{subject to} & \bm{y}^\top\bm{A} \leq \bm{c}^\top. \end{align} \qquad(33)

  • Given a feasible solution ๐ฒ\bm{y}, not necessarily basic, to the dual, define the subset ๐’ซ\mathcal{P} of indices {1,2,โ€ฆ,n}\{1, 2, \ldots, n\} by jโˆˆ๐’ซj \in \mathcal{P} if ๐ฒโŠค๐šj=cj\bm{y}^\top \bm{a}_j = c_j.
    • Since ๐ฒ\bm{y} is dual feasible, we have โˆ€jโˆ‰๐’ซ,๐ฒโŠค๐šj<cj\forall j \notin \mathcal{P}, \;\; \bm{y}^\top \bm{a}_j < c_j.
  • Corresponding to ๐ฒ\bm{y} and the index set ๐’ซ\mathcal{P}, we define the associated restricted primal problem

minimize๐ŸโŠค๐ฎ,๐ŸโŠค=[11โ‹ฏ1]subject to๐€๐ฑ+๐ฎ=๐›๐ฑโ‰ฅ๐ŸŽxj=0forjโˆ‰๐’ซ๐ฎโ‰ฅ๐ŸŽ(34) \begin{align} \operatorname{minimize} & \bm{1}^\top \bm{u}, & \bm{1}^\top = \begin{bmatrix} 1 & 1 & \cdots & 1 \end{bmatrix} \\ \text{subject to} & \bm{Ax} + \bm{u} = \bm{b} & \\ & \bm{x} \geq \bm{0} & x_j = 0 \quad \text{for} \quad j \notin \mathcal{P} \\ & \bm{u} \geq \bm{0} & \end{align} \qquad(34)

The Primal-Dual Algorithm

  • The dual of this associated restricted primal is called the associated restricted dual with dual variable vector ๐ฒโ€ฒ\bm{y}^\prime.

maximize(๐ฒโ€ฒ)โŠค๐›subject to(๐ฒโ€ฒ)โŠค๐šjโ‰ค0jโˆˆ๐’ซ(๐ฒโ€ฒ)โ‰ค๐Ÿ.(35) \begin{align} \operatorname{maximize} & (\bm{y}^\prime)^\top \bm{b} & \\ \text{subject to} & (\bm{y}^\prime)^\top \bm{a}_j \leq 0 & j \in \mathcal{P} \\ & (\bm{y}^\prime) \leq \bm{1}. \end{align} \qquad(35)

Primal-Dual Optimality Theorem

Suppose that ๐ฒ\bm{y} is feasible for the original dual and that ๐ฑ\bm{x} and ๐ฎ=๐ŸŽ\bm{u} = \bm{0} is feasible (and of course optimal) for the associated restricted primal. Then ๐ฑ\bm{x} and ๐ฒ\bm{y} are optimal for the original prime and dual programs, respectively.

Proof

Clearly ๐ฑ\bm{x} is feasible for the primal. Also, we have ๐œโŠค๐ฑ=๐ฒโŠค๐€๐ฑ\bm{c}^\top\bm{x} = \bm{y}^\top\bm{Ax} because ๐ฒโŠค๐€\bm{y}^\top\bm{A} is identical to ๐œโŠค\bm{c}^\top on the components corresponding to the nonzero elements of ๐ฑ\bm{x}. Thus ๐œโŠค๐ฑ=๐ฒโŠค๐€๐ฑ=๐ฒโŠค๐›\bm{c}^\top\bm{x} = \bm{y}^\top\bm{Ax} = \bm{y}^\top\bm{b} and optimality follows from strong duality or complementary slackness.

The Primal-Dual Algorithm

Step 1. Given a feasible solution ๐ฒ0\bm{y}^0 to the dual program Equation 33, determine the associated restricted primal according to Equation 34.

Step 2. Optimize the associated restricted primal. If the minimal value of this problem is zero, the corresponding solution and ๐ฒ0\bm{y}^0 is an optimal pair for the original LP Equation 33.

Step 3. If the minimal value of the associated restricted primal is strictly positive, the maximal objective value of the associated restricted dual Equation 35 is also positive from the strong duality theorem, that is, its optimal solution ๐ฒโ€ฒโŠค๐›>0{\bm{y}^\prime}^\top \bm{b} > 0. If there is no jj for which ๐ฒโ€ฒโŠค๐šj>0{\bm{y}^\prime}^\top \bm{a}_j > 0 for all jโˆ‰๐’ซj \notin \mathcal{P}, conclude the primal has no feasible solutions from Farkasโ€™s lemma.

Step 4. If, on the other hand, for at least one jโˆ‰๐’ซj \notin \mathcal{P}, ๐ฒโ€ฒโŠค๐šj>0{\bm{y}^\prime}^\top \bm{a}_j > 0, define the new dual feasible vector ๐ฒ(ฮต)=๐ฒ0+ฮต๐ฒโ€ฒ, \bm{y}(\varepsilon) = \bm{y}^0 + \varepsilon \bm{y}^\prime, where ฮต\varepsilon is referred to as the stepsize, is chosen as large as possible till one of the constraints, jโˆ‰๐’ซj \notin \mathcal{P} becomes equal ๐ฒ(ฮต)โŠค๐šj=cj,jโˆ‰๐’ซ. \bm{y}(\varepsilon)^\top \bm{a}_j = c_j, \quad j \notin \mathcal{P}.

If ฮต\varepsilon can be increased to โˆž\infty, then the original dual is unbounded. Otherwise ฮต>0\varepsilon > 0 and we go back to Step 1 using this new dual feasible solution ๐ฒ(ฮต)\bm{y}(\varepsilon) whose dual objective is strictly increased. ๐ฒ(ฮต)โŠค๐›=(๐ฒ0)โŠค๐›+ฮต๐ฒโ€ฒโŠค๐›>(๐ฒ0)โŠค๐›. \bm{y}(\varepsilon)^\top \bm{b} = (\bm{y}^0)^\top \bm{b} + \varepsilon {\bm{y}^\prime}^\top \bm{b} > (\bm{y}^0)^\top \bm{b}.