ME/AER 647 Systems Optimization I


Trust Region Methods


Instructor: Hasan A. Poonawala

Mechanical and Aerospace Engineering
University of Kentucky, Lexington, KY, USA

Topics:
What they are
Quadratic Models and Regions

General Framework

  • Approximate ff near 𝐱k\mathbf{x}_k using a model function mkm_k
  • Minimize mkm_k near 𝐱k\mathbf{x}_k by solving minpmk(𝐱k+p)s.t.𝐱k+pTrust regionOriginal Constraints \begin{align} \min_{p} & m_k\left(\mathbf{x}_k+p\right)\\ \text{s.t.} & \mathbf{x}_k+p \in \text{Trust region}\\ & \text{Original Constraints} \end{align}

Quadratic Case

The optimization problem becomes1: minpmk(𝐱k+p)=fk+pTfk+12pTFkps.t.p2Δk \begin{align} \min_{p} & m_k\left(\mathbf{x}_k+p\right) = f_k + p^T \nabla f_k + \frac{1}{2}p^T F_k p\\ \text{s.t.} & \| p\|_2 \leq \Delta_k \end{align}

We’ve seen how to work through the KKT conditions to find a solution:

  • If fkTFk1fkΔk2\nabla f_k ^T F_k^{-1} \nabla f_k \leq \Delta_k^2, then pk=Fk1fkp_k = -F_k^{-1} \nabla f_k, assuming FkF_k is PSD
  • Otherwise, need to find μ>0\mu >0 such that fkT(Fk+μI)1fk=Δk2\nabla f_k ^T ( F_k + \mu I)^{-1} \nabla f_k = \Delta_k^2 and Fk+μIF_k +\mu I is PSD. If found, then pk=(Fk+μI)1fkp_k = -(F_k+\mu I)^{-1} \nabla f_k

Trust Region Size

  • How big should Δk\Delta_k be?

  • Define ρk=f(xk)f(xk+pk)f(xk)m(pk)\rho_k = \frac{f(x_k) -f(x_k + p_k)}{f(x_k) - m(p_k)}

    • ρk<0\rho_k <0 implies objective increased, reject step
    • ρk1\rho_k \approx 1 means model mkm_k was good over trust region
    • ρk1\rho_k \ll 1 implies model was not good over trust region
  • Use ρk\rho_k to modulate Δk\Delta_k

Trust Region Algorithm

Given Δ̂>0\hat \Delta >0, Δ0(0,Δ̂)\Delta_0 \in (0,\hat \Delta), and η[0,14)\eta \in [0,\frac{1}{4}), evaluate the following loop:

  1. Solve for pkp_k
  2. Evaluate ρk\rho_k
    • If ρk<14\rho_k < \frac{1}{4}: \quad Δk+1=14Δk\Delta_{k+1} = \frac{1}{4} \Delta_k
    • Else
      • If ρk>34\rho_k > \frac{3}{4} and pk=Δk\|p_k\| = \Delta_k: \quad Δk+1=min(2Δk,Δ̂)\Delta_{k+1} = \min(2 \Delta_k,\hat \Delta)
      • Else: \quad Δk+1=Δk\Delta_{k+1} = \Delta_k
  3. Update xk+1x_{k+1}:
    • If ρk>η\rho_k > \eta: \quad xk+1=xk+pkx_{k+1} = x_k + p_k
    • Else : \quad xk+1=xkx_{k+1} = x_k

Trust Region Method Approximations

  • For large problems, prefer to approximately solve for pkp_k
    • No point getting an expensive exact solution to an approximate problem
  • Two ideas
    • Cauchy Point: Line search along gradient.
      • Basically steepest descent with a max step size
    • Dogleg method
      • Best when FkF_k is positive definite
      • Unconstrained minimum of mkm_k is pkBp_k^B (full step), minimum of line search along fk\nabla f_k is pUp^U
      • On a two-line-segment path from xkpUpkBx_{k} \to p^U \to p_k^B, mkm_k decreases while the point gets further away from xkx_k (Lemma 4.2 in NW)
      • Find the point on this path where pk=Δk\|p_k\| = \|\Delta_k\|