Poblano Toolbox v1.1

Truncated Newton Optimization

Truncated Newton (TN) methods for minimization are Newton methods in which the Newton direction is only approximated at each iteration (thus reducing computation). Furthermore, the Poblano implementation of the truncated Newton method does not require an explicit Hessian matrix in the computation of the approximate Newton direction (thus reducing storage requirements).

The Poblano function for the truncated Newton method is called tn.

Contents


Introduction

The general steps of the TN method in Poblano is given below in high-level pseudo-code [1]:

Notes

In Step 5, the linear conjugate gradient (CG) method stopping tolerance is allowed to change at each iteration. The input parameter CGTolType determines how is computed.

In Step 6,


Method Specific Input Parameters

The input parameters specific to the tn method are presented below. See the Optimization Input Parameters documentation for more details on the Poblano parameters shared across all methods.

CGSolver         Matlab CG method to use {'symmlq'}
  'symmlq'       symmlq (designed for symmetric indefinite systems)
  'pcg'          pcg (designed for symmetric positive definite systems)
CGIters          Maximum number of conjugate gradient iterations allowed {5}
CGTolType        CG stopping tolerance type used {'quadratic'}
  'quadratic'    ||R|| / ||G|| <  min(0.5,||G||)
  'superlinear'  ||R|| / ||G|| <  min(0.5,sqrt(||G||))
  'fixed'        ||R|| < CGTol
                 where R is the residual and G is the gradient of FUN at X
CGTol            CG stopping tolerance when CGTolType is 'fixed' {1e-6}
HessVecFDStep    Hessian vector product finite difference step {1e-10}
  0              Use iterate-based step: 1e-8*(1+||X||)
  >0             Fixed value to use at the difference step


Default Input Parameters

The default input parameters are returned with the following call to tn:

params = tn('defaults')
params = 

                   CGIters: 5
                  CGSolver: 'symmlq'
                     CGTol: 1e-06
                 CGTolType: 'quadratic'
                   Display: 'iter'
              DisplayIters: 1
             HessVecFDStep: 1e-10
           LineSearch_ftol: 0.0001
           LineSearch_gtol: 0.01
    LineSearch_initialstep: 1
         LineSearch_maxfev: 20
         LineSearch_method: 'more-thuente'
         LineSearch_stpmax: 1e+15
         LineSearch_stpmin: 1e-15
           LineSearch_xtol: 1e-15
              MaxFuncEvals: 10000
                  MaxIters: 1000
                RelFuncTol: 1e-06
                   StopTol: 1e-05
                 TraceFunc: 0
            TraceFuncEvals: 0
                 TraceGrad: 0
             TraceGradNorm: 0
              TraceRelFunc: 0
                    TraceX: 0

See the Optimization Input Parameters documentation for more details on the Poblano parameters shared across all methods.


Examples

Below are the results of using the tn method in Poblano to solve example problems solved using the ncg method in the Nonlinear Conjugate Gradient Optimization and lbfgs method in the Limited-Memory BFGS Optimization documentation.

Example 1 (from Poblano Examples)

In this example, we have and , and use a random starting point.

randn('state',0);
x0 = randn(10,1)
out = tn(@(x) example1(x,3), x0)
x0 =

     -0.43256
      -1.6656
      0.12533
      0.28768
      -1.1465
       1.1909
       1.1892
    -0.037633
      0.32729
      0.17464

 Iter  FuncEvals       F(X)          ||G(X)||/N        
------ --------- ---------------- ----------------
     0         1       1.80545257       0.73811114
tn: line search warning = 0
     1        10      -4.10636797       0.54564169
     2        18      -4.21263331       0.41893997
tn: line search warning = 0
     3        28      -7.18352472       0.36547546
     4        34      -8.07095085       0.11618518
tn: line search warning = 0
     5        41      -9.87251163       0.15057476
     6        46      -9.99999862       0.00049753
     7        50     -10.00000000       0.00000000

out = 

             Params: [1x1 inputParser]
           ExitFlag: 0
    ExitDescription: 'Successful termination based on StopTol'
                  X: [10x1 double]
                  F: -10
                  G: [10x1 double]
          FuncEvals: 50
              Iters: 7

Note that in this example the line search in tn method displays a warning during iterations 1, 3 and 5, indicating that the norm of the search direction is nearly 0. In those cases, the steepest descent direction is used for the search direction during those iterations.

Example 2

In this example, we compute a rank 2 approximation to a Pascal matrix (generated using the Matlab function pascal(4)). The starting point is a random vector. Note that in the interest of space, Poblano is set to display only the final iteration is this example.

m = 4; n = 4; k = 4;
Data.rank = k;
Data.A = pascal(m);
randn('state',0);
x0 = randn((m+n)*k,1);
out = tn(@(x) example2(x,Data), x0, 'Display', 'final')
 Iter  FuncEvals       F(X)          ||G(X)||/N        
------ --------- ---------------- ----------------
    21       155       0.00000001       0.00000606

out = 

             Params: [1x1 inputParser]
           ExitFlag: 0
    ExitDescription: 'Successful termination based on StopTol'
                  X: [32x1 double]
                  F: 9.4184e-09
                  G: [32x1 double]
          FuncEvals: 155
              Iters: 21

As for the ncg and lbfgs methods, the fact that out.ExitFlag > 0 indicates that the method did not converge to the specified tolerance (i.e., the default StopTol input parameter value of 1e-5). Since the maximum number of function evaluations was exceeded, we can increasing the number of maximum numbers of function evaluations and iterations allowed, and the optimizer converges to a solution within the specified tolerance.

out = tn(@(x) example2(x,Data), x0, 'MaxIters',1000, ...
    'MaxFuncEvals',10000,'Display','final')
 Iter  FuncEvals       F(X)          ||G(X)||/N        
------ --------- ---------------- ----------------
    21       155       0.00000001       0.00000606

out = 

             Params: [1x1 inputParser]
           ExitFlag: 0
    ExitDescription: 'Successful termination based on StopTol'
                  X: [32x1 double]
                  F: 9.4184e-09
                  G: [32x1 double]
          FuncEvals: 155
              Iters: 21

Verifying the solution, we see that we find a matrix decomposition which fits the matrix with very small relative error (given the stopping tolerance of 1e-5 used by the optimizer).

[U,V] = example2_extract(m,n,k,out.X);
norm(Data.A-U*V')/norm(Data.A)
ans =

   4.4426e-06

Again, in Example 2, we see that tn exhibits different behavior from that of the ncg and lbfgs methods. Thus, it is recommended that several test runs on smaller problems are performed initially using the different methods to help decide which method and set of parameters works best for a particular class of problems.


References

[1] Dembo, R.S. and and Steihaug, T. (1983). Truncated-Newton Algorithms for Large-Scale Unconstrained Minimization., Mathematical Programming, 26, 190-212.


Poblano Toolbox v1.1