edu.stanford.nlp.optimization
Class QNMinimizer
java.lang.Object
edu.stanford.nlp.optimization.QNMinimizer
- All Implemented Interfaces:
- HasEvaluators, Minimizer<DiffFunction>
public class QNMinimizer
- extends Object
- implements Minimizer<DiffFunction>, HasEvaluators
An implementation of L-BFGS for Quasi Newton unconstrained minimization.
The general outline of the algorithm is taken from:
Numerical Optimization (second edition) 2006
Jorge Nocedal and Stephen J. Wright
A variety of different options are available.
LINESEARCHES
BACKTRACKING: This routine
simply starts with a guess for step size of 1. If the step size doesn't
supply a sufficient decrease in the function value the step is updated
through step = 0.1*step. This method is certainly simpler, but doesn't allow
for an increase in step size, and isn't well suited for Quasi Newton methods.
MINPACK: This routine is based off of the implementation used in MINPACK.
This routine finds a point satisfying the Wolfe conditions, which state that
a point must have a sufficiently smaller function value, and a gradient of
smaller magnitude. This provides enough to prove theoretically quadratic
convergence. In order to find such a point the linesearch first finds an
interval which must contain a satisfying point, and then progressively
reduces that interval all using cubic or quadratic interpolation.
SCALING: L-BFGS allows the initial guess at the hessian to be updated at each
step. Standard BFGS does this by approximating the hessian as a scaled
identity matrix. To use this method set the scaleOpt to SCALAR. A better way
of approximate the hessian is by using a scaling diagonal matrix. The
diagonal can then be updated as more information comes in. This method can be
used by setting scaleOpt to DIAGONAL.
CONVERGENCE: Previously convergence was gauged by looking at the average
decrease per step dividing that by the current value and terminating when
that value because smaller than TOL. This method fails when the function
value approaches zero, so two other convergence criteria are used. The first
stores the initial gradient norm |g0|, then terminates when the new gradient
norm, |g| is sufficiently smaller: i.e., |g| < eps*|g0| the second checks if
|g| < eps*max( 1 , |x| ) which is essentially checking to see if the gradient
is numerically zero.
Each of these convergence criteria can be turned on or off by setting the
flags:
private boolean useAveImprovement = true;
private boolean useRelativeNorm = true;
private boolean useNumericalZero = true;
To use the QNMinimizer first construct it using
QNMinimizer qn = new QNMinimizer(mem, true)
mem - the number of previous estimate vector pairs to
store, generally 15 is plenty. true - this tells the QN to use the MINPACK
linesearch with DIAGONAL scaling. false would lead to the use of the criteria
used in the old QNMinimizer class.
Then call:
qn.minimize(dfunction,convergenceTolerance,initialGuess,maxFunctionEvaluations);
- Author:
- akleeman
Method Summary |
float[] |
minimize(DiffFloatFunction function,
float functionTolerance,
float[] initial)
|
double[] |
minimize(DiffFunction function,
double functionTolerance,
double[] initial)
Attempts to find an unconstrained minimum of the objective
function starting at initial , within
functionTolerance . |
double[] |
minimize(DiffFunction dfunction,
double functionTolerance,
double[] initial,
int maxFunctionEvaluations)
|
double[] |
minimize(DiffFunction dfunction,
double functionTolerance,
double[] initial,
int maxFunctionEvaluations,
QNMinimizer.QNInfo qn)
|
void |
setEvaluators(int iters,
Evaluator[] evaluators)
|
void |
setHistory(List<double[]> s,
List<double[]> y)
|
void |
setM(int m)
|
void |
setOldOptions()
|
void |
setRobustOptions()
|
void |
shutUp()
|
void |
terminateOnAverageImprovement(boolean toTerminate)
|
void |
terminateOnNumericalZero(boolean toTerminate)
|
void |
terminateOnRelativeNorm(boolean toTerminate)
|
void |
useBacktracking()
|
void |
useDiagonalScaling()
|
void |
useMinPackSearch()
|
void |
useScalarScaling()
|
boolean |
wasSuccessful()
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
outputToFile
public boolean outputToFile
QNMinimizer
public QNMinimizer(int m)
QNMinimizer
public QNMinimizer(int m,
boolean useRobustOptions)
QNMinimizer
public QNMinimizer()
QNMinimizer
public QNMinimizer(Function monitor)
QNMinimizer
public QNMinimizer(FloatFunction monitor)
QNMinimizer
public QNMinimizer(Function monitor,
int m)
QNMinimizer
public QNMinimizer(Function monitor,
int m,
boolean useRobustOptions)
setOldOptions
public void setOldOptions()
setRobustOptions
public final void setRobustOptions()
setEvaluators
public void setEvaluators(int iters,
Evaluator[] evaluators)
- Specified by:
setEvaluators
in interface HasEvaluators
terminateOnRelativeNorm
public void terminateOnRelativeNorm(boolean toTerminate)
terminateOnNumericalZero
public void terminateOnNumericalZero(boolean toTerminate)
terminateOnAverageImprovement
public void terminateOnAverageImprovement(boolean toTerminate)
useMinPackSearch
public void useMinPackSearch()
useBacktracking
public void useBacktracking()
useDiagonalScaling
public void useDiagonalScaling()
useScalarScaling
public void useScalarScaling()
wasSuccessful
public boolean wasSuccessful()
shutUp
public void shutUp()
setM
public void setM(int m)
setHistory
public void setHistory(List<double[]> s,
List<double[]> y)
minimize
public float[] minimize(DiffFloatFunction function,
float functionTolerance,
float[] initial)
minimize
public double[] minimize(DiffFunction function,
double functionTolerance,
double[] initial)
- Description copied from interface:
Minimizer
- Attempts to find an unconstrained minimum of the objective
function
starting at initial
, within
functionTolerance
.
- Specified by:
minimize
in interface Minimizer<DiffFunction>
- Parameters:
function
- the objective functionfunctionTolerance
- a double
valueinitial
- a initial feasible point
- Returns:
- Unconstrained minimum of function
minimize
public double[] minimize(DiffFunction dfunction,
double functionTolerance,
double[] initial,
int maxFunctionEvaluations)
- Specified by:
minimize
in interface Minimizer<DiffFunction>
minimize
public double[] minimize(DiffFunction dfunction,
double functionTolerance,
double[] initial,
int maxFunctionEvaluations,
QNMinimizer.QNInfo qn)
Stanford NLP Group