edu.stanford.nlp.ie.crf
Class CRFLogConditionalObjectiveFunction

java.lang.Object
  extended by edu.stanford.nlp.optimization.AbstractCachingDiffFunction
      extended by edu.stanford.nlp.optimization.AbstractStochasticCachingDiffFunction
          extended by edu.stanford.nlp.optimization.AbstractStochasticCachingDiffUpdateFunction
              extended by edu.stanford.nlp.ie.crf.CRFLogConditionalObjectiveFunction
All Implemented Interfaces:
HasCliquePotentialFunction, DiffFunction, Function, HasFeatureGrouping, HasInitial

public class CRFLogConditionalObjectiveFunction
extends AbstractStochasticCachingDiffUpdateFunction
implements HasCliquePotentialFunction, HasFeatureGrouping

Author:
Jenny Finkel Mengqiu Wang

Nested Class Summary
 
Nested classes/interfaces inherited from class edu.stanford.nlp.optimization.AbstractStochasticCachingDiffFunction
AbstractStochasticCachingDiffFunction.SamplingMethod
 
Field Summary
static int HUBER_PRIOR
           
static int NO_PRIOR
           
static int QUADRATIC_PRIOR
           
static int QUARTIC_PRIOR
           
static boolean VERBOSE
           
 
Fields inherited from class edu.stanford.nlp.optimization.AbstractStochasticCachingDiffUpdateFunction
skipValCalc
 
Fields inherited from class edu.stanford.nlp.optimization.AbstractStochasticCachingDiffFunction
allIndices, curElement, finiteDifferenceStepSize, gradPerturbed, hasNewVals, HdotV, lastBatch, lastBatchSize, lastElement, lastVBatch, lastXBatch, method, randGenerator, recalculatePrevBatch, returnPreviousValues, sampleMethod, scaleUp, thisBatch, xPerturbed
 
Fields inherited from class edu.stanford.nlp.optimization.AbstractCachingDiffFunction
derivative, generator, value
 
Method Summary
 void calculate(double[] x)
          Calculates both value and partial derivatives at the point x, and save them internally.
 void calculateStochastic(double[] x, double[] v, int[] batch)
          calculateStochastic needs to calculate a stochastic approximation to the derivative and value of of a function for a given batch of the data.
 void calculateStochasticGradient(double[] x, int[] batch)
          Performs stochastic gradient update based on samples indexed by batch, but does not apply regularization.
 double calculateStochasticUpdate(double[] x, double xscale, int[] batch, double gscale)
          Performs stochastic update of weights x (scaled by xscale) based on samples indexed by batch.
 int dataDimension()
          Data dimension must return the size of the data used by the function.
 int domainDimension()
          Returns the number of dimensions in the function's domain
 CliquePotentialFunction getCliquePotentialFunction(double[] x)
           
 int[][] getFeatureGrouping()
           
static int getPriorType(java.lang.String priorTypeStr)
           
 int[][] getWeightIndices()
           
 double[] initial()
          Returns the intitial point in the domain (but not necessarily a feasible one).
 void setFeatureGrouping(int[][] fg)
           
 double[] to1D(double[][] weights)
           
static double[] to1D(double[][] weights, int domainDimension)
           
 double[][] to2D(double[] weights)
           
 double[][] to2D(double[] weights, double wscale)
           
static double[][] to2D(double[] weights, java.util.List<Index<CRFLabel>> labelIndices, int[] map)
          Takes a double array of weights and creates a 2D array where: the first element is the mapped index of the clique size (e.g., node-0, edge-1) matcing featuresIndex i the second element is the number of output classes for that clique size
 double valueAt(double[] x, double xscale, int[] batch)
          Computes value of function for specified value of x (scaled by xscale) only over samples indexed by batch.
 double valueForADoc(double[][] weights, int docIndex)
           
 
Methods inherited from class edu.stanford.nlp.optimization.AbstractStochasticCachingDiffUpdateFunction
calculateStochasticGradient, calculateStochasticUpdate, getSample
 
Methods inherited from class edu.stanford.nlp.optimization.AbstractStochasticCachingDiffFunction
clearCache, decrementBatch, derivativeAt, derivativeAt, getBatch, HdotVAt, HdotVAt, HdotVAt, incrementBatch, incrementRandom, lastDerivative, lastValue, scaleUp, valueAt, valueAt
 
Methods inherited from class edu.stanford.nlp.optimization.AbstractCachingDiffFunction
copy, derivativeAt, getDerivative, gradientCheck, gradientCheck, randomInitial, valueAt
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

NO_PRIOR

public static final int NO_PRIOR
See Also:
Constant Field Values

QUADRATIC_PRIOR

public static final int QUADRATIC_PRIOR
See Also:
Constant Field Values

HUBER_PRIOR

public static final int HUBER_PRIOR
See Also:
Constant Field Values

QUARTIC_PRIOR

public static final int QUARTIC_PRIOR
See Also:
Constant Field Values

VERBOSE

public static boolean VERBOSE
Method Detail

initial

public double[] initial()
Description copied from interface: HasInitial
Returns the intitial point in the domain (but not necessarily a feasible one).

Specified by:
initial in interface HasInitial
Overrides:
initial in class AbstractStochasticCachingDiffFunction
Returns:
a domain point

getPriorType

public static int getPriorType(java.lang.String priorTypeStr)

domainDimension

public int domainDimension()
Description copied from interface: Function
Returns the number of dimensions in the function's domain

Specified by:
domainDimension in interface Function
Returns:
the number of domain dimensions

to2D

public static double[][] to2D(double[] weights,
                              java.util.List<Index<CRFLabel>> labelIndices,
                              int[] map)
Takes a double array of weights and creates a 2D array where: the first element is the mapped index of the clique size (e.g., node-0, edge-1) matcing featuresIndex i the second element is the number of output classes for that clique size

Returns:
a 2D weight array

to2D

public double[][] to2D(double[] weights)

to2D

public double[][] to2D(double[] weights,
                       double wscale)

to1D

public static double[] to1D(double[][] weights,
                            int domainDimension)

to1D

public double[] to1D(double[][] weights)

getWeightIndices

public int[][] getWeightIndices()

valueForADoc

public double valueForADoc(double[][] weights,
                           int docIndex)

getCliquePotentialFunction

public CliquePotentialFunction getCliquePotentialFunction(double[] x)
Specified by:
getCliquePotentialFunction in interface HasCliquePotentialFunction

calculate

public void calculate(double[] x)
Calculates both value and partial derivatives at the point x, and save them internally.

Specified by:
calculate in class AbstractCachingDiffFunction
Parameters:
x - The point at which to calculate the function

calculateStochastic

public void calculateStochastic(double[] x,
                                double[] v,
                                int[] batch)
Description copied from class: AbstractStochasticCachingDiffFunction
calculateStochastic needs to calculate a stochastic approximation to the derivative and value of of a function for a given batch of the data. The approximation to the derivative must be stored in the array derivative , the approximation to the value in value and the approximation to the Hessian vector product H.v in the array HdotV . Note that the hessian vector product is used primarily with the Stochastic Meta Descent optimization routine SMDMinimizer . Important: The stochastic approximation must be such that the sum of all stochastic calculations over each of the batches in the data must equal the full calculation. i.e. for a data set of size 100 the sum of the gradients for batches 1-10 , 11-20 , 21-30 .... 91-100 must be the same as the gradient for the full calculation (at the very least in expectation). Be sure to take into account the priors.

Specified by:
calculateStochastic in class AbstractStochasticCachingDiffFunction
Parameters:
x - - value to evaluate at
v - - the vector for the Hessian vector product H.v
batch - - an array containing the indices of the data to use in the calculation, this array is being calculated internal to the abstract, and only needs to be handled not generated by the implementation.

dataDimension

public int dataDimension()
Description copied from class: AbstractStochasticCachingDiffFunction
Data dimension must return the size of the data used by the function.

Specified by:
dataDimension in class AbstractStochasticCachingDiffFunction

calculateStochasticUpdate

public double calculateStochasticUpdate(double[] x,
                                        double xscale,
                                        int[] batch,
                                        double gscale)
Performs stochastic update of weights x (scaled by xscale) based on samples indexed by batch. NOTE: This function does not do regularization (regularization is done by the minimizer).

Specified by:
calculateStochasticUpdate in class AbstractStochasticCachingDiffUpdateFunction
Parameters:
x - - unscaled weights
xscale - - how much to scale x by when performing calculations
batch - - indices of which samples to compute function over
gscale - - how much to scale adjustments to x
Returns:
value of function at specified x (scaled by xscale) for samples

calculateStochasticGradient

public void calculateStochasticGradient(double[] x,
                                        int[] batch)
Performs stochastic gradient update based on samples indexed by batch, but does not apply regularization.

Specified by:
calculateStochasticGradient in class AbstractStochasticCachingDiffUpdateFunction
Parameters:
x - - unscaled weights
batch - - indices of which samples to compute function over

valueAt

public double valueAt(double[] x,
                      double xscale,
                      int[] batch)
Computes value of function for specified value of x (scaled by xscale) only over samples indexed by batch. NOTE: This function does not do regularization (regularization is done by the minimizer).

Specified by:
valueAt in class AbstractStochasticCachingDiffUpdateFunction
Parameters:
x - - unscaled weights
xscale - - how much to scale x by when performing calculations
batch - - indices of which samples to compute function over
Returns:
value of function at specified x (scaled by xscale) for samples

getFeatureGrouping

public int[][] getFeatureGrouping()
Specified by:
getFeatureGrouping in interface HasFeatureGrouping

setFeatureGrouping

public void setFeatureGrouping(int[][] fg)


Stanford NLP Group