CRFClassifier (Stanford JavaNLP API)

java.lang.Object
- edu.stanford.nlp.ie.AbstractSequenceClassifier<IN>
- - edu.stanford.nlp.ie.crf.CRFClassifier<IN>

All Implemented Interfaces:

java.util.function.Function<java.lang.String,java.lang.String>

Direct Known Subclasses:

CRFBiasedClassifier, CRFClassifierFloat, CRFClassifierNoisyLabel, CRFClassifierNonlinear, CRFClassifierWithDropout, CRFClassifierWithLOP
```
public class CRFClassifier<IN extends CoreMap>
extends AbstractSequenceClassifier<IN>
```
Class for sequence classification using a Conditional Random Field model. The code has functionality for different document formats, but when using the standard ColumnDocumentReaderAndWriter for training or testing models, input files are expected to be one token per line with the columns indicating things like the word, POS, chunk, and answer class. The default for ColumnDocumentReaderAndWriter training data is 3 column input, with the columns containing a word, its POS, and its gold class, but this can be specified via the map property.
When run on a file with -textFile or -textFiles, the file is assumed to be plain English text (or perhaps simple HTML/XML), and a reasonable attempt is made at English tokenization by PlainTextDocumentReaderAndWriter. The class used to read the text can be changed with -plainTextDocumentReaderAndWriter. Extra options can be supplied to the tokenizer using the -tokenizerOptions flag.
To read from stdin, use the flag -readStdin. The same reader/writer will be used as for -textFile.
Typical command-line usage
For running a trained model with a provided serialized classifier on a text file:
java -mx500m edu.stanford.nlp.ie.crf.CRFClassifier -loadClassifier conll.ner.gz -textFile sampleSentences.txt
When specifying all parameters in a properties file (train, test, or runtime):
java -mx1g edu.stanford.nlp.ie.crf.CRFClassifier -prop propFile
To train and test a simple NER model from the command line:
java -mx1g edu.stanford.nlp.ie.crf.CRFClassifier -trainFile trainFile -testFile testFile -macro > output
To train with multiple files:
java -mx1g edu.stanford.nlp.ie.crf.CRFClassifier -trainFileList file1,file2,... -testFile testFile -macro > output
To test on multiple files, use the -testFiles option and a comma separated list.
Features are defined by a FeatureFactory. NERFeatureFactory is used by default, and you should look there for feature templates and properties or flags that will cause certain features to be used when training an NER classifier. There are also various feature factories for Chinese word segmentation such as ChineseSegmenterFeatureFactory. Features are specified either by a Properties file (which is the recommended method) or by flags on the command line. The flags are read into a SeqClassifierFlags object, which the user need not be concerned with, unless wishing to add new features.
CRFClassifier may also be used programmatically. When creating a new instance, you must specify a Properties object. You may then call train methods to train a classifier, or load a classifier. The other way to get a CRFClassifier is to deserialize one via the static getClassifier(String) methods, which return a deserialized classifier. You may then tag (classify the items of) documents using either the assorted classify() methods here or the additional ones in AbstractSequenceClassifier. Probabilities assigned by the CRF can be interrogated using either the printProbsDocument() or getCliqueTrees() methods.

Author:

Jenny Finkel, Sonal Gupta (made the class generic), Mengqiu Wang (LOP implementation and non-linear CRF implementation)

Field Summary

Fields
Modifier and Type Field and Description

static java.lang.String DEFAULT_CLASSIFIER
Name of default serialized classifier resource to look for in a jar file.
- Fields inherited from class edu.stanford.nlp.ie.AbstractSequenceClassifier
  classIndex, featureFactories, flags, knownLCWords, pad, windowSize

Fields
Modifier and Type	Field and Description
`static java.lang.String`	`DEFAULT_CLASSIFIER` Name of default serialized classifier resource to look for in a jar file.

Constructor Summary

Constructors
Modifier	Constructor and Description
`protected`	`CRFClassifier()`
	`CRFClassifier(CRFClassifier<IN> crf)` Makes a copy of the crf classifier
	`CRFClassifier(java.util.Properties props)`
	`CRFClassifier(SeqClassifierFlags flags)`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`protected void`	`addProcessedData(java.util.List<java.util.List<CRFDatum<java.util.Collection<java.lang.String>,java.lang.String>>> processedData, int[][][][] data, int[][] labels, double[][][][] featureVals, int offset)` Adds the List of Lists of CRFDatums to the data and labels arrays, treating each datum as if it were its own document.
`protected static Index<CRFLabel>`	`allLabels(int window, Index<java.lang.String> classIndex)`
`java.util.List<IN>`	`classify(java.util.List<IN> document)` Classify a `List` of something that extends`CoreMap`.
`java.util.List<IN>`	`classifyGibbs(java.util.List<IN> document)`
`java.util.List<IN>`	`classifyGibbs(java.util.List<IN> document, Triple<int[][][],int[],double[][][]> documentDataAndLabels)`
`java.util.List<IN>`	`classifyMaxEnt(java.util.List<IN> document)` Do standard sequence inference, using either Viterbi or Beam inference depending on the value of `flags.inferenceType`.
`java.util.List<IN>`	`classifyWithGlobalInformation(java.util.List<IN> tokenSeq, CoreMap doc, CoreMap sent)` Classify a `List` of something that extends `CoreMap` using as additional information whatever is stored in the document and sentence.
`void`	`combine(CRFClassifier<IN> crf, double weight)` Combines weighted crf with this crf.
`Triple<int[][][][],int[][],double[][][][]>`	`documentsToDataAndLabels(java.util.Collection<java.util.List<IN>> documents)` Convert an ObjectBank to arrays of data features and labels.
`java.util.List<Triple<int[][][],int[],double[][][]>>`	`documentsToDataAndLabelsList(java.util.Collection<java.util.List<IN>> documents)` Convert an ObjectBank to corresponding collection of data features and labels.
`Triple<int[][][],int[],double[][][]>`	`documentToDataAndLabels(java.util.List<IN> document)` Convert a document List into arrays storing the data features and labels.
`void`	`dropFeaturesBelowThreshold(double threshold)`
`void`	`dumpFeatures(java.util.Collection<java.util.List<IN>> docs)` Does nothing by default.
`protected java.util.List<CRFDatum<? extends java.util.Collection<java.lang.String>,? extends java.lang.CharSequence>>`	`extractDatumSequence(int[][][] allData, int beginPosition, int endPosition, java.util.List<IN> labeledWordInfos)` Creates a new CRFDatum from the preprocessed allData format, given the document number, position number, and a List of Object labels.
`static <INN extends CoreMap> CRFClassifier<INN>`	`getClassifier(java.io.File file)` Loads a CRF classifier from a filepath, and returns it.
`static <INN extends CoreMap> CRFClassifier<INN>`	`getClassifier(java.io.InputStream in)` Loads a CRF classifier from an InputStream, and returns it.
`static <INN extends CoreMap> CRFClassifier<INN>`	`getClassifier(java.io.ObjectInputStream ois)`
`static <INN extends CoreMap> CRFClassifier<INN>`	`getClassifier(java.io.ObjectInputStream ois, java.util.Properties props)`
`static CRFClassifier<CoreLabel>`	`getClassifier(java.lang.String loadPath)`
`static <INN extends CoreMap> CRFClassifier<INN>`	`getClassifier(java.lang.String loadPath, java.util.Properties props)`
`static <INN extends CoreMap> CRFClassifier<INN>`	`getClassifierNoExceptions(java.lang.String loadPath)`
`protected CliquePotentialFunction`	`getCliquePotentialFunctionForTest()`
`CRFCliqueTree<java.lang.String>`	`getCliqueTree(java.util.List<IN> document)`
`CRFCliqueTree<java.lang.String>`	`getCliqueTree(Triple<int[][][],int[],double[][][]> p)`
`java.util.List<CRFCliqueTree<java.lang.String>>`	`getCliqueTrees(java.lang.String filename, DocumentReaderAndWriter<IN> readerAndWriter)` Want to make arbitrary probability queries? Then this is the method for you.
`static <INN extends CoreMap> CRFClassifier<INN>`	`getDefaultClassifier()` Used to get the default supplied classifier inside the jar file.
`static <INN extends CoreMap> CRFClassifier<INN>`	`getDefaultClassifier(java.util.Properties props)` Used to get the default supplied classifier inside the jar file.
`Minimizer<DiffFunction>`	`getMinimizer()`
`Minimizer<DiffFunction>`	`getMinimizer(int featurePruneIteration, Evaluator[] evaluators)`
`int`	`getNumWeights()` Returns the total number of weights associated with this classifier.
`protected CRFLogConditionalObjectiveFunction`	`getObjectiveFunction(int[][][][] data, int[][] labels)`
`SequenceModel`	`getSequenceModel(java.util.List<IN> doc)`
`protected java.util.Collection<java.util.List<IN>>`	`loadAuxiliaryData(java.util.Collection<java.util.List<IN>> docs, DocumentReaderAndWriter<IN> readerAndWriter)` Load auxiliary data to be used in constructing features and labels Intended to be overridden by subclasses
`void`	`loadClassifier(java.io.ObjectInputStream ois, java.util.Properties props)` Loads a classifier from the specified InputStream.
`static Index<java.lang.String>`	`loadClassIndexFromFile(java.lang.String serializePath)`
`void`	`loadDefaultClassifier()` This is used to load the default supplied classifier stored within the jar file.
`void`	`loadDefaultClassifier(java.util.Properties props)` This is used to load the default supplied classifier stored within the jar file.
`static Index<java.lang.String>`	`loadFeatureIndexFromFile(java.lang.String serializePath)`
`protected static java.util.List<java.util.List<CRFDatum<java.util.Collection<java.lang.String>,java.lang.String>>>`	`loadProcessedData(java.lang.String filename)`
`void`	`loadTagIndex()`
`protected void`	`loadTextClassifier(java.io.BufferedReader br)`
`void`	`loadTextClassifier(java.lang.String text, java.util.Properties props)`
`static double[][]`	`loadWeightsFromFile(java.lang.String serializePath)`
`static void`	`main(java.lang.String[] args)` The main method.
`protected void`	`makeAnswerArraysAndTagIndex(java.util.Collection<java.util.List<IN>> ob)` This routine builds the `labelIndices` which give the empirically legal label sequences (of length (order) at most `windowSize`) and the `classIndex`, which indexes known answer classes.
`CRFDatum<java.util.Collection<java.lang.String>,CRFLabel>`	`makeDatum(java.util.List<IN> info, int loc, java.util.List<FeatureFactory<IN>> featureFactories)` Makes a CRFDatum by producing features and a label from input data at a specific position, using the provided factory.
`void`	`printFactorTable(java.lang.String filename, DocumentReaderAndWriter<IN> readerAndWriter)` Takes the file, reads it in, and prints out the factor table at each position.
`void`	`printFactorTableDocument(java.util.List<IN> document)` Takes a `List` of something that extends `CoreMap` and prints the factor table at each point.
`void`	`printFactorTableDocuments(ObjectBank<java.util.List<IN>> documents)` Takes a `List` of documents and prints the factor table at each point.
`protected void`	`printFeatures()`
`void`	`printFirstOrderProbs(java.lang.String filename, DocumentReaderAndWriter<IN> readerAndWriter)` Takes the file, reads it in, and prints out the likelihood of each possible label at each point.
`void`	`printFirstOrderProbsDocument(java.util.List<IN> document)` Takes a `List` of something that extends `CoreMap` and prints the likelihood of each possible label at each point.
`void`	`printFirstOrderProbsDocuments(ObjectBank<java.util.List<IN>> documents)` Takes a `List` of documents and prints the likelihood of each possible label at each point.
`void`	`printLabelInformation(java.lang.String testFile, DocumentReaderAndWriter<IN> readerAndWriter)`
`void`	`printLabelValue(java.util.List<IN> document)`
`Triple<Counter<java.lang.Integer>,Counter<java.lang.Integer>,TwoDimensionalCounter<java.lang.Integer,java.lang.String>>`	`printProbsDocument(java.util.List<IN> document)` Takes a `List` of something that extends `CoreMap` and prints the likelihood of each possible label at each point.
`protected void`	`pruneNodeFeatureIndices(int totalNumOfFeatureSlices, int numOfFeatureSlices)`
`protected static void`	`saveProcessedData(java.util.List<?> datums, java.lang.String filename)`
`void`	`scaleWeights(double scale)` Scales the weights of this CRFClassifier by the specified weight.
`void`	`serializeClassifier(java.io.ObjectOutputStream oos)` Serialize the classifier to the given ObjectOutputStream.
`void`	`serializeClassifier(java.lang.String serializePath)` Serialize a sequence classifier to a file on the given path.
`void`	`serializeClassIndex(java.lang.String serializePath)`
`void`	`serializeFeatureIndex(java.lang.String serializePath)`
`void`	`serializeFeatureIndexToText(java.lang.String serializePath)`
`protected void`	`serializeTextClassifier(java.io.PrintWriter pw)`
`void`	`serializeTextClassifier(java.lang.String serializePath)` Serialize the model to a human readable format.
`void`	`serializeWeights(java.lang.String serializePath)`
`static float[][]`	`to2D(double[] weights, java.util.List<Index<CRFLabel>> labelIndices, int[] map)`
`java.util.Map<java.lang.String,Counter<java.lang.String>>`	`topWeights()`
`java.lang.String`	`toString()`
`void`	`train(java.util.Collection<java.util.List<IN>> objectBankWrapper, DocumentReaderAndWriter<IN> readerAndWriter)` Trains a classifier from a Collection of sequences.
`protected double[]`	`trainWeights(int[][][][] data, int[][] labels, Evaluator[] evaluators, int pruneFeatureItr, double[][][][] featureVals)`
`void`	`updateWeightsForTest(double[] x)`
`void`	`writeWeights(java.io.PrintStream p)`
`java.util.List<Counter<java.lang.String>>`	`zeroOrderProbabilities(java.util.List<IN> document)`

Methods inherited from class edu.stanford.nlp.ie.AbstractSequenceClassifier
apply, backgroundSymbol, classify, classifyAndWriteAnswers, classifyAndWriteAnswers, classifyAndWriteAnswers, classifyAndWriteAnswers, classifyAndWriteAnswers, classifyAndWriteAnswers, classifyAndWriteAnswers, classifyAndWriteAnswersKBest, classifyAndWriteAnswersKBest, classifyAndWriteViterbiSearchGraph, classifyFile, classifyFilesAndWriteAnswers, classifyFilesAndWriteAnswers, classifyKBest, classifyRaw, classifySentence, classifySentenceWithGlobalInformation, classifyStdin, classifyStdin, classifyToCharacterOffsets, classifyToString, classifyToString, classifyWithInlineXML, countResults, countResultsSegmenter, defaultReaderAndWriter, finalizeClassification, getKnownLCWords, getSampler, labels, loadClassifier, loadClassifier, loadClassifier, loadClassifier, loadClassifier, loadClassifier, loadClassifierNoExceptions, loadClassifierNoExceptions, loadClassifierNoExceptions, loadClassifierNoExceptions, loadClassifierNoExceptions, makeObjectBankFromFile, makeObjectBankFromFile, makeObjectBankFromFiles, makeObjectBankFromFiles, makeObjectBankFromFiles, makeObjectBankFromReader, makeObjectBankFromString, makePlainTextReaderAndWriter, makePlainTextReaderAndWriter, makeReaderAndWriter, plainTextReaderAndWriter, printFeatureLists, printFeatures, printProbs, printProbs, printProbsDocuments, printResults, reinit, segmentString, segmentString, train, train, train, train, train, train, windowSize, writeAnswers

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Methods inherited from interface java.util.function.Function
andThen, compose, identity

- Field Detail
  - DEFAULT_CLASSIFIER
```
public static final java.lang.String DEFAULT_CLASSIFIER
```
    Name of default serialized classifier resource to look for in a jar file.
    
    See Also:
    
    Constant Field Values
- Constructor Detail
  - CRFClassifier
```
protected CRFClassifier()
```
  - CRFClassifier
```
public CRFClassifier(java.util.Properties props)
```
  - CRFClassifier
```
public CRFClassifier(SeqClassifierFlags flags)
```
  - CRFClassifier
```
public CRFClassifier(CRFClassifier<IN> crf)
```
    Makes a copy of the crf classifier
- Method Detail
  - getNumWeights
```
public int getNumWeights()
```
    Returns the total number of weights associated with this classifier.
    
    Returns:
    
    number of weights
  - scaleWeights
```
public void scaleWeights(double scale)
```
    Scales the weights of this CRFClassifier by the specified weight.
    
    Parameters:
    
    scale - The scale to multiply by
  - combine
```
public void combine(CRFClassifier<IN> crf,
                    double weight)
```
    Combines weighted crf with this crf.
    
    Parameters:
    
    crf - Other CRF whose weights to combine into this CRF
    
    weight - Amount to scale the other CRF's weights by
  - dropFeaturesBelowThreshold
```
public void dropFeaturesBelowThreshold(double threshold)
```
  - documentToDataAndLabels
```
public Triple<int[][][],int[],double[][][]> documentToDataAndLabels(java.util.List<IN> document)
```
    Convert a document List into arrays storing the data features and labels. This is used at test time.
    
    Parameters:
    
    document - Testing documents
    
    Returns:
    
    A Triple, where the first element is an int[][][] representing the data, the second element is an int[] representing the labels, and the third element is a double[][][] representing the feature values (optionally null)
  - printLabelInformation
```
public void printLabelInformation(java.lang.String testFile,
                                  DocumentReaderAndWriter<IN> readerAndWriter)
                           throws java.lang.Exception
```
    Throws:
    
    java.lang.Exception
  - printLabelValue
```
public void printLabelValue(java.util.List<IN> document)
```
  - documentsToDataAndLabels
```
public Triple<int[][][][],int[][],double[][][][]> documentsToDataAndLabels(java.util.Collection<java.util.List<IN>> documents)
```
    Convert an ObjectBank to arrays of data features and labels. This version is used at training time.
    
    Returns:
    
    A Triple, where the first element is an int[][][][] representing the data, the second element is an int[][] representing the labels, and the third element is a double[][][][] representing the feature values which could be optionally left as null.
  - documentsToDataAndLabelsList
```
public java.util.List<Triple<int[][][],int[],double[][][]>> documentsToDataAndLabelsList(java.util.Collection<java.util.List<IN>> documents)
```
    Convert an ObjectBank to corresponding collection of data features and labels. This version is used at test time.
    
    Returns:
    
    A List of pairs, one for each document, where the first element is an int[][][] representing the data and the second element is an int[] representing the labels.
  - printFeatures
```
protected void printFeatures()
```
  - makeAnswerArraysAndTagIndex
```
protected void makeAnswerArraysAndTagIndex(java.util.Collection<java.util.List<IN>> ob)
```
    This routine builds the labelIndices which give the empirically legal label sequences (of length (order) at most windowSize) and the classIndex, which indexes known answer classes.
    
    Parameters:
    
    ob - The training data: Read from an ObjectBank, each item in it is a List<CoreLabel>.
  - allLabels
```
protected static Index<CRFLabel> allLabels(int window,
                                           Index<java.lang.String> classIndex)
```
  - makeDatum
```
public CRFDatum<java.util.Collection<java.lang.String>,CRFLabel> makeDatum(java.util.List<IN> info,
                                                                           int loc,
                                                                           java.util.List<FeatureFactory<IN>> featureFactories)
```
    Makes a CRFDatum by producing features and a label from input data at a specific position, using the provided factory.
    
    Parameters:
    
    info - The input data. Particular feature factories might look for arbitrary keys in the IN items.
    
    loc - The position to build a datum at
    
    featureFactories - The FeatureFactories to use to extract features
    
    Returns:
    
    The constructed CRFDatum
  - dumpFeatures
```
public void dumpFeatures(java.util.Collection<java.util.List<IN>> docs)
```
    Description copied from class: AbstractSequenceClassifier
    
    Does nothing by default. Subclasses can override if necessary.
    
    Overrides:
    
    dumpFeatures in class AbstractSequenceClassifier<IN extends CoreMap>
  - classify
```
public java.util.List<IN> classify(java.util.List<IN> document)
```
    Description copied from class: AbstractSequenceClassifier
    
    Classify a List of something that extendsCoreMap. The classifications are added in place to the items of the document, which is also returned by this method. Warning: In many circumstances, you should not call this method directly. In particular, if you call this method directly, your document will not be preprocessed to add things like word distributional similarity class or word shape features that your classifier may rely on to work correctly. In such cases, you should call classifySentence instead.
    
    Specified by:
    
    classify in class AbstractSequenceClassifier<IN extends CoreMap>
    
    Parameters:
    
    document - A List of something that extends CoreMap.
    
    Returns:
    
    The same List, but with the elements annotated with their answers (stored under the CoreAnnotations.AnswerAnnotation key). The answers will be the class labels defined by the CRF Classifier. They might be things like entity labels (in BIO notation or not) or something like "1" vs. "0" on whether to begin a new token here or not (in word segmentation).
  - getSequenceModel
```
public SequenceModel getSequenceModel(java.util.List<IN> doc)
```
    Overrides:
    
    getSequenceModel in class AbstractSequenceClassifier<IN extends CoreMap>
  - getCliquePotentialFunctionForTest
```
protected CliquePotentialFunction getCliquePotentialFunctionForTest()
```
  - updateWeightsForTest
```
public void updateWeightsForTest(double[] x)
```
  - classifyMaxEnt
```
public java.util.List<IN> classifyMaxEnt(java.util.List<IN> document)
```
    Do standard sequence inference, using either Viterbi or Beam inference depending on the value of flags.inferenceType.
    
    Parameters:
    
    document - Document to classify. Classification happens in place. This document is modified.
    
    Returns:
    
    The classified document
  - classifyGibbs
```
public java.util.List<IN> classifyGibbs(java.util.List<IN> document)
                                 throws java.lang.ClassNotFoundException,
                                        java.lang.SecurityException,
                                        java.lang.NoSuchMethodException,
                                        java.lang.IllegalArgumentException,
                                        java.lang.InstantiationException,
                                        java.lang.IllegalAccessException,
                                        java.lang.reflect.InvocationTargetException
```
    Throws:
    
    java.lang.ClassNotFoundException
    
    java.lang.SecurityException
    
    java.lang.NoSuchMethodException
    
    java.lang.IllegalArgumentException
    
    java.lang.InstantiationException
    
    java.lang.IllegalAccessException
    
    java.lang.reflect.InvocationTargetException
  - classifyGibbs
```
public java.util.List<IN> classifyGibbs(java.util.List<IN> document,
                                        Triple<int[][][],int[],double[][][]> documentDataAndLabels)
                                 throws java.lang.ClassNotFoundException,
                                        java.lang.SecurityException,
                                        java.lang.NoSuchMethodException,
                                        java.lang.IllegalArgumentException,
                                        java.lang.InstantiationException,
                                        java.lang.IllegalAccessException,
                                        java.lang.reflect.InvocationTargetException
```
    Throws:
    
    java.lang.ClassNotFoundException
    
    java.lang.SecurityException
    
    java.lang.NoSuchMethodException
    
    java.lang.IllegalArgumentException
    
    java.lang.InstantiationException
    
    java.lang.IllegalAccessException
    
    java.lang.reflect.InvocationTargetException
  - printProbsDocument
```
public Triple<Counter<java.lang.Integer>,Counter<java.lang.Integer>,TwoDimensionalCounter<java.lang.Integer,java.lang.String>> printProbsDocument(java.util.List<IN> document)
```
    Takes a List of something that extends CoreMap and prints the likelihood of each possible label at each point.
    
    Overrides:
    
    printProbsDocument in class AbstractSequenceClassifier<IN extends CoreMap>
    
    Parameters:
    
    document - A List of something that extends CoreMap.
    
    Returns:
    
    If verboseMode is set, a Triple of Counters recording classification decisions, else null.
  - zeroOrderProbabilities
```
public java.util.List<Counter<java.lang.String>> zeroOrderProbabilities(java.util.List<IN> document)
```
  - printFirstOrderProbs
```
public void printFirstOrderProbs(java.lang.String filename,
                                 DocumentReaderAndWriter<IN> readerAndWriter)
```
    Takes the file, reads it in, and prints out the likelihood of each possible label at each point. This gives a simple way to examine the probability distributions of the CRF. See getCliqueTrees() for more.
    
    Parameters:
    
    filename - The path to the specified file
  - printFirstOrderProbsDocuments
```
public void printFirstOrderProbsDocuments(ObjectBank<java.util.List<IN>> documents)
```
    Takes a List of documents and prints the likelihood of each possible label at each point.
    
    Parameters:
    
    documents - A List of List of INs.
  - printFactorTable
```
public void printFactorTable(java.lang.String filename,
                             DocumentReaderAndWriter<IN> readerAndWriter)
```
    Takes the file, reads it in, and prints out the factor table at each position.
    
    Parameters:
    
    filename - The path to the specified file
  - printFactorTableDocuments
```
public void printFactorTableDocuments(ObjectBank<java.util.List<IN>> documents)
```
    Takes a List of documents and prints the factor table at each point.
    
    Parameters:
    
    documents - A List of List of INs.
  - getCliqueTrees
```
public java.util.List<CRFCliqueTree<java.lang.String>> getCliqueTrees(java.lang.String filename,
                                                                      DocumentReaderAndWriter<IN> readerAndWriter)
```
    Want to make arbitrary probability queries? Then this is the method for you. Given the filename, it reads it in and breaks it into documents, and then makes a CRFCliqueTree for each document. you can then ask the clique tree for marginals and conditional probabilities of almost anything you want.
  - getCliqueTree
```
public CRFCliqueTree<java.lang.String> getCliqueTree(Triple<int[][][],int[],double[][][]> p)
```
  - getCliqueTree
```
public CRFCliqueTree<java.lang.String> getCliqueTree(java.util.List<IN> document)
```
  - printFactorTableDocument
```
public void printFactorTableDocument(java.util.List<IN> document)
```
    Takes a List of something that extends CoreMap and prints the factor table at each point.
    
    Parameters:
    
    document - A List of something that extends CoreMap.
  - printFirstOrderProbsDocument
```
public void printFirstOrderProbsDocument(java.util.List<IN> document)
```
    Takes a List of something that extends CoreMap and prints the likelihood of each possible label at each point.
    
    Parameters:
    
    document - A List of something that extends CoreMap.
  - loadAuxiliaryData
```
protected java.util.Collection<java.util.List<IN>> loadAuxiliaryData(java.util.Collection<java.util.List<IN>> docs,
                                                                     DocumentReaderAndWriter<IN> readerAndWriter)
```
    Load auxiliary data to be used in constructing features and labels Intended to be overridden by subclasses
  - train
```
public void train(java.util.Collection<java.util.List<IN>> objectBankWrapper,
                  DocumentReaderAndWriter<IN> readerAndWriter)
```
    Trains a classifier from a Collection of sequences. Note that the Collection can be (and usually is) an ObjectBank.
    
    Specified by:
    
    train in class AbstractSequenceClassifier<IN extends CoreMap>
    
    Parameters:
    
    objectBankWrapper - An ObjectBank or a collection of sequences of IN
    
    readerAndWriter - A DocumentReaderAndWriter to use when loading test files
  - to2D
```
public static float[][] to2D(double[] weights,
                             java.util.List<Index<CRFLabel>> labelIndices,
                             int[] map)
```
  - pruneNodeFeatureIndices
```
protected void pruneNodeFeatureIndices(int totalNumOfFeatureSlices,
                                       int numOfFeatureSlices)
```
  - getObjectiveFunction
```
protected CRFLogConditionalObjectiveFunction getObjectiveFunction(int[][][][] data,
                                                                  int[][] labels)
```
  - trainWeights
```
protected double[] trainWeights(int[][][][] data,
                                int[][] labels,
                                Evaluator[] evaluators,
                                int pruneFeatureItr,
                                double[][][][] featureVals)
```
  - getMinimizer
```
public Minimizer<DiffFunction> getMinimizer()
```
  - getMinimizer
```
public Minimizer<DiffFunction> getMinimizer(int featurePruneIteration,
                                            Evaluator[] evaluators)
```
  - extractDatumSequence
```
protected java.util.List<CRFDatum<? extends java.util.Collection<java.lang.String>,? extends java.lang.CharSequence>> extractDatumSequence(int[][][] allData,
                                                                                                                                           int beginPosition,
                                                                                                                                           int endPosition,
                                                                                                                                           java.util.List<IN> labeledWordInfos)
```
    Creates a new CRFDatum from the preprocessed allData format, given the document number, position number, and a List of Object labels.
    
    Returns:
    
    A new CRFDatum
  - addProcessedData
```
protected void addProcessedData(java.util.List<java.util.List<CRFDatum<java.util.Collection<java.lang.String>,java.lang.String>>> processedData,
                                int[][][][] data,
                                int[][] labels,
                                double[][][][] featureVals,
                                int offset)
```
    Adds the List of Lists of CRFDatums to the data and labels arrays, treating each datum as if it were its own document. Adds context labels in addition to the target label for each datum, meaning that for a particular document, the number of labels will be windowSize-1 greater than the number of datums.
    
    Parameters:
    
    processedData - A List of Lists of CRFDatums
  - saveProcessedData
```
protected static void saveProcessedData(java.util.List<?> datums,
                                        java.lang.String filename)
```
  - loadProcessedData
```
protected static java.util.List<java.util.List<CRFDatum<java.util.Collection<java.lang.String>,java.lang.String>>> loadProcessedData(java.lang.String filename)
```
  - loadTextClassifier
```
protected void loadTextClassifier(java.io.BufferedReader br)
                           throws java.lang.Exception
```
    Throws:
    
    java.lang.Exception
  - loadTextClassifier
```
public void loadTextClassifier(java.lang.String text,
                               java.util.Properties props)
                        throws java.lang.ClassCastException,
                               java.io.IOException,
                               java.lang.ClassNotFoundException,
                               java.lang.InstantiationException,
                               java.lang.IllegalAccessException
```
    Throws:
    
    java.lang.ClassCastException
    
    java.io.IOException
    
    java.lang.ClassNotFoundException
    
    java.lang.InstantiationException
    
    java.lang.IllegalAccessException
  - serializeTextClassifier
```
protected void serializeTextClassifier(java.io.PrintWriter pw)
                                throws java.lang.Exception
```
    Throws:
    
    java.lang.Exception
  - serializeTextClassifier
```
public void serializeTextClassifier(java.lang.String serializePath)
```
    Serialize the model to a human readable format. It's not yet complete. It should now work for Chinese segmenter though. TODO: check things in serializeClassifier and add other necessary serialization back.
    
    Parameters:
    
    serializePath - File to write text format of classifier to.
  - serializeClassIndex
```
public void serializeClassIndex(java.lang.String serializePath)
```
  - loadClassIndexFromFile
```
public static Index<java.lang.String> loadClassIndexFromFile(java.lang.String serializePath)
```
  - serializeWeights
```
public void serializeWeights(java.lang.String serializePath)
```
  - loadWeightsFromFile
```
public static double[][] loadWeightsFromFile(java.lang.String serializePath)
```
  - serializeFeatureIndex
```
public void serializeFeatureIndex(java.lang.String serializePath)
```
  - serializeFeatureIndexToText
```
public void serializeFeatureIndexToText(java.lang.String serializePath)
```
  - loadFeatureIndexFromFile
```
public static Index<java.lang.String> loadFeatureIndexFromFile(java.lang.String serializePath)
```
  - serializeClassifier
```
public void serializeClassifier(java.lang.String serializePath)
```
    Serialize a sequence classifier to a file on the given path.
    
    Specified by:
    
    serializeClassifier in class AbstractSequenceClassifier<IN extends CoreMap>
    
    Parameters:
    
    serializePath - The path/filename to write the classifier to.
  - serializeClassifier
```
public void serializeClassifier(java.io.ObjectOutputStream oos)
```
    Serialize the classifier to the given ObjectOutputStream.
    (Since the classifier is a processor, we don't want to serialize the whole classifier but just the data that represents a classifier model.)
    
    Specified by:
    
    serializeClassifier in class AbstractSequenceClassifier<IN extends CoreMap>
  - loadClassifier
```
public void loadClassifier(java.io.ObjectInputStream ois,
                           java.util.Properties props)
                    throws java.lang.ClassCastException,
                           java.io.IOException,
                           java.lang.ClassNotFoundException
```
    Loads a classifier from the specified InputStream. This version works quietly (unless VERBOSE is true). If props is non-null then any properties it specifies override those in the serialized file. However, only some properties are sensible to change (you shouldn't change how features are defined).
    Note: This method does not close the ObjectInputStream. (But earlier versions of the code used to, so beware....)
    
    Specified by:
    
    loadClassifier in class AbstractSequenceClassifier<IN extends CoreMap>
    
    Parameters:
    
    ois - The InputStream to load the serialized classifier from
    
    props - This Properties object will be used to update the SeqClassifierFlags which are read from the serialized classifier
    
    Throws:
    
    java.lang.ClassCastException - If there are problems interpreting the serialized data
    
    java.io.IOException - If there are problems accessing the input stream
    
    java.lang.ClassNotFoundException - If there are problems interpreting the serialized data
  - loadDefaultClassifier
```
public void loadDefaultClassifier()
```
    This is used to load the default supplied classifier stored within the jar file. THIS FUNCTION WILL ONLY WORK IF THE CODE WAS LOADED FROM A JAR FILE WHICH HAS A SERIALIZED CLASSIFIER STORED INSIDE IT.
  - loadTagIndex
```
public void loadTagIndex()
```
  - writeWeights
```
public void writeWeights(java.io.PrintStream p)
```
  - topWeights
```
public java.util.Map<java.lang.String,Counter<java.lang.String>> topWeights()
```
  - classifyWithGlobalInformation
```
public java.util.List<IN> classifyWithGlobalInformation(java.util.List<IN> tokenSeq,
                                                        CoreMap doc,
                                                        CoreMap sent)
```
    Description copied from class: AbstractSequenceClassifier
    
    Classify a List of something that extends CoreMap using as additional information whatever is stored in the document and sentence. This is needed for SUTime (NumberSequenceClassifier), which requires the document date to resolve relative dates.
    
    Specified by:
    
    classifyWithGlobalInformation in class AbstractSequenceClassifier<IN extends CoreMap>
    
    Parameters:
    
    tokenSeq - A List of something that extends CoreMap
    
    Returns:
    
    Classified version of the input tokenSequence
  - loadDefaultClassifier
```
public void loadDefaultClassifier(java.util.Properties props)
```
    This is used to load the default supplied classifier stored within the jar file. THIS FUNCTION WILL ONLY WORK IF THE CODE WAS LOADED FROM A JAR FILE WHICH HAS A SERIALIZED CLASSIFIER STORED INSIDE IT.
  - getDefaultClassifier
```
public static <INN extends CoreMap> CRFClassifier<INN> getDefaultClassifier()
```
    Used to get the default supplied classifier inside the jar file. THIS FUNCTION WILL ONLY WORK IF THE CODE WAS LOADED FROM A JAR FILE WHICH HAS A SERIALIZED CLASSIFIER STORED INSIDE IT.
    
    Returns:
    
    The default CRFClassifier in the jar file (if there is one)
  - getDefaultClassifier
```
public static <INN extends CoreMap> CRFClassifier<INN> getDefaultClassifier(java.util.Properties props)
```
    Used to get the default supplied classifier inside the jar file. THIS FUNCTION WILL ONLY WORK IF THE CODE WAS LOADED FROM A JAR FILE WHICH HAS A SERIALIZED CLASSIFIER STORED INSIDE IT.
    
    Returns:
    
    The default CRFClassifier in the jar file (if there is one)
  - getClassifier
```
public static <INN extends CoreMap> CRFClassifier<INN> getClassifier(java.io.File file)
                                                              throws java.io.IOException,
                                                                     java.lang.ClassCastException,
                                                                     java.lang.ClassNotFoundException
```
    Loads a CRF classifier from a filepath, and returns it.
    
    Parameters:
    
    file - File to load classifier from
    
    Returns:
    
    The CRF classifier
    
    Throws:
    
    java.io.IOException - If there are problems accessing the input stream
    
    java.lang.ClassCastException - If there are problems interpreting the serialized data
    
    java.lang.ClassNotFoundException - If there are problems interpreting the serialized data
  - getClassifier
```
public static <INN extends CoreMap> CRFClassifier<INN> getClassifier(java.io.InputStream in)
                                                              throws java.io.IOException,
                                                                     java.lang.ClassCastException,
                                                                     java.lang.ClassNotFoundException
```
    Loads a CRF classifier from an InputStream, and returns it. This method does not buffer the InputStream, so you should have buffered it before calling this method.
    
    Parameters:
    
    in - InputStream to load classifier from
    
    Returns:
    
    The CRF classifier
    
    Throws:
    
    java.io.IOException - If there are problems accessing the input stream
    
    java.lang.ClassCastException - If there are problems interpreting the serialized data
    
    java.lang.ClassNotFoundException - If there are problems interpreting the serialized data
  - getClassifier
```
public static <INN extends CoreMap> CRFClassifier<INN> getClassifier(java.io.ObjectInputStream ois)
                                                              throws java.io.IOException,
                                                                     java.lang.ClassCastException,
                                                                     java.lang.ClassNotFoundException
```
    Throws:
    
    java.io.IOException
    
    java.lang.ClassCastException
    
    java.lang.ClassNotFoundException
  - getClassifierNoExceptions
```
public static <INN extends CoreMap> CRFClassifier<INN> getClassifierNoExceptions(java.lang.String loadPath)
```
  - getClassifier
```
public static CRFClassifier<CoreLabel> getClassifier(java.lang.String loadPath)
                                              throws java.io.IOException,
                                                     java.lang.ClassCastException,
                                                     java.lang.ClassNotFoundException
```
    Throws:
    
    java.io.IOException
    
    java.lang.ClassCastException
    
    java.lang.ClassNotFoundException
  - getClassifier
```
public static <INN extends CoreMap> CRFClassifier<INN> getClassifier(java.lang.String loadPath,
                                                                     java.util.Properties props)
                                                              throws java.io.IOException,
                                                                     java.lang.ClassCastException,
                                                                     java.lang.ClassNotFoundException
```
    Throws:
    
    java.io.IOException
    
    java.lang.ClassCastException
    
    java.lang.ClassNotFoundException
  - getClassifier
```
public static <INN extends CoreMap> CRFClassifier<INN> getClassifier(java.io.ObjectInputStream ois,
                                                                     java.util.Properties props)
                                                              throws java.io.IOException,
                                                                     java.lang.ClassCastException,
                                                                     java.lang.ClassNotFoundException
```
    Throws:
    
    java.io.IOException
    
    java.lang.ClassCastException
    
    java.lang.ClassNotFoundException
  - toString
```
public java.lang.String toString()
```
    Overrides:
    
    toString in class java.lang.Object
  - main
```
public static void main(java.lang.String[] args)
                 throws java.lang.Exception
```
    The main method. See the class documentation.
    
    Throws:
    
    java.lang.Exception

Class CRFClassifier<IN extends CoreMap>

Field Summary

Fields inherited from class edu.stanford.nlp.ie.AbstractSequenceClassifier

Constructor Summary

Method Summary

Methods inherited from class edu.stanford.nlp.ie.AbstractSequenceClassifier

Methods inherited from class java.lang.Object

Methods inherited from interface java.util.function.Function

Field Detail

DEFAULT_CLASSIFIER

Constructor Detail

CRFClassifier

CRFClassifier

CRFClassifier

CRFClassifier

Method Detail

getNumWeights

scaleWeights

combine

dropFeaturesBelowThreshold

documentToDataAndLabels

printLabelInformation

printLabelValue

documentsToDataAndLabels

documentsToDataAndLabelsList

printFeatures

makeAnswerArraysAndTagIndex

allLabels

makeDatum

dumpFeatures

classify

getSequenceModel

getCliquePotentialFunctionForTest

updateWeightsForTest

classifyMaxEnt

classifyGibbs

classifyGibbs

printProbsDocument

zeroOrderProbabilities

printFirstOrderProbs

printFirstOrderProbsDocuments

printFactorTable

printFactorTableDocuments

getCliqueTrees

getCliqueTree

getCliqueTree

printFactorTableDocument

printFirstOrderProbsDocument

loadAuxiliaryData

train

to2D

pruneNodeFeatureIndices

getObjectiveFunction

trainWeights

getMinimizer

getMinimizer

extractDatumSequence

addProcessedData

saveProcessedData

loadProcessedData

loadTextClassifier

loadTextClassifier

serializeTextClassifier

serializeTextClassifier

serializeClassIndex

loadClassIndexFromFile

serializeWeights

loadWeightsFromFile

serializeFeatureIndex

serializeFeatureIndexToText

loadFeatureIndexFromFile

serializeClassifier

serializeClassifier

loadClassifier

loadDefaultClassifier

loadTagIndex

writeWeights

topWeights

classifyWithGlobalInformation

loadDefaultClassifier