edu.stanford.nlp.ie.crf
Class CRFClassifier<IN extends CoreMap>

java.lang.Object
  extended by edu.stanford.nlp.ie.AbstractSequenceClassifier<IN>
      extended by edu.stanford.nlp.ie.crf.CRFClassifier<IN>
All Implemented Interfaces:
Function<String,String>
Direct Known Subclasses:
CRFBiasedClassifier

public class CRFClassifier<IN extends CoreMap>
extends AbstractSequenceClassifier<IN>

Class for Sequence Classification using a Conditional Random Field model. The code has functionality for different document formats, but when using the standard ColumnDocumentReaderAndWriter for training or testing models, input files are expected to be one token per line with the columns indicating things like the word, POS, chunk, and answer class. The default for ColumnDocumentReaderAndWriter training data is 3 column input, with the columns containing a word, its POS, and its gold class, but this can be specified via the map property.

When run on a file with -textFile, the file is assumed to be plain English text (or perhaps simple HTML/XML), and a reasonable attempt is made at English tokenization by PlainTextDocumentReaderAndWriter.

Typical command-line usage

For running a trained model with a provided serialized classifier on a text file:

java -mx500m edu.stanford.nlp.ie.crf.CRFClassifier -loadClassifier conll.ner.gz -textFile samplesentences.txt

When specifying all parameters in a properties file (train, test, or runtime):

java -mx1g edu.stanford.nlp.ie.crf.CRFClassifier -prop propFile

To train and test a simple NER model from the command line:
java -mx1000m edu.stanford.nlp.ie.crf.CRFClassifier -trainFile trainFile -testFile testFile -macro > output

To train with multiple files:
java -mx1000m edu.stanford.nlp.ie.crf.CRFClassifier -trainFileList file1,file2,... -testFile testFile -macro > output

Features are defined by a FeatureFactory. NERFeatureFactory is used by default, and you should look there for feature templates and properties or flags that will cause certain features to be used when training an NER classifier. There is also a edu.stanford.nlp.wordseg.SighanFeatureFactory, and various successors such as edu.stanford.nlp.wordseg.ChineseSegmenterFeatureFactory, which are used for Chinese word segmentation. Features are specified either by a Properties file (which is the recommended method) or by flags on the command line. The flags are read into a SeqClassifierFlags object, which the user need not be concerned with, unless wishing to add new features.

CRFClassifier may also be used programmatically. When creating a new instance, you must specify a Properties object. You may then call train methods to train a classifier, or load a classifier. The other way to get a CRFClassifier is to deserialize one via the static getClassifier(String) methods, which return a deserialized classifier. You may then tag (classify the items of) documents using either the assorted classify() or the assorted classify methods in AbstractSequenceClassifier. Probabilities assigned by the CRF can be interrogated using either the printProbsDocument() or getCliqueTrees() methods.

Author:
Jenny Finkel, Sonal Gupta (made the class generic)

Nested Class Summary
static class CRFClassifier.TestSequenceModel
           
 
Field Summary
static String DEFAULT_CLASSIFIER
          Name of default serialized classifier resource to look for in a jar file.
 
Fields inherited from class edu.stanford.nlp.ie.AbstractSequenceClassifier
classIndex, featureFactory, flags, knownLCWords, pad, windowSize
 
Constructor Summary
protected CRFClassifier()
           
  CRFClassifier(CRFClassifier<IN> crf)
          Makes a copy of the crf classifier
  CRFClassifier(Properties props)
           
  CRFClassifier(SeqClassifierFlags flags)
           
 
Method Summary
protected  void addProcessedData(List<List<CRFDatum<Collection<String>,String>>> processedData, int[][][][] data, int[][] labels, int offset)
          Adds the List of Lists of CRFDatums to the data and labels arrays, treating each datum as if it were its own document.
protected static Index<CRFLabel> allLabels(int window, Index classIndex)
           
 List<IN> classify(List<IN> document)
          Classify a List of something that extendsCoreMap.
 void classifyAndWriteAnswers(Collection<List<IN>> documents, List<Pair<int[][][],int[]>> documentDataAndLabels, OutputStream outStream, DocumentReaderAndWriter readerAndWriter)
           
 List<IN> classifyGibbs(List<IN> document)
           
 List<IN> classifyGibbs(List<IN> document, Pair<int[][][],int[]> documentDataAndLabels)
           
 List<IN> classifyGibbsUsingPrior(List<IN> sentence, SequenceModel[] priorModels, SequenceListener[] priorListeners, double[] modelWts)
           
 List<IN> classifyGibbsUsingPrior(List<IN> sentence, SequenceModel priorModel, SequenceListener priorListener, double model1Wt, double model2Wt)
           
 List<IN> classifyMaxEnt(List<IN> document)
          Do standard sequence inference, using either Viterbi or Beam inference depending on the value of flags.inferenceType.
 List<IN> classifyWithGlobalInformation(List<IN> tokenSeq, CoreMap doc, CoreMap sent)
          Classify a List of something that extends CoreMap using as additional information whatever is stored in the document and sentence.
 void combine(CRFClassifier<IN> crf, double weight)
          Combines weighted crf with this crf
 Pair<int[][][][],int[][]> documentsToDataAndLabels(Collection<List<IN>> documents)
          Convert an ObjectBank to arrays of data features and labels.
 List<Pair<int[][][],int[]>> documentsToDataAndLabelsList(Collection<List<IN>> documents)
          Convert an ObjectBank to corresponding collection of data features and labels.
 Pair<int[][][],int[]> documentToDataAndLabels(List<IN> document)
          Convert a document List into arrays storing the data features and labels.
 void dropFeaturesBelowThreshold(double threshold)
           
protected  List<CRFDatum> extractDatumSequence(int[][][] allData, int beginPosition, int endPosition, List<IN> labeledWordInfos)
          Creates a new CRFDatum from the preprocessed allData format, given the document number, position number, and a List of Object labels.
static
<IN extends CoreMap>
CRFClassifier<IN>
getClassifier(File file)
          Loads a CRF classifier from a filepath, and returns it.
static CRFClassifier getClassifier(InputStream in)
          Loads a CRF classifier from an InputStream, and returns it.
static CRFClassifier getClassifier(String loadPath)
           
static CRFClassifier getClassifier(String loadPath, Properties props)
           
static CRFClassifier getClassifierNoExceptions(String loadPath)
           
 List<CRFCliqueTree> getCliqueTrees(String filename, DocumentReaderAndWriter readerAndWriter)
          Want to make arbitrary probability queries? Then this is the method for you.
static
<IN extends CoreMap>
CRFClassifier<IN>
getDefaultClassifier()
          Used to get the default supplied classifier inside the jar file.
static
<IN extends CoreMap>
CRFClassifier<IN>
getDefaultClassifier(Properties props)
          Used to get the default supplied classifier inside the jar file.
static
<IN extends CoreMap>
CRFClassifier<IN>
getJarClassifier(String resourceName, Properties props)
          Used to load a classifier stored as a resource inside a jar file.
protected  Minimizer getMinimizer()
           
protected  Minimizer getMinimizer(int featurePruneIteration, Evaluator[] evaluators)
           
 int getNumWeights()
          Returns the total number of weights associated with this classifier.
 SequenceModel getSequenceModel(List<IN> doc)
           
 SequenceModel getSequenceModel(List<IN> doc, Pair<int[][][],int[]> documentDataAndLabels)
           
 void loadClassifier(ObjectInputStream ois, Properties props)
          Loads a classifier from the specified InputStream.
 void loadDefaultClassifier()
          This is used to load the default supplied classifier stored within the jar file.
 void loadDefaultClassifier(Properties props)
          This is used to load the default supplied classifier stored within the jar file.
protected static List loadProcessedData(String filename)
           
 void loadTextClassifier(String text, Properties props)
           
static void main(String[] args)
          The main method.
protected  void makeAnswerArraysAndTagIndex(Collection<List<IN>> ob)
          This routine builds the labelIndices which give the empirically legal label sequences (of length (order) at most windowSize) and the classIndex, which indexes known answer classes.
 CRFDatum<List<String>,CRFLabel> makeDatum(List<IN> info, int loc, FeatureFactory<IN> featureFactory)
          Makes a CRFDatum by producing features and a label from input data at a specific position, using the provided factory.
 void printFirstOrderProbs(String filename, DocumentReaderAndWriter readerAndWriter)
          Takes the file, reads it in, and prints out the likelihood of each possible label at each point.
 void printFirstOrderProbsDocument(List<IN> document)
          Takes a List of something that extends CoreMap and prints the likelihood of each possible label at each point.
 void printFirstOrderProbsDocuments(ObjectBank<List<IN>> documents)
          Takes a List of documents and prints the likelihood of each possible label at each point.
 void printLabelInformation(String testFile, DocumentReaderAndWriter readerAndWriter)
           
 void printLabelValue(List<IN> document)
           
 void printProbsDocument(List<IN> document)
          Takes a List of something that extends CoreMap and prints the likelihood of each possible label at each point.
protected static void saveProcessedData(List datums, String filename)
           
 void scaleWeights(double scale)
          Scales the weights of this crfclassifier by the specified weight
 void serializeClassifier(String serializePath)
          Serialize a sequence classifier to a file on the given path.
 void serializeTextClassifier(String serializePath)
          Serialize the model to a human readable format.
 void train(Collection<List<IN>> docs, DocumentReaderAndWriter readerAndWriter)
          Train a classifier from documents.
 
Methods inherited from class edu.stanford.nlp.ie.AbstractSequenceClassifier
apply, backgroundSymbol, classify, classifyAndWriteAnswers, classifyAndWriteAnswers, classifyAndWriteAnswers, classifyAndWriteAnswers, classifyAndWriteAnswers, classifyAndWriteAnswersKBest, classifyAndWriteViterbiSearchGraph, classifyFile, classifyKBest, classifyRaw, classifySentence, classifySentenceWithGlobalInformation, classifyToCharacterOffsets, classifyToString, classifyToString, classifyWithInlineXML, countResults, getSampler, getViterbiSearchGraph, labels, loadClassifier, loadClassifier, loadClassifier, loadClassifier, loadClassifier, loadClassifier, loadClassifierNoExceptions, loadClassifierNoExceptions, loadClassifierNoExceptions, loadClassifierNoExceptions, loadClassifierNoExceptions, loadJarClassifier, makeObjectBankFromFile, makeObjectBankFromFiles, makeObjectBankFromFiles, makeObjectBankFromFiles, makeObjectBankFromReader, makeObjectBankFromString, makeReaderAndWriter, printFeatureLists, printFeatures, printProbs, printProbsDocuments, printResults, reinit, segmentString, segmentString, train, train, train, train, train, train, writeAnswers
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_CLASSIFIER

public static final String DEFAULT_CLASSIFIER
Name of default serialized classifier resource to look for in a jar file.

See Also:
Constant Field Values
Constructor Detail

CRFClassifier

protected CRFClassifier()

CRFClassifier

public CRFClassifier(Properties props)

CRFClassifier

public CRFClassifier(SeqClassifierFlags flags)

CRFClassifier

public CRFClassifier(CRFClassifier<IN> crf)
Makes a copy of the crf classifier

Method Detail

getNumWeights

public int getNumWeights()
Returns the total number of weights associated with this classifier.

Returns:
number of weights

scaleWeights

public void scaleWeights(double scale)
Scales the weights of this crfclassifier by the specified weight

Parameters:
scale -

combine

public void combine(CRFClassifier<IN> crf,
                    double weight)
Combines weighted crf with this crf

Parameters:
crf -
weight -

dropFeaturesBelowThreshold

public void dropFeaturesBelowThreshold(double threshold)

documentToDataAndLabels

public Pair<int[][][],int[]> documentToDataAndLabels(List<IN> document)
Convert a document List into arrays storing the data features and labels.

Parameters:
document - Training documents
Returns:
A Pair, where the first element is an int[][][] representing the data and the second element is an int[] representing the labels

printLabelInformation

public void printLabelInformation(String testFile,
                                  DocumentReaderAndWriter readerAndWriter)
                           throws Exception
Throws:
Exception

printLabelValue

public void printLabelValue(List<IN> document)

documentsToDataAndLabels

public Pair<int[][][][],int[][]> documentsToDataAndLabels(Collection<List<IN>> documents)
Convert an ObjectBank to arrays of data features and labels.

Returns:
A Pair, where the first element is an int[][][][] representing the data and the second element is an int[][] representing the labels.

documentsToDataAndLabelsList

public List<Pair<int[][][],int[]>> documentsToDataAndLabelsList(Collection<List<IN>> documents)
Convert an ObjectBank to corresponding collection of data features and labels.

Returns:
A List of pairs, one for each document, where the first element is an int[][][] representing the data and the second element is an int[] representing the labels.

makeAnswerArraysAndTagIndex

protected void makeAnswerArraysAndTagIndex(Collection<List<IN>> ob)
This routine builds the labelIndices which give the empirically legal label sequences (of length (order) at most windowSize) and the classIndex, which indexes known answer classes.

Parameters:
ob - The training data: Read from an ObjectBank, each item in it is a List.

allLabels

protected static Index<CRFLabel> allLabels(int window,
                                           Index classIndex)

makeDatum

public CRFDatum<List<String>,CRFLabel> makeDatum(List<IN> info,
                                                 int loc,
                                                 FeatureFactory<IN> featureFactory)
Makes a CRFDatum by producing features and a label from input data at a specific position, using the provided factory.

Parameters:
info - The input data
loc - The position to build a datum at
featureFactory - The FeatureFactory to use to extract features
Returns:
The constructed CRFDatum

classify

public List<IN> classify(List<IN> document)
Description copied from class: AbstractSequenceClassifier
Classify a List of something that extendsCoreMap. The classifications are added in place to the items of the document, which is also returned by this method

Specified by:
classify in class AbstractSequenceClassifier<IN extends CoreMap>
Parameters:
document - A List of something that extends CoreMap.
Returns:
The same List, but with the elements annotated with their answers (stored under the CoreAnnotations.AnswerAnnotation key).

classifyAndWriteAnswers

public void classifyAndWriteAnswers(Collection<List<IN>> documents,
                                    List<Pair<int[][][],int[]>> documentDataAndLabels,
                                    OutputStream outStream,
                                    DocumentReaderAndWriter readerAndWriter)
                             throws IOException
Throws:
IOException

getSequenceModel

public SequenceModel getSequenceModel(List<IN> doc)
Overrides:
getSequenceModel in class AbstractSequenceClassifier<IN extends CoreMap>

getSequenceModel

public SequenceModel getSequenceModel(List<IN> doc,
                                      Pair<int[][][],int[]> documentDataAndLabels)

classifyMaxEnt

public List<IN> classifyMaxEnt(List<IN> document)
Do standard sequence inference, using either Viterbi or Beam inference depending on the value of flags.inferenceType.

Parameters:
document - Document to classify. Classification happens in place. This document is modified.
Returns:
The classified document

classifyGibbs

public List<IN> classifyGibbs(List<IN> document)
                                       throws ClassNotFoundException,
                                              SecurityException,
                                              NoSuchMethodException,
                                              IllegalArgumentException,
                                              InstantiationException,
                                              IllegalAccessException,
                                              InvocationTargetException
Throws:
ClassNotFoundException
SecurityException
NoSuchMethodException
IllegalArgumentException
InstantiationException
IllegalAccessException
InvocationTargetException

classifyGibbs

public List<IN> classifyGibbs(List<IN> document,
                              Pair<int[][][],int[]> documentDataAndLabels)
                                       throws ClassNotFoundException,
                                              SecurityException,
                                              NoSuchMethodException,
                                              IllegalArgumentException,
                                              InstantiationException,
                                              IllegalAccessException,
                                              InvocationTargetException
Throws:
ClassNotFoundException
SecurityException
NoSuchMethodException
IllegalArgumentException
InstantiationException
IllegalAccessException
InvocationTargetException

classifyGibbsUsingPrior

public List<IN> classifyGibbsUsingPrior(List<IN> sentence,
                                        SequenceModel[] priorModels,
                                        SequenceListener[] priorListeners,
                                        double[] modelWts)
                                                 throws ClassNotFoundException,
                                                        SecurityException,
                                                        NoSuchMethodException,
                                                        IllegalArgumentException,
                                                        InstantiationException,
                                                        IllegalAccessException,
                                                        InvocationTargetException
Parameters:
sentence -
priorModels - an array of prior models
priorListeners - an array of prior listeners
modelWts - an array of model weights: IMPORTANT: this includes the weight of CRF clasifier as well at position 0, and therefore is longer than priorListeners/priorModels array by 1.
Returns:
A list of INs with
Throws:
ClassNotFoundException
SecurityException
NoSuchMethodException
IllegalArgumentException
InstantiationException
IllegalAccessException
InvocationTargetException

classifyGibbsUsingPrior

public List<IN> classifyGibbsUsingPrior(List<IN> sentence,
                                        SequenceModel priorModel,
                                        SequenceListener priorListener,
                                        double model1Wt,
                                        double model2Wt)
                                                 throws ClassNotFoundException,
                                                        SecurityException,
                                                        NoSuchMethodException,
                                                        IllegalArgumentException,
                                                        InstantiationException,
                                                        IllegalAccessException,
                                                        InvocationTargetException
Throws:
ClassNotFoundException
SecurityException
NoSuchMethodException
IllegalArgumentException
InstantiationException
IllegalAccessException
InvocationTargetException

printProbsDocument

public void printProbsDocument(List<IN> document)
Takes a List of something that extends CoreMap and prints the likelihood of each possible label at each point.

Specified by:
printProbsDocument in class AbstractSequenceClassifier<IN extends CoreMap>
Parameters:
document - A List of something that extends CoreMap.

printFirstOrderProbs

public void printFirstOrderProbs(String filename,
                                 DocumentReaderAndWriter readerAndWriter)
Takes the file, reads it in, and prints out the likelihood of each possible label at each point. This gives a simple way to examine the probability distributions of the CRF. See getCliqueTrees() for more.

Parameters:
filename - The path to the specified file

printFirstOrderProbsDocuments

public void printFirstOrderProbsDocuments(ObjectBank<List<IN>> documents)
Takes a List of documents and prints the likelihood of each possible label at each point.

Parameters:
documents - A List of List of INs.

getCliqueTrees

public List<CRFCliqueTree> getCliqueTrees(String filename,
                                          DocumentReaderAndWriter readerAndWriter)
Want to make arbitrary probability queries? Then this is the method for you. Given the filename, it reads it in and breaks it into documents, and then makes a CRFCliqueTree for each document. you can then ask the clique tree for marginals and conditional probabilities of almost anything you want.


printFirstOrderProbsDocument

public void printFirstOrderProbsDocument(List<IN> document)
Takes a List of something that extends CoreMap and prints the likelihood of each possible label at each point.

Parameters:
document - A List of something that extends CoreMap.

train

public void train(Collection<List<IN>> docs,
                  DocumentReaderAndWriter readerAndWriter)
Train a classifier from documents.

Specified by:
train in class AbstractSequenceClassifier<IN extends CoreMap>
Parameters:
docs - An objectbank representation of documents. Changed this type from ObjectBank to Collection for generality (mihai)
readerAndWriter - A DocumentReaderAndWriter to use when loading test files

getMinimizer

protected Minimizer getMinimizer()

getMinimizer

protected Minimizer getMinimizer(int featurePruneIteration,
                                 Evaluator[] evaluators)

extractDatumSequence

protected List<CRFDatum> extractDatumSequence(int[][][] allData,
                                              int beginPosition,
                                              int endPosition,
                                              List<IN> labeledWordInfos)
Creates a new CRFDatum from the preprocessed allData format, given the document number, position number, and a List of Object labels.

Returns:
A new CRFDatum

addProcessedData

protected void addProcessedData(List<List<CRFDatum<Collection<String>,String>>> processedData,
                                int[][][][] data,
                                int[][] labels,
                                int offset)
Adds the List of Lists of CRFDatums to the data and labels arrays, treating each datum as if it were its own document. Adds context labels in addition to the target label for each datum, meaning that for a particular document, the number of labels will be windowSize-1 greater than the number of datums.

Parameters:
processedData - a List of Lists of CRFDatums

saveProcessedData

protected static void saveProcessedData(List datums,
                                        String filename)

loadProcessedData

protected static List loadProcessedData(String filename)

loadTextClassifier

public void loadTextClassifier(String text,
                               Properties props)
                        throws ClassCastException,
                               IOException,
                               ClassNotFoundException,
                               InstantiationException,
                               IllegalAccessException
Throws:
ClassCastException
IOException
ClassNotFoundException
InstantiationException
IllegalAccessException

serializeTextClassifier

public void serializeTextClassifier(String serializePath)
Serialize the model to a human readable format. It's not yet complete. It should now work for Chinese segmenter though. TODO: check things in serializeClassifier and add other necessary serialization back

Parameters:
serializePath - File to write text format of classifier to.

serializeClassifier

public void serializeClassifier(String serializePath)
Serialize a sequence classifier to a file on the given path.

Specified by:
serializeClassifier in class AbstractSequenceClassifier<IN extends CoreMap>
Parameters:
serializePath - The path/filename to write the classifier to.

loadClassifier

public void loadClassifier(ObjectInputStream ois,
                           Properties props)
                    throws ClassCastException,
                           IOException,
                           ClassNotFoundException
Loads a classifier from the specified InputStream. This version works quietly (unless VERBOSE is true). If props is non-null then any properties it specifies override those in the serialized file. However, only some properties are sensible to change (you shouldn't change how features are defined).

Note: This method does not close the ObjectInputStream. (But earlier versions of the code used to, so beware....)

Specified by:
loadClassifier in class AbstractSequenceClassifier<IN extends CoreMap>
Parameters:
ois - The InputStream to load the serialized classifier from
props - This Properties object will be used to update the SeqClassifierFlags which are read from the serialized classifier
Throws:
ClassCastException - If there are problems interpreting the serialized data
IOException - If there are problems accessing the input stream
ClassNotFoundException - If there are problems interpreting the serialized data

loadDefaultClassifier

public void loadDefaultClassifier()
This is used to load the default supplied classifier stored within the jar file. THIS FUNCTION WILL ONLY WORK IF THE CODE WAS LOADED FROM A JAR FILE WHICH HAS A SERIALIZED CLASSIFIER STORED INSIDE IT.


loadDefaultClassifier

public void loadDefaultClassifier(Properties props)
This is used to load the default supplied classifier stored within the jar file. THIS FUNCTION WILL ONLY WORK IF THE CODE WAS LOADED FROM A JAR FILE WHICH HAS A SERIALIZED CLASSIFIER STORED INSIDE IT.


getDefaultClassifier

public static <IN extends CoreMap> CRFClassifier<IN> getDefaultClassifier()
Used to get the default supplied classifier inside the jar file. THIS FUNCTION WILL ONLY WORK IF THE CODE WAS LOADED FROM A JAR FILE WHICH HAS A SERIALIZED CLASSIFIER STORED INSIDE IT.

Returns:
The default CRFClassifier in the jar file (if there is one)

getDefaultClassifier

public static <IN extends CoreMap> CRFClassifier<IN> getDefaultClassifier(Properties props)
Used to get the default supplied classifier inside the jar file. THIS FUNCTION WILL ONLY WORK IF THE CODE WAS LOADED FROM A JAR FILE WHICH HAS A SERIALIZED CLASSIFIER STORED INSIDE IT.

Returns:
The default CRFClassifier in the jar file (if there is one)

getJarClassifier

public static <IN extends CoreMap> CRFClassifier<IN> getJarClassifier(String resourceName,
                                                                      Properties props)
Used to load a classifier stored as a resource inside a jar file. THIS FUNCTION WILL ONLY WORK IF THE CODE WAS LOADED FROM A JAR FILE WHICH HAS A SERIALIZED CLASSIFIER STORED INSIDE IT.

Parameters:
resourceName - Name of clasifier resource inside the jar file.
Returns:
A CRFClassifier stored in the jar file

getClassifier

public static <IN extends CoreMap> CRFClassifier<IN> getClassifier(File file)
                                                       throws IOException,
                                                              ClassCastException,
                                                              ClassNotFoundException
Loads a CRF classifier from a filepath, and returns it.

Parameters:
file - File to load classifier from
Returns:
The CRF classifier
Throws:
IOException - If there are problems accessing the input stream
ClassCastException - If there are problems interpreting the serialized data
ClassNotFoundException - If there are problems interpreting the serialized data

getClassifier

public static CRFClassifier getClassifier(InputStream in)
                                   throws IOException,
                                          ClassCastException,
                                          ClassNotFoundException
Loads a CRF classifier from an InputStream, and returns it. This method does not buffer the InputStream, so you should have buffered it before calling this method.

Parameters:
in - InputStream to load classifier from
Returns:
The CRF classifier
Throws:
IOException - If there are problems accessing the input stream
ClassCastException - If there are problems interpreting the serialized data
ClassNotFoundException - If there are problems interpreting the serialized data

getClassifierNoExceptions

public static CRFClassifier getClassifierNoExceptions(String loadPath)

getClassifier

public static CRFClassifier getClassifier(String loadPath)
                                   throws IOException,
                                          ClassCastException,
                                          ClassNotFoundException
Throws:
IOException
ClassCastException
ClassNotFoundException

getClassifier

public static CRFClassifier getClassifier(String loadPath,
                                          Properties props)
                                   throws IOException,
                                          ClassCastException,
                                          ClassNotFoundException
Throws:
IOException
ClassCastException
ClassNotFoundException

main

public static void main(String[] args)
                 throws Exception
The main method. See the class documentation.

Throws:
Exception

classifyWithGlobalInformation

public List<IN> classifyWithGlobalInformation(List<IN> tokenSeq,
                                              CoreMap doc,
                                              CoreMap sent)
Description copied from class: AbstractSequenceClassifier
Classify a List of something that extends CoreMap using as additional information whatever is stored in the document and sentence. This is needed for SUTime (NumberSequenceClassifier), which requires the document date to resolve relative dates.

Specified by:
classifyWithGlobalInformation in class AbstractSequenceClassifier<IN extends CoreMap>
Returns:
Classified version of the input tokenSequence


Stanford NLP Group