public class CRFBiasedClassifier<IN extends CoreMap> extends CRFClassifier<IN>
CRFClassifier
and supports most command-line parameters
available in CRFClassifier
. In addition to this,
CRFBiasedClassifier also interprets the parameter -classBias, as in:
java -server -mx500m edu.stanford.nlp.ie.crf.CRFBiasedClassifier -loadClassifier model.gz -testFile test.txt -classBias A:0.5,B:1.5
The command above sets a bias of 0.5 towards class A and a bias of
1.5 towards class B. These biases (which internally are treated as
feature weights in the log-linear model underpinning the CRF
classifier) can take any real value. As the weight of A tends towards plus
infinity, the classifier will only predict A labels, and as it tends
towards minus infinity, it will never predict A labels.DEFAULT_CLASSIFIER
classIndex, featureFactories, flags, knownLCWords, pad, windowSize
Constructor and Description |
---|
CRFBiasedClassifier(java.util.Properties props) |
CRFBiasedClassifier(SeqClassifierFlags flags) |
Modifier and Type | Method and Description |
---|---|
void |
adjustBias(java.util.List<java.util.List<IN>> develData,
java.util.function.DoubleUnaryOperator evalFunction,
double low,
double high)
Adjust the bias parameter to optimize some objective function.
|
java.util.List<IN> |
classify(java.util.List<IN> document)
Classify a
List of something that extendsCoreMap . |
static void |
main(java.lang.String[] args)
The main method, which is essentially the same as in CRFClassifier.
|
CRFDatum<java.util.Collection<java.lang.String>,CRFLabel> |
makeDatum(java.util.List<IN> info,
int loc,
java.util.List<FeatureFactory<IN>> featureFactories)
Makes a CRFDatum by producing features and a label from input data at a
specific position, using the provided factory.
|
void |
setBiasWeight(int cindex,
double weight) |
void |
setBiasWeight(java.lang.String cname,
double weight) |
addProcessedData, allLabels, classifyGibbs, classifyGibbs, classifyMaxEnt, classifyWithGlobalInformation, combine, documentsToDataAndLabels, documentsToDataAndLabelsList, documentToDataAndLabels, dropFeaturesBelowThreshold, dumpFeatures, extractDatumSequence, getClassifier, getClassifier, getClassifier, getClassifier, getClassifier, getClassifier, getClassifierNoExceptions, getCliquePotentialFunctionForTest, getCliqueTree, getCliqueTree, getCliqueTrees, getDefaultClassifier, getDefaultClassifier, getMinimizer, getMinimizer, getNumWeights, getObjectiveFunction, getSequenceModel, loadAuxiliaryData, loadClassifier, loadClassIndexFromFile, loadDefaultClassifier, loadDefaultClassifier, loadFeatureIndexFromFile, loadProcessedData, loadTagIndex, loadTextClassifier, loadTextClassifier, loadWeightsFromFile, makeAnswerArraysAndTagIndex, printFactorTable, printFactorTableDocument, printFactorTableDocuments, printFeatures, printFirstOrderProbs, printFirstOrderProbsDocument, printFirstOrderProbsDocuments, printLabelInformation, printLabelValue, printProbsDocument, pruneNodeFeatureIndices, saveProcessedData, scaleWeights, serializeClassifier, serializeClassifier, serializeClassIndex, serializeFeatureIndex, serializeFeatureIndexToText, serializeTextClassifier, serializeTextClassifier, serializeWeights, to2D, topWeights, toString, train, trainWeights, updateWeightsForTest, writeWeights, zeroOrderProbabilities
apply, backgroundSymbol, classify, classifyAndWriteAnswers, classifyAndWriteAnswers, classifyAndWriteAnswers, classifyAndWriteAnswers, classifyAndWriteAnswers, classifyAndWriteAnswers, classifyAndWriteAnswers, classifyAndWriteAnswersKBest, classifyAndWriteAnswersKBest, classifyAndWriteViterbiSearchGraph, classifyFile, classifyFilesAndWriteAnswers, classifyFilesAndWriteAnswers, classifyKBest, classifyRaw, classifySentence, classifySentenceWithGlobalInformation, classifyStdin, classifyStdin, classifyToCharacterOffsets, classifyToString, classifyToString, classifyWithInlineXML, countResults, countResultsSegmenter, defaultReaderAndWriter, finalizeClassification, getKnownLCWords, getSampler, labels, loadClassifier, loadClassifier, loadClassifier, loadClassifier, loadClassifier, loadClassifier, loadClassifierNoExceptions, loadClassifierNoExceptions, loadClassifierNoExceptions, loadClassifierNoExceptions, loadClassifierNoExceptions, makeObjectBankFromFile, makeObjectBankFromFile, makeObjectBankFromFiles, makeObjectBankFromFiles, makeObjectBankFromFiles, makeObjectBankFromReader, makeObjectBankFromString, makePlainTextReaderAndWriter, makePlainTextReaderAndWriter, makeReaderAndWriter, plainTextReaderAndWriter, printFeatureLists, printFeatures, printProbs, printProbs, printProbsDocuments, printResults, reinit, segmentString, segmentString, train, train, train, train, train, train, windowSize, writeAnswers
public CRFBiasedClassifier(java.util.Properties props)
public CRFBiasedClassifier(SeqClassifierFlags flags)
public CRFDatum<java.util.Collection<java.lang.String>,CRFLabel> makeDatum(java.util.List<IN> info, int loc, java.util.List<FeatureFactory<IN>> featureFactories)
CRFClassifier
makeDatum
in class CRFClassifier<IN extends CoreMap>
info
- The input data. Particular feature factories might look for arbitrary keys in the IN items.loc
- The position to build a datum atfeatureFactories
- The FeatureFactories to use to extract featurespublic void setBiasWeight(java.lang.String cname, double weight)
public void setBiasWeight(int cindex, double weight)
public java.util.List<IN> classify(java.util.List<IN> document)
AbstractSequenceClassifier
List
of something that extendsCoreMap
.
The classifications are added in place to the items of the document,
which is also returned by this method.
Warning: In many circumstances, you should not call this method directly.
In particular, if you call this method directly, your document will not be preprocessed
to add things like word distributional similarity class or word shape features that your
classifier may rely on to work correctly. In such cases, you should call
classifySentence
instead.classify
in class CRFClassifier<IN extends CoreMap>
document
- A List
of something that extends CoreMap
.List
, but with the elements annotated with their
answers (stored under the
CoreAnnotations.AnswerAnnotation
key). The answers will be the class labels defined by the CRF
Classifier. They might be things like entity labels (in BIO
notation or not) or something like "1" vs. "0" on whether to
begin a new token here or not (in word segmentation).public void adjustBias(java.util.List<java.util.List<IN>> develData, java.util.function.DoubleUnaryOperator evalFunction, double low, double high)
public static void main(java.lang.String[] args) throws java.lang.Exception
java.lang.Exception