|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.stanford.nlp.ie.AbstractSequenceClassifier
public abstract class AbstractSequenceClassifier
This class provides common functionality for (probabilistic) sequence models. It is a superclass of our CMM and CRF sequence classifiers, and is even used in the (deterministic) NumberSequenceClassifier. See implementing classes for more information.
Field Summary | |
---|---|
Index<String> |
classIndex
|
FeatureFactory |
featureFactory
|
SeqClassifierFlags |
flags
|
static String |
JAR_CLASSIFIER_PATH
|
protected Set<String> |
knownLCWords
|
protected FeatureLabel |
pad
|
protected DocumentReaderAndWriter |
readerAndWriter
|
int |
windowSize
|
Constructor Summary | |
---|---|
AbstractSequenceClassifier()
This does nothing. |
Method Summary | |
---|---|
Object |
apply(Object in)
Maps a String input to an XML-formatted rendition of applying NER to the String. |
String |
backgroundSymbol()
|
Sampler<List<FeatureLabel>> |
getSampler(List<FeatureLabel> input)
|
SequenceModel |
getSequenceModel(List<FeatureLabel> doc)
|
protected void |
init(Properties props)
|
protected void |
init(SeqClassifierFlags flags)
|
Set<String> |
labels()
|
void |
loadClassifier(File file)
|
void |
loadClassifier(File file,
Properties props)
Loads a classifier from the file specified by loadPath. |
void |
loadClassifier(InputStream in)
|
abstract void |
loadClassifier(InputStream in,
Properties props)
Load a classsifier from the specified input stream. |
void |
loadClassifier(String loadPath)
Loads a classifier from the file specified by loadPath. |
void |
loadClassifierNoExceptions(BufferedInputStream in)
Loads a classifier from the given input stream. |
void |
loadClassifierNoExceptions(File file)
|
void |
loadClassifierNoExceptions(File file,
Properties props)
|
void |
loadClassifierNoExceptions(String loadPath)
|
void |
loadClassifierNoExceptions(String loadPath,
Properties props)
|
void |
loadJarClassifier(String modelName,
Properties props)
This function will load a classifier that is stored inside a jar file (if it is so stored). |
protected ObjectBank<List<FeatureLabel>> |
makeObjectBank(BufferedReader in)
|
protected ObjectBank<List<FeatureLabel>> |
makeObjectBank(BufferedReader in,
boolean quietly)
Set up an ObjectBank that will allow one to iterate over a collection of documents obtained from the passed in Reader. |
ObjectBank<List<FeatureLabel>> |
makeObjectBank(String filename)
|
void |
printProbs(String filename)
Takes the file, reads it in, and prints out the likelihood of each possible label at each point. |
abstract void |
printProbsDocument(List<FeatureLabel> document)
|
void |
printProbsDocuments(ObjectBank<List<FeatureLabel>> documents)
Takes a List of documents and prints the likelihood
of each possible label at each point. |
protected void |
reinit()
This method should be called after there have been changes to the flags (SeqClassifierFlags) variable, such as after deserializing a classifier. |
List<String> |
segmentString(String sentence)
ONLY USE IF LOADED A CHINESE WORD SEGMENTER!!!!! |
abstract void |
serializeClassifier(String serializePath)
|
abstract List<FeatureLabel> |
test(List<FeatureLabel> document)
Classify a List of FeatureLabel s. |
void |
testAndWriteAnswers(String testFile)
Load a test file, run the classifier on it, and then print the answers to stdout (with timing to stderr). |
void |
testAndWriteAnswersKBest(String testFile,
int k)
Load a test file, run the classifier on it, and then print the answers to stdout (with timing to stderr). |
List<List<FeatureLabel>> |
testFile(String filename)
Classify a Sentence . |
Counter<List<FeatureLabel>> |
testKBest(List<FeatureLabel> doc,
String answerField,
int k)
|
List<FeatureLabel> |
testSentence(List<? extends HasWord> sentence)
Classify a Sentence . |
List<List<FeatureLabel>> |
testSentences(String sentences)
Classify a Sentence . |
List<FeatureLabel> |
testSentenceWithCasing(List<FeatureLabel> sentence)
Classify a List of FeatureLabels using a TrueCasingDocumentReader. |
String |
testString(String sentences)
Classify the contents of a String . |
String |
testStringInlineXML(String sentences)
Classify the contents of a String . |
String |
testStringXML(String sentences)
Classify the contents of a String . |
void |
train()
|
abstract void |
train(ObjectBank<List<FeatureLabel>> docs)
|
void |
train(String filename)
|
void |
writeAnswers(List<FeatureLabel> doc)
Write the classifications of the Sequence classifier out in a format determined by the DocumentReaderAndWriter used. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final String JAR_CLASSIFIER_PATH
public SeqClassifierFlags flags
public Index<String> classIndex
protected DocumentReaderAndWriter readerAndWriter
public FeatureFactory featureFactory
protected FeatureLabel pad
public int windowSize
protected Set<String> knownLCWords
Constructor Detail |
---|
public AbstractSequenceClassifier()
Method Detail |
---|
protected void init(Properties props)
protected void init(SeqClassifierFlags flags)
protected void reinit()
Implementation note: At the moment this variable doesn't set windowSize or featureFactory, since they are being serialized separately in the file, but we should probably stop serializing them and just reinitialize them from the flags?
public String backgroundSymbol()
public Set<String> labels()
public List<FeatureLabel> testSentence(List<? extends HasWord> sentence)
Sentence
.
sentence
- The Sentence
to be classified.
Sentence
, where the classifier output for
each token is stored in its "answer" field.public SequenceModel getSequenceModel(List<FeatureLabel> doc)
public Sampler<List<FeatureLabel>> getSampler(List<FeatureLabel> input)
public Counter<List<FeatureLabel>> testKBest(List<FeatureLabel> doc, String answerField, int k)
public List<FeatureLabel> testSentenceWithCasing(List<FeatureLabel> sentence)
sentence
- a list of featureLabels to be classifierd
public List<List<FeatureLabel>> testSentences(String sentences)
Sentence
.
sentences
- The sentence(s) to be classified.
List
of classified Sentence
s.public List<List<FeatureLabel>> testFile(String filename)
Sentence
.
filename
- Contains the sentence(s) to be classified.
List
of classified Sentence
s.public Object apply(Object in)
apply
in interface Function
public String testStringInlineXML(String sentences)
String
. Plain text or XML is
expected and the PlainTextDocumentReaderAndWriter
is used. Output
is in inline XML format (e.g. <PERSON>Bill Smith</PERSON>
went to <LOCATION>Paris</LOCATION> .)
sentences
- The string to be classified
String
with annotated with classification
information.public String testStringXML(String sentences)
String
. Plain text or XML is
expected and the PlainTextDocumentReaderAndWriter
is used. Output
is in XML format.
sentences
- The string to be classified
String
with annotated with classification
information.public String testString(String sentences)
String
. Plain text or XML is
expected and the PlainTextDocumentReaderAndWriter
is used. Output
looks like: My/O name/O is/O Bill/PERSON Smith/PERSON ./O
sentences
- The string to be classified
String
with annotated with classification
information.public List<String> segmentString(String sentence)
sentence
- The string to be classified
public abstract List<FeatureLabel> test(List<FeatureLabel> document)
List
of FeatureLabel
s.
document
- A List
of FeatureLabel
s.
List
, but with the elements annotated
with their answers (with setAnswer()
).public void train()
public void train(String filename)
public abstract void train(ObjectBank<List<FeatureLabel>> docs)
public ObjectBank<List<FeatureLabel>> makeObjectBank(String filename)
protected ObjectBank<List<FeatureLabel>> makeObjectBank(BufferedReader in)
protected ObjectBank<List<FeatureLabel>> makeObjectBank(BufferedReader in, boolean quietly)
flags.documentReader
,
and for some reader choices, the column mapping given in
flags.map
.
in
- Input dataquietly
- Print less messages if this is true (use when calling
it repeatedly on small bits of text)
public void printProbs(String filename)
filename
- The path to the specified filepublic void printProbsDocuments(ObjectBank<List<FeatureLabel>> documents)
List
of documents and prints the likelihood
of each possible label at each point.
documents
- A List
of List
of FeatureLabel
s.public abstract void printProbsDocument(List<FeatureLabel> document)
public void testAndWriteAnswers(String testFile) throws Exception
testFile
- The file to test on.
Exception
public void testAndWriteAnswersKBest(String testFile, int k) throws Exception
testFile
- The file to test on.
Exception
public void writeAnswers(List<FeatureLabel> doc) throws Exception
Exception
public abstract void serializeClassifier(String serializePath)
public void loadClassifierNoExceptions(BufferedInputStream in)
public void loadClassifier(InputStream in) throws IOException, ClassCastException, ClassNotFoundException
IOException
ClassCastException
ClassNotFoundException
public abstract void loadClassifier(InputStream in, Properties props) throws IOException, ClassCastException, ClassNotFoundException
in
- The InputStream to load the serialized classifier fromprops
- This Properties object will be used to update the SeqClassifierFlags which
are read from the serialized classifier
IOException
ClassCastException
ClassNotFoundException
public void loadClassifier(String loadPath) throws ClassCastException, IOException, ClassNotFoundException
ClassCastException
IOException
ClassNotFoundException
public void loadClassifierNoExceptions(String loadPath)
public void loadClassifierNoExceptions(String loadPath, Properties props)
public void loadClassifier(File file) throws ClassCastException, IOException, ClassNotFoundException
ClassCastException
IOException
ClassNotFoundException
public void loadClassifier(File file, Properties props) throws ClassCastException, IOException, ClassNotFoundException
ClassCastException
IOException
ClassNotFoundException
public void loadClassifierNoExceptions(File file)
public void loadClassifierNoExceptions(File file, Properties props)
public void loadJarClassifier(String modelName, Properties props)
/classifiers/
) is
coded in this class. If the classifier is not stored in the jar file
or this is not run from inside a jar file, then this function will
throw a RuntimeException.
modelName
- The name of the model file. Iff it ends in .gz, then
it is assumed to be gzip compressed.props
- A Properties object which can override certain properties
in the serialized file, such as the DocumentReaderAndWriter.
You can pass in null
to override nothing.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |