Uses parsed files to train classifier and test on data set.
Simple extractor which combines several other Extractors.
Main driver for Machine Reading training, annotation, and evaluation.
Base class for feature factories Created by Sonal Gupta.
Class for comparing the output of information extraction to a gold standard, and printing the results.
Sample properties files are included in
bin/javanlp.sh edu.stanford.nlp.ie.machinereading.MachineReading --arguments machinereading.properties
projects/core/src/edu/stanford/nlp/ie/machinereading. Eventually, we will have one for each corpus. The attributes for the properties file are explained below:
GenericDataSetReaderto use (needs to match the corpus in question). For example:
serializedModelPath: where to store/load the serialized extraction model
trainPath: path to the training file/directory (needs to match the
serializedTrainingSentencesPath: where to store the serialized training sentences objects (To save time loading the training data, the objects produced when reading them in are serialized.)
forceRetraining: retrains an extraction model even if it already exists (otherwise, we only train if the
serializedModelPathdoesn't exist on disk, default is
trainOnly: if true, don't run evaluation (implies
forceRetraining, default is
serializedTestSentencesPathproperties can be omitted if
true. Otherwise, these are analogous to their train counterparts.
extractRelations: whether we should extract relations (currently ignored)
extractEvents: whether we should extract events (currently ignored)
Stanford NLP Group