edu.stanford.nlp.ie.machinereading (Stanford JavaNLP API)

Interface Summary
Interface Description

Extractor

LabelValidator

Interface Summary
Interface	Description
Extractor
LabelValidator

Class Summary
Class	Description
BasicEntityExtractor	Uses parsed files to train classifier and test on data set.
BasicRelationExtractor
BasicRelationFeatureFactory
EntityExtractorResultsPrinter
ExtractorMerger	Simple extractor which combines several other Extractors.
GenericDataSetReader
MachineReading	Main driver for Machine Reading training, annotation, and evaluation.
MachineReadingProperties
NilLabelValidator
RelationExtractorResultsPrinter
RelationFeatureFactory	Base class for feature factories Created by Sonal Gupta.
ResultsPrinter	Class for comparing the output of information extraction to a gold standard, and printing the results.

Enum Summary
Enum Description

RelationFeatureFactory.DEPENDENCY_TYPE

Enum Summary
Enum	Description
RelationFeatureFactory.DEPENDENCY_TYPE

Package edu.stanford.nlp.ie.machinereading Description

A package for supervised relation and event extraction.

Usage

The easiest way to run Machine Reading is using the following command from your javanlp directory.

bin/javanlp.sh edu.stanford.nlp.ie.machinereading.MachineReading --arguments machinereading.properties

Sample properties files are included in projects/core/src/edu/stanford/nlp/ie/machinereading . Eventually, we will have one for each corpus. The attributes for the properties file are explained below:

`MachineReading` Properties

Required Properties

datasetReaderClass: which GenericDataSetReader to use (needs to match the corpus in question). For example: edu.stanford.nlp.ie.machinereading.reader.AceReader
serializedModelPath: where to store/load the serialized extraction model
trainPath: path to the training file/directory (needs to match the datasetReaderClass)
serializedTrainingSentencesPath: where to store the serialized training sentences objects (To save time loading the training data, the objects produced when reading them in are serialized.)

Optional Properties:

The following properties are optional because the code assumes default values, which it prints out if not defined.

forceRetraining: retrains an extraction model even if it already exists (otherwise, we only train if the serializedModelPath doesn't exist on disk, default is false).
trainOnly: if true, don't run evaluation (implies forceRetraining, default is false)
The testPath and serializedTestSentencesPath properties can be omitted if trainOnly is true. Otherwise, these are analogous to their train counterparts.
extractRelations: whether we should extract relations (currently ignored)
extractEvents: whether we should extract events (currently ignored)

Package edu.stanford.nlp.ie.machinereading

Package edu.stanford.nlp.ie.machinereading Description

Usage

MachineReading Properties

Required Properties

Optional Properties:

`MachineReading` Properties