edu.stanford.nlp.parser.lexparser
Interface UnknownWordModelTrainer


public interface UnknownWordModelTrainer

An interface for training an UnknownWordModel. Once initialized, you can feed it trees and then call finishTraining to get the UnknownWordModel.

Author:
John Bauer

Field Summary
static IntTaggedWord NULL_ITW
           
static short nullTag
           
static int nullWord
           
static String unknown
           
 
Method Summary
 UnknownWordModel finishTraining()
          Returns the trained UWM.
 void incrementTreesRead(double weight)
           
 void initializeTraining(Options op, Lexicon lex, Index<String> wordIndex, Index<String> tagIndex, double totalTrees)
          Initialize the trainer with a few of the data structures it needs to train.
 void train(Collection<Tree> trees)
          Tallies statistics for this particular collection of trees.
 void train(Collection<Tree> trees, double weight)
          Tallies statistics for a weighted collection of trees.
 void train(TaggedWord tw, int loc, double weight)
          Tallies statistics for a single word.
 void train(Tree tree, double weight)
          Tallies statistics for a single tree.
 

Field Detail

unknown

static final String unknown
See Also:
Constant Field Values

nullWord

static final int nullWord
See Also:
Constant Field Values

nullTag

static final short nullTag
See Also:
Constant Field Values

NULL_ITW

static final IntTaggedWord NULL_ITW
Method Detail

initializeTraining

void initializeTraining(Options op,
                        Lexicon lex,
                        Index<String> wordIndex,
                        Index<String> tagIndex,
                        double totalTrees)
Initialize the trainer with a few of the data structures it needs to train. Also, it is necessary to estimate the number of trees that it will be given, as many of the UWMs switch training modes after seeing a fraction of the trees.
This is an initialization method and not part of the constructor because these Trainers are generally loaded by reflection, and making this a method instead of a constructor lets the compiler catch silly errors.


train

void train(Collection<Tree> trees)
Tallies statistics for this particular collection of trees. Can be called multiple times.


train

void train(Collection<Tree> trees,
           double weight)
Tallies statistics for a weighted collection of trees. Can be called multiple times.


train

void train(Tree tree,
           double weight)
Tallies statistics for a single tree. Can be called multiple times.


train

void train(TaggedWord tw,
           int loc,
           double weight)
Tallies statistics for a single word. Can be called multiple times.


incrementTreesRead

void incrementTreesRead(double weight)

finishTraining

UnknownWordModel finishTraining()
Returns the trained UWM. Many of the subclasses build exactly one model, and some of the finishTraining methods manipulate the data in permanent ways, so this should only be called once



Stanford NLP Group