edu.stanford.nlp.parser.lexparser
Class FactoredLexicon

java.lang.Object
  extended by edu.stanford.nlp.parser.lexparser.BaseLexicon
      extended by edu.stanford.nlp.parser.lexparser.FactoredLexicon
All Implemented Interfaces:
Lexicon, Serializable

public class FactoredLexicon
extends BaseLexicon

A lexicon that accommodates a separation between the surface form and inflectional features, which are encoded in the POS tags.

TODO: Could do smoothing during training so that each word is counted with its base category.

Author:
Spence Green
See Also:
Serialized Form

Field Summary
 
Fields inherited from class edu.stanford.nlp.parser.lexparser.BaseLexicon
DEBUG_LEXICON, DEBUG_LEXICON_SCORE, flexiTag, lastSentencePosition, lastSignatureIndex, lastWordToSignaturize, nullTag, nullWord, rulesWithWord, seenCounter, smartMutation, smoothInUnknownsThreshold, tagNumberer, tags, uwModel, wordNumberer, words
 
Fields inherited from interface edu.stanford.nlp.parser.lexparser.Lexicon
BOUNDARY, BOUNDARY_TAG, UNKNOWN_WORD
 
Constructor Summary
FactoredLexicon(MorphoFeatureSpecification morphoSpec)
           
FactoredLexicon(Options.LexOptions op, MorphoFeatureSpecification morphoSpec)
           
 
Method Summary
 Iterator<IntTaggedWord> ruleIteratorByWord(int word, int loc, String featureSpec)
          Generate the possible taggings for a word at a sentence position.
 
Methods inherited from class edu.stanford.nlp.parser.lexparser.BaseLexicon
addAll, addAll, addTagging, evaluateCoverage, getBaseTag, getUnknownWordModel, initRulesWithWord, isKnown, isKnown, listOfLabeledWordsToEvents, listToEvents, main, numRules, printLexStats, readData, ruleIteratorByWord, score, setTagNumberer, setUnknownWordModel, setWordNumberer, tagNumberer, train, train, train, train, trainWithExpansion, treeToEvents, treeToEvents, tune, wordNumberer, writeData
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FactoredLexicon

public FactoredLexicon(MorphoFeatureSpecification morphoSpec)

FactoredLexicon

public FactoredLexicon(Options.LexOptions op,
                       MorphoFeatureSpecification morphoSpec)
Method Detail

ruleIteratorByWord

public Iterator<IntTaggedWord> ruleIteratorByWord(int word,
                                                  int loc,
                                                  String featureSpec)
Description copied from class: BaseLexicon
Generate the possible taggings for a word at a sentence position. This may either be based on a strict lexicon or an expanded generous set of possible taggings.

Implementation note: Expanded sets of possible taggings are calculated dynamically at runtime, so as to reduce the memory used by the lexicon (a space/time tradeoff).

Specified by:
ruleIteratorByWord in interface Lexicon
Overrides:
ruleIteratorByWord in class BaseLexicon
Parameters:
word - The word (as an int)
loc - Its index in the sentence (usually only relevant for unknown words)
featureSpec - Additional word features like morphosyntactic information.
Returns:
A list of possible taggings


Stanford NLP Group