FactoredLexicon (Stanford CoreNLP API)

java.lang.Object
- edu.stanford.nlp.parser.lexparser.BaseLexicon
- - edu.stanford.nlp.parser.lexparser.FactoredLexicon

All Implemented Interfaces:

Lexicon, Serializable
```
public class FactoredLexicon
extends BaseLexicon
```
Author:

Spence Green

See Also:

Serialized Form

Field Summary
- Fields inherited from class edu.stanford.nlp.parser.lexparser.BaseLexicon
  DEBUG_LEXICON, DEBUG_LEXICON_SCORE, flexiTag, NULL_ITW, nullTag, nullWord, op, rulesWithWord, seenCounter, smartMutation, smoothInUnknownsThreshold, tagIndex, tags, testOptions, trainOptions, useSignatureForKnownSmoothing, uwModel, uwModelTrainer, uwModelTrainerClass, wordIndex, words
- Fields inherited from interface edu.stanford.nlp.parser.lexparser.Lexicon
  BOUNDARY, BOUNDARY_TAG, UNKNOWN_WORD

Constructor Summary

Constructors
Constructor and Description
`FactoredLexicon(MorphoFeatureSpecification morphoSpec, Index<String> wordIndex, Index<String> tagIndex)`
`FactoredLexicon(Options op, MorphoFeatureSpecification morphoSpec, Index<String> wordIndex, Index<String> tagIndex)`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`protected void`	`initRulesWithWord()` Rule table is lemmas!
`static void`	`main(String[] args)`
`Iterator<IntTaggedWord>`	`ruleIteratorByWord(int word, int loc, String featureSpec)` Rule table is lemmas.
`float`	`score(IntTaggedWord iTW, int loc, String word, String featureSpec)` Get the score of this word with this tag (as an IntTaggedWord) at this location.
`void`	`train(Collection<Tree> trees, Collection<Tree> rawTrees)` This method should populate wordIndex, tagIndex, and morphIndex.

Methods inherited from class edu.stanford.nlp.parser.lexparser.BaseLexicon
addAll, addAll, addTagging, evaluateCoverage, examineIntersection, finishTraining, getBaseTag, getUnknownWordModel, incrementTreesRead, initializeTraining, isKnown, isKnown, listToEvents, numRules, printLexStats, readData, ruleIteratorByWord, ruleIteratorByWord, setUnknownWordModel, tagSet, train, train, train, train, train, trainUnannotated, trainWithExpansion, treeToEvents, tune, writeData

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - FactoredLexicon
```
public FactoredLexicon(MorphoFeatureSpecification morphoSpec,
                       Index<String> wordIndex,
                       Index<String> tagIndex)
```
  - FactoredLexicon
```
public FactoredLexicon(Options op,
                       MorphoFeatureSpecification morphoSpec,
                       Index<String> wordIndex,
                       Index<String> tagIndex)
```
- Method Detail
  - ruleIteratorByWord
```
public Iterator<IntTaggedWord> ruleIteratorByWord(int word,
                                                  int loc,
                                                  String featureSpec)
```
    Rule table is lemmas. So isKnown() is slightly trickier.
    
    Specified by:
    
    ruleIteratorByWord in interface Lexicon
    
    Overrides:
    
    ruleIteratorByWord in class BaseLexicon
    
    Parameters:
    
    word - The word (as an int)
    
    loc - Its index in the sentence (usually only relevant for unknown words)
    
    featureSpec - Additional word features like morphosyntactic information.
    
    Returns:
    
    A list of possible taggings
  - score
```
public float score(IntTaggedWord iTW,
                   int loc,
                   String word,
                   String featureSpec)
```
    Description copied from class: BaseLexicon
    
    Get the score of this word with this tag (as an IntTaggedWord) at this location. (Presumably an estimate of P(word | tag).)
    Implementation documentation: Seen: c_W = count(W) c_TW = count(T,W) c_T = count(T) c_Tunseen = count(T) among new words in 2nd half total = count(seen words) totalUnseen = count("unseen" words) p_T_U = Pmle(T|"unseen") pb_T_W = P(T|W). If (c_W > smoothInUnknownsThreshold) = c_TW/c_W Else (if not smart mutation) pb_T_W = bayes prior smooth[1] with p_T_U p_T= Pmle(T) p_W = Pmle(W) pb_W_T = log(pb_T_W * p_W / p_T) [Bayes rule] Note that this doesn't really properly reserve mass to unknowns. Unseen: c_TS = count(T,Sig|Unseen) c_S = count(Sig) c_T = count(T|Unseen) c_U = totalUnseen above p_T_U = Pmle(T|Unseen) pb_T_S = Bayes smooth of Pmle(T|S) with P(T|Unseen) [smooth[0]] pb_W_T = log(P(W|T)) inverted
    
    Specified by:
    
    score in interface Lexicon
    
    Overrides:
    
    score in class BaseLexicon
    
    Parameters:
    
    iTW - An IntTaggedWord pairing a word and POS tag
    
    loc - The position in the sentence. In the default implementation this is used only for unknown words to change their probability distribution when sentence initial
    
    word - The word itself; useful so we don't have to look it up in an index
    
    featureSpec - TODO
    
    Returns:
    
    A float score, usually, log P(word|tag)
  - train
```
public void train(Collection<Tree> trees,
                  Collection<Tree> rawTrees)
```
    This method should populate wordIndex, tagIndex, and morphIndex.
    
    Specified by:
    
    train in interface Lexicon
    
    Overrides:
    
    train in class BaseLexicon
  - initRulesWithWord
```
protected void initRulesWithWord()
```
    Rule table is lemmas!
    
    Overrides:
    
    initRulesWithWord in class BaseLexicon
  - main
```
public static void main(String[] args)
```
    Parameters:
    
    args -

Class FactoredLexicon

Field Summary

Fields inherited from class edu.stanford.nlp.parser.lexparser.BaseLexicon

Fields inherited from interface edu.stanford.nlp.parser.lexparser.Lexicon

Constructor Summary

Method Summary

Methods inherited from class edu.stanford.nlp.parser.lexparser.BaseLexicon

Methods inherited from class java.lang.Object

Constructor Detail

FactoredLexicon

FactoredLexicon

Method Detail

ruleIteratorByWord

score

train

initRulesWithWord

main