|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.stanford.nlp.parser.lexparser.Lexicon
A class implementing a Lexicon.
Field Summary | |
static String |
BOUNDARY
|
static String |
BOUNDARY_TAG
|
protected int |
lastSentencePosition
|
protected int |
lastSignatureIndex
We cache the last signature looked up, because it asks for the same one many times when an unknown word is encountered! (Note that under the current scheme, one unknown word, if seen sentence-initially and non-initially, will be parsed with two different signatures....) |
protected int |
lastWordToSignaturize
|
protected static short |
nullTag
|
protected static int |
nullWord
|
protected Set |
rules
|
protected List[] |
rulesWithWord
|
protected Counter |
seenCounter
|
protected static long |
serialVersionUID
|
protected Set |
sigs
|
protected Numberer |
tagNumberer
|
protected Set |
tags
|
static String |
UNKNOWN_WORD
|
protected Counter |
unSeenCounter
|
protected Numberer |
wordNumberer
|
protected Set |
words
|
Constructor Summary | |
Lexicon()
|
Method Summary | |
protected void |
addTagging(boolean seen,
IntTaggedWord itw,
double count)
Adds the tagging with count to the data structures in this Lexicon. |
double |
evaluateCoverage(Collection trees,
Set missingWords,
Set missingTags,
Set missingTW)
Evaluates how many words (= terminals) in a collection of trees are covered by the lexicon. |
protected String |
getSignature(String word,
int loc)
This routine returns a String that is the "signature" of the class of a word. |
protected int |
getSignatureIndex(int wordIndex,
int sentencePosition)
Returns the index of the signature of the word numbered wordIndex, where the signature is the String representation of unknown word features. |
protected void |
initRulesWithWord()
|
boolean |
isKnown(int word)
|
boolean |
isKnown(String word)
Checks whether a word is in the lexicon. |
void |
printLexStats()
|
void |
readData(BufferedReader in)
Populates data in this Lexicon from the character stream given by the Reader r. |
protected void |
readObject(ObjectInputStream stream)
|
Iterator |
ruleIterator()
|
Iterator |
ruleIteratorByWord(int word,
int loc)
|
double |
score(IntTaggedWord iTW,
int loc)
|
protected double |
scoreAll(List trees)
|
String |
showTags()
|
Numberer |
tagNumberer()
|
void |
train(Collection trees)
Trains this lexicon on the Collection of trees. |
protected List |
treeToEvents(Collection trees)
|
protected List |
treeToEvents(Tree tree)
|
void |
tune(List trees)
|
Numberer |
wordNumberer()
|
void |
writeData(Writer w)
Writes out data from this Object to the Writer w. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
public static final String UNKNOWN_WORD
public static final String BOUNDARY
public static final String BOUNDARY_TAG
protected transient List[] rulesWithWord
protected transient Set rules
protected transient Set tags
protected transient Set words
protected transient Set sigs
protected Counter seenCounter
protected Counter unSeenCounter
protected static final int nullWord
protected static final short nullTag
protected transient int lastSignatureIndex
protected transient int lastSentencePosition
protected transient int lastWordToSignaturize
protected transient Numberer tagNumberer
protected transient Numberer wordNumberer
protected static final long serialVersionUID
Constructor Detail |
public Lexicon()
Method Detail |
public Numberer tagNumberer()
public Numberer wordNumberer()
public Iterator ruleIterator()
public boolean isKnown(int word)
public boolean isKnown(String word)
word
- The word as a String
public Iterator ruleIteratorByWord(int word, int loc)
protected void initRulesWithWord()
protected List treeToEvents(Tree tree)
protected List treeToEvents(Collection trees)
public void train(Collection trees)
protected void addTagging(boolean seen, IntTaggedWord itw, double count)
protected int getSignatureIndex(int wordIndex, int sentencePosition)
protected String getSignature(String word, int loc)
word
- The word to make a signature forloc
- Its position in the sentence (mainly so sentence-initial
capitalized words can be treated differently)
public double score(IntTaggedWord iTW, int loc)
protected double scoreAll(List trees)
public void tune(List trees)
public void printLexStats()
public double evaluateCoverage(Collection trees, Set missingWords, Set missingTags, Set missingTW)
public String showTags()
protected void readObject(ObjectInputStream stream) throws IOException, ClassNotFoundException
IOException
ClassNotFoundException
public void readData(BufferedReader in) throws IOException
IOException
public void writeData(Writer w) throws IOException
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |