|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.stanford.nlp.parser.lexparser.PetrovLexicon
public class PetrovLexicon
Field Summary |
---|
Fields inherited from interface edu.stanford.nlp.parser.lexparser.Lexicon |
---|
BOUNDARY, BOUNDARY_TAG |
Constructor Summary | |
---|---|
PetrovLexicon()
|
Method Summary | |
---|---|
int |
getSignature(int word,
int loc)
|
java.lang.String |
getSignature(java.lang.String word,
int loc)
|
UnknownWordModel |
getUnknownWordModel()
|
protected void |
initRulesWithWord()
|
boolean |
isKnown(int word)
Checks whether a word is in the lexicon. |
boolean |
isKnown(java.lang.String word)
Checks whether a word is in the lexicon. |
int |
numRules()
Returns the number of rules (tag rewrites as word) in the Lexicon. |
void |
readData(java.io.BufferedReader in)
Read the lexicon from the BufferedReader in the format written by writeData. |
java.util.Iterator<IntTaggedWord> |
ruleIteratorByWord(int word,
int loc)
Get an iterator over all rules (pairs of (word, POS)) for this word. |
float |
score(IntTaggedWord iTW,
int loc)
Computes an estimate of log P(word | tag). |
void |
setUnknownWordModel(UnknownWordModel uwm)
|
void |
train(java.util.Collection<Tree> trees)
Trains this lexicon on the Collection of trees. |
void |
writeData(java.io.Writer w)
Write the lexicon in human-readable format to the Writer. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public PetrovLexicon()
Method Detail |
---|
public boolean isKnown(int word)
Lexicon
isKnown
in interface Lexicon
word
- The word as an int
public boolean isKnown(java.lang.String word)
Lexicon
isKnown
in interface Lexicon
word
- The word as a String
public java.util.Iterator<IntTaggedWord> ruleIteratorByWord(int word, int loc)
Lexicon
ruleIteratorByWord
in interface Lexicon
word
- The word, represented as an integer in Numbererloc
- The position of the word in the sentence (counting from 0).
Implementation note: The BaseLexicon class doesn't
actually make use of this position information.
tag -> word rule.)
protected void initRulesWithWord()
public int numRules()
numRules
in interface Lexicon
public void train(java.util.Collection<Tree> trees)
Lexicon
train
in interface Lexicon
trees
- Trees to train onpublic float score(IntTaggedWord iTW, int loc)
score
in interface Lexicon
iTW
- An IntTaggedWord pairing a word and POS tagloc
- The position in the sentence. In the default implementation
this is used only for unknown words to change their
probability distribution when sentence initial.
public void writeData(java.io.Writer w) throws java.io.IOException
Lexicon
writeData
in interface Lexicon
w
- The writer to output to
java.io.IOException
- If any I/O problempublic void readData(java.io.BufferedReader in) throws java.io.IOException
Lexicon
readData
in interface Lexicon
in
- The BufferedReader to read from
java.io.IOException
- If any I/O problempublic int getSignature(int word, int loc)
public java.lang.String getSignature(java.lang.String word, int loc)
public UnknownWordModel getUnknownWordModel()
getUnknownWordModel
in interface Lexicon
public void setUnknownWordModel(UnknownWordModel uwm)
setUnknownWordModel
in interface Lexicon
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |