|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--edu.stanford.nlp.parser.lexparser.LexicalizedParser
A reasonably good lexicalized PCFG parser. It does a product-of-experts model of plain PCFG parsing and lexicalized dependency parsing. Or it can do unlexicalized PCFG parsing by using just that component parser. Note that training requires a lot of memory to run. Try -mx1500m. See the package documentation for more details and examples of use. See the main method documentation for details of invoking the parser.
Constructor Summary | |
LexicalizedParser()
Construct a new LexicalizedParser object from a previously assembled grammar read from a property edu.stanford.nlp.SerializedLexicalizedParser ,
or a default place. |
|
LexicalizedParser(edu.stanford.nlp.parser.lexparser.LexicalizedParser.ParserData pd)
Construct a new LexicalizedParser object from a previously assembled grammar. |
|
LexicalizedParser(ObjectInputStream in)
Construct a new LexicalizedParser object from a previously assembled grammar read from an InputStream. |
|
LexicalizedParser(ObjectInputStream in,
int maxLeng)
Construct a new LexicalizedParser object from a previously assembled grammar read from an InputStream. |
|
LexicalizedParser(String serializedFileOrUrl)
Construct a new LexicalizedParser. |
|
LexicalizedParser(String treebankPath,
FileFilter filt,
int maxLeng)
Construct a new LexicalizedParser. |
|
LexicalizedParser(String treebankPath,
FileFilter filt,
int maxLeng,
TreebankLangParserParams tlpParams)
Construct a new LexicalizedParser. |
|
LexicalizedParser(String treebankPath,
FileFilter filt,
TreebankLangParserParams tlpParams)
Construct a new LexicalizedParser by training from treebank files. |
|
LexicalizedParser(String serializedFileOrUrl,
int maxLeng)
Construct a new LexicalizedParser. |
|
LexicalizedParser(String treebankPath,
TreebankLangParserParams tlpParams)
Construct a new LexicalizedParser by training from treebank files. |
Method Summary | |
Object |
apply(Object in)
Converts a Sentence/List into a Tree. |
protected static edu.stanford.nlp.parser.lexparser.LexicalizedParser.ParserData |
deserializeParser(String filenameOrUrl)
|
Tree |
getBestDependencyParse()
|
Tree |
getBestParse()
Return the best parse of the sentence most recently parsed. |
Tree |
getBestPCFGParse()
|
static void |
main(String[] args)
A simple main program for using the parser. |
boolean |
parse(List sentence)
Parse a sentence represented as a List. |
boolean |
parse(Sentence sentence)
Parse a Sentence. |
boolean |
parse(Sentence sentence,
String goal)
Parse a Sentence. |
void |
setTreebankLangParserParams(TreebankLangParserParams tlpp)
Allows the caller to specify a TreebankLangParserParams to use. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
public LexicalizedParser()
edu.stanford.nlp.SerializedLexicalizedParser
,
or a default place.
public LexicalizedParser(String serializedFileOrUrl)
IllegalArgumentException
- If parser data cannot be loadedpublic LexicalizedParser(String serializedFileOrUrl, int maxLeng)
maxLeng
- Maximum sentence length that you want the parser to be
able to parse (this effects memory consumption)
IllegalArgumentException
- If parser data cannot be loadedpublic LexicalizedParser(edu.stanford.nlp.parser.lexparser.LexicalizedParser.ParserData pd)
pd
- A ParserData
object (not null
)public LexicalizedParser(ObjectInputStream in) throws Exception
in
- The ObjectInputStreampublic LexicalizedParser(ObjectInputStream in, int maxLeng) throws Exception
in
- The ObjectInputStreammaxLeng
- Maximum sentence length that you want the parser to be
able to parse (this effects memory consumption)public LexicalizedParser(String treebankPath, FileFilter filt, TreebankLangParserParams tlpParams)
treebankPath
- a String
valuefilt
- a FileFilter
value. This may be
null
if no filtering of selected files is needed.public LexicalizedParser(String treebankPath, TreebankLangParserParams tlpParams)
treebankPath
- a String
valuepublic LexicalizedParser(String treebankPath, FileFilter filt, int maxLeng, TreebankLangParserParams tlpParams)
treebankPath
- a String
valuefilt
- a FileFilter
valuemaxLeng
- The maximum length sentences to be able to parser.
A large value for this requires a great deal of memory (and
time) for parsing, but allows parsing longer sentences.tlpParams
- The Treebank parameters class for different languagespublic LexicalizedParser(String treebankPath, FileFilter filt, int maxLeng)
treebankPath
- a String
valuefilt
- a FileFilter
valuemaxLeng
- The maximum length sentences to be able to parser.
A large value for this requires a great deal of memory (and
time) for parsing, but allows parsing longer sentences.Method Detail |
public void setTreebankLangParserParams(TreebankLangParserParams tlpp)
tlpp
- The one to usepublic Object apply(Object in)
apply
in interface Appliable
in
- The input Sentence/List
IllegalArgumentException
- If argument isn't a Listpublic boolean parse(Sentence sentence)
parse
in interface Parser
sentence
- A Sentence
to be parsed
public boolean parse(Sentence sentence, String goal)
parse
in interface Parser
sentence
- A Sentence
to be parsedgoal
- The category to parse the sentence as (e.g., NP, S)
public boolean parse(List sentence)
sentence
- The sentence to parse
UnsupportedOperationException
- If the Sentence is too long or
otherwise fails for resource reasonspublic Tree getBestParse()
getBestParse
in interface ViterbiParser
NoSuchElementException
- If no previously successfully parsed
sentencepublic Tree getBestPCFGParse()
public Tree getBestDependencyParse()
protected static edu.stanford.nlp.parser.lexparser.LexicalizedParser.ParserData deserializeParser(String filenameOrUrl)
public static void main(String[] args)
Usages:
java edu.stanford.nlp.parser.lexparser.LexicalizedParser [-v] -validate
trainFilesPath [start stop
-treebank [testFilePath [start stop]]]
java -mx512m edu.stanford.nlp.parser.lexparser.LexicalizedParser
[-v] serializedParserFilename filename+
java -mx512m edu.stanford.nlp.parser.lexparser.LexicalizedParser
[-v] serializedParserFilename -treebank testFilePath [start stop]
java edu.stanford.nlp.parser.lexparser.LexicalizedParser [-v] -train
trainFilesPath [start stop] serializedParserFilename
If the serializedParserFilename
ends in .gz
,
then the serialization data is written and read compressed (GZip).
The argument filename
may be a URL, starting with
http://
.
If no files are supplied in the third usage, then a hardwired sentence
is parsed.
All final arguments are passed to FactoredParser.
In the same position as the verbose flag (-v
), many other
options can be specified. The most useful to an end user are:
-tLPP class
Specify a different
TreebankLangParserParams, for when using a different language or
treebank (the default is English Penn Treebank)-encoding charset
Specify the character encoding of the
input files-tokenized
Says that the input is already separated
into whitespace-delimited tokens-tokenizer class
Specifies a Tokenizer class to be used
for tokenization-sentences token
Specifies a token that marks sentence
bounaries, or "sDelimited" or "newline", which have special
interpretations (see package documentation)-tagSeparator char
Specifies to look for tags on words
separated by a reserved character char.-maxLength leng
Specify the longest sentence that
will be parsed (and hence indirectly the amount of memory needed.-outputTreeFormat style
Choose the style of output
sentences: penn
for prettyprinting as in the Penn
treebank files, or oneline
for printing sentences one
per line.
args
- Command line arguments, as above
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |