LexicalizedParser (Stanford JavaNLP API)

java.lang.Object
- edu.stanford.nlp.parser.common.ParserGrammar
- - edu.stanford.nlp.parser.lexparser.LexicalizedParser

All Implemented Interfaces:

ParserQueryFactory, java.io.Serializable, java.util.function.Function<java.util.List<? extends HasWord>,Tree>
```
public class LexicalizedParser
extends ParserGrammar
implements java.io.Serializable
```
This class provides the top-level API and command-line interface to a set of reasonably good treebank-trained parsers. The name reflects the main factored parsing model, which provides a lexicalized PCFG parser implemented as a product model of a plain PCFG parser and a lexicalized dependency parser. But you can also run either component parser alone. In particular, it is often useful to do unlexicalized PCFG parsing by using just that component parser.
See the package documentation for more details and examples of use.
For information on invoking the parser from the command-line, and for a more detailed list of options, see the main(java.lang.String[]) method.
Note that training on a 1 million word treebank requires a fair amount of memory to run. Try -mx1500m to increase the memory allocated by the JVM.

Author:

Dan Klein (original version), Christopher Manning (better features, ParserParams, serialization), Roger Levy (internationalization), Teg Grenager (grammar compaction, tokenization, etc.), Galen Andrew (considerable refactoring), John Bauer (made threadsafe)

See Also:

Serialized Form

Field Summary

Fields
Modifier and Type	Field and Description
`BinaryGrammar`	`bg`
`static java.lang.String`	`DEFAULT_PARSER_LOC`
`DependencyGrammar`	`dg`
`Lexicon`	`lex`
`Reranker`	`reranker`
`Index<java.lang.String>`	`stateIndex`
`Index<java.lang.String>`	`tagIndex`
`UnaryGrammar`	`ug`
`Index<java.lang.String>`	`wordIndex`

Constructor Summary

Constructors
Constructor and Description
`LexicalizedParser(Lexicon lex, BinaryGrammar bg, UnaryGrammar ug, DependencyGrammar dg, Index<java.lang.String> stateIndex, Index<java.lang.String> wordIndex, Index<java.lang.String> tagIndex, Options op)`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`static TreeAnnotatorAndBinarizer`	`buildTrainBinarizer(Options op)`
`static CompositeTreeTransformer`	`buildTrainTransformer(Options op)`
`static CompositeTreeTransformer`	`buildTrainTransformer(Options op, TreeAnnotatorAndBinarizer binarizer)`
`static LexicalizedParser`	`copyLexicalizedParser(LexicalizedParser parser)`
`java.lang.String[]`	`defaultCoreNLPFlags()` Returns a set of options which should be set by default when used in corenlp.
`static Triple<Treebank,Treebank,Treebank>`	`getAnnotatedBinaryTreebankFromTreebank(Treebank trainTreebank, Treebank secondaryTreebank, Treebank tuneTreebank, Options op)`
`java.util.List<Eval>`	`getExtraEvals()` Returns a list of extra Eval objects to use when scoring the parser.
`Lexicon`	`getLexicon()`
`Options`	`getOp()`
`static LexicalizedParser`	`getParserFromFile(java.lang.String parserFileOrUrl, Options op)`
`static LexicalizedParser`	`getParserFromSerializedFile(java.lang.String serializedFileOrUrl)`
`protected static LexicalizedParser`	`getParserFromTextFile(java.lang.String textFileOrUrl, Options op)`
`static LexicalizedParser`	`getParserFromTreebank(Treebank trainTreebank, Treebank secondaryTrainTreebank, double weight, GrammarCompactor compactor, Options op, Treebank tuneTreebank, java.util.List<java.util.List<TaggedWord>> extraTaggedWords)` A method for training from two different treebanks, the second of which is presumed to be orders of magnitude larger.
`java.util.List<ParserQueryEval>`	`getParserQueryEvals()` Return a list of Eval-style objects which care about the whole ParserQuery, not just the finished tree
`TreebankLangParserParams`	`getTLPParams()`
`TreePrint`	`getTreePrint()` Return a TreePrint for formatting parsed output trees.
`LexicalizedParserQuery`	`lexicalizedParserQuery()`
`static LexicalizedParser`	`loadModel()` Construct a new LexicalizedParser object from a previously serialized grammar read from a System property `edu.stanford.nlp.SerializedLexicalizedParser`, or a default classpath location (`edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz`).
`static LexicalizedParser`	`loadModel(java.io.ObjectInputStream ois)` Reads one object from the given ObjectInputStream, which is assumed to be a LexicalizedParser.
`static LexicalizedParser`	`loadModel(Options op, java.lang.String... extraFlags)` Construct a new LexicalizedParser object from a previously serialized grammar read from a System property `edu.stanford.nlp.SerializedLexicalizedParser`, or a default classpath location (`edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz`).
`static LexicalizedParser`	`loadModel(java.lang.String parserFileOrUrl, java.util.List<java.lang.String> extraFlags)`
`static LexicalizedParser`	`loadModel(java.lang.String parserFileOrUrl, Options op, java.lang.String... extraFlags)` Construct a new LexicalizedParser.
`static LexicalizedParser`	`loadModel(java.lang.String parserFileOrUrl, java.lang.String... extraFlags)`
`static void`	`main(java.lang.String[] args)` A main program for using the parser with various options.
`Tree`	`parse(java.util.List<? extends HasWord> lst)` Parses the list of HasWord.
`java.util.List<Tree>`	`parseMultiple(java.util.List<? extends java.util.List<? extends HasWord>> sentences)`
`java.util.List<Tree>`	`parseMultiple(java.util.List<? extends java.util.List<? extends HasWord>> sentences, int nthreads)` Will launch multiple threads which calls `parse` on each of the `sentences` in order, returning the resulting parse trees in the same order.
`ParserQuery`	`parserQuery()`
`Tree`	`parseStrings(java.util.List<java.lang.String> lst)` Will process a list of strings into a list of HasWord and return the parse tree associated with that list.
`Tree`	`parseTree(java.util.List<? extends HasWord> sentence)` Similar to parse(), but instead of returning an X tree on failure, returns null.
`boolean`	`requiresTags()` The model requires text to be pretagged
`void`	`saveParserToSerialized(java.lang.String filename)` Saves the parser defined by pd to the given filename.
`void`	`saveParserToTextFile(java.lang.String filename)` Saves the parser defined by pd to the given filename.
`void`	`setOptionFlags(java.lang.String... flags)` This will set options to the parser, in a way exactly equivalent to passing in the same sequence of command-line arguments.
`static LexicalizedParser`	`trainFromTreebank(java.lang.String treebankPath, java.io.FileFilter filt, Options op)`
`static LexicalizedParser`	`trainFromTreebank(Treebank trainTreebank, GrammarCompactor compactor, Options op)` Construct a new LexicalizedParser.
`static LexicalizedParser`	`trainFromTreebank(Treebank trainTreebank, Options op)`
`TreebankLanguagePack`	`treebankLanguagePack()`

Methods inherited from class edu.stanford.nlp.parser.common.ParserGrammar
apply, lemmatize, lemmatize, loadModelFromZip, loadTagger, parse, tokenize

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface java.util.function.Function
andThen, compose, identity

- Field Detail
  - lex
```
public Lexicon lex
```
  - bg
```
public BinaryGrammar bg
```
  - ug
```
public UnaryGrammar ug
```
  - dg
```
public DependencyGrammar dg
```
  - stateIndex
```
public Index<java.lang.String> stateIndex
```
  - wordIndex
```
public Index<java.lang.String> wordIndex
```
  - tagIndex
```
public Index<java.lang.String> tagIndex
```
  - reranker
```
public Reranker reranker
```
  - DEFAULT_PARSER_LOC
```
public static final java.lang.String DEFAULT_PARSER_LOC
```
- Constructor Detail
  - LexicalizedParser
```
public LexicalizedParser(Lexicon lex,
                         BinaryGrammar bg,
                         UnaryGrammar ug,
                         DependencyGrammar dg,
                         Index<java.lang.String> stateIndex,
                         Index<java.lang.String> wordIndex,
                         Index<java.lang.String> tagIndex,
                         Options op)
```
- Method Detail
  - getOp
```
public Options getOp()
```
    Specified by:
    
    getOp in class ParserGrammar
  - getTLPParams
```
public TreebankLangParserParams getTLPParams()
```
    Specified by:
    
    getTLPParams in class ParserGrammar
  - treebankLanguagePack
```
public TreebankLanguagePack treebankLanguagePack()
```
    Specified by:
    
    treebankLanguagePack in class ParserGrammar
  - defaultCoreNLPFlags
```
public java.lang.String[] defaultCoreNLPFlags()
```
    Description copied from class: ParserGrammar
    
    Returns a set of options which should be set by default when used in corenlp. For example, the English PCFG/RNN models want -retainTmpSubcategories, and the ShiftReduceParser models may want -beamSize 4 depending on how they were trained.
    TODO: right now completely hardcoded, should be settable as a training time option
    
    Specified by:
    
    defaultCoreNLPFlags in class ParserGrammar
  - requiresTags
```
public boolean requiresTags()
```
    Description copied from class: ParserGrammar
    
    The model requires text to be pretagged
    
    Specified by:
    
    requiresTags in class ParserGrammar
  - loadModel
```
public static LexicalizedParser loadModel()
```
    Construct a new LexicalizedParser object from a previously serialized grammar read from a System property edu.stanford.nlp.SerializedLexicalizedParser, or a default classpath location (edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz).
  - loadModel
```
public static LexicalizedParser loadModel(Options op,
                                          java.lang.String... extraFlags)
```
    Construct a new LexicalizedParser object from a previously serialized grammar read from a System property edu.stanford.nlp.SerializedLexicalizedParser, or a default classpath location (edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz).
    
    Parameters:
    
    op - Options to the parser. These get overwritten by the Options read from the serialized parser; I think the only thing determined by them is the encoding of the grammar iff it is a text grammar
  - loadModel
```
public static LexicalizedParser loadModel(java.lang.String parserFileOrUrl,
                                          java.lang.String... extraFlags)
```
  - loadModel
```
public static LexicalizedParser loadModel(java.lang.String parserFileOrUrl,
                                          java.util.List<java.lang.String> extraFlags)
```
  - loadModel
```
public static LexicalizedParser loadModel(java.lang.String parserFileOrUrl,
                                          Options op,
                                          java.lang.String... extraFlags)
```
    Construct a new LexicalizedParser. This loads a grammar that was previously assembled and stored as a serialized file.
    
    Parameters:
    
    parserFileOrUrl - Filename/URL to load parser from
    
    op - Options for this parser. These will normally be overwritten by options stored in the file
    
    Throws:
    
    java.lang.IllegalArgumentException - If parser data cannot be loaded
  - loadModel
```
public static LexicalizedParser loadModel(java.io.ObjectInputStream ois)
```
    Reads one object from the given ObjectInputStream, which is assumed to be a LexicalizedParser. Throws a ClassCastException if this is not true. The stream is not closed.
  - copyLexicalizedParser
```
public static LexicalizedParser copyLexicalizedParser(LexicalizedParser parser)
```
  - trainFromTreebank
```
public static LexicalizedParser trainFromTreebank(Treebank trainTreebank,
                                                  GrammarCompactor compactor,
                                                  Options op)
```
    Construct a new LexicalizedParser.
    
    Parameters:
    
    trainTreebank - a treebank to train from
  - trainFromTreebank
```
public static LexicalizedParser trainFromTreebank(java.lang.String treebankPath,
                                                  java.io.FileFilter filt,
                                                  Options op)
```
  - trainFromTreebank
```
public static LexicalizedParser trainFromTreebank(Treebank trainTreebank,
                                                  Options op)
```
  - parseStrings
```
public Tree parseStrings(java.util.List<java.lang.String> lst)
```
    Will process a list of strings into a list of HasWord and return the parse tree associated with that list.
  - parse
```
public Tree parse(java.util.List<? extends HasWord> lst)
```
    Parses the list of HasWord. If the parse fails for some reason, an X tree is returned instead of barfing.
    
    Specified by:
    
    parse in class ParserGrammar
    
    Parameters:
    
    lst - The input sentence (a List of words)
    
    Returns:
    
    A Tree that is the parse tree for the sentence. If the parser fails, a new Tree is synthesized which attaches all words to the root.
  - parseMultiple
```
public java.util.List<Tree> parseMultiple(java.util.List<? extends java.util.List<? extends HasWord>> sentences)
```
  - parseMultiple
```
public java.util.List<Tree> parseMultiple(java.util.List<? extends java.util.List<? extends HasWord>> sentences,
                                          int nthreads)
```
    Will launch multiple threads which calls parse on each of the sentences in order, returning the resulting parse trees in the same order.
  - getTreePrint
```
public TreePrint getTreePrint()
```
    Return a TreePrint for formatting parsed output trees.
    
    Returns:
    
    A TreePrint for formatting parsed output trees.
  - parseTree
```
public Tree parseTree(java.util.List<? extends HasWord> sentence)
```
    Similar to parse(), but instead of returning an X tree on failure, returns null.
    
    Specified by:
    
    parseTree in class ParserGrammar
  - getExtraEvals
```
public java.util.List<Eval> getExtraEvals()
```
    Description copied from class: ParserGrammar
    
    Returns a list of extra Eval objects to use when scoring the parser.
    
    Specified by:
    
    getExtraEvals in class ParserGrammar
  - getParserQueryEvals
```
public java.util.List<ParserQueryEval> getParserQueryEvals()
```
    Description copied from class: ParserGrammar
    
    Return a list of Eval-style objects which care about the whole ParserQuery, not just the finished tree
    
    Specified by:
    
    getParserQueryEvals in class ParserGrammar
  - parserQuery
```
public ParserQuery parserQuery()
```
    Specified by:
    
    parserQuery in interface ParserQueryFactory
    
    Specified by:
    
    parserQuery in class ParserGrammar
  - lexicalizedParserQuery
```
public LexicalizedParserQuery lexicalizedParserQuery()
```
  - getParserFromFile
```
public static LexicalizedParser getParserFromFile(java.lang.String parserFileOrUrl,
                                                  Options op)
```
  - getLexicon
```
public Lexicon getLexicon()
```
  - saveParserToSerialized
```
public void saveParserToSerialized(java.lang.String filename)
```
    Saves the parser defined by pd to the given filename. If there is an error, a RuntimeIOException is thrown.
  - saveParserToTextFile
```
public void saveParserToTextFile(java.lang.String filename)
```
    Saves the parser defined by pd to the given filename. If there is an error, a RuntimeIOException is thrown.
  - getParserFromTextFile
```
protected static LexicalizedParser getParserFromTextFile(java.lang.String textFileOrUrl,
                                                         Options op)
```
  - getParserFromSerializedFile
```
public static LexicalizedParser getParserFromSerializedFile(java.lang.String serializedFileOrUrl)
```
  - buildTrainBinarizer
```
public static TreeAnnotatorAndBinarizer buildTrainBinarizer(Options op)
```
  - buildTrainTransformer
```
public static CompositeTreeTransformer buildTrainTransformer(Options op)
```
  - buildTrainTransformer
```
public static CompositeTreeTransformer buildTrainTransformer(Options op,
                                                             TreeAnnotatorAndBinarizer binarizer)
```
  - getAnnotatedBinaryTreebankFromTreebank
```
public static Triple<Treebank,Treebank,Treebank> getAnnotatedBinaryTreebankFromTreebank(Treebank trainTreebank,
                                                                                        Treebank secondaryTreebank,
                                                                                        Treebank tuneTreebank,
                                                                                        Options op)
```
    Returns:
    
    A triple of binaryTrainTreebank, binarySecondaryTrainTreebank, binaryTuneTreebank.
  - getParserFromTreebank
```
public static LexicalizedParser getParserFromTreebank(Treebank trainTreebank,
                                                      Treebank secondaryTrainTreebank,
                                                      double weight,
                                                      GrammarCompactor compactor,
                                                      Options op,
                                                      Treebank tuneTreebank,
                                                      java.util.List<java.util.List<TaggedWord>> extraTaggedWords)
```
    A method for training from two different treebanks, the second of which is presumed to be orders of magnitude larger.
    Trees are not read into memory but processed as they are read from disk.
    A weight (typically <= 1) can be put on the second treebank.
    
    Parameters:
    
    trainTreebank - A treebank to train from
    
    secondaryTrainTreebank - Another treebank to train from
    
    weight - A weight factor to give the secondary treebank. If the weight is 0.25, each example in the secondaryTrainTreebank will be treated as 1/4 of an example sentence.
    
    compactor - A class for compacting grammars. May be null.
    
    op - Options for how the grammar is built from the treebank
    
    tuneTreebank - A treebank to tune free params on (may be null)
    
    extraTaggedWords - A list of words to add to the Lexicon
    
    Returns:
    
    The trained LexicalizedParser
  - setOptionFlags
```
public void setOptionFlags(java.lang.String... flags)
```
    This will set options to the parser, in a way exactly equivalent to passing in the same sequence of command-line arguments. This is a useful convenience method when building a parser programmatically. The options passed in should be specified like command-line arguments, including with an initial minus sign.
    Notes: This can be used to set parsing-time flags for a serialized parser. You can also still change things serialized in Options, but this will probably degrade parsing performance. The vast majority of command line flags can be passed to this method, but you cannot pass in options that specify the treebank or grammar to be loaded, the grammar to be written, trees or files to be parsed or details of their encoding, nor the TreebankLangParserParams (-tLPP) to use. The TreebankLangParserParams should be set up on construction of a LexicalizedParser, by constructing an Options that uses the required TreebankLangParserParams, and passing that to a LexicalizedParser constructor. Note that despite this method being an instance method, many flags are actually set as static class variables.
    
    Specified by:
    
    setOptionFlags in class ParserGrammar
    
    Parameters:
    
    flags - Arguments to the parser, for example, {"-outputFormat", "typedDependencies", "-maxLength", "70"}
    
    Throws:
    
    java.lang.IllegalArgumentException - If an unknown flag is passed in
  - main
```
public static void main(java.lang.String[] args)
```
    A main program for using the parser with various options. This program can be used for building and serializing a parser from treebank data, for parsing sentences from a file or URL using a serialized or text grammar parser, and (mainly for parser quality testing) for training and testing a parser on a treebank all in one go.
    Sample Usages:
    - Train a parser (saved to serializedGrammarFilename) from a directory of trees (trainFilesPath, with an optional fileRange, e.g., 0-1000): java -mx1500m edu.stanford.nlp.parser.lexparser.LexicalizedParser [-v] -train trainFilesPath [fileRange] -saveToSerializedFile serializedGrammarFilename
    - Train a parser (not saved) from a directory of trees, and test it (reporting scores) on a directory of trees java -mx1500m edu.stanford.nlp.parser.lexparser.LexicalizedParser [-v] -train trainFilesPath [fileRange] -testTreebank testFilePath [fileRange]
    - Parse one or more files, given a serialized grammar and a list of files java -mx512m edu.stanford.nlp.parser.lexparser.LexicalizedParser [-v] serializedGrammarPath filename [filename]*
    - Test and report scores for a serialized grammar on trees in an output directory java -mx512m edu.stanford.nlp.parser.lexparser.LexicalizedParser [-v] -loadFromSerializedFile serializedGrammarPath -testTreebank testFilePath [fileRange]
    If the serializedGrammarPath ends in .gz, then the grammar is written and read as a compressed file (GZip). If the serializedGrammarPath is a URL, starting with http://, then the parser is read from the URL. A fileRange specifies a numeric value that must be included within a filename for it to be used in training or testing (this works well with most current treebanks). It can be specified like a range of pages to be printed, for instance as 200-2199 or 1-300,500-725,9000 or just as 1 (if all your trees are in a single file, either omit this parameter or just give a dummy argument such as 0). If the filename to parse is "-" then the parser parses from stdin. If no files are supplied to parse, then a hardwired sentence is parsed.
    The parser can write a grammar as either a serialized Java object file or in a text format (or as both), specified with the following options:
    java edu.stanford.nlp.parser.lexparser.LexicalizedParser [-v] -train trainFilesPath [fileRange] [-saveToSerializedFile grammarPath] [-saveToTextFile grammarPath]
    
    In the same position as the verbose flag (-v), many other options can be specified. The most useful to an end user are:
    - -tLPP class Specify a different TreebankLangParserParams, for when using a different language or treebank (the default is English Penn Treebank). This option MUST occur before any other language-specific options that are used (or else they are ignored!). (It's usually a good idea to specify this option even when loading a serialized grammar; it is necessary if the language pack specifies a needed character encoding or you wish to specify language-specific options on the command line.)
    - -encoding charset Specify the character encoding of the input and output files. This will override the value in the TreebankLangParserParams, provided this option appears after any -tLPP option.
    - -tokenized Says that the input is already separated into whitespace-delimited tokens. If this option is specified, any tokenizer specified for the language is ignored, and a universal (Unicode) tokenizer, which divides only on whitespace, is used. Unless you also specify -escaper, the tokens must all be correctly tokenized tokens of the appropriate treebank for the parser to work well (for instance, if using the Penn English Treebank, you must have coded "(" as "-LRB-", etc.). (Note: we do not use the backslash escaping in front of / and * that appeared in Penn Treebank releases through 1999.)
    - -escaper class Specify a class of type Function<List<HasWord>,List<HasWord>> to do customized escaping of tokenized text. This class will be run over the tokenized text and can fix the representation of tokens. For instance, it could change "(" to "-LRB-" for the Penn English Treebank. A provided escaper that does such things for the Penn English Treebank is edu.stanford.nlp.process.PTBEscapingProcessor
    - -tokenizerFactory class Specifies a TokenizerFactory class to be used for tokenization
    - -tokenizerOptions options Specifies options to a TokenizerFactory class to be used for tokenization. A comma-separated list. For PTBTokenizer, options of interest include americanize=false and quotes=ascii (for German). Note that any choice of tokenizer options that conflicts with the tokenization used in the parser training data will likely degrade parser performance.
    - -sentences token Specifies a token that marks sentence boundaries. A value of newline causes sentence breaking on newlines. A value of onePerElement causes each element (using the XML -parseInside option) to be treated as a sentence. All other tokens will be interpreted literally, and must be exactly the same as tokens returned by the tokenizer. For example, you might specify "|||" and put that symbol sequence as a token between sentences. If no explicit sentence breaking option is chosen, sentence breaking is done based on a set of language-particular sentence-ending patterns.
    - -parseInside element Specifies that parsing should only be done for tokens inside the indicated XML-style elements (done as simple pattern matching, rather than XML parsing). For example, if this is specified as sentence, then the text inside the sentence element would be parsed. Using "-parseInside s" gives you support for the input format of Charniak's parser. Sentences cannot span elements. Whether the contents of the element are treated as one sentence or potentially multiple sentences is controlled by the -sentences flag. The default is potentially multiple sentences. This option gives support for extracting and parsing text from very simple SGML and XML documents, and is provided as a user convenience for that purpose. If you want to really parse XML documents before NLP parsing them, you should use an XML parser, and then call to a LexicalizedParser on appropriate CDATA.
    - -tagSeparator char Specifies to look for tags on words following the word and separated from it by a special character char. For instance, many tagged corpora have the representation "house/NN" and you would use -tagSeparator /. Notes: This option requires that the input be pretokenized. The separator has to be only a single character, and there is no escaping mechanism. However, splitting is done on the last instance of the character in the token, so that cases like "3\/4/CD" are handled correctly. The parser will in all normal circumstances use the tag you provide, but will override it in the case of very common words in cases where the tag that you provide is not one that it regards as a possible tagging for the word. The parser supports a format where only some of the words in a sentence have a tag (if you are calling the parser programmatically, you indicate them by having them implement the HasTag interface). You can do this at the command-line by only having tags after some words, but you are limited by the fact that there is no way to escape the tagSeparator character.
    - -maxLength leng Specify the longest sentence that will be parsed (and hence indirectly the amount of memory needed for the parser). If this is not specified, the parser will try to dynamically grow its parse chart when long sentence are encountered, but may run out of memory trying to do so.
    - -outputFormat styles Choose the style(s) of output sentences: penn for prettyprinting as in the Penn treebank files, or oneline for printing sentences one per line, words, wordsAndTags, dependencies, typedDependencies, or typedDependenciesCollapsed. Multiple options may be specified as a comma-separated list. See TreePrint class for further documentation.
    - -outputFormatOptions Provide options that control the behavior of various -outputFormat choices, such as lexicalize, stem, markHeadNodes, or xml. TreePrint Options are specified as a comma-separated list.
    - -writeOutputFiles Write output files corresponding to the input files, with the same name but a ".stp" file extension. The format of these files depends on the outputFormat option. (If not specified, output is sent to stdout.)
    - -outputFilesExtension The extension that is appended to the filename that is being parsed to produce an output file name (with the -writeOutputFiles option). The default is stp. Don't include the period.
    - -outputFilesDirectory The directory in which output files are written (when the -writeOutputFiles option is specified). If not specified, output files are written in the same directory as the input files.
    - -nthreads Parsing files and testing on treebanks can use multiple threads. This option tells the parser how many threads to use. A negative number indicates to use as many threads as the machine has cores.
    See also the package documentation for more details and examples of use.
    Parameters:
    
    args - Command line arguments, as above

Class LexicalizedParser

Field Summary

Constructor Summary

Method Summary

Methods inherited from class edu.stanford.nlp.parser.common.ParserGrammar

Methods inherited from class java.lang.Object

Methods inherited from interface java.util.function.Function

Field Detail

lex

bg

ug

dg

stateIndex

wordIndex

tagIndex

reranker

DEFAULT_PARSER_LOC

Constructor Detail

LexicalizedParser

Method Detail

getOp

getTLPParams

treebankLanguagePack

defaultCoreNLPFlags

requiresTags

loadModel

loadModel

loadModel

loadModel

loadModel

loadModel

copyLexicalizedParser

trainFromTreebank

trainFromTreebank

trainFromTreebank

parseStrings

parse

parseMultiple

parseMultiple

getTreePrint

parseTree

getExtraEvals

getParserQueryEvals

parserQuery

lexicalizedParserQuery

getParserFromFile

getLexicon

saveParserToSerialized

saveParserToTextFile

getParserFromTextFile

getParserFromSerializedFile

buildTrainBinarizer

buildTrainTransformer

buildTrainTransformer

getAnnotatedBinaryTreebankFromTreebank

getParserFromTreebank

setOptionFlags

main