edu.stanford.nlp.parser.lexparser
Class Options

java.lang.Object
  extended byedu.stanford.nlp.parser.lexparser.Options
All Implemented Interfaces:
Serializable

public class Options
extends Object
implements Serializable

See Also:
Serialized Form

Field Summary
 boolean coarseDistance
          Use coarser distance (4 bins) in dependency calculations
 boolean dcTags
          "double count" tags rewrites as word in PCFG and Dep parser.
 boolean directional
          Whether dependency grammar considers left/right direction.
 boolean distance
          Use distance bins in the dependency calculations
 boolean doDep
          Do a dependency parse of the sentence.
 boolean doPCFG
          Do a PCFG parse of the sentence.
 boolean flexiTag
           
 boolean forceCNF
          Forces parsing with strictly CNF grammar -- unary chains are converted to XP&YP symbols and back
 boolean freeDependencies
          if true, any child can be the head (seems rather bad!)
 boolean genStop
           
 int numStates
           
 int numTags
           
 int numWords
           
 boolean smartMutation
          Smarter smoothing for rare words.
 int SMOOTH_IN_UNKNOWNS_THRESHOLD
          Words more common than this are tagged with MLE P(t|w).
 TreebankLangParserParams tlpParams
          The treebank-specific parser parameters to use.
 int useUnknownWordSignatures
          Use suffix and capitalization information for unknowns.
 
Method Summary
 void display()
           
static Options get()
           
 TreebankLanguagePack langpack()
          returns the treebank language pack for the treebank the parser is trained on.
 void readData(BufferedReader in)
          Populates data in this Options from the character stream given by the Reader r.
static void set(Options pt)
           
 void writeData(Writer w)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

numWords

public int numWords

numTags

public int numTags

numStates

public int numStates

useUnknownWordSignatures

public int useUnknownWordSignatures
Use suffix and capitalization information for unknowns. 0 means a single unknown token. 1 uses suffix, and capitalization. 2 uses a variant (richer) form of signature. Good. Use this one. Using the richer signatures in versions 3 or 4 seems to have very marginal or no positive value. 3 uses a richer form of signature that mimics the NER word type patterns. 4 is a variant of 2. 5 is another with more English specific morphology (good for English unknowns!)


SMOOTH_IN_UNKNOWNS_THRESHOLD

public int SMOOTH_IN_UNKNOWNS_THRESHOLD
Words more common than this are tagged with MLE P(t|w). Default 100. The smoothing is sufficiently slight that changing this has little effect.


tlpParams

public TreebankLangParserParams tlpParams
The treebank-specific parser parameters to use.


smartMutation

public boolean smartMutation
Smarter smoothing for rare words.


forceCNF

public boolean forceCNF
Forces parsing with strictly CNF grammar -- unary chains are converted to XP&YP symbols and back


doPCFG

public boolean doPCFG
Do a PCFG parse of the sentence. If both variables are on, also do a combined parse of the sentence.


doDep

public boolean doDep
Do a dependency parse of the sentence.


freeDependencies

public boolean freeDependencies
if true, any child can be the head (seems rather bad!)


directional

public boolean directional
Whether dependency grammar considers left/right direction. Good.


genStop

public boolean genStop

distance

public boolean distance
Use distance bins in the dependency calculations


coarseDistance

public boolean coarseDistance
Use coarser distance (4 bins) in dependency calculations


flexiTag

public boolean flexiTag

dcTags

public boolean dcTags
"double count" tags rewrites as word in PCFG and Dep parser. Good for combined parsing only (it used to not kick in for PCFG parsing). This option is only used at Test time, but it is now in Options, so the correct choice for a grammar is recorded by a serialized parser.

Method Detail

get

public static Options get()

set

public static void set(Options pt)

langpack

public TreebankLanguagePack langpack()
returns the treebank language pack for the treebank the parser is trained on.


display

public void display()

writeData

public void writeData(Writer w)
               throws IOException
Throws:
IOException

readData

public void readData(BufferedReader in)
              throws IOException
Populates data in this Options from the character stream given by the Reader r.

Throws:
IOException


Stanford NLP Group