|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.stanford.nlp.parser.lexparser.TrainOptions
public class TrainOptions
Non-language-specific options for training a grammar from a treebank. These options are not used at parsing time. But they are all static so it isn't possible to train multiple parsers in multiple threads at present with different options, until this is changed.
Field Summary | |
---|---|
boolean |
basicCategoryTagsInDependencyGrammar
Where to use the basic or split tags in the dependency grammar |
boolean |
cheatPCFG
Add all test set trees to training data for PCFG. |
boolean |
collinsPunc
Promote/delete punctuation like Collins. |
int |
compactGrammar
How to compact grammars as FSMs. |
Set<String> |
deleteSplitters
|
double |
fractionBeforeUnseenCounting
Start to aggregate signature-tag pairs only for words unseen in the first this fraction of the data. |
boolean |
gPA
This variable controls doing 2 levels of parent annotation. |
int |
HSEL_CUT
|
boolean |
hSelSplit
|
boolean |
leftRec
Left edge is right-recursive (X << X) Bad. |
boolean |
leftToRight
|
boolean |
markFinalStates
Whether or not to mark final states in binarized grammar. |
boolean |
markovFactor
Whether to do "horizontal Markovization" (as in ACL 2003 paper). |
int |
markovOrder
|
int |
markUnary
Mark all unary nodes specially. |
boolean |
markUnaryTags
Mark POS tags which are the sole member of their phrasal constituent. |
boolean |
noTagSplit
|
int |
openClassTypesThreshold
A POS tag has to have been attributed to more than this number of word types before it is regarded as an open-class tag. |
boolean |
PA
This variable controls doing parent annotation of phrasal nodes. |
boolean |
postGPA
|
boolean |
postPA
|
Set |
postSplitters
|
boolean |
postSplitWithBaseCategory
Whether, in post-splitting of categories, nodes are annotated with the (grand)parent's base category or with its complete subcategorized category. |
TreeTransformer |
preTransformer
A transformer to use on the training data before any other processing step. |
PrintWriter |
printAnnotatedPW
|
boolean |
printAnnotatedRuleCounts
|
boolean |
printAnnotatedStateCounts
|
PrintWriter |
printBinarizedPW
|
boolean |
printStates
|
int |
printTreeTransformations
Just for debugging: check that your tree transforms work correctly. |
boolean |
rightRec
Right edge is right-recursive (X << X) Bad. |
double |
ruleDiscount
Discounts the count of BinaryRule's (only, apparently) in training data. |
boolean |
ruleSmoothing
Enables linear rule smoothing during grammar extraction but before grammar compaction. |
double |
ruleSmoothingAlpha
|
boolean |
selectivePostSplit
|
double |
selectivePostSplitCutOff
|
boolean |
selectiveSplit
Only split the "common high KL divergence" parent categories.... |
double |
selectiveSplitCutOff
|
boolean |
sisterAnnotate
Selective Sister annotation. |
Set<String> |
sisterSplitters
|
boolean |
smoothing
TODO wsg2011: This is the old grammar smoothing parameter that no longer does anything in the parser. |
boolean |
splitPrePreT
Mark all pre-preterminals (also does splitBaseNP: don't need both) |
Set<String> |
splitters
Set the splitter strings. |
String |
taggedFiles
A set of files to use as extra information in the lexicon. |
boolean |
tagPA
Parent annotation on tags. |
boolean |
tagSelectivePostSplit
|
double |
tagSelectivePostSplitCutOff
|
boolean |
tagSelectiveSplit
Do parent annotation on tags selectively. |
double |
tagSelectiveSplitCutOff
|
int |
trainLengthLimit
|
String |
trainTreeFile
|
Constructor Summary | |
---|---|
TrainOptions()
|
Method Summary | |
---|---|
int |
compactGrammar()
|
void |
display()
|
boolean |
outsideFactor()
If true, declare early -- leave this on except maybe with markov on. |
static void |
printTrainTree(PrintWriter pw,
String message,
Tree t)
|
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public String trainTreeFile
public int trainLengthLimit
public boolean cheatPCFG
public boolean markovFactor
public int markovOrder
public boolean hSelSplit
public int HSEL_CUT
public boolean markFinalStates
public int openClassTypesThreshold
public double fractionBeforeUnseenCounting
public boolean PA
public boolean gPA
public boolean postPA
public boolean postGPA
public boolean selectiveSplit
public double selectiveSplitCutOff
public boolean selectivePostSplit
public double selectivePostSplitCutOff
public boolean postSplitWithBaseCategory
public boolean sisterAnnotate
public Set<String> sisterSplitters
public int markUnary
public boolean markUnaryTags
public boolean splitPrePreT
public boolean tagPA
public boolean tagSelectiveSplit
public double tagSelectiveSplitCutOff
public boolean tagSelectivePostSplit
public double tagSelectivePostSplitCutOff
public boolean rightRec
public boolean leftRec
public boolean collinsPunc
public Set<String> splitters
public Set postSplitters
public Set<String> deleteSplitters
public int printTreeTransformations
public PrintWriter printAnnotatedPW
public PrintWriter printBinarizedPW
public boolean printStates
public int compactGrammar
public boolean leftToRight
public boolean noTagSplit
public boolean ruleSmoothing
public double ruleSmoothingAlpha
public boolean smoothing
public double ruleDiscount
public boolean printAnnotatedRuleCounts
public boolean printAnnotatedStateCounts
public boolean basicCategoryTagsInDependencyGrammar
public TreeTransformer preTransformer
public String taggedFiles
Constructor Detail |
---|
public TrainOptions()
Method Detail |
---|
public boolean outsideFactor()
public int compactGrammar()
public void display()
public static void printTrainTree(PrintWriter pw, String message, Tree t)
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |