public abstract class AbstractTreebankParserParams extends java.lang.Object implements TreebankLangParserParams
TreebankLangParserParams
implementing class.
Modifier and Type | Class and Description |
---|---|
protected static class |
AbstractTreebankParserParams.AnnotatePunctuationFunction
Annotation function for mapping punctuation to PTB-style equivalence classes.
|
protected class |
AbstractTreebankParserParams.RemoveGFSubcategoryStripper
The job of this class is to remove subcategorizations from
tag and category nodes, so as to put a tree in a suitable
state for evaluation.
|
protected class |
AbstractTreebankParserParams.SubcategoryStripper
The job of this class is to remove subcategorizations from
tag and category nodes, so as to put a tree in a suitable
state for evaluation.
|
Modifier and Type | Field and Description |
---|---|
protected boolean |
evalGF
If true, then evaluation is over grammatical functions as well as the labels
If false, then grammatical functions are stripped for evaluation.
|
protected boolean |
generateOriginalDependencies |
protected java.lang.String |
inputEncoding |
protected java.lang.String |
outputEncoding |
protected TreebankLanguagePack |
tlp |
Modifier | Constructor and Description |
---|---|
protected |
AbstractTreebankParserParams(TreebankLanguagePack tlp)
Stores the passed-in TreebankLanguagePack and sets up charset encodings.
|
Modifier and Type | Method and Description |
---|---|
abstract AbstractCollinizer |
collinizer()
the tree transformer used to produce trees for evaluation.
|
abstract AbstractCollinizer |
collinizerEvalb()
the tree transformer used to produce trees for evaluation.
|
java.lang.String[] |
defaultCoreNLPFlags()
When run inside StanfordCoreNLP, which flags should be used by default.
|
abstract java.util.List<? extends HasWord> |
defaultTestSentence()
Return a default sentence of the language (for testing).
|
Extractor<DependencyGrammar> |
dependencyGrammarExtractor(Options op,
Index<java.lang.String> wordIndex,
Index<java.lang.String> tagIndex) |
DiskTreebank |
diskTreebank()
Allows you to read in trees from the source you want.
|
abstract void |
display()
Display (write to stderr) language-specific settings.
|
boolean |
generateOriginalDependencies()
Whether to generate original Stanford Dependencies or the newer
Universal Dependencies.
|
GrammaticalStructure |
getGrammaticalStructure(Tree t,
java.util.function.Predicate<java.lang.String> filter,
HeadFinder hf)
Build a GrammaticalStructure from a Tree.
|
java.lang.String |
getInputEncoding()
Returns the input encoding being used.
|
java.lang.String |
getOutputEncoding()
Returns the output encoding being used.
|
abstract HeadFinder |
headFinder()
The HeadFinder to use for your treebank.
|
boolean |
isEvalGF() |
Lexicon |
lex(Options op,
Index<java.lang.String> wordIndex,
Index<java.lang.String> tagIndex)
Vends a
Lexicon object suitable to the particular language/treebank combination of interest. |
MemoryTreebank |
memoryTreebank()
Allows you to read in trees from the source you want.
|
double[] |
MLEDependencyGrammarSmoothingParams()
Give the parameters for smoothing in the MLEDependencyGrammar.
|
AbstractEval |
ppAttachmentEval()
Returns a language specific object for evaluating PP attachment
|
Label |
processHeadWord(Label headWord)
Allows language specific processing (e.g., stemming) of head words.
|
java.io.PrintWriter |
pw()
The PrintWriter used to print output.
|
java.io.PrintWriter |
pw(java.io.OutputStream o)
The PrintWriter used to print output.
|
java.util.List<GrammaticalStructure> |
readGrammaticalStructureFromFile(java.lang.String filename)
Returns a function which reads the given filename and turns its
content in a list of GrammaticalStructures.
|
void |
setEvalGF(boolean evalGF) |
void |
setEvaluateGrammaticalFunctions(boolean evalGFs)
Sets whether to consider grammatical functions in evaluation
|
void |
setGenerateOriginalDependencies(boolean originalDependencies)
For languages that have implementations of the
original Stanford dependencies and Universal
dependencies, this parameter is used to decide which
implementation should be used.
|
void |
setInputEncoding(java.lang.String encoding)
Sets the input encoding.
|
int |
setOptionFlag(java.lang.String[] args,
int i)
Set language-specific options according to flags.
|
void |
setOutputEncoding(java.lang.String encoding)
Sets the output encoding.
|
abstract java.lang.String[] |
sisterSplitters()
Returns the splitting strings used for selective splits.
|
TreeTransformer |
subcategoryStripper()
Returns a TreeTransformer appropriate to the Treebank which
can be used to remove functional tags (such as "-TMP") from
categories.
|
boolean |
supportsBasicDependencies()
By default, parsers are assumed to not support dependencies.
|
MemoryTreebank |
testMemoryTreebank()
You can often return the same thing for testMemoryTreebank as
for memoryTreebank
|
abstract Tree |
transformTree(Tree t,
Tree root)
This method does language-specific tree transformations such
as annotating particular nodes with language-relevant features.
|
Treebank |
treebank()
Implemented as required by TreebankFactory.
|
TreebankLanguagePack |
treebankLanguagePack()
Returns an appropriate treebankLanguagePack
|
abstract TreeReaderFactory |
treeReaderFactory()
Returns a factory for reading in trees from the source you want.
|
TokenizerFactory<Tree> |
treeTokenizerFactory() |
abstract HeadFinder |
typedDependencyHeadFinder()
The HeadFinder to use when extracting typed dependencies.
|
protected boolean evalGF
protected java.lang.String inputEncoding
protected java.lang.String outputEncoding
protected TreebankLanguagePack tlp
protected boolean generateOriginalDependencies
protected AbstractTreebankParserParams(TreebankLanguagePack tlp)
tlp
- The treebank language pack to usepublic Label processHeadWord(Label headWord)
TreebankLangParserParams
processHeadWord
in interface TreebankLangParserParams
headWord
- An Label
that minimally implements the
HasWord
and HasTag
interfaces.Label
public void setEvaluateGrammaticalFunctions(boolean evalGFs)
setEvaluateGrammaticalFunctions
in interface TreebankLangParserParams
public void setInputEncoding(java.lang.String encoding)
setInputEncoding
in interface TreebankLangParserParams
public void setOutputEncoding(java.lang.String encoding)
setOutputEncoding
in interface TreebankLangParserParams
public java.lang.String getOutputEncoding()
getOutputEncoding
in interface TreebankLangParserParams
public java.lang.String getInputEncoding()
getInputEncoding
in interface TreebankLangParserParams
public abstract TreeReaderFactory treeReaderFactory()
treeReaderFactory
in interface TreebankLangParserParams
public AbstractEval ppAttachmentEval()
ppAttachmentEval
in interface TreebankLangParserParams
AbstractEval
public DiskTreebank diskTreebank()
diskTreebank
in interface TreebankLangParserParams
public MemoryTreebank memoryTreebank()
memoryTreebank
in interface TreebankLangParserParams
public MemoryTreebank testMemoryTreebank()
testMemoryTreebank
in interface TreebankLangParserParams
public Treebank treebank()
treebank
in interface TreebankLangParserParams
treebank
in interface TreebankFactory
public java.io.PrintWriter pw()
pw
in interface TreebankLangParserParams
public java.io.PrintWriter pw(java.io.OutputStream o)
pw
in interface TreebankLangParserParams
public TreebankLanguagePack treebankLanguagePack()
treebankLanguagePack
in interface TreebankLangParserParams
public abstract HeadFinder headFinder()
headFinder
in interface TreebankLangParserParams
public abstract HeadFinder typedDependencyHeadFinder()
typedDependencyHeadFinder
in interface TreebankLangParserParams
public Lexicon lex(Options op, Index<java.lang.String> wordIndex, Index<java.lang.String> tagIndex)
TreebankLangParserParams
Lexicon
object suitable to the particular language/treebank combination of interest.lex
in interface TreebankLangParserParams
op
- Options as to how the Lexicon behavespublic double[] MLEDependencyGrammarSmoothingParams()
MLEDependencyGrammarSmoothingParams
in interface TreebankLangParserParams
public abstract AbstractCollinizer collinizer()
collinizer
in interface TreebankLangParserParams
public abstract AbstractCollinizer collinizerEvalb()
collinizerEvalb
in interface TreebankLangParserParams
public abstract java.lang.String[] sisterSplitters()
sisterSplitters
in interface TreebankLangParserParams
public TreeTransformer subcategoryStripper()
subcategoryStripper
in interface TreebankLangParserParams
public abstract Tree transformTree(Tree t, Tree root)
t
. It changes both
labels and the tree shape.transformTree
in interface TreebankLangParserParams
t
- The input tree (with non-language specific annotation already
done, so you need to strip back to basic categories)root
- The root of the current tree (can be null for words)public abstract void display()
display
in interface TreebankLangParserParams
public int setOptionFlag(java.lang.String[] args, int i)
Generic options are processed separately by
Options.setOption(String[],int)
,
and implementations of this method do not have to worry about them.
The Options class handles routing options.
TreebankParserParams that extend this class should call super when
overriding this method.
setOptionFlag
in interface TreebankLangParserParams
args
- Array of command line argumentsi
- Index in command line arguments to try to process as an optionpublic abstract java.util.List<? extends HasWord> defaultTestSentence()
defaultTestSentence
in interface TreebankLangParserParams
public TokenizerFactory<Tree> treeTokenizerFactory()
treeTokenizerFactory
in interface TreebankLangParserParams
public Extractor<DependencyGrammar> dependencyGrammarExtractor(Options op, Index<java.lang.String> wordIndex, Index<java.lang.String> tagIndex)
dependencyGrammarExtractor
in interface TreebankLangParserParams
public boolean isEvalGF()
public void setEvalGF(boolean evalGF)
public java.util.List<GrammaticalStructure> readGrammaticalStructureFromFile(java.lang.String filename)
TreebankLangParserParams
readGrammaticalStructureFromFile
in interface TreebankLangParserParams
public GrammaticalStructure getGrammaticalStructure(Tree t, java.util.function.Predicate<java.lang.String> filter, HeadFinder hf)
TreebankLangParserParams
getGrammaticalStructure
in interface TreebankLangParserParams
public boolean supportsBasicDependencies()
supportsBasicDependencies
in interface TreebankLangParserParams
public void setGenerateOriginalDependencies(boolean originalDependencies)
setGenerateOriginalDependencies
in interface TreebankLangParserParams
originalDependencies
- Whether to generate SDpublic boolean generateOriginalDependencies()
TreebankLangParserParams
generateOriginalDependencies
in interface TreebankLangParserParams
public java.lang.String[] defaultCoreNLPFlags()
TreebankLangParserParams
defaultCoreNLPFlags
in interface TreebankLangParserParams