public interface TreebankLangParserParams extends TreebankFactory, Serializable
Modifier and Type | Method and Description |
---|---|
TreeTransformer |
collinizer()
The tree transformer applied to trees prior to evaluation.
|
TreeTransformer |
collinizerEvalb()
the tree transformer used to produce trees for evaluation.
|
String[] |
defaultCoreNLPFlags()
When run inside StanfordCoreNLP, which flags should be used by default
|
List<? extends HasWord> |
defaultTestSentence()
Return a default sentence of the language (for testing).
|
Extractor<DependencyGrammar> |
dependencyGrammarExtractor(Options op,
Index<String> wordIndex,
Index<String> tagIndex) |
DiskTreebank |
diskTreebank()
returns a DiskTreebank appropriate to the treebank source
|
void |
display()
display language-specific settings
|
GrammaticalStructure |
getGrammaticalStructure(Tree t,
java.util.function.Predicate<String> filter,
HeadFinder hf)
Build a GrammaticalStructure from a Tree.
|
String |
getInputEncoding()
Returns the input encoding being used.
|
String |
getOutputEncoding()
Returns the output encoding being used.
|
HeadFinder |
headFinder() |
Lexicon |
lex(Options op,
Index<String> wordIndex,
Index<String> tagIndex)
Vends a
Lexicon object suitable to the particular language/treebank combination of interest. |
MemoryTreebank |
memoryTreebank()
returns a MemoryTreebank appropriate to the treebank source
|
double[] |
MLEDependencyGrammarSmoothingParams()
Give the parameters for smoothing in the MLEDependencyGrammar.
|
AbstractEval |
ppAttachmentEval()
Returns a language specific object for evaluating PP attachment
|
Label |
processHeadWord(Label headWord)
Allows language specific processing (e.g., stemming) of head words.
|
PrintWriter |
pw()
returns a PrintWriter used to print output.
|
PrintWriter |
pw(OutputStream o)
returns a PrintWriter used to print output to the OutputStream
o.
|
List<GrammaticalStructure> |
readGrammaticalStructureFromFile(String filename)
Returns a function which reads the given filename and turns its
content in a list of GrammaticalStructures.
|
void |
setEvaluateGrammaticalFunctions(boolean evalGFs)
If evalGFs = true, then the evaluation of parse trees will include evaluation on grammatical functions.
|
void |
setInputEncoding(String encoding) |
int |
setOptionFlag(String[] args,
int i)
Set a language-specific option according to command-line flags.
|
void |
setOutputEncoding(String encoding) |
String[] |
sisterSplitters()
Returns the splitting strings used for selective splits.
|
TreeTransformer |
subcategoryStripper()
Returns a TreeTransformer appropriate to the Treebank which
can be used to remove functional tags (such as "-TMP") from
categories.
|
boolean |
supportsBasicDependencies() |
MemoryTreebank |
testMemoryTreebank()
returns a MemoryTreebank appropriate to the testing treebank source
|
Tree |
transformTree(Tree t,
Tree root)
This method does language-specific tree transformations such
as annotating particular nodes with language-relevant features.
|
Treebank |
treebank()
Required to extend TreebankFactory
|
TreebankLanguagePack |
treebankLanguagePack()
returns a TreebankLanguagePack containing Treebank-specific (but
not parser-specific) info such as what is punctuation, and also
information about the structure of labels
|
TreeReaderFactory |
treeReaderFactory()
Returns a factory for reading in trees from the source you want.
|
TokenizerFactory<Tree> |
treeTokenizerFactory() |
HeadFinder |
typedDependencyHeadFinder() |
HeadFinder headFinder()
HeadFinder typedDependencyHeadFinder()
Label processHeadWord(Label headWord)
void setInputEncoding(String encoding)
void setOutputEncoding(String encoding)
void setEvaluateGrammaticalFunctions(boolean evalGFs)
String getOutputEncoding()
String getInputEncoding()
TreeReaderFactory treeReaderFactory()
Lexicon lex(Options op, Index<String> wordIndex, Index<String> tagIndex)
Lexicon
object suitable to the particular language/treebank combination of interest.op
- Options as to how the Lexicon behavesTreeTransformer collinizer()
TreeTransformer collinizerEvalb()
MemoryTreebank memoryTreebank()
DiskTreebank diskTreebank()
MemoryTreebank testMemoryTreebank()
Treebank treebank()
treebank
in interface TreebankFactory
TreebankLanguagePack treebankLanguagePack()
PrintWriter pw()
PrintWriter pw(OutputStream o)
String[] sisterSplitters()
TreeTransformer subcategoryStripper()
Tree transformTree(Tree t, Tree root)
t
. It changes both
labels and the tree shape.t
- The input tree (with non-language specific annotation already
done, so you need to strip back to basic categories)root
- The root of the current tree (can be null for words)void display()
int setOptionFlag(String[] args, int i)
args
- Array of command line argumentsi
- Index in command line arguments to try to process as an optionList<? extends HasWord> defaultTestSentence()
TokenizerFactory<Tree> treeTokenizerFactory()
Extractor<DependencyGrammar> dependencyGrammarExtractor(Options op, Index<String> wordIndex, Index<String> tagIndex)
double[] MLEDependencyGrammarSmoothingParams()
AbstractEval ppAttachmentEval()
AbstractEval
List<GrammaticalStructure> readGrammaticalStructureFromFile(String filename)
GrammaticalStructure getGrammaticalStructure(Tree t, java.util.function.Predicate<String> filter, HeadFinder hf)
boolean supportsBasicDependencies()
String[] defaultCoreNLPFlags()