|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectedu.stanford.nlp.trees.AbstractTreebankLanguagePack
edu.stanford.nlp.trees.international.pennchinese.ChineseTreebankLanguagePack
public class ChineseTreebankLanguagePack
Language pack for the UPenn/Colorado Chinese treebank. The native character set for the Chinese Treebank is GB18030. This file (like the rest of JavaNLP) is in UTF-8.
| Field Summary | |
|---|---|
static java.lang.String |
ENCODING
|
| Fields inherited from class edu.stanford.nlp.trees.AbstractTreebankLanguagePack |
|---|
DEFAULT_ENCODING, DEFAULT_GF_CHAR, gfCharacter |
| Constructor Summary | |
|---|---|
ChineseTreebankLanguagePack()
|
|
| Method Summary | |
|---|---|
static Filter<java.lang.String> |
chineseColonAcceptFilter()
|
static Filter<java.lang.String> |
chineseCommaAcceptFilter()
|
static Filter<java.lang.String> |
chineseDashAcceptFilter()
|
static Filter<java.lang.String> |
chineseDouHaoAcceptFilter()
|
static Filter<java.lang.String> |
chineseEndSentenceAcceptFilter()
|
static Filter<java.lang.String> |
chineseLeftParenthesisAcceptFilter()
|
static Filter<java.lang.String> |
chineseLeftQuoteMarkAcceptFilter()
|
static Filter<java.lang.String> |
chineseOtherAcceptFilter()
|
static Filter<java.lang.String> |
chineseParenthesisAcceptFilter()
|
static Filter<java.lang.String> |
chineseQuoteMarkAcceptFilter()
|
static Filter<java.lang.String> |
chineseRightParenthesisAcceptFilter()
|
static Filter<java.lang.String> |
chineseRightQuoteMarkAcceptFilter()
|
java.lang.String |
getEncoding()
Return the input Charset encoding for the Treebank. |
TokenizerFactory<? extends HasWord> |
getTokenizerFactory()
Return a tokenizer which might be suitable for tokenizing text that will be used with this Treebank/Language pair, without tokenizing carriage returns (i.e., treating them as white space). |
GrammaticalStructureFactory |
grammaticalStructureFactory()
Return a GrammaticalStructureFactory suitable for this language/treebank. |
GrammaticalStructureFactory |
grammaticalStructureFactory(Filter<java.lang.String> puncFilt)
Return a GrammaticalStructureFactory suitable for this language/treebank. |
GrammaticalStructureFactory |
grammaticalStructureFactory(Filter<java.lang.String> puncFilt,
HeadFinder hf)
Return a GrammaticalStructureFactory suitable for this language/treebank. |
HeadFinder |
headFinder()
The HeadFinder to use for your treebank. |
boolean |
isEvalBIgnoredPunctuationTag(java.lang.String str)
Accepts a String that is a punctuation tag that should be ignored by EVALB-style evaluation, and rejects everything else. |
boolean |
isPunctuationTag(java.lang.String str)
Accepts a String that is a punctuation tag name, and rejects everything else. |
boolean |
isPunctuationWord(java.lang.String str)
Accepts a String that is a punctuation word, and rejects everything else. |
boolean |
isSentenceFinalPunctuationTag(java.lang.String str)
Accepts a String that is a sentence end punctuation tag, and rejects everything else. |
char[] |
labelAnnotationIntroducingCharacters()
Return an array of characters at which a String should be truncated to give the basic syntactic category of a label. |
java.lang.String[] |
punctuationTags()
Returns a String array of punctuation tags for this treebank/language. |
java.lang.String[] |
punctuationWords()
Returns a String array of punctuation words for this treebank/language. |
java.lang.String[] |
sentenceFinalPunctuationTags()
Returns a String array of sentence final punctuation tags for this treebank/language. |
java.lang.String[] |
sentenceFinalPunctuationWords()
Returns a String array of sentence final punctuation words for this treebank/language. |
static void |
setTokenizerFactory(TokenizerFactory<? extends HasWord> tf)
|
java.lang.String[] |
startSymbols()
Returns a String array of treebank start symbols. |
java.lang.String |
treebankFileExtension()
Returns the extension of treebank files for this treebank. |
TreeReaderFactory |
treeReaderFactory()
Returns a TreeReaderFactory suitable for general purpose use with this language/treebank. |
HeadFinder |
typedDependencyHeadFinder()
The HeadFinder to use when making typed dependencies. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final java.lang.String ENCODING
| Constructor Detail |
|---|
public ChineseTreebankLanguagePack()
| Method Detail |
|---|
public static void setTokenizerFactory(TokenizerFactory<? extends HasWord> tf)
public TokenizerFactory<? extends HasWord> getTokenizerFactory()
AbstractTreebankLanguagePackWhitespaceTokenizer.
getTokenizerFactory in interface TreebankLanguagePackgetTokenizerFactory in class AbstractTreebankLanguagePackpublic java.lang.String getEncoding()
Charset class.
getEncoding in interface TreebankLanguagePackgetEncoding in class AbstractTreebankLanguagePackpublic boolean isPunctuationTag(java.lang.String str)
isPunctuationTag in interface TreebankLanguagePackisPunctuationTag in class AbstractTreebankLanguagePackstr - The string to check
public boolean isPunctuationWord(java.lang.String str)
isPunctuationWord in interface TreebankLanguagePackisPunctuationWord in class AbstractTreebankLanguagePackstr - The string to check
public boolean isSentenceFinalPunctuationTag(java.lang.String str)
isSentenceFinalPunctuationTag in interface TreebankLanguagePackisSentenceFinalPunctuationTag in class AbstractTreebankLanguagePackstr - The string to check
public java.lang.String[] punctuationTags()
punctuationTags in interface TreebankLanguagePackpunctuationTags in class AbstractTreebankLanguagePackpublic java.lang.String[] punctuationWords()
punctuationWords in interface TreebankLanguagePackpunctuationWords in class AbstractTreebankLanguagePackpublic java.lang.String[] sentenceFinalPunctuationTags()
sentenceFinalPunctuationTags in interface TreebankLanguagePacksentenceFinalPunctuationTags in class AbstractTreebankLanguagePackpublic java.lang.String[] sentenceFinalPunctuationWords()
public boolean isEvalBIgnoredPunctuationTag(java.lang.String str)
isEvalBIgnoredPunctuationTag in interface TreebankLanguagePackisEvalBIgnoredPunctuationTag in class AbstractTreebankLanguagePackstr - The string to check
public char[] labelAnnotationIntroducingCharacters()
labelAnnotationIntroducingCharacters in interface TreebankLanguagePacklabelAnnotationIntroducingCharacters in class AbstractTreebankLanguagePackpublic java.lang.String[] startSymbols()
startSymbols in interface TreebankLanguagePackstartSymbols in class AbstractTreebankLanguagePackpublic static Filter<java.lang.String> chineseCommaAcceptFilter()
public static Filter<java.lang.String> chineseEndSentenceAcceptFilter()
public static Filter<java.lang.String> chineseDouHaoAcceptFilter()
public static Filter<java.lang.String> chineseQuoteMarkAcceptFilter()
public static Filter<java.lang.String> chineseParenthesisAcceptFilter()
public static Filter<java.lang.String> chineseColonAcceptFilter()
public static Filter<java.lang.String> chineseDashAcceptFilter()
public static Filter<java.lang.String> chineseOtherAcceptFilter()
public static Filter<java.lang.String> chineseLeftParenthesisAcceptFilter()
public static Filter<java.lang.String> chineseRightParenthesisAcceptFilter()
public static Filter<java.lang.String> chineseLeftQuoteMarkAcceptFilter()
public static Filter<java.lang.String> chineseRightQuoteMarkAcceptFilter()
public java.lang.String treebankFileExtension()
public GrammaticalStructureFactory grammaticalStructureFactory()
AbstractTreebankLanguagePack
grammaticalStructureFactory in interface TreebankLanguagePackgrammaticalStructureFactory in class AbstractTreebankLanguagePackpublic GrammaticalStructureFactory grammaticalStructureFactory(Filter<java.lang.String> puncFilt)
AbstractTreebankLanguagePack
grammaticalStructureFactory in interface TreebankLanguagePackgrammaticalStructureFactory in class AbstractTreebankLanguagePackpuncFilt - A filter which should reject punctuation words (as Strings)
public GrammaticalStructureFactory grammaticalStructureFactory(Filter<java.lang.String> puncFilt,
HeadFinder hf)
AbstractTreebankLanguagePack
grammaticalStructureFactory in interface TreebankLanguagePackgrammaticalStructureFactory in class AbstractTreebankLanguagePackpuncFilt - A filter which should reject punctuation words (as Strings)hf - A HeadFinder which finds heads for typed dependencies
public TreeReaderFactory treeReaderFactory()
AbstractTreebankLanguagePack
treeReaderFactory in interface TreebankLanguagePacktreeReaderFactory in class AbstractTreebankLanguagePackpublic HeadFinder headFinder()
public HeadFinder typedDependencyHeadFinder()
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||