Package edu.stanford.nlp.trees.international.arabic

Class Summary
ArabicHeadFinder Find the head of an Arabic tree, using the usual kind of heuristic head finding rules.
ArabicTokenizer An ArabicTokenizer is a simple tokenizer that splits off a few punctuation characters, and otherwise just splits on and discards whitespace characters.
ArabicTreebankLanguagePack Specifies the treebank/language specific components needed for parsing the English Penn Treebank.
ArabicTreebankTokenizer Builds a tokenizer for English PennTreebank (release 2) trees.
ArabicTreeNormalizer A first-version tree normalizer for the Arabic Penn Treebank.
ArabicTreeNormalizer.ArabicEmptyFilter This one extends the one in BobChrisTreeNormalizer to also delete empty pronouns.
ArabicTreeReaderFactory Reads ArabicTreebank trees.
ArabicTreeReaderFactory.ArabicRawTreeReaderFactory  
ArabicTreeReaderFactory.ArabicXFilteringTreeReaderFactory  
Buckwalter This class can convert between Unicode and Buckwalter encodings of Arabic.
IBMArabicEscaper This escaper deletes the '#' and '+' symbols that the IBM segmenter uses to mark prefixes and suffixes, since they're not present in the Penn Arabic treebank materials (though later we might try adding them), and escapes the parenthesis characters.
 

Enum Summary
ArabicHeadFinder.TagSet  
 



Stanford NLP Group