edu.stanford.nlp.trees.international.arabic
Class ArabicTreebankTokenizer
java.lang.Object
edu.stanford.nlp.process.AbstractTokenizer<String>
edu.stanford.nlp.process.TokenizerAdapter
edu.stanford.nlp.trees.PennTreebankTokenizer
edu.stanford.nlp.trees.international.arabic.ArabicTreebankTokenizer
- All Implemented Interfaces:
- Tokenizer<String>, Iterator<String>
public class ArabicTreebankTokenizer
- extends PennTreebankTokenizer
Builds a tokenizer for the Penn Arabic Treebank (ATB) using a
StreamTokenizer
.
This implementation is current as of the following LDC catalog numbers:
LDC2008E61 (ATBp1v4), LDC2008E62 (ATBp2v3), and LDC2008E22 (ATBp3v3.1)
- Author:
- Christopher Manning, Spence Green
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ArabicTreebankTokenizer
public ArabicTreebankTokenizer(Reader r)
getNext
public String getNext()
- Internally fetches the next token.
- Overrides:
getNext
in class TokenizerAdapter
- Returns:
- the next token in the token stream, or null if none exists.
main
public static void main(String[] args)
throws IOException
- Throws:
IOException
Stanford NLP Group