|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--edu.stanford.nlp.trees.international.CHTBTokenizer
A simple tokenizer for tokenizing Penn Chinese Treebank files. A token is any parenthesis, node label, or terminal. All SGML content of the files is ignored.
Constructor Summary | |
CHTBTokenizer(Reader r)
Constructs a new tokenizer from a Reader. |
Method Summary | |
static void |
main(String[] args)
The main() method tokenizes a file in the specified Encoding and prints it to standard output in the specified Encoding. |
String |
next()
Returns the next token. |
void |
pushBack()
Pushes the previous token back. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
public CHTBTokenizer(Reader r)
ConvertEncodingThread
, or by specifying
the files encoding explicitly in the Reader with
java.io.InputStreamReader
.
r
- ReaderMethod Detail |
public String next() throws IOException
edu.stanford.nlp.io.StreamTokenizer
interface.
next
in interface StreamTokenizer
IOException
public void pushBack()
edu.stanford.nlp.io.StreamTokenizer
interface.
pushBack
in interface StreamTokenizer
public static void main(String[] args) throws IOException
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |