edu.stanford.nlp.process
Class PTBEscapingProcessor
java.lang.Object
edu.stanford.nlp.process.PTBEscapingProcessor
- All Implemented Interfaces:
- Processor
- public class PTBEscapingProcessor
- extends Object
- implements Processor
Produces a new Document of Words in which special characters of the PTB
have been properly escaped.
- Author:
- Teg Grenager (grenager@stanford.edu)
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
stringSubs
protected Map stringSubs
oldChars
protected char[] oldChars
oldStrings
protected static final String[] oldStrings
newStrings
protected static final String[] newStrings
defaultOldChars
protected static final char[] defaultOldChars
fixQuotes
protected boolean fixQuotes
PTBEscapingProcessor
public PTBEscapingProcessor()
PTBEscapingProcessor
public PTBEscapingProcessor(Map stringSubs,
char[] oldChars,
boolean fixQuotes)
makeStringMap
protected static Map makeStringMap()
process
public Document process(Document input)
- Description copied from interface:
Processor
- Converts a Document to a different Document, by transforming
or filtering the original Document. The general contract of this method
is to not modify the
in
Document in any way, and to
preserve the metadata of the in
Document in the
returned Document.
- Specified by:
process
in interface Processor
- See Also:
FunctionProcessor
process
public List process(List input)
- Parameters:
input
- must be a List of objects of type HasWord
main
public static void main(String[] args)
- This will do the escaping on an input file. Input file must already be tokenized,
with tokens separated by whitespace.
Usage: java edu.stanford.nlp.process.PTBEscapingProcessor fileOrUrl
- Parameters:
args
- Command line argument: a file or URL
Stanford NLP Group