edu.stanford.nlp.process
Class PTBEscapingProcessor

java.lang.Object
  extended byedu.stanford.nlp.process.PTBEscapingProcessor
All Implemented Interfaces:
Processor

public class PTBEscapingProcessor
extends Object
implements Processor

Produces a new Document of Words in which special characters of the PTB have been properly escaped.

Author:
Teg Grenager (grenager@stanford.edu)

Field Summary
protected static char[] defaultOldChars
           
protected  boolean fixQuotes
           
protected static String[] newStrings
           
protected  char[] oldChars
           
protected static String[] oldStrings
           
protected  Map stringSubs
           
 
Constructor Summary
PTBEscapingProcessor()
           
PTBEscapingProcessor(Map stringSubs, char[] oldChars, boolean fixQuotes)
           
 
Method Summary
static void main(String[] args)
          This will do the escaping on an input file.
protected static Map makeStringMap()
           
 Document process(Document input)
          Converts a Document to a different Document, by transforming or filtering the original Document.
 List process(List input)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

stringSubs

protected Map stringSubs

oldChars

protected char[] oldChars

oldStrings

protected static final String[] oldStrings

newStrings

protected static final String[] newStrings

defaultOldChars

protected static final char[] defaultOldChars

fixQuotes

protected boolean fixQuotes
Constructor Detail

PTBEscapingProcessor

public PTBEscapingProcessor()

PTBEscapingProcessor

public PTBEscapingProcessor(Map stringSubs,
                            char[] oldChars,
                            boolean fixQuotes)
Method Detail

makeStringMap

protected static Map makeStringMap()

process

public Document process(Document input)
Description copied from interface: Processor
Converts a Document to a different Document, by transforming or filtering the original Document. The general contract of this method is to not modify the in Document in any way, and to preserve the metadata of the in Document in the returned Document.

Specified by:
process in interface Processor
See Also:
FunctionProcessor

process

public List process(List input)
Parameters:
input - must be a List of objects of type HasWord

main

public static void main(String[] args)
This will do the escaping on an input file. Input file must already be tokenized, with tokens separated by whitespace.
Usage: java edu.stanford.nlp.process.PTBEscapingProcessor fileOrUrl

Parameters:
args - Command line argument: a file or URL


Stanford NLP Group