edu.stanford.nlp.process
Class PTBEscapingProcessor

java.lang.Object
  extended by edu.stanford.nlp.process.AbstractListProcessor
      extended by edu.stanford.nlp.process.PTBEscapingProcessor
All Implemented Interfaces:
Function<java.util.List<HasWord>,java.util.List<HasWord>>, ListProcessor, Processor, java.io.Serializable

public class PTBEscapingProcessor
extends AbstractListProcessor
implements Function<java.util.List<HasWord>,java.util.List<HasWord>>

Produces a new Document of Words in which special characters of the PTB have been properly escaped.

See Also:
Serialized Form

Field Summary
protected static char[] defaultOldChars
           
protected  boolean fixQuotes
           
protected static java.lang.String[] newStrings
           
protected  char[] oldChars
           
protected static java.lang.String[] oldStrings
           
protected  java.util.Map stringSubs
           
 
Constructor Summary
PTBEscapingProcessor()
           
PTBEscapingProcessor(java.util.Map stringSubs, char[] oldChars, boolean fixQuotes)
           
 
Method Summary
 java.util.List<HasWord> apply(java.util.List<HasWord> hasWordsList)
          Unescape a List of HasWords.
static void main(java.lang.String[] args)
          This will do the escaping on an input file.
protected static java.util.Map makeStringMap()
           
 java.util.List process(java.util.List input)
          Take a List (including a Sentence) of input, and return a List that has been processed in some way.
 
Methods inherited from class edu.stanford.nlp.process.AbstractListProcessor
processDocument, processLists
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

stringSubs

protected java.util.Map stringSubs

oldChars

protected char[] oldChars

oldStrings

protected static final java.lang.String[] oldStrings

newStrings

protected static final java.lang.String[] newStrings

defaultOldChars

protected static final char[] defaultOldChars

fixQuotes

protected boolean fixQuotes
Constructor Detail

PTBEscapingProcessor

public PTBEscapingProcessor()

PTBEscapingProcessor

public PTBEscapingProcessor(java.util.Map stringSubs,
                            char[] oldChars,
                            boolean fixQuotes)
Method Detail

makeStringMap

protected static java.util.Map makeStringMap()

apply

public java.util.List<HasWord> apply(java.util.List<HasWord> hasWordsList)
Unescape a List of HasWords. Implements the Function<List<HasWord>, List<HasWord>> interface.

Specified by:
apply in interface Function<java.util.List<HasWord>,java.util.List<HasWord>>

process

public java.util.List process(java.util.List input)
Description copied from interface: ListProcessor
Take a List (including a Sentence) of input, and return a List that has been processed in some way.

Specified by:
process in interface ListProcessor
Parameters:
input - must be a List of objects of type HasWord

main

public static void main(java.lang.String[] args)
This will do the escaping on an input file. Input file must already be tokenized, with tokens separated by whitespace.
Usage: java edu.stanford.nlp.process.PTBEscapingProcessor fileOrUrl

Parameters:
args - Command line argument: a file or URL