edu.stanford.nlp.process
Class PTBEscapingProcessor

java.lang.Object
  extended by edu.stanford.nlp.process.AbstractListProcessor
      extended by edu.stanford.nlp.process.PTBEscapingProcessor
All Implemented Interfaces:
ListProcessor, Processor, Function<List<HasWord>,List<HasWord>>

public class PTBEscapingProcessor
extends AbstractListProcessor
implements Function<List<HasWord>,List<HasWord>>

Produces a new Document of Words in which special characters of the PTB have been properly escaped.

Author:
Teg Grenager (grenager@stanford.edu)

Field Summary
protected static char[] defaultOldChars
           
protected  boolean fixQuotes
           
protected static String[] newStrings
           
protected  char[] oldChars
           
protected static String[] oldStrings
           
protected  Map<String,String> stringSubs
           
 
Constructor Summary
PTBEscapingProcessor()
           
PTBEscapingProcessor(Map<String,String> stringSubs, char[] oldChars, boolean fixQuotes)
           
 
Method Summary
 List<HasWord> apply(List<HasWord> hasWordsList)
          Unescape a List of HasWords.
static void main(String[] args)
          This will do the escaping on an input file.
protected static Map<String,String> makeStringMap()
           
 List process(List input)
          Take a List (including a Sentence) of input, and return a List that has been processed in some way.
static String unprocess(String s)
           
 
Methods inherited from class edu.stanford.nlp.process.AbstractListProcessor
processDocument, processLists
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

stringSubs

protected Map<String,String> stringSubs

oldChars

protected char[] oldChars

oldStrings

protected static final String[] oldStrings

newStrings

protected static final String[] newStrings

defaultOldChars

protected static final char[] defaultOldChars

fixQuotes

protected boolean fixQuotes
Constructor Detail

PTBEscapingProcessor

public PTBEscapingProcessor()

PTBEscapingProcessor

public PTBEscapingProcessor(Map<String,String> stringSubs,
                            char[] oldChars,
                            boolean fixQuotes)
Method Detail

makeStringMap

protected static Map<String,String> makeStringMap()

apply

public List<HasWord> apply(List<HasWord> hasWordsList)
Unescape a List of HasWords. Implements the Function<List<HasWord>, List<HasWord>> interface.

Specified by:
apply in interface Function<List<HasWord>,List<HasWord>>
Parameters:
hasWordsList - The function's argument
Returns:
The function's evaluated value

unprocess

public static String unprocess(String s)

process

public List process(List input)
Description copied from interface: ListProcessor
Take a List (including a Sentence) of input, and return a List that has been processed in some way.

Specified by:
process in interface ListProcessor
Parameters:
input - must be a List of objects of type HasWord

main

public static void main(String[] args)
This will do the escaping on an input file. Input file must already be tokenized, with tokens separated by whitespace.
Usage: java edu.stanford.nlp.process.PTBEscapingProcessor fileOrUrl

Parameters:
args - Command line argument: a file or URL


Stanford NLP Group