edu.stanford.nlp.process
Class PTBEscapingProcessor<IN extends HasWord,L,F>

java.lang.Object
  extended by edu.stanford.nlp.process.AbstractListProcessor<IN,HasWord,L,F>
      extended by edu.stanford.nlp.process.PTBEscapingProcessor<IN,L,F>
Type Parameters:
L - The type of the labels
F - The type of the features
All Implemented Interfaces:
DocumentProcessor<IN,HasWord,L,F>, ListProcessor<IN,HasWord>, Function<List<IN>,List<HasWord>>

public class PTBEscapingProcessor<IN extends HasWord,L,F>
extends AbstractListProcessor<IN,HasWord,L,F>
implements Function<List<IN>,List<HasWord>>

Produces a new Document of Words in which special characters of the PTB have been properly escaped.

Author:
Teg Grenager (grenager@stanford.edu), Sarah Spikes (sdspikes@cs.stanford.edu) (Templatization)

Field Summary
protected  char[] escapeChars
           
protected  boolean fixQuotes
           
protected  String[] replaceEscapes
           
protected  String[] replaceSubsts
           
protected  char[] substChars
           
 
Constructor Summary
PTBEscapingProcessor()
           
PTBEscapingProcessor(char[] escapeChars, String[] replaceEscapes, char[] substChars, String[] replaceSubsts, boolean fixQuotes)
           
 
Method Summary
 List<HasWord> apply(List<IN> hasWordsList)
          Escape a List of HasWords.
 String escapeString(String s)
           
static void main(String[] args)
          This will do the escaping on an input file.
 List<HasWord> process(List<? extends IN> input)
          Take a List (including a Sentence) of input, and return a List that has been processed in some way.
static String unprocess(String s)
           
 
Methods inherited from class edu.stanford.nlp.process.AbstractListProcessor
processDocument, processLists
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

substChars

protected char[] substChars

replaceSubsts

protected String[] replaceSubsts

escapeChars

protected char[] escapeChars

replaceEscapes

protected String[] replaceEscapes

fixQuotes

protected boolean fixQuotes
Constructor Detail

PTBEscapingProcessor

public PTBEscapingProcessor()

PTBEscapingProcessor

public PTBEscapingProcessor(char[] escapeChars,
                            String[] replaceEscapes,
                            char[] substChars,
                            String[] replaceSubsts,
                            boolean fixQuotes)
Method Detail

apply

public List<HasWord> apply(List<IN> hasWordsList)
Escape a List of HasWords. Implements the Function<List<HasWord>, List<HasWord>> interface.

Specified by:
apply in interface Function<List<IN extends HasWord>,List<HasWord>>
Parameters:
hasWordsList - The function's argument
Returns:
The function's evaluated value

unprocess

public static String unprocess(String s)

process

public List<HasWord> process(List<? extends IN> input)
Description copied from interface: ListProcessor
Take a List (including a Sentence) of input, and return a List that has been processed in some way.

Specified by:
process in interface ListProcessor<IN extends HasWord,HasWord>
Parameters:
input - must be a List of objects of type HasWord

escapeString

public String escapeString(String s)

main

public static void main(String[] args)
This will do the escaping on an input file. Input file should already be tokenized, with tokens separated by whitespace.
Usage: java edu.stanford.nlp.process.PTBEscapingProcessor fileOrUrl

Parameters:
args - Command line argument: a file or URL


Stanford NLP Group