edu.stanford.nlp.process
Class PTBEscapingProcessor<IN extends HasWord,L,F>
java.lang.Object
edu.stanford.nlp.process.AbstractListProcessor<IN,HasWord,L,F>
edu.stanford.nlp.process.PTBEscapingProcessor<IN,L,F>
- Type Parameters:
L
- The type of the labelsF
- The type of the features
- All Implemented Interfaces:
- DocumentProcessor<IN,HasWord,L,F>, ListProcessor<IN,HasWord>, Function<List<IN>,List<HasWord>>
public class PTBEscapingProcessor<IN extends HasWord,L,F>
- extends AbstractListProcessor<IN,HasWord,L,F>
- implements Function<List<IN>,List<HasWord>>
Produces a new Document of Words in which special characters of the PTB
have been properly escaped.
- Author:
- Teg Grenager (grenager@stanford.edu), Sarah Spikes (sdspikes@cs.stanford.edu) (Templatization)
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
stringSubs
protected Map<String,String> stringSubs
oldChars
protected char[] oldChars
oldStrings
protected static final String[] oldStrings
newStrings
protected static final String[] newStrings
defaultOldChars
protected static final char[] defaultOldChars
fixQuotes
protected boolean fixQuotes
PTBEscapingProcessor
public PTBEscapingProcessor()
PTBEscapingProcessor
public PTBEscapingProcessor(Map<String,String> stringSubs,
char[] oldChars,
boolean fixQuotes)
makeStringMap
protected static Map<String,String> makeStringMap()
apply
public List<HasWord> apply(List<IN> hasWordsList)
- Unescape a List of HasWords. Implements the
Function<List<HasWord>, List<HasWord>> interface.
- Specified by:
apply
in interface Function<List<IN extends HasWord>,List<HasWord>>
- Parameters:
hasWordsList
- The function's argument
- Returns:
- The function's evaluated value
unprocess
public static String unprocess(String s)
process
public List<HasWord> process(List<IN> input)
- Description copied from interface:
ListProcessor
- Take a List (including a Sentence) of input, and return a
List that has been processed in some way.
- Specified by:
process
in interface ListProcessor<IN extends HasWord,HasWord>
- Parameters:
input
- must be a List of objects of type HasWord
main
public static void main(String[] args)
- This will do the escaping on an input file. Input file must already be tokenized,
with tokens separated by whitespace.
Usage: java edu.stanford.nlp.process.PTBEscapingProcessor fileOrUrl
- Parameters:
args
- Command line argument: a file or URL
Stanford NLP Group