edu.stanford.nlp.process
Class Americanize

java.lang.Object
  extended by edu.stanford.nlp.process.Americanize
All Implemented Interfaces:
Function<HasWord,HasWord>

public class Americanize
extends java.lang.Object
implements Function<HasWord,HasWord>

Takes a HasWord or String and returns an Americanized version of it. Optionally, it does some month/day name normalization to capitalized. This is deterministic spelling coversion, and so cannot deal with certain cases involving complex ambiguities, but it can do most of the simple case of English to American conversion.

This list is still quite incomplete, but does some of the commenest cases found when running our parser or doing biomedical processing. to expand this list, we should probably look at: http://wordlist.sourceforge.net/ or http://home.comcast.net/~helenajole/Harry.html.

Author:
Christopher Manning

Field Summary
static int DONT_CAPITALIZE_TIMEX
           
 
Constructor Summary
Americanize()
           
Americanize(int flags)
          Make an object for Americanizing spelling.
 
Method Summary
static java.lang.String americanize(java.lang.String str)
          Convert the spelling of a word from British to American English.
static java.lang.String americanize(java.lang.String str, boolean capitalizeTimex)
          Convert the spelling of a word from British to American English.
 HasWord apply(HasWord w)
          Americanize the HasWord or String coming in.
static void main(java.lang.String[] args)
          Americanize and print the command line arguments.
static void setStaticCapitalizeTimex(boolean capitalizeTimex)
           
 java.lang.String toString()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

DONT_CAPITALIZE_TIMEX

public static final int DONT_CAPITALIZE_TIMEX
See Also:
Constant Field Values
Constructor Detail

Americanize

public Americanize()

Americanize

public Americanize(int flags)
Make an object for Americanizing spelling.

Parameters:
flags - An integer representing bit flags. At present the only recognized flag is DONT_CAPITALIZE_TIMEX = 1 which suppresses capitalization of days of the week and months
Method Detail

apply

public HasWord apply(HasWord w)
Americanize the HasWord or String coming in.

Specified by:
apply in interface Function<HasWord,HasWord>
Parameters:
w - A HasWord or String to covert to American if needed.
Returns:
Either the input or an Americanized version of it.

americanize

public static java.lang.String americanize(java.lang.String str)
Convert the spelling of a word from British to American English. This is deterministic spelling coversion, and so cannot deal with certain cases involving complex ambiguities, but it can do most of the simple cases of English to American conversion. Month and day names will be capitalized unless you have changed the default setting.

Parameters:
str - The String to be Americanized
Returns:
The American spelling of the word.

americanize

public static java.lang.String americanize(java.lang.String str,
                                           boolean capitalizeTimex)
Convert the spelling of a word from British to American English. This is deterministic spelling coversion, and so cannot deal with certain cases involving complex ambiguities, but it can do most of the simple cases of English to American conversion.

Parameters:
str - The String to be Americanized
capitalizeTimex - Whether to capitalize time expressions like month names in return value
Returns:
The American spelling of the word.

setStaticCapitalizeTimex

public static void setStaticCapitalizeTimex(boolean capitalizeTimex)

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object

main

public static void main(java.lang.String[] args)
Americanize and print the command line arguments. This main method is just for debugging.

Parameters:
args - Command line arguments: a list of words


Stanford NLP Group