public class WordShapeClassifier
extends java.lang.Object
Modifier and Type | Field and Description |
---|---|
static int |
NOWORDSHAPE |
static int |
WORDSHAPECHINESE |
static int |
WORDSHAPECHRIS1 |
static int |
WORDSHAPECHRIS2 |
static int |
WORDSHAPECHRIS2USELC |
static int |
WORDSHAPECHRIS3 |
static int |
WORDSHAPECHRIS3USELC |
static int |
WORDSHAPECHRIS4 |
static int |
WORDSHAPECLUSTER1 |
static int |
WORDSHAPEDAN1 |
static int |
WORDSHAPEDAN2 |
static int |
WORDSHAPEDAN2BIO |
static int |
WORDSHAPEDAN2BIOUSELC |
static int |
WORDSHAPEDAN2USELC |
static int |
WORDSHAPEDIGITS |
static int |
WORDSHAPEJENNY1 |
static int |
WORDSHAPEJENNY1USELC |
Modifier and Type | Method and Description |
---|---|
static int |
lookupShaper(java.lang.String name)
Look up a shaper by a short String name.
|
static void |
main(java.lang.String[] args)
Usage:
java edu.stanford.nlp.process.WordShapeClassifier
[-wordShape name] string+ where name is an argument to lookupShaper . |
static java.lang.String |
wordShape(java.lang.String inStr,
int wordShaper)
Specify the String and the int identifying which word shaper to
use and this returns the result of using that wordshaper on the String.
|
static java.lang.String |
wordShape(java.lang.String inStr,
int wordShaper,
java.util.Collection<java.lang.String> knownLCWords)
Specify the string and the int identifying which word shaper to
use and this returns the result of using that wordshaper on the String.
|
static java.lang.String |
wordShapeChris4(java.lang.String s) |
public static final int NOWORDSHAPE
public static final int WORDSHAPEDAN1
public static final int WORDSHAPECHRIS1
public static final int WORDSHAPEDAN2
public static final int WORDSHAPEDAN2USELC
public static final int WORDSHAPEDAN2BIO
public static final int WORDSHAPEDAN2BIOUSELC
public static final int WORDSHAPEJENNY1
public static final int WORDSHAPEJENNY1USELC
public static final int WORDSHAPECHRIS2
public static final int WORDSHAPECHRIS2USELC
public static final int WORDSHAPECHRIS3
public static final int WORDSHAPECHRIS3USELC
public static final int WORDSHAPECHRIS4
public static final int WORDSHAPEDIGITS
public static final int WORDSHAPECHINESE
public static final int WORDSHAPECLUSTER1
public static int lookupShaper(java.lang.String name)
name
- Shaper name. Known names have patterns along the lines of:
dan[12](bio)?(UseLC)?, jenny1(useLC)?, chris[1234](useLC)?, cluster1.public static java.lang.String wordShape(java.lang.String inStr, int wordShaper)
inStr
- String to calculate word shape ofwordShaper
- Constant for which shaping formula to usepublic static java.lang.String wordShape(java.lang.String inStr, int wordShaper, java.util.Collection<java.lang.String> knownLCWords)
inStr
- String to calculate word shape ofwordShaper
- Constant for which shaping formula to useknownLCWords
- A Collection of known lowercase words, which some shapers use
to decide the class of capitalized words.
Note: while this code works with any Collection, you should
provide a Set for decent performance. If this parameter is
null or empty, then this option is not used (capitalized words
are treated the same, regardless of whether the lowercased
version of the String has been seen).public static java.lang.String wordShapeChris4(java.lang.String s)
public static void main(java.lang.String[] args)
java edu.stanford.nlp.process.WordShapeClassifier
[-wordShape name] string+
name
is an argument to lookupShaper
.
Known names have patterns along the lines of: dan[12](bio)?(UseLC)?,
jenny1(useLC)?, chris[1234](useLC)?, cluster1.
If you don't specify a word shape function, you get chris1.args
- Command-line arguments, as above.