public class Morphology
extends java.lang.Object
implements java.util.function.Function
Another way of using Morphology is to run it on an input file by running
java Morphology filename
. In this case, POS tags MUST be
separated from words by an underscore ("_").
Note that a single instance of Morphology is not thread-safe, as the underlying lexer object is not built to be re-entrant. One thing that you can do to get around this is build a new Morphology object for each thread or each set of calls to the Morphology. For example, the MorphaAnnotator builds a Morphology for each document it annotates. The other approach is to use the synchronized methods in this class. The crucial lexer-accessing portion of all the static methods is synchronized (otherwise, their use tended to be threading bugs waiting to happen). If you want less synchronization, create your own Morphology objects.
Constructor and Description |
---|
Morphology() |
Morphology(java.io.Reader in)
Process morphologically words from a Reader.
|
Morphology(java.io.Reader in,
int flags) |
Modifier and Type | Method and Description |
---|---|
java.lang.Object |
apply(java.lang.Object in) |
java.lang.String |
lemma(java.lang.String word,
java.lang.String tag) |
java.lang.String |
lemma(java.lang.String word,
java.lang.String tag,
boolean lowercase) |
static java.lang.String |
lemmaStatic(java.lang.String word,
java.lang.String tag)
Lemmatize the word, being sensitive to the tag.
|
static java.lang.String |
lemmaStatic(java.lang.String word,
java.lang.String tag,
boolean lowercase)
Lemmatize the word, being sensitive to the tag.
|
WordLemmaTag |
lemmatize(WordTag wT)
Lemmatize returning a
WordLemmaTag . |
static WordLemmaTag |
lemmatizeStatic(WordTag wT) |
static void |
main(java.lang.String[] args)
Run the morphological analyzer.
|
Word |
next() |
void |
stem(CoreLabel label)
Adds the LemmaAnnotation to the given CoreLabel.
|
void |
stem(CoreLabel label,
java.lang.Class<? extends CoreAnnotation<java.lang.String>> ann)
Adds stem under annotation
ann to the given CoreLabel. |
java.lang.String |
stem(java.lang.String word) |
Word |
stem(Word w) |
static WordTag |
stemStatic(java.lang.String word,
java.lang.String tag)
Return a new WordTag which has the lemma as the value of word().
|
static WordTag |
stemStatic(WordTag wT)
Return a new WordTag which has the lemma as the value of word().
|
public Morphology()
public Morphology(java.io.Reader in)
in
- The Reader to read frompublic Morphology(java.io.Reader in, int flags)
public Word next() throws java.io.IOException
java.io.IOException
public java.lang.String stem(java.lang.String word)
public java.lang.String lemma(java.lang.String word, java.lang.String tag)
public java.lang.String lemma(java.lang.String word, java.lang.String tag, boolean lowercase)
public void stem(CoreLabel label)
public void stem(CoreLabel label, java.lang.Class<? extends CoreAnnotation<java.lang.String>> ann)
ann
to the given CoreLabel.
Assumes that it has a TextAnnotation and PartOfSpeechAnnotation.public static WordTag stemStatic(java.lang.String word, java.lang.String tag)
public static java.lang.String lemmaStatic(java.lang.String word, java.lang.String tag)
word
- The word to lemmatizetag
- What part of speech to assume for it.public static java.lang.String lemmaStatic(java.lang.String word, java.lang.String tag, boolean lowercase)
word
- The word to lemmatizetag
- What part of speech to assume for it.lowercase
- If this is true, words other than proper nouns will
be changed to all lowercase.public static WordTag stemStatic(WordTag wT)
public java.lang.Object apply(java.lang.Object in)
apply
in interface java.util.function.Function
public WordLemmaTag lemmatize(WordTag wT)
WordLemmaTag
.public static WordLemmaTag lemmatizeStatic(WordTag wT)
public static void main(java.lang.String[] args) throws java.io.IOException
java.io.IOException