edu.stanford.nlp.process
Class Morphology

java.lang.Object
  extended by edu.stanford.nlp.process.Morphology
All Implemented Interfaces:
Function

public class Morphology
extends java.lang.Object
implements Function

Morphology computes the base form of English words, by removing just inflections (not derivational morphology). That is, it only does noun plurals, pronoun case, and verb endings, and not things like comparative adjectives or derived nominals. It is based on a finite-state transducer implemented by John Carroll et al., written in flex and publicly available. See: http://www.informatics.susx.ac.uk/research/nlp/carroll/morph.html . There are several ways of invoking Morphology. One is by calling the static methods WordTag stemStatic(String word, String tag) or WordTag stemStatic(WordTag wordTag). If we have created a Morphology object already we can use the methods WordTag stem(String word, string tag) or WordTag stem(WordTag wordTag).
Another way of using Morphology is to run it on an input file by running java Morphology filename. In this case, POS tags MUST be separated from words by an underscore ("_").
Note that a single instance of Morphology is not thread-safe, as the underlying lexer object is not built to be re-entrant. One thing that you can do to get around this is build a new Morphology object for each set of calls to the Morphology. For example, the MorphaAnnotator builds a Morphology for each document it annotates. The other approach is to use the synchronized methods in this class.

Author:
Kristina Toutanova (kristina@cs.stanford.edu), Christopher Manning

Constructor Summary
Morphology()
           
Morphology(java.io.Reader in)
          Process morphologically words from a Reader.
Morphology(java.io.Reader in, int flags)
           
 
Method Summary
 java.lang.Object apply(java.lang.Object in)
          Converts a T1 to a different T2.
 java.lang.String lemma(java.lang.String word, java.lang.String tag)
           
 java.lang.String lemma(java.lang.String word, java.lang.String tag, boolean lowercase)
           
static java.lang.String lemmaStatic(java.lang.String word, java.lang.String tag, boolean lowercase)
           
static java.lang.String lemmaStaticSynchronized(java.lang.String word, java.lang.String tag, boolean lowercase)
           
 WordLemmaTag lemmatize(WordTag wT)
          Lemmatize returning a WordLemmaTag .
static WordLemmaTag lemmatizeStatic(WordTag wT)
           
static void main(java.lang.String[] args)
          Run the morphological analyzer.
 Word next()
           
 void stem(CoreLabel label)
          Adds the LemmaAnnotation to the given CoreLabel.
 void stem(CoreLabel label, java.lang.Class<? extends CoreAnnotation<java.lang.String>> ann)
          Adds annotation ann to the given CoreLabel.
 java.lang.String stem(java.lang.String word)
           
 Word stem(Word w)
           
static WordTag stemStatic(java.lang.String word, java.lang.String tag)
          Return a new WordTag which has the lemma as the value of word().
static WordTag stemStatic(WordTag wT)
          Return a new WordTag which has the lemma as the value of word().
static WordTag stemStaticSynchronized(java.lang.String word, java.lang.String tag)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Morphology

public Morphology()

Morphology

public Morphology(java.io.Reader in)
Process morphologically words from a Reader.

Parameters:
in - The Reader to read from

Morphology

public Morphology(java.io.Reader in,
                  int flags)
Method Detail

next

public Word next()
          throws java.io.IOException
Throws:
java.io.IOException

stem

public Word stem(Word w)

stem

public java.lang.String stem(java.lang.String word)

lemma

public java.lang.String lemma(java.lang.String word,
                              java.lang.String tag)

lemma

public java.lang.String lemma(java.lang.String word,
                              java.lang.String tag,
                              boolean lowercase)

stem

public void stem(CoreLabel label)
Adds the LemmaAnnotation to the given CoreLabel.


stem

public void stem(CoreLabel label,
                 java.lang.Class<? extends CoreAnnotation<java.lang.String>> ann)
Adds annotation ann to the given CoreLabel. Assumes that it has a TextAnnotation and PartOfSpeechAnnotation.


stemStatic

public static WordTag stemStatic(java.lang.String word,
                                 java.lang.String tag)
Return a new WordTag which has the lemma as the value of word(). The default is to lowercase non-proper-nouns, unless options have been set.


lemmaStatic

public static java.lang.String lemmaStatic(java.lang.String word,
                                           java.lang.String tag,
                                           boolean lowercase)

stemStaticSynchronized

public static WordTag stemStaticSynchronized(java.lang.String word,
                                             java.lang.String tag)

lemmaStaticSynchronized

public static java.lang.String lemmaStaticSynchronized(java.lang.String word,
                                                       java.lang.String tag,
                                                       boolean lowercase)

stemStatic

public static WordTag stemStatic(WordTag wT)
Return a new WordTag which has the lemma as the value of word(). The default is to lowercase non-proper-nouns, unless options have been set.


apply

public java.lang.Object apply(java.lang.Object in)
Description copied from interface: Function
Converts a T1 to a different T2. For example, a Parser will convert a Sentence to a Tree. A Tagger will convert a Sentence to a TaggedSentence.

Specified by:
apply in interface Function
Parameters:
in - The function's argument
Returns:
The function's evaluated value

lemmatize

public WordLemmaTag lemmatize(WordTag wT)
Lemmatize returning a WordLemmaTag .


lemmatizeStatic

public static WordLemmaTag lemmatizeStatic(WordTag wT)

main

public static void main(java.lang.String[] args)
                 throws java.io.IOException
Run the morphological analyzer. Options are:

Throws:
java.io.IOException


Stanford NLP Group