edu.stanford.nlp.international.arabic.pipeline
Class DefaultLexicalMapper

java.lang.Object
  extended by edu.stanford.nlp.international.arabic.pipeline.DefaultLexicalMapper
All Implemented Interfaces:
Mapper, java.io.Serializable

public class DefaultLexicalMapper
extends java.lang.Object
implements Mapper, java.io.Serializable

Applies a default set of lexical transformations that have been empirically validated in various Arabic tasks. This class automatically detects the input encoding and applies the appropriate set of transformations.

Author:
Spence Green
See Also:
Serialized Form

Constructor Summary
DefaultLexicalMapper()
           
 
Method Summary
 boolean canChangeEncoding(java.lang.String parent, java.lang.String element)
          Indicates whether child can be converted to another encoding.
static void main(java.lang.String[] args)
           
 java.lang.String map(java.lang.String parent, java.lang.String element)
          Maps from one string representation to another.
 void setup(java.io.File path)
          Perform initialization prior to the first call to map.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DefaultLexicalMapper

public DefaultLexicalMapper()
Method Detail

map

public java.lang.String map(java.lang.String parent,
                            java.lang.String element)
Description copied from interface: Mapper
Maps from one string representation to another.

Specified by:
map in interface Mapper
Parameters:
parent - element's context (e.g., the parent node in a parse tree)
element - The string to be transformed.
Returns:
The transformed string

setup

public void setup(java.io.File path)
Description copied from interface: Mapper
Perform initialization prior to the first call to map.

Specified by:
setup in interface Mapper
Parameters:
path - A filename for data on disk used during mapping

canChangeEncoding

public boolean canChangeEncoding(java.lang.String parent,
                                 java.lang.String element)
Description copied from interface: Mapper
Indicates whether child can be converted to another encoding. In the ATB, for example, if a punctuation character is labeled with the "PUNC" POS tag, then that character should not be converted from Buckwalter to UTF-8.

Specified by:
canChangeEncoding in interface Mapper
Parameters:
parent - element's context (e.g., the parent node in a parse tree)
element - The string to be transformed.
Returns:
True if the string encoding can be changed. False otherwise.

main

public static void main(java.lang.String[] args)


Stanford NLP Group