edu.stanford.nlp.trees.international.arabic
Class ArabicTreeNormalizer
java.lang.Object
edu.stanford.nlp.trees.TreeNormalizer
edu.stanford.nlp.trees.BobChrisTreeNormalizer
edu.stanford.nlp.trees.international.arabic.ArabicTreeNormalizer
- All Implemented Interfaces:
- Serializable
public class ArabicTreeNormalizer
- extends BobChrisTreeNormalizer
A first-version tree normalizer for the Arabic Penn Treebank.
Just like BobChrisTreeNormalizer but:
- Adds a ROOT node to the top of every tree
- Strips all the interesting stuff off of the POS tags.
- Can keep NP-TMP annotations (retainNPTmp parameter)
- Can keep whatever annotations there are on verbs that are sisters
to predicatively marked (-PRD) elements (markPRDverb parameter)
[Chris Nov 2006: I'm a bit unsure on that one!]
- Can keep categories unchanged, i.e., not mapped to basic categories
(changeNoLabels parameter)
- Counts pronoun deletions ("nullp" and "_") as empty; filters
- Author:
- Roger Levy, Anna Rafferty
- See Also:
- Serialized Form
Constructor Summary |
ArabicTreeNormalizer()
|
ArabicTreeNormalizer(boolean retainNPTmp)
|
ArabicTreeNormalizer(boolean retainNPTmp,
boolean markPRDverb)
|
ArabicTreeNormalizer(boolean retainNPTmp,
boolean markPRDverb,
boolean changeNoLabels)
|
ArabicTreeNormalizer(boolean retainNPTmp,
boolean markPRDverb,
boolean changeNoLabels,
boolean collapsePreps)
|
ArabicTreeNormalizer(boolean retainNPTmp,
boolean markPRDverb,
boolean changeNoLabels,
boolean collapsePreps,
boolean retainNPSbj,
boolean retainPPClr)
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ArabicTreeNormalizer
public ArabicTreeNormalizer(boolean retainNPTmp,
boolean markPRDverb,
boolean changeNoLabels,
boolean collapsePreps,
boolean retainNPSbj,
boolean retainPPClr)
ArabicTreeNormalizer
public ArabicTreeNormalizer(boolean retainNPTmp,
boolean markPRDverb,
boolean changeNoLabels,
boolean collapsePreps)
ArabicTreeNormalizer
public ArabicTreeNormalizer(boolean retainNPTmp,
boolean markPRDverb,
boolean changeNoLabels)
ArabicTreeNormalizer
public ArabicTreeNormalizer(boolean retainNPTmp,
boolean markPRDverb)
ArabicTreeNormalizer
public ArabicTreeNormalizer(boolean retainNPTmp)
ArabicTreeNormalizer
public ArabicTreeNormalizer()
normalizeNonterminal
public String normalizeNonterminal(String category)
- Description copied from class:
BobChrisTreeNormalizer
- Normalizes a nonterminal contents.
This implementation strips functional tags, etc. and interns the
nonterminal.
- Overrides:
normalizeNonterminal
in class BobChrisTreeNormalizer
- Parameters:
category
- The String that decorates this nonterminal node
- Returns:
- The normalized form of this nonterminal String
normalizeTerminal
public String normalizeTerminal(String leaf)
- Miscellany:
- Escapes out "/" and "*" tokens (this is ugly, should be fixed!)
todo: cdm 2009: Is this really needed for Arabic??
- Overrides:
normalizeTerminal
in class BobChrisTreeNormalizer
- Parameters:
leaf
- The String that decorates the leaf
- Returns:
- The normalized form of this leaf String
normalizeWholeTree
public Tree normalizeWholeTree(Tree tree,
TreeFactory tf)
- Description copied from class:
BobChrisTreeNormalizer
- Normalize a whole tree -- one can assume that this is the
root. This implementation deletes empty elements (ones with
nonterminal tag label '-NONE-') from the tree, and splices out
unary A over A nodes. It does work for a null tree.
- Overrides:
normalizeWholeTree
in class BobChrisTreeNormalizer
- Parameters:
tree
- The tree to be normalizedtf
- the TreeFactory to create new nodes (if needed)
- Returns:
- Tree the normalized tree
Stanford NLP Group