edu.stanford.nlp.trees.international.pennchinese
Class CTBErrorCorrectingTreeNormalizer

java.lang.Object
  extended by edu.stanford.nlp.trees.TreeNormalizer
      extended by edu.stanford.nlp.trees.BobChrisTreeNormalizer
          extended by edu.stanford.nlp.trees.international.pennchinese.CTBErrorCorrectingTreeNormalizer
All Implemented Interfaces:
Serializable

public class CTBErrorCorrectingTreeNormalizer
extends BobChrisTreeNormalizer

This was originally written to correct a few errors Galen found in CTB3. The thinking was that perhaps when we get CTB4 they would be gone and we could revert to BobChris. Alas, CTB4 contained only more errors.... It has since been extended to allow some functional tags from CTB to be maintained. This is so far much easier than in NPTmpRetainingTN, since we don't do any tag percolation (helped by CTB marking temporal nouns).

See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class edu.stanford.nlp.trees.BobChrisTreeNormalizer
BobChrisTreeNormalizer.AOverAFilter, BobChrisTreeNormalizer.EmptyFilter
 
Field Summary
 
Fields inherited from class edu.stanford.nlp.trees.BobChrisTreeNormalizer
aOverAFilter, emptyFilter, tlp
 
Constructor Summary
CTBErrorCorrectingTreeNormalizer()
          Constructor with all of the options of the other constructor false
CTBErrorCorrectingTreeNormalizer(boolean splitNPTMP, boolean splitPPTMP, boolean splitXPTMP, boolean charTags)
          Build a CTBErrorCorrectingTreeNormalizer.
 
Method Summary
protected  String cleanUpLabel(String label)
          Remove things like hyphened functional tags and equals from the end of a node label.
 Tree normalizeWholeTree(Tree tree, TreeFactory tf)
          Normalize a whole tree -- one can assume that this is the root.
 
Methods inherited from class edu.stanford.nlp.trees.BobChrisTreeNormalizer
normalizeNonterminal, normalizeTerminal
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CTBErrorCorrectingTreeNormalizer

public CTBErrorCorrectingTreeNormalizer()
Constructor with all of the options of the other constructor false


CTBErrorCorrectingTreeNormalizer

public CTBErrorCorrectingTreeNormalizer(boolean splitNPTMP,
                                        boolean splitPPTMP,
                                        boolean splitXPTMP,
                                        boolean charTags)
Build a CTBErrorCorrectingTreeNormalizer.

Parameters:
splitNPTMP - Temporal annotation on NPs
splitPPTMP - Temporal annotation on PPs
splitXPTMP - Temporal annotation on any phrase marked in CTB
charTags - Whether you wish to push POS tags down on to the characters of a word (for unsegmented text)
Method Detail

cleanUpLabel

protected String cleanUpLabel(String label)
Remove things like hyphened functional tags and equals from the end of a node label. But keep occasional functional tags as determined by class parameters, particularly TMP

Overrides:
cleanUpLabel in class BobChrisTreeNormalizer
Parameters:
label - The label to be cleaned up
Returns:
The cleaned up label (phrase structure category)

normalizeWholeTree

public Tree normalizeWholeTree(Tree tree,
                               TreeFactory tf)
Description copied from class: BobChrisTreeNormalizer
Normalize a whole tree -- one can assume that this is the root. This implementation deletes empty elements (ones with nonterminal tag label '-NONE-') from the tree, and splices out unary A over A nodes. It does work for a null tree.

Overrides:
normalizeWholeTree in class BobChrisTreeNormalizer
Parameters:
tree - The tree to be normalized
tf - the TreeFactory to create new nodes (if needed)
Returns:
Tree the normalized tree


Stanford NLP Group