edu.stanford.nlp.trees
Class AbstractCollinsHeadFinder

java.lang.Object
  extended by edu.stanford.nlp.trees.AbstractCollinsHeadFinder
All Implemented Interfaces:
HeadFinder, java.io.Serializable
Direct Known Subclasses:
ArabicHeadFinder, BikelChineseHeadFinder, ChineseHeadFinder, CollinsHeadFinder, DybroFrenchHeadFinder, FrenchHeadFinder, NegraHeadFinder, SunJurafskyChineseHeadFinder, TueBaDZHeadFinder

public abstract class AbstractCollinsHeadFinder
extends java.lang.Object
implements HeadFinder

A base class for a HeadFinder similar to the one described in Michael Collins' 1999 thesis. For a given constituent we perform operations like (this is for "left" or "right":

 for categoryList in categoryLists
   for index = 1 to n [or n to 1 if R->L]
     for category in categoryList
       if category equals daughter[index] choose it.
 

with a final default that goes with the direction (L->R or R->L) For most constituents, there will be only one category in the list, the exception being, in Collins' original version, NP.

It is up to the overriding base class to initialize the map from constituent type to categoryLists, "nonTerminalInfo", in its constructor. Entries are presumed to be of type String[][]. Each String[] is a list of categories, except for the first entry, which specifies direction of traversal and must be one of the following:

Changes:

Author:
Christopher Manning, Galen Andrew
See Also:
Serialized Form

Field Summary
protected  java.lang.String[] defaultLeftRule
          These are built automatically from categoriesToAvoid and used in a fairly different fashion from defaultRule (above).
protected  java.lang.String[] defaultRightRule
           
protected  java.lang.String[] defaultRule
          Default direction if no rule is found for category (the head/parent).
protected  java.util.Map<java.lang.String,java.lang.String[][]> nonTerminalInfo
           
protected  TreebankLanguagePack tlp
           
 
Constructor Summary
protected AbstractCollinsHeadFinder(TreebankLanguagePack tlp, java.lang.String... categoriesToAvoid)
          Construct a HeadFinder.
 
Method Summary
 Tree determineHead(Tree t)
          Determine which daughter of the current parse tree is the head.
 Tree determineHead(Tree t, Tree parent)
          Determine which daughter of the current parse tree is the head.
protected  Tree determineNonTrivialHead(Tree t, Tree parent)
          Called by determineHead and may be overridden in subclasses if special treatment is necessary for particular categories.
protected  Tree findMarkedHead(Tree t)
          A way for subclasses for corpora with explicit head markings to return the explicitly marked head
protected  int postOperationFix(int headIdx, Tree[] daughterTrees)
          A way for subclasses to fix any heads under special conditions.
protected  Tree traverseLocate(Tree[] daughterTrees, java.lang.String[] how, boolean lastResort)
          Attempt to locate head daughter tree from among daughters.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

tlp

protected final TreebankLanguagePack tlp

nonTerminalInfo

protected java.util.Map<java.lang.String,java.lang.String[][]> nonTerminalInfo

defaultRule

protected java.lang.String[] defaultRule
Default direction if no rule is found for category (the head/parent). Subclasses can turn it on if they like. If they don't it is an error if no rule is defined for a category (null is returned).


defaultLeftRule

protected java.lang.String[] defaultLeftRule
These are built automatically from categoriesToAvoid and used in a fairly different fashion from defaultRule (above). These are used for categories that do have defined rules but where none of them have matched. Rather than picking the rightmost or leftmost child, we will use these to pick the the rightmost or leftmost child which isn't in categoriesToAvoid.


defaultRightRule

protected java.lang.String[] defaultRightRule
Constructor Detail

AbstractCollinsHeadFinder

protected AbstractCollinsHeadFinder(TreebankLanguagePack tlp,
                                    java.lang.String... categoriesToAvoid)
Construct a HeadFinder. The TreebankLanguagePack is used to get basic categories. The remaining arguments set categories which, if it comes to last resort processing (i.e., none of the rules matched), will be avoided as heads. In last resort processing, it will attempt to match the leftmost or rightmost constituent not in this set but will fall back to the left or rightmost constituent if necessary.

Parameters:
tlp - TreebankLanguagePack used to determine basic category
categoriesToAvoid - Constituent types to avoid as head
Method Detail

findMarkedHead

protected Tree findMarkedHead(Tree t)
A way for subclasses for corpora with explicit head markings to return the explicitly marked head

Parameters:
t - a tree to find the head of
Returns:
the marked head-- null if no marked head

determineHead

public Tree determineHead(Tree t)
Determine which daughter of the current parse tree is the head.

Specified by:
determineHead in interface HeadFinder
Parameters:
t - The parse tree to examine the daughters of. If this is a leaf, null is returned
Returns:
The daughter parse tree that is the head of t
See Also:
for a routine to call this and spread heads throughout a tree

determineHead

public Tree determineHead(Tree t,
                          Tree parent)
Determine which daughter of the current parse tree is the head.

Specified by:
determineHead in interface HeadFinder
Parameters:
t - The parse tree to examine the daughters of. If this is a leaf, null is returned
parent - The parent of t
Returns:
The daughter parse tree that is the head of t. Returns null for leaf nodes.
See Also:
for a routine to call this and spread heads throughout a tree

determineNonTrivialHead

protected Tree determineNonTrivialHead(Tree t,
                                       Tree parent)
Called by determineHead and may be overridden in subclasses if special treatment is necessary for particular categories.

Parameters:
t - The tre to determine the head daughter of
parent - The parent of t (or may be null)
Returns:
The head daughter of t

traverseLocate

protected Tree traverseLocate(Tree[] daughterTrees,
                              java.lang.String[] how,
                              boolean lastResort)
Attempt to locate head daughter tree from among daughters. Go through daughterTrees looking for things from or not in a set given by the contents of the array how, and if you do not find one, take leftmost or rightmost perhaps matching thing iff lastResort is true, otherwise return null.


postOperationFix

protected int postOperationFix(int headIdx,
                               Tree[] daughterTrees)
A way for subclasses to fix any heads under special conditions. The default does nothing.

Parameters:
headIdx - The index of the proposed head
daughterTrees - The array of daughter trees
Returns:
The new headIndex


Stanford NLP Group