edu.stanford.nlp.parser.lexparser
Class MLEDependencyGrammar

java.lang.Object
  extended by edu.stanford.nlp.parser.lexparser.AbstractDependencyGrammar
      extended by edu.stanford.nlp.parser.lexparser.MLEDependencyGrammar
All Implemented Interfaces:
DependencyGrammar, java.io.Serializable
Direct Known Subclasses:
ChineseSimWordAvgDepGrammar

public class MLEDependencyGrammar
extends AbstractDependencyGrammar

See Also:
Serialized Form

Field Summary
protected  ClassicCounter<IntDependency> argCounter
          Stores all the counts for dependencies (with and without the word being a wildcard) in the reduced tag space.
 double interp
          Interpolation between model that directly predicts aTW and model that predicts aT and then aW given aT.
protected static double MIN_PROBABILITY
           
protected  int numWordTokens
           
 double smooth_aPTW_aPT
           
 double smooth_aT_hTd
           
 double smooth_aT_hTWd
          Bayesian m-estimate prior for aT given hTWd against base distribution of aT given hTd.
 double smooth_aTW_aT
           
 double smooth_aTW_hTd
           
 double smooth_aTW_hTWd
          Bayesian m-estimate prior for aTW given hTWd against base distribution of aTW given hTd.
 double smooth_stop
           
protected  ClassicCounter<IntDependency> stopCounter
           
protected  java.util.List<IntTaggedWord> tagITWList
          The indices of this list are in the tag binned space.
 
Fields inherited from class edu.stanford.nlp.parser.lexparser.AbstractDependencyGrammar
coarseDistanceBins, directional, expandDependencyMap, lex, numTagBins, regDistanceBins, stopTW, tagBin, tagProjection, tlp, useCoarseDistance, useDistance, wildTW
 
Constructor Summary
MLEDependencyGrammar(TagProjection tagProjection, TreebankLangParserParams tlpParams, boolean directional, boolean useDistance, boolean useCoarseDistance)
           
MLEDependencyGrammar(TreebankLangParserParams tlpParams, boolean directional, boolean distance, boolean coarseDistance)
           
 
Method Summary
 void addRule(IntDependency dependency, double count)
          Add this dependency with the given count to the grammar.
 double countHistory(IntDependency dependency)
           
 void dumpSizes()
           
protected  void expandDependency(IntDependency dependency, double count)
          The dependency arg is still in the full tag space.
protected  double getStopProb(IntDependency dependency)
          Return the probability (as a real number between 0 and 1) of stopping rather than generating another argument at this position.
protected  double probTB(IntDependency dependency)
          Calculate the probability of a dependency as a real probability between 0 and 1 inclusive.
 boolean pruneTW(IntTaggedWord argTW)
           
 void readData(java.io.BufferedReader in)
          Populates data in this DependencyGrammar from the character stream given by the Reader r.
 double scoreAll(java.util.Collection<IntDependency> deps)
           
 double scoreTB(IntDependency dependency)
          Score a tag binned dependency.
 java.lang.String toString()
           
protected static edu.stanford.nlp.parser.lexparser.MLEDependencyGrammar.EndHead treeToDependencyHelper(Tree tree, java.util.List<IntDependency> depList, int loc)
          Adds dependencies to list depList.
static java.util.List<IntDependency> treeToDependencyList(Tree tree)
          Returns the List of dependencies for a binarized Tree.
 void tune(java.util.Collection<Tree> trees)
          Tune the smoothing and interpolation parameters of the dependency grammar based on a tuning treebank.
 void writeData(java.io.PrintWriter out)
          Writes out data from this Object to the Writer w.
 
Methods inherited from class edu.stanford.nlp.parser.lexparser.AbstractDependencyGrammar
coarseDistanceBin, distanceBin, initTagBins, intern, numDistBins, numTagBins, regDistanceBin, rootTW, score, score, scoreTB, setLexicon, tagBin, tagNumberer, valenceBin, wordNumberer
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

numWordTokens

protected int numWordTokens

argCounter

protected ClassicCounter<IntDependency> argCounter
Stores all the counts for dependencies (with and without the word being a wildcard) in the reduced tag space.


stopCounter

protected ClassicCounter<IntDependency> stopCounter

smooth_aT_hTWd

public double smooth_aT_hTWd
Bayesian m-estimate prior for aT given hTWd against base distribution of aT given hTd. TODO: Note that these values are overwritten in the constructor. Find what is best and then maybe remove these defaults!


smooth_aTW_hTWd

public double smooth_aTW_hTWd
Bayesian m-estimate prior for aTW given hTWd against base distribution of aTW given hTd.


smooth_stop

public double smooth_stop

interp

public double interp
Interpolation between model that directly predicts aTW and model that predicts aT and then aW given aT. This percent of the mass is on the model directly predicting aTW.


smooth_aTW_aT

public double smooth_aTW_aT

smooth_aTW_hTd

public double smooth_aTW_hTd

smooth_aT_hTd

public double smooth_aT_hTd

smooth_aPTW_aPT

public double smooth_aPTW_aPT

tagITWList

protected transient java.util.List<IntTaggedWord> tagITWList
The indices of this list are in the tag binned space.


MIN_PROBABILITY

protected static final double MIN_PROBABILITY
See Also:
Constant Field Values
Constructor Detail

MLEDependencyGrammar

public MLEDependencyGrammar(TreebankLangParserParams tlpParams,
                            boolean directional,
                            boolean distance,
                            boolean coarseDistance)

MLEDependencyGrammar

public MLEDependencyGrammar(TagProjection tagProjection,
                            TreebankLangParserParams tlpParams,
                            boolean directional,
                            boolean useDistance,
                            boolean useCoarseDistance)
Method Detail

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object

pruneTW

public boolean pruneTW(IntTaggedWord argTW)

treeToDependencyHelper

protected static edu.stanford.nlp.parser.lexparser.MLEDependencyGrammar.EndHead treeToDependencyHelper(Tree tree,
                                                                                                       java.util.List<IntDependency> depList,
                                                                                                       int loc)
Adds dependencies to list depList. These are in terms of the original tag set not the reduced (projected) tag set.


dumpSizes

public void dumpSizes()

treeToDependencyList

public static java.util.List<IntDependency> treeToDependencyList(Tree tree)
Returns the List of dependencies for a binarized Tree. In this tree, one of the two children always equals the head. The dependencies are in terms of the original tag set not the reduced (projected) tag set.

Parameters:
tree - A tree to be analyzed as dependencies
Returns:
The list of dependencies in the tree (int format)

scoreAll

public double scoreAll(java.util.Collection<IntDependency> deps)

tune

public void tune(java.util.Collection<Tree> trees)
Tune the smoothing and interpolation parameters of the dependency grammar based on a tuning treebank.

Specified by:
tune in interface DependencyGrammar
Overrides:
tune in class AbstractDependencyGrammar
Parameters:
trees - A Collection of Trees for setting parameters

addRule

public void addRule(IntDependency dependency,
                    double count)
Add this dependency with the given count to the grammar. This is the main entry point of MLEDependencyGrammarExtractor. This is a dependency represented in the full tag space.


expandDependency

protected void expandDependency(IntDependency dependency,
                                double count)
The dependency arg is still in the full tag space.

Parameters:
dependency - An opbserved dependency
count - The weight of the dependency

countHistory

public double countHistory(IntDependency dependency)

scoreTB

public double scoreTB(IntDependency dependency)
Score a tag binned dependency.

Parameters:
dependency - The dependency object to be scored, where the tags in the dependency have already been mapped to a reduced space by a tagProjection function.
Returns:
The negative log probability given to the dependency by the grammar. This may be Double.NEGATIVE_INFINITY for "impossible".

probTB

protected double probTB(IntDependency dependency)
Calculate the probability of a dependency as a real probability between 0 and 1 inclusive.

Parameters:
dependency - The dependency for which the probability is to be calculated. The tags in this dependency are in the reduced TagProjection space.
Returns:
The probability of the dependency

getStopProb

protected double getStopProb(IntDependency dependency)
Return the probability (as a real number between 0 and 1) of stopping rather than generating another argument at this position.

Parameters:
dependency - The dependency used as the basis for stopping on. Tags are assumed to be in the TagProjection space.
Returns:
The probability of generating this stop probability

readData

public void readData(java.io.BufferedReader in)
              throws java.io.IOException
Populates data in this DependencyGrammar from the character stream given by the Reader r.

Specified by:
readData in interface DependencyGrammar
Overrides:
readData in class AbstractDependencyGrammar
Throws:
java.io.IOException

writeData

public void writeData(java.io.PrintWriter out)
               throws java.io.IOException
Writes out data from this Object to the Writer w.

Specified by:
writeData in interface DependencyGrammar
Overrides:
writeData in class AbstractDependencyGrammar
Throws:
java.io.IOException


Stanford NLP Group