edu.stanford.nlp.util
Class Distribution

java.lang.Object
  extended byedu.stanford.nlp.util.Distribution
All Implemented Interfaces:
Serializable

public class Distribution
extends Object
implements Serializable

Immutable class for representing normalized, smoothed discrete distributions from Counters. Smoothed counters reserve probability mass for unseen items, so queries for the probability of unseen items will return a small positive amount. #totalCount should always return 1. Counter passed in constructors is copied.

Author:
Galen Andrew (galand@cs.stanford.edu)
See Also:
Serialized Form

Method Summary
static Distribution addOneSmoothedCounter(Counter counter, int numberOfKeys)
          Creates an add-1 smoothed Distribution from the given counter, ie adds one count to every item, including unseen ones, and divides by the total count.
 Object argmax()
           
 boolean containsKey(Object key)
           
static Distribution distributionFromLogisticCounter(Counter cntr)
          Maps a counter representing the linear weights of a multiclass logistic regression model to the probabilities of each class.
 boolean equals(Object o)
           
 double getCount(Object key)
          Returns the current count for the given key, which is 0 if it hasn't been seen before.
 int getNumberOfKeys()
           
 double getReservedMass()
           
static Distribution goodTuringSmoothedCounter(Counter counter, int numberOfKeys)
          Creates a Good-Turing smoothed Distribution from the given counter.
 int hashCode()
           
static Distribution jeffreysPerksSmoothedCounter(Counter counter, int numberOfKeys)
          Creates a smoothed Distribution using the Jeffreys-Perks law, ie adds one half count to every item, including unseen ones, and divides by the total count.
 Set keySet()
           
static Distribution lidstoneSmoothedCounter(Counter counter, int numberOfKeys, double lambda)
          Creates a smoothed Distribution using Lidstone's law, ie adds lambda (typically between 0 and 1) to every item, including unseen ones, and divides by the total count.
static void main(String[] args)
          For internal testing purposes only.
static Distribution normalizedCounter(Counter counter)
          Creates a Distribution from the given counter, ie makes an internal copy of the counter and divides all counts by the total count.
static Distribution normalizedCounterWithDirichletPrior(Counter c, Distribution prior, double weight)
          Returns a Distribution that uses prior as a Dirichlet prior weighted by weight.
 double probabilityOf(Object key)
          Returns the normalized count of the given object.
 String toString()
           
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Method Detail

toString

public String toString()

getReservedMass

public double getReservedMass()

getNumberOfKeys

public int getNumberOfKeys()

keySet

public Set keySet()

containsKey

public boolean containsKey(Object key)

getCount

public double getCount(Object key)
Returns the current count for the given key, which is 0 if it hasn't been seen before. This is a convenient version of get that casts and extracts the primitive value.


normalizedCounter

public static Distribution normalizedCounter(Counter counter)
Creates a Distribution from the given counter, ie makes an internal copy of the counter and divides all counts by the total count.

Parameters:
counter -
Returns:
a new Distribution

addOneSmoothedCounter

public static Distribution addOneSmoothedCounter(Counter counter,
                                                 int numberOfKeys)
Creates an add-1 smoothed Distribution from the given counter, ie adds one count to every item, including unseen ones, and divides by the total count.

Parameters:
counter -
numberOfKeys -
Returns:
a new add-1 smoothed Distribution

jeffreysPerksSmoothedCounter

public static Distribution jeffreysPerksSmoothedCounter(Counter counter,
                                                        int numberOfKeys)
Creates a smoothed Distribution using the Jeffreys-Perks law, ie adds one half count to every item, including unseen ones, and divides by the total count.

Parameters:
counter -
numberOfKeys -
Returns:
a new Jeffreys-Perks smoothed Distribution

lidstoneSmoothedCounter

public static Distribution lidstoneSmoothedCounter(Counter counter,
                                                   int numberOfKeys,
                                                   double lambda)
Creates a smoothed Distribution using Lidstone's law, ie adds lambda (typically between 0 and 1) to every item, including unseen ones, and divides by the total count.

Parameters:
counter -
numberOfKeys -
lambda -
Returns:
a new Lidstone smoothed Distribution

goodTuringSmoothedCounter

public static Distribution goodTuringSmoothedCounter(Counter counter,
                                                     int numberOfKeys)
Creates a Good-Turing smoothed Distribution from the given counter.

Parameters:
counter -
numberOfKeys -
Returns:
a new Good-Turing smoothed Distribution.

normalizedCounterWithDirichletPrior

public static Distribution normalizedCounterWithDirichletPrior(Counter c,
                                                               Distribution prior,
                                                               double weight)
Returns a Distribution that uses prior as a Dirichlet prior weighted by weight. Essentially adds "pseudo-counts" for each Object in prior proportional to that Object's mass in prior times weight, then normalizes. WARNING: If unseen item is encountered in c, total may not be 1.

Parameters:
c -
prior -
weight - average "pseudo-count" of Objects in prior
Returns:
new Distribution

distributionFromLogisticCounter

public static Distribution distributionFromLogisticCounter(Counter cntr)
Maps a counter representing the linear weights of a multiclass logistic regression model to the probabilities of each class.


probabilityOf

public double probabilityOf(Object key)
Returns the normalized count of the given object.

Parameters:
key -
Returns:
the normalized count of the object

argmax

public Object argmax()

equals

public boolean equals(Object o)

hashCode

public int hashCode()

main

public static void main(String[] args)
For internal testing purposes only.



Stanford NLP Group