edu.stanford.nlp.stats

## Class Distributions

• java.lang.Object
• edu.stanford.nlp.stats.Distributions

• ```public class Distributions
extends java.lang.Object```
Static methods for operating on `Distributions`s. In general, if a method is operating on a pair of Distribution objects, we imagine that the set of possible keys for each Distribution is the same. Therefore we require that d1.numberOFKeys = d2.numberOfKeys and that the number of keys in the union of the two key sets <= numKeys
Author:
Jeff Michels (jmichels@stanford.edu)
• ### Method Summary

All Methods
Modifier and Type Method and Description
`static <K> Distribution<K>` ```average(Distribution<K> d1, Distribution<K> d2)```
`protected static <K> java.util.Set<K>` ```getSetOfAllKeys(Distribution<K> d1, Distribution<K> d2)```
`static <K> double` ```informationRadius(Distribution<K> d1, Distribution<K> d2)```
Calculates the information radius (aka the Jensen-Shannon divergence) between the two Distributions.
`static <K> double` ```jensenShannonDivergence(Distribution<K> d1, Distribution<K> d2)```
Calculates the Jensen-Shannon divergence between the two distributions.
`static <K> double` ```klDivergence(Distribution<K> from, Distribution<K> to)```
Calculates the KL divergence between the two distributions.
`static <K> double` ```overlap(Distribution<K> d1, Distribution<K> d2)```
Returns a double between 0 and 1 representing the overlap of d1 and d2.
`static <K> double` ```skewDivergence(Distribution<K> d1, Distribution<K> d2, double skew)```
Calculates the skew divergence between the two distributions.
`static <K> Distribution<K>` ```weightedAverage(Distribution<K> d1, double w1, Distribution<K> d2)```
Returns a new Distribution with counts averaged from the two given Distributions.
• ### Methods inherited from class java.lang.Object

`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`
• ### Method Detail

• #### getSetOfAllKeys

```protected static <K> java.util.Set<K> getSetOfAllKeys(Distribution<K> d1,
Distribution<K> d2)```
• #### overlap

```public static <K> double overlap(Distribution<K> d1,
Distribution<K> d2)```
Returns a double between 0 and 1 representing the overlap of d1 and d2. Equals 0 if there is no overlap, equals 1 iff d1==d2
• #### weightedAverage

```public static <K> Distribution<K> weightedAverage(Distribution<K> d1,
double w1,
Distribution<K> d2)```
Returns a new Distribution with counts averaged from the two given Distributions. The average Distribution will contain the union of keys in both source Distributions, and each count will be the weighted average of the two source counts for that key, a missing count in one Distribution is treated as if it has probability equal to that returned by the probabilityOf() function.
Returns:
A new distribution with counts that are the mean of the resp. counts in the given distributions with the remaining probability mass adjusted accordingly.
• #### average

```public static <K> Distribution<K> average(Distribution<K> d1,
Distribution<K> d2)```
• #### klDivergence

```public static <K> double klDivergence(Distribution<K> from,
Distribution<K> to)```
Calculates the KL divergence between the two distributions. That is, it calculates KL(from || to). In other words, how well can d1 be represented by d2. if there is some value in d1 that gets zero prob in d2, then return positive infinity.
Returns:
The KL divergence between the distributions
• #### jensenShannonDivergence

```public static <K> double jensenShannonDivergence(Distribution<K> d1,
Distribution<K> d2)```
Calculates the Jensen-Shannon divergence between the two distributions. That is, it calculates 1/2 [KL(d1 || avg(d1,d2)) + KL(d2 || avg(d1,d2))] .
Returns:
The KL divergence between the distributions
• #### skewDivergence

```public static <K> double skewDivergence(Distribution<K> d1,
Distribution<K> d2,
double skew)```
Calculates the skew divergence between the two distributions. That is, it calculates KL(d1 || (d2*skew + d1*(1-skew))) . In other words, how well can d1 be represented by a "smoothed" d2.
Returns:
The skew divergence between the distributions
```public static <K> double informationRadius(Distribution<K> d1,