edu.stanford.nlp.stats
Class Counters

java.lang.Object
  extended by edu.stanford.nlp.stats.Counters

public class Counters
extends Object

Static methods for operating on Counters.

All methods that change their arguments change the first argument (only), and have "InPlace" in their name. This class also provides access to Comparators that can be used to sort the keys or entries of this Counter by the counts, in either ascending or descending order.

Author:
Galen Andrew (galand@cs.stanford.edu), Jeff Michels (jmichels@stanford.edu), dramage, daniel cer (http://dmcer.net), Christopher Manning, stefank (Optimized dot product)

Method Summary
static
<E> Counter<E>
absoluteDifference(Counter<E> c1, Counter<E> c2)
          Returns |c1 - c2|.
static
<E> void
addInPlace(Counter<E> target, Collection<E> arg)
          Sets each value of target to be target[k]+ num-of-times-it-occurs-in-collection if the key is present in the arg collection.
static
<E> void
addInPlace(Counter<E> target, Counter<E> arg)
          Sets each value of target to be target[k]+arg[k] for all keys k in arg.
static
<E> void
addInPlace(Counter<E> target, Counter<E> arg, double scale)
          Sets each value of target to be target[k]+scale*arg[k] for all keys k in target.
static
<E> void
addInPlace(double[] target, Counter<E> arg, Index<E> idx)
          Sets each value of double[] target to be target[idx.indexOf(k)]+a.getCount(k) for all keys k in arg
static
<T1,T2> void
addInPlace(TwoDimensionalCounter<T1,T2> target, TwoDimensionalCounter<T1,T2> arg)
          For all keys (u,v) in arg, sets target[u,v] to be target[u,v] + arg[u,v]
static
<T1,T2> void
addInPlace(TwoDimensionalCounter<T1,T2> target, TwoDimensionalCounter<T1,T2> arg, double scale)
          For all keys (u,v) in arg, sets target[u,v] to be target[u,v] + scale * arg[u,v]
static
<E> E
argmax(Counter<E> c)
          Finds and returns the key in the Counter with the largest count.
static
<E> E
argmin(Counter<E> c)
          Finds and returns the key in this Counter with the smallest count.
static
<E> Counter<E>
asCounter(Collection<E> c)
          Takes in a Collection of something and makes a counter, incrementing once for each object in the collection.
static
<E> Counter<E>
asCounter(FixedPrioritiesPriorityQueue<E> p)
          Returns a counter whose keys are the elements in this priority queue, and whose counts are the priorities in this queue.
static
<E> Map<E,Double>
asMap(Counter<E> counter)
          Returns a map view of the given counter.
static
<E> Counter<E>
average(Counter<E> c1, Counter<E> c2)
          Returns a new Counter with counts averaged from the two given Counters.
static
<E> double
cosine(Counter<E> c1, Counter<E> c2)
           
static
<E> double
crossEntropy(Counter<E> from, Counter<E> to)
          Note that this implementation doesn't normalize the "from" Counter.
static
<E> List<E>
deleteOutofRange(Counter<E> c, int top, int bottom)
          Delete 'top' and 'bottom' number of elements from the top and bottom respectively
static
<T> ClassicCounter<T>
deserializeCounter(String filename)
           
static
<T> Counter<T>
diff(Counter<T> goldFeatures, Counter<T> guessedFeatures)
           
static
<E> void
divideInPlace(Counter<E> target, Counter<E> denominator)
          Divides every non-zero count in target by the corresponding value in the denominator Counter.
static
<E> Counter<E>
divideInPlace(Counter<E> target, double divisor)
          Divides each value in target by the given divisor, in place.
static
<E> Counter<E>
division(Counter<E> c1, Counter<E> c2)
          Returns c1 divided by c2.
static
<E> double
dotProduct(Counter<E> c1, Counter<E> c2)
          Returns the product of c1 and c2.
static
<E> double
dotProduct(Counter<E> c, double[] a, Index<E> idx)
          Returns the product of Counter c and double[] a, using Index idx to map entries in C onto a.
static
<E> void
dotProductInPlace(Counter<E> target, Counter<E> term)
          Multiplies every count in target by the corresponding value in the term Counter.
static
<E> double
entropy(Counter<E> c)
          Calculates the entropy of the given counter (in bits).
static
<E> boolean
equals(Counter<E> o1, Counter<E> o2)
          Default equality comparison for two counters potentially backed by alternative implementations.
static
<T> Counter<T>
exp(Counter<T> c)
           
static
<T> void
expInPlace(Counter<T> c)
           
static
<E,N extends Number>
Counter<E>
fromMap(Map<E,N> map)
          Returns a counter view of the given map.
static
<E,N extends Number>
Counter<E>
fromMap(Map<E,N> map, Class<N> type)
          Returns a counter view of the given map.
static
<E> Counter<E>
getCopy(Counter<E> originalCounter)
           
static
<E> Counter<Double>
getCountCounts(Counter<E> c)
           
static
<E> int
hIndex(Counter<E> citationCounts)
          Calculate h-Index (Hirsch, 2005) of an author.
static
<E> Counter<E>
intersection(Counter<E> c1, Counter<E> c2)
          Returns a counter that is the intersection of c1 and c2.
static
<E> double
jaccardCoefficient(Counter<E> c1, Counter<E> c2)
          Returns the Jaccard Coefficient of the two counters.
static
<E> double
jensenShannonDivergence(Counter<E> c1, Counter<E> c2)
          Calculates the Jensen-Shannon divergence between the two counters.
static
<E> Set<E>
keysAbove(Counter<E> c, double countThreshold)
          Returns the set of keys whose counts are at or above the given threshold.
static
<E> Set<E>
keysAt(Counter<E> c, double count)
          Returns the set of keys that have exactly the given count.
static
<E> Set<E>
keysBelow(Counter<E> c, double countThreshold)
          Returns the set of keys whose counts are at or below the given threshold.
static
<E> double
klDivergence(Counter<E> from, Counter<E> to)
          Calculates the KL divergence between the two counters.
static
<E,C extends Counter<E>>
double
L1Norm(C c)
          Return the L1 norm of a counter.
static
<E,C extends Counter<E>>
double
L2Norm(C c)
          Return the l2 norm (Euclidean vector length) of a Counter.
static
<E,C extends Counter<E>>
C
L2Normalize(C c)
          L2 normalize a counter.
static
<E,C extends Counter<E>>
Counter<E>
L2NormalizeInPlace(Counter<E> c)
          L2 normalize a counter in place.
static
<E> Counter<E>
linearCombination(Counter<E> c1, double w1, Counter<E> c2, double w2)
          Returns a Counter which is a weighted average of c1 and c2.
static
<T1,T2> TwoDimensionalCounter<T1,T2>
load2DCounter(String filename, Class<T1> t1, Class<T2> t2)
           
static
<E> ClassicCounter<E>
loadCounter(String filename, Class<E> c)
          Loads a Counter from a text file.
static
<E> IntCounter<E>
loadIntCounter(String filename, Class<E> c)
          Loads a Counter from a text file.
static
<E> void
logInPlace(Counter<E> target)
           
static
<E> void
logNormalizeInPlace(Counter<E> c)
          Transform log space values into a probability distribution in place.
static
<E> double
logSum(Counter<E> c)
          Returns ArrayMath.logSum of the values in this counter.
static
<E> double
max(Counter<E> c)
          Returns the value of the maximum entry in this counter.
static
<E> double
mean(Counter<E> c)
          Returns the mean of all the counts (totalCount/size).
static
<E> double
min(Counter<E> c)
          Returns the value of the smallest entry in this counter.
static
<E> Counter<E>
multiplyInPlace(Counter<E> target, double multiplier)
          Multiplies each value in target by the given multiplier, in place.
static
<E> void
normalize(Counter<E> target)
          Normalizes the target counter in-place, so the sum of the resulting values equals 1.
static
<E> double
optimizedDotProduct(Counter<E> c1, Counter<E> c2)
          This method does not check entries for NAN or INFINITY values in the doubles returned.
static
<E,C extends Counter<E>>
C
perturbCounts(C c, Random random, double p)
           
static
<T1,T2> double
pointwiseMutualInformation(Counter<T1> var1Distribution, Counter<T2> var2Distribution, Counter<MutablePair<T1,T2>> jointDistribution, MutablePair<T1,T2> values)
           
static
<T> Counter<T>
pow(Counter<T> c, double temp)
           
static
<T> void
powInPlace(Counter<T> c, double temp)
           
static
<E> Counter<E>
powNormalized(Counter<E> c, double temp)
          Returns a counter where each element corresponds to the normalized count of the corresponding element in c raised to the given power.
static
<E> void
printCounterComparison(Counter<E> a, Counter<E> b)
          Great for debugging.
static
<E> void
printCounterComparison(Counter<E> a, Counter<E> b, PrintStream out)
          Great for debugging.
static
<E extends Comparable<E>>
void
printCounterSortedByKeys(Counter<E> c)
           
static
<E> Counter<E>
product(Counter<E> c1, Counter<E> c2)
          Returns the product of c1 and c2.
static
<E> void
removeKeys(Counter<E> counter, Collection<E> removeKeysCollection)
          Removes all entries with keys in the given collection
static
<E> E
restrictedArgMax(Counter<E> c, Collection<E> restriction)
           
static
<E> Set<E>
retainAbove(Counter<E> counter, double countThreshold)
          Removes all entries with counts below the given threshold, returning the set of removed entries.
static
<E> Set<E>
retainBelow(Counter<E> counter, double countMaxThreshold)
          Removes all entries with counts above the given threshold, returning the set of removed entries.
static
<E> void
retainBottom(Counter<E> c, int num)
          Removes all entries from c except for the bottom num
static Set<String> retainMatchingKeys(Counter<String> counter, List<Pattern> matchPatterns)
          Removes all entries with keys that does not match one of the given patterns
static
<E> Set<E>
retainNonZeros(Counter<E> counter)
          Removes all entries with 0 count in the counter, returning the set of removed entries.
static
<E> void
retainTop(Counter<E> c, int num)
          Removes all entries from c except for the top num
static
<E,C extends Counter<E>>
double
saferL2Norm(C c)
          For counters with large # of entries, this scales down each entry in the sum, to prevent an extremely large sum from building up and overwhelming the max double.
static
<E,C extends Counter<E>>
C
saferL2Normalize(C c)
          L2 normalize a counter, using the "safer" L2 normalizer.
static
<T> T
sample(Counter<T> c)
          Does not assumes c is normalized.
static
<T> T
sample(Counter<T> c, Random rand)
          Does not assumes c is normalized.
static
<T1,T2> void
save2DCounter(TwoDimensionalCounter<T1,T2> tdc, String filename)
           
static
<E> void
saveCounter(Counter<E> c, OutputStream stream)
          Saves a Counter as one key/count pair per line separated by white space to the given OutputStream.
static
<E> void
saveCounter(Counter<E> c, String filename)
          Saves a Counter to a text file.
static
<E,C extends Counter<E>>
C
scale(C c, double s)
          Returns a new Counter which is scaled by the given scale factor.
static
<T1,T2> TwoDimensionalCounter<T1,T2>
scale(TwoDimensionalCounter<T1,T2> c, double d)
          Creates a new TwoDimensionalCounter where all the counts are scaled by d.
static
<T> void
serializeCounter(Counter<T> c, String filename)
           
static
<E> double
skewDivergence(Counter<E> c1, Counter<E> c2, double skew)
          Calculates the skew divergence between the two counters.
static
<E> void
subtractInPlace(Counter<E> target, Counter<E> arg)
          Sets each value of target to be target[k]-arg[k] for all keys k in target.
static
<E> void
subtractInPlace(double[] target, Counter<E> arg, Index<E> idx)
          Sets each value of double[] target to be target[idx.indexOf(k)]-a.getCount(k) for all keys k in arg
static
<E> double
sumEntries(Counter<E> c1, Collection<E> entries)
           
static
<E,C extends Counter<E>>
C
tfLogScale(C c, double base)
          Returns a new Counter which is the input counter with log tf scaling
static
<E> String
toBiggestValuesFirstString(Counter<E> c)
           
static
<E> String
toBiggestValuesFirstString(Counter<E> c, int k)
           
static
<T> String
toBiggestValuesFirstString(Counter<Integer> c, int k, Index<T> index)
           
static
<E> Comparator<E>
toComparator(Counter<E> counter)
          Returns a comparator backed by this counter: two objects are compared by their associated values stored in the counter.
static
<E> Comparator<E>
toComparator(Counter<E> counter, boolean ascending, boolean useMagnitude)
          Returns a comparator suitable for sorting this Counter's keys or entries by their respective value or magnitude (by absolute value).
static
<E> Comparator<E>
toComparatorDescending(Counter<E> counter)
          Returns a comparator backed by this counter: two objects are compared by their associated values stored in the counter.
static
<T> Counter<T>
toCounter(double[] counts, Index<T> index)
           
static
<E> Counter<E>
toCounter(Map<Integer,? extends Number> counts, Index<E> index)
          Turns the given map and index into a counter instance.
static
<E> List<MutablePair<E,Double>>
toDescendingMagnitudeSortedListWithCounts(Counter<E> c)
           
static
<E> PriorityQueue<E>
toPriorityQueue(Counter<E> c)
          Returns a PriorityQueue whose elements are the keys of Counter c, and the score of each key in c becomes its priority.
static
<T extends Comparable<T>>
String
toSortedByKeysString(Counter<T> counter, String itemFormat, String joiner, String wrapperFormat)
          Returns a string representation of a Counter, where (key, value) pairs are sorted by key, and formatted as specified.
static
<E> List<E>
toSortedList(Counter<E> c)
          A List of the keys in c, sorted from highest count to lowest.
static
<E> List<MutablePair<E,Double>>
toSortedListWithCounts(Counter<E> c)
          A List of the keys in c, sorted from highest count to lowest, paired with counts
static
<T> String
toSortedString(Counter<T> counter, int k, String itemFormat, String joiner)
          Returns a string representation of a Counter, displaying the keys and their counts in decreasing order of count.
static
<T> String
toSortedString(Counter<T> counter, int k, String itemFormat, String joiner, String wrapperFormat)
          Returns a string representation of a Counter, displaying the keys and their counts in decreasing order of count.
static
<E> String
toString(Counter<E> counter, int maxKeysToPrint)
          Returns a string representation which includes no more than the maxKeysToPrint elements with largest counts.
static
<E> String
toString(Counter<E> counter, NumberFormat nf)
           
static
<E> String
toString(Counter<E> counter, NumberFormat nf, String preAppend, String postAppend, String keyValSeparator, String itemSeparator)
          Pretty print a Counter.
static
<E> String
toVerticalString(Counter<E> c)
           
static
<E> String
toVerticalString(Counter<E> c, int k)
           
static
<E> String
toVerticalString(Counter<E> c, int k, String fmt)
           
static
<E> String
toVerticalString(Counter<E> c, int k, String fmt, boolean swap)
          Returns a String representation of the k keys with the largest counts in the given Counter, using the given format string.
static
<E> String
toVerticalString(Counter<E> c, String fmt)
           
static
<T1,T2> Counter<T2>
transform(Counter<T1> c, Function<T1,T2> f)
          Returns the counter with keys modified according to function F.
static
<E,C extends Counter<E>>
C
union(C c1, C c2)
          Returns a Counter that is the union of the two Counters passed in (counts are added).
static
<T> Counter<T>
unmodifiableCounter(Counter<T> counter)
          Returns unmodifiable view of the counter.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

logSum

public static <E> double logSum(Counter<E> c)
Returns ArrayMath.logSum of the values in this counter.

Parameters:
c - Argument counter (which is not modified)
Returns:
ArrayMath.logSum of the values in this counter.

logNormalizeInPlace

public static <E> void logNormalizeInPlace(Counter<E> c)
Transform log space values into a probability distribution in place. On the assumption that the values in the Counter are in log space, this method calculates their sum, and then subtracts the log of their sum from each element. That is, if a counter has keys c1, c2, c3 with values v1, v2, v3, the value of c1 becomes v1 - log(e^v1 + e^v2 + e^v3). After this, e^v1 + e^v2 + e^v3 = 1.0, so Counters.logSum(c) = 0.0 (approximately).

Parameters:
c - The Counter to log normalize in place

max

public static <E> double max(Counter<E> c)
Returns the value of the maximum entry in this counter. This is also the Linfinity norm. An empty counter is given a max value of Double.NEGATIVE_INFINITY.

Parameters:
c - The Counter to find the max of
Returns:
The maximum value of the Counter

asCounter

public static <E> Counter<E> asCounter(Collection<E> c)
Takes in a Collection of something and makes a counter, incrementing once for each object in the collection.

Parameters:
c - The Collection to turn into a counter
Returns:
The counter made out of the collection

min

public static <E> double min(Counter<E> c)
Returns the value of the smallest entry in this counter.

Parameters:
c - The Counter (not modified)
Returns:
The minimum value in the Counter

argmax

public static <E> E argmax(Counter<E> c)
Finds and returns the key in the Counter with the largest count. Returning null if count is empty.

Parameters:
c - The Counter
Returns:
The key in the Counter with the largest count.

argmin

public static <E> E argmin(Counter<E> c)
Finds and returns the key in this Counter with the smallest count.

Parameters:
c - The Counter
Returns:
The key in the Counter with the smallest count.

mean

public static <E> double mean(Counter<E> c)
Returns the mean of all the counts (totalCount/size).

Parameters:
c - The Counter to find the mean of.
Returns:
The mean of all the counts (totalCount/size).

addInPlace

public static <E> void addInPlace(Counter<E> target,
                                  Counter<E> arg,
                                  double scale)
Sets each value of target to be target[k]+scale*arg[k] for all keys k in target.

Parameters:
target - A Counter that is modified
arg - The Counter whose contents are added to target
scale - How the arg Counter is scaled before being added

addInPlace

public static <E> void addInPlace(Counter<E> target,
                                  Counter<E> arg)
Sets each value of target to be target[k]+arg[k] for all keys k in arg.


addInPlace

public static <E> void addInPlace(double[] target,
                                  Counter<E> arg,
                                  Index<E> idx)
Sets each value of double[] target to be target[idx.indexOf(k)]+a.getCount(k) for all keys k in arg


addInPlace

public static <T1,T2> void addInPlace(TwoDimensionalCounter<T1,T2> target,
                                      TwoDimensionalCounter<T1,T2> arg,
                                      double scale)
For all keys (u,v) in arg, sets target[u,v] to be target[u,v] + scale * arg[u,v]

Type Parameters:
T1 -
T2 -

addInPlace

public static <T1,T2> void addInPlace(TwoDimensionalCounter<T1,T2> target,
                                      TwoDimensionalCounter<T1,T2> arg)
For all keys (u,v) in arg, sets target[u,v] to be target[u,v] + arg[u,v]

Type Parameters:
T1 -
T2 -

addInPlace

public static <E> void addInPlace(Counter<E> target,
                                  Collection<E> arg)
Sets each value of target to be target[k]+ num-of-times-it-occurs-in-collection if the key is present in the arg collection.


subtractInPlace

public static <E> void subtractInPlace(Counter<E> target,
                                       Counter<E> arg)
Sets each value of target to be target[k]-arg[k] for all keys k in target.


subtractInPlace

public static <E> void subtractInPlace(double[] target,
                                       Counter<E> arg,
                                       Index<E> idx)
Sets each value of double[] target to be target[idx.indexOf(k)]-a.getCount(k) for all keys k in arg


divideInPlace

public static <E> void divideInPlace(Counter<E> target,
                                     Counter<E> denominator)
Divides every non-zero count in target by the corresponding value in the denominator Counter. Beware that this can give NaN values for zero counts in the denominator counter!


dotProductInPlace

public static <E> void dotProductInPlace(Counter<E> target,
                                         Counter<E> term)
Multiplies every count in target by the corresponding value in the term Counter.


divideInPlace

public static <E> Counter<E> divideInPlace(Counter<E> target,
                                           double divisor)
Divides each value in target by the given divisor, in place.

Parameters:
target - The values in this Counter will be changed throught by the multiplier
divisor - The number by which to change each number in the Counter
Returns:
The target Counter is returned (for easier method chaining)

multiplyInPlace

public static <E> Counter<E> multiplyInPlace(Counter<E> target,
                                             double multiplier)
Multiplies each value in target by the given multiplier, in place.

Parameters:
target - The values in this Counter will be changed throught by the multiplier
multiplier - The number by which to change each number in the Counter

normalize

public static <E> void normalize(Counter<E> target)
Normalizes the target counter in-place, so the sum of the resulting values equals 1.

Type Parameters:
E -

logInPlace

public static <E> void logInPlace(Counter<E> target)

deleteOutofRange

public static <E> List<E> deleteOutofRange(Counter<E> c,
                                           int top,
                                           int bottom)
Delete 'top' and 'bottom' number of elements from the top and bottom respectively


retainTop

public static <E> void retainTop(Counter<E> c,
                                 int num)
Removes all entries from c except for the top num


retainBottom

public static <E> void retainBottom(Counter<E> c,
                                    int num)
Removes all entries from c except for the bottom num


retainNonZeros

public static <E> Set<E> retainNonZeros(Counter<E> counter)
Removes all entries with 0 count in the counter, returning the set of removed entries.


retainAbove

public static <E> Set<E> retainAbove(Counter<E> counter,
                                     double countThreshold)
Removes all entries with counts below the given threshold, returning the set of removed entries.

Parameters:
counter - The counter.
countThreshold - The minimum count for an entry to be kept. Entries (strictly) less than this threshold are discarded.
Returns:
The set of discarded entries.

retainBelow

public static <E> Set<E> retainBelow(Counter<E> counter,
                                     double countMaxThreshold)
Removes all entries with counts above the given threshold, returning the set of removed entries.

Parameters:
counter - The counter.
countMaxThreshold - The maximum count for an entry to be kept. Entries (strictly) more than this threshold are discarded.
Returns:
The set of discarded entries.

retainMatchingKeys

public static Set<String> retainMatchingKeys(Counter<String> counter,
                                             List<Pattern> matchPatterns)
Removes all entries with keys that does not match one of the given patterns

Parameters:
counter - The counter.
matchPatterns - pattern for key to match
Returns:
The set of discarded entries.

removeKeys

public static <E> void removeKeys(Counter<E> counter,
                                  Collection<E> removeKeysCollection)
Removes all entries with keys in the given collection

Type Parameters:
E -
Parameters:
counter -
removeKeysCollection -

keysAbove

public static <E> Set<E> keysAbove(Counter<E> c,
                                   double countThreshold)
Returns the set of keys whose counts are at or above the given threshold. This set may have 0 elements but will not be null.

Parameters:
c - The Counter to examine
countThreshold - Items equal to or above this number are kept
Returns:
A (non-null) Set of keys whose counts are at or above the given threshold.

keysBelow

public static <E> Set<E> keysBelow(Counter<E> c,
                                   double countThreshold)
Returns the set of keys whose counts are at or below the given threshold. This set may have 0 elements but will not be null.


keysAt

public static <E> Set<E> keysAt(Counter<E> c,
                                double count)
Returns the set of keys that have exactly the given count. This set may have 0 elements but will not be null.


transform

public static <T1,T2> Counter<T2> transform(Counter<T1> c,
                                            Function<T1,T2> f)
Returns the counter with keys modified according to function F. Eager evaluation.


toComparator

public static <E> Comparator<E> toComparator(Counter<E> counter)
Returns a comparator backed by this counter: two objects are compared by their associated values stored in the counter. This comparator returns keys by ascending numeric value. Note that this ordering is not fixed, but depends on the mutable values stored in the Counter. Doing this comparison does not depend on the type of the key, since it uses the numeric value, which is always Comparable.

Parameters:
counter - The Counter whose values are used for ordering the keys
Returns:
A Comparator using this ordering

toComparatorDescending

public static <E> Comparator<E> toComparatorDescending(Counter<E> counter)
Returns a comparator backed by this counter: two objects are compared by their associated values stored in the counter. This comparator returns keys by descending numeric value. Note that this ordering is not fixed, but depends on the mutable values stored in the Counter. Doing this comparison does not depend on the type of the key, since it uses the numeric value, which is always Comparable.

Parameters:
counter - The Counter whose values are used for ordering the keys
Returns:
A Comparator using this ordering

toComparator

public static <E> Comparator<E> toComparator(Counter<E> counter,
                                             boolean ascending,
                                             boolean useMagnitude)
Returns a comparator suitable for sorting this Counter's keys or entries by their respective value or magnitude (by absolute value). If ascending is true, smaller magnitudes will be returned first, otherwise higher magnitudes will be returned first.

Sample usage:

 Counter c = new Counter();
 // add to the counter...
 List biggestKeys = new ArrayList(c.keySet());
 Collections.sort(biggestAbsKeys, Counters.comparator(c, false, true));
 List smallestEntries = new ArrayList(c.entrySet());
 Collections.sort(smallestEntries, Counters.comparator(c, true, false));
 


toSortedList

public static <E> List<E> toSortedList(Counter<E> c)
A List of the keys in c, sorted from highest count to lowest.

Returns:
A List of the keys in c, sorted from highest count to lowest.

toDescendingMagnitudeSortedListWithCounts

public static <E> List<MutablePair<E,Double>> toDescendingMagnitudeSortedListWithCounts(Counter<E> c)

toSortedListWithCounts

public static <E> List<MutablePair<E,Double>> toSortedListWithCounts(Counter<E> c)
A List of the keys in c, sorted from highest count to lowest, paired with counts

Returns:
A List of the keys in c, sorted from highest count to lowest.

toPriorityQueue

public static <E> PriorityQueue<E> toPriorityQueue(Counter<E> c)
Returns a PriorityQueue whose elements are the keys of Counter c, and the score of each key in c becomes its priority.

Parameters:
c - Input Counter
Returns:
A PriorityQueue where the count is a key's priority

union

public static <E,C extends Counter<E>> C union(C c1,
                                               C c2)
Returns a Counter that is the union of the two Counters passed in (counts are added).

Returns:
A Counter that is the union of the two Counters passed in (counts are added).

intersection

public static <E> Counter<E> intersection(Counter<E> c1,
                                          Counter<E> c2)
Returns a counter that is the intersection of c1 and c2. If both c1 and c2 contain a key, the min of the two counts is used.

Returns:
A counter that is the intersection of c1 and c2

jaccardCoefficient

public static <E> double jaccardCoefficient(Counter<E> c1,
                                            Counter<E> c2)
Returns the Jaccard Coefficient of the two counters. Calculated as |c1 intersect c2| / ( |c1| + |c2| - |c1 intersect c2|

Returns:
The Jaccard Coefficient of the two counters

product

public static <E> Counter<E> product(Counter<E> c1,
                                     Counter<E> c2)
Returns the product of c1 and c2.

Returns:
The product of c1 and c2.

dotProduct

public static <E> double dotProduct(Counter<E> c1,
                                    Counter<E> c2)
Returns the product of c1 and c2.

Returns:
The product of c1 and c2.

dotProduct

public static <E> double dotProduct(Counter<E> c,
                                    double[] a,
                                    Index<E> idx)
Returns the product of Counter c and double[] a, using Index idx to map entries in C onto a.

Returns:
The product of c and a.

sumEntries

public static <E> double sumEntries(Counter<E> c1,
                                    Collection<E> entries)

optimizedDotProduct

public static <E> double optimizedDotProduct(Counter<E> c1,
                                             Counter<E> c2)
This method does not check entries for NAN or INFINITY values in the doubles returned. It also only iterates over the counter with the smallest number of keys to help speed up computation. Pair this method with normalizing your counters before hand and you have a reasonably quick implementation of cosine.

Type Parameters:
E -
Parameters:
c1 -
c2 -
Returns:
The dot product of the two counter (as vectors)

absoluteDifference

public static <E> Counter<E> absoluteDifference(Counter<E> c1,
                                                Counter<E> c2)
Returns |c1 - c2|.

Returns:
The difference between sets c1 and c2.

division

public static <E> Counter<E> division(Counter<E> c1,
                                      Counter<E> c2)
Returns c1 divided by c2. Note that this can create NaN if c1 has non-zero counts for keys that c2 has zero counts.

Returns:
c1 divided by c2.

entropy

public static <E> double entropy(Counter<E> c)
Calculates the entropy of the given counter (in bits). This method internally uses normalized counts (so they sum to one), but the value returned is meaningless if some of the counts are negative.

Returns:
The entropy of the given counter (in bits)

crossEntropy

public static <E> double crossEntropy(Counter<E> from,
                                      Counter<E> to)
Note that this implementation doesn't normalize the "from" Counter. It does, however, normalize the "to" Counter. Result is meaningless if any of the counts are negative.

Returns:
The cross entropy of H(from, to)

klDivergence

public static <E> double klDivergence(Counter<E> from,
                                      Counter<E> to)
Calculates the KL divergence between the two counters. That is, it calculates KL(from || to). This method internally uses normalized counts (so they sum to one), but the value returned is meaningless if any of the counts are negative. In other words, how well can c1 be represented by c2. if there is some value in c1 that gets zero prob in c2, then return positive infinity.

Returns:
The KL divergence between the distributions

jensenShannonDivergence

public static <E> double jensenShannonDivergence(Counter<E> c1,
                                                 Counter<E> c2)
Calculates the Jensen-Shannon divergence between the two counters. That is, it calculates 1/2 [KL(c1 || avg(c1,c2)) + KL(c2 || avg(c1,c2))] .

Returns:
The Jensen-Shannon divergence between the distributions

skewDivergence

public static <E> double skewDivergence(Counter<E> c1,
                                        Counter<E> c2,
                                        double skew)
Calculates the skew divergence between the two counters. That is, it calculates KL(c1 || (c2*skew + c1*(1-skew))) . In other words, how well can c1 be represented by a "smoothed" c2.

Returns:
The skew divergence between the distributions

L2Norm

public static <E,C extends Counter<E>> double L2Norm(C c)
Return the l2 norm (Euclidean vector length) of a Counter. Implementation note: The method name favors legibility of the L over the convention of using lowercase names for methods.

Parameters:
c - The Counter
Returns:
Its length

L1Norm

public static <E,C extends Counter<E>> double L1Norm(C c)
Return the L1 norm of a counter. Implementation note: The method name favors legibility of the L over the convention of using lowercase names for methods.

Parameters:
c - The Counter
Returns:
Its length

L2Normalize

public static <E,C extends Counter<E>> C L2Normalize(C c)
L2 normalize a counter.

Parameters:
c - The Counter to be L2 normalized. This counter is not modified.
Returns:
A new l2-normalized Counter based on c.

L2NormalizeInPlace

public static <E,C extends Counter<E>> Counter<E> L2NormalizeInPlace(Counter<E> c)
L2 normalize a counter in place.

Parameters:
c - The Counter to be L2 normalized. This counter is modified
Returns:
the passed in counter l2-normalized

saferL2Norm

public static <E,C extends Counter<E>> double saferL2Norm(C c)
For counters with large # of entries, this scales down each entry in the sum, to prevent an extremely large sum from building up and overwhelming the max double. This may also help reduce error by preventing loss of SD's with extremely large values.

Type Parameters:
E -
C -

saferL2Normalize

public static <E,C extends Counter<E>> C saferL2Normalize(C c)
L2 normalize a counter, using the "safer" L2 normalizer.

Parameters:
c - The Counter to be L2 normalized. This counter is not modified.
Returns:
A new l2-normalized Counter based on c.

cosine

public static <E> double cosine(Counter<E> c1,
                                Counter<E> c2)

average

public static <E> Counter<E> average(Counter<E> c1,
                                     Counter<E> c2)
Returns a new Counter with counts averaged from the two given Counters. The average Counter will contain the union of keys in both source Counters, and each count will be the average of the two source counts for that key, where as usual a missing count in one Counter is treated as count 0.

Returns:
A new counter with counts that are the mean of the resp. counts in the given counters.

linearCombination

public static <E> Counter<E> linearCombination(Counter<E> c1,
                                               double w1,
                                               Counter<E> c2,
                                               double w2)
Returns a Counter which is a weighted average of c1 and c2. Counts from c1 are weighted with weight w1 and counts from c2 are weighted with w2.


pointwiseMutualInformation

public static <T1,T2> double pointwiseMutualInformation(Counter<T1> var1Distribution,
                                                        Counter<T2> var2Distribution,
                                                        Counter<MutablePair<T1,T2>> jointDistribution,
                                                        MutablePair<T1,T2> values)

hIndex

public static <E> int hIndex(Counter<E> citationCounts)
Calculate h-Index (Hirsch, 2005) of an author. A scientist has index h if h of their Np papers have at least h citations each, and the other (Np − h) papers have at most h citations each.

Parameters:
citationCounts - Citation counts for each of the articles written by the author. The keys can be anything, but the values should be integers.
Returns:
The h-Index of the author.

perturbCounts

public static <E,C extends Counter<E>> C perturbCounts(C c,
                                                       Random random,
                                                       double p)

printCounterComparison

public static <E> void printCounterComparison(Counter<E> a,
                                              Counter<E> b)
Great for debugging.


printCounterComparison

public static <E> void printCounterComparison(Counter<E> a,
                                              Counter<E> b,
                                              PrintStream out)
Great for debugging.


getCountCounts

public static <E> Counter<Double> getCountCounts(Counter<E> c)

scale

public static <E,C extends Counter<E>> C scale(C c,
                                               double s)
Returns a new Counter which is scaled by the given scale factor.

Parameters:
c - The counter to scale. It is not changed
s - The constant to scale the counter by
Returns:
A new Counter which is the argument scaled by the given scale factor.

tfLogScale

public static <E,C extends Counter<E>> C tfLogScale(C c,
                                                    double base)
Returns a new Counter which is the input counter with log tf scaling

Parameters:
c - The counter to scale. It is not changed
base - The base of the logarithm used for tf scaling by 1 + log tf
Returns:
A new Counter which is the argument scaled by the given scale factor.

printCounterSortedByKeys

public static <E extends Comparable<E>> void printCounterSortedByKeys(Counter<E> c)

loadCounter

public static <E> ClassicCounter<E> loadCounter(String filename,
                                                Class<E> c)
                                     throws RuntimeException
Loads a Counter from a text file. File must have the format of one key/count pair per line, separated by whitespace.

Parameters:
filename - the path to the file to load the Counter from
c - the Class to instantiate each member of the set. Must have a String constructor.
Returns:
The counter loaded from the file.
Throws:
RuntimeException

loadIntCounter

public static <E> IntCounter<E> loadIntCounter(String filename,
                                               Class<E> c)
                                    throws Exception
Loads a Counter from a text file. File must have the format of one key/count pair per line, separated by whitespace.

Parameters:
filename - the path to the file to load the Counter from
c - the Class to instantiate each member of the set. Must have a String constructor.
Returns:
The counter loaded from the file.
Throws:
Exception

saveCounter

public static <E> void saveCounter(Counter<E> c,
                                   OutputStream stream)
Saves a Counter as one key/count pair per line separated by white space to the given OutputStream. Does not close the stream.


saveCounter

public static <E> void saveCounter(Counter<E> c,
                                   String filename)
                        throws IOException
Saves a Counter to a text file. Counter written as one key/count pair per line, separated by whitespace.

Throws:
IOException

load2DCounter

public static <T1,T2> TwoDimensionalCounter<T1,T2> load2DCounter(String filename,
                                                                 Class<T1> t1,
                                                                 Class<T2> t2)
                                                  throws RuntimeException
Throws:
RuntimeException

save2DCounter

public static <T1,T2> void save2DCounter(TwoDimensionalCounter<T1,T2> tdc,
                                         String filename)
                          throws IOException
Throws:
IOException

serializeCounter

public static <T> void serializeCounter(Counter<T> c,
                                        String filename)
                             throws IOException
Throws:
IOException

deserializeCounter

public static <T> ClassicCounter<T> deserializeCounter(String filename)
                                            throws Exception
Throws:
Exception

toSortedString

public static <T> String toSortedString(Counter<T> counter,
                                        int k,
                                        String itemFormat,
                                        String joiner,
                                        String wrapperFormat)
Returns a string representation of a Counter, displaying the keys and their counts in decreasing order of count. At most k keys are displayed. Note that this method subsumes many of the other toString methods, e.g.: toString(c, k) and toBiggestValuesFirstString(c, k) => toSortedString(c, k, "%s=%f", ", ", "[%s]") toVerticalString(c, k) => toSortedString(c, k, "%2$g\t%1$s", "\n", "%s\n")

Parameters:
counter - A Counter.
k - The number of keys to include. Use Integer.MAX_VALUE to include all keys.
itemFormat - The format string for key/count pairs, where the key is first and the value is second. To display the value first, use argument indices, e.g. "%2$f %1$s".
joiner - The string used between pairs of key/value strings.
wrapperFormat - The format string for wrapping text around the joined items, where the joined item string value is "%s".
Returns:
The top k values from the Counter, formatted as specified.

toSortedString

public static <T> String toSortedString(Counter<T> counter,
                                        int k,
                                        String itemFormat,
                                        String joiner)
Returns a string representation of a Counter, displaying the keys and their counts in decreasing order of count. At most k keys are displayed.

Parameters:
counter - A Counter.
k - The number of keys to include. Use Integer.MAX_VALUE to include all keys.
itemFormat - The format string for key/count pairs, where the key is first and the value is second. To display the value first, use argument indices, e.g. "%2$f %1$s".
joiner - The string used between pairs of key/value strings.
Returns:
The top k values from the Counter, formatted as specified.

toSortedByKeysString

public static <T extends Comparable<T>> String toSortedByKeysString(Counter<T> counter,
                                                                    String itemFormat,
                                                                    String joiner,
                                                                    String wrapperFormat)
Returns a string representation of a Counter, where (key, value) pairs are sorted by key, and formatted as specified.

Parameters:
counter - The Counter.
itemFormat - The format string for key/count pairs, where the key is first and the value is second. To display the value first, use argument indices, e.g. "%2$f %1$s".
joiner - The string used between pairs of key/value strings.
wrapperFormat - The format string for wrapping text around the joined items, where the joined item string value is "%s".
Returns:
The Counter, formatted as specified.

toString

public static <E> String toString(Counter<E> counter,
                                  int maxKeysToPrint)
Returns a string representation which includes no more than the maxKeysToPrint elements with largest counts. If maxKeysToPrint is non-positive, all elements are printed.

Parameters:
counter - The Counter
maxKeysToPrint - Max keys to print
Returns:
A partial string representation

toString

public static <E> String toString(Counter<E> counter,
                                  NumberFormat nf)

toString

public static <E> String toString(Counter<E> counter,
                                  NumberFormat nf,
                                  String preAppend,
                                  String postAppend,
                                  String keyValSeparator,
                                  String itemSeparator)
Pretty print a Counter. This one has more flexibility in formatting, and doesn't sort the keys.


toBiggestValuesFirstString

public static <E> String toBiggestValuesFirstString(Counter<E> c)

toBiggestValuesFirstString

public static <E> String toBiggestValuesFirstString(Counter<E> c,
                                                    int k)

toBiggestValuesFirstString

public static <T> String toBiggestValuesFirstString(Counter<Integer> c,
                                                    int k,
                                                    Index<T> index)

toVerticalString

public static <E> String toVerticalString(Counter<E> c)

toVerticalString

public static <E> String toVerticalString(Counter<E> c,
                                          int k)

toVerticalString

public static <E> String toVerticalString(Counter<E> c,
                                          String fmt)

toVerticalString

public static <E> String toVerticalString(Counter<E> c,
                                          int k,
                                          String fmt)

toVerticalString

public static <E> String toVerticalString(Counter<E> c,
                                          int k,
                                          String fmt,
                                          boolean swap)
Returns a String representation of the k keys with the largest counts in the given Counter, using the given format string.

Parameters:
c - a Counter
k - how many keys to print
fmt - a format string, such as "%.0f\t%s" (do not include final "%n")
swap - whether the count should appear after the key

restrictedArgMax

public static <E> E restrictedArgMax(Counter<E> c,
                                     Collection<E> restriction)
Returns:
Returns the maximum element of c that is within the restriction Collection

toCounter

public static <T> Counter<T> toCounter(double[] counts,
                                       Index<T> index)

toCounter

public static <E> Counter<E> toCounter(Map<Integer,? extends Number> counts,
                                       Index<E> index)
Turns the given map and index into a counter instance. For each entry in counts, its key is converted to a counter key via lookup in the given index.


scale

public static <T1,T2> TwoDimensionalCounter<T1,T2> scale(TwoDimensionalCounter<T1,T2> c,
                                                         double d)
Creates a new TwoDimensionalCounter where all the counts are scaled by d. Internally, uses Counters.scale();

Returns:
The TwoDimensionalCounter

sample

public static <T> T sample(Counter<T> c,
                           Random rand)
Does not assumes c is normalized.

Returns:
A sample from c

sample

public static <T> T sample(Counter<T> c)
Does not assumes c is normalized.

Returns:
A sample from c

powNormalized

public static <E> Counter<E> powNormalized(Counter<E> c,
                                           double temp)
Returns a counter where each element corresponds to the normalized count of the corresponding element in c raised to the given power.


pow

public static <T> Counter<T> pow(Counter<T> c,
                                 double temp)

powInPlace

public static <T> void powInPlace(Counter<T> c,
                                  double temp)

exp

public static <T> Counter<T> exp(Counter<T> c)

expInPlace

public static <T> void expInPlace(Counter<T> c)

diff

public static <T> Counter<T> diff(Counter<T> goldFeatures,
                                  Counter<T> guessedFeatures)

equals

public static <E> boolean equals(Counter<E> o1,
                                 Counter<E> o2)
Default equality comparison for two counters potentially backed by alternative implementations.


unmodifiableCounter

public static <T> Counter<T> unmodifiableCounter(Counter<T> counter)
Returns unmodifiable view of the counter. changes to the underlying Counter are written through to this Counter.

Parameters:
counter - The counter
Returns:
unmodifiable view of the counter

asCounter

public static <E> Counter<E> asCounter(FixedPrioritiesPriorityQueue<E> p)
Returns a counter whose keys are the elements in this priority queue, and whose counts are the priorities in this queue. In the event there are multiple instances of the same element in the queue, the counter's count will be the sum of the instances' priorities.


fromMap

public static <E,N extends Number> Counter<E> fromMap(Map<E,N> map)
Returns a counter view of the given map. Infers the numeric type of the values from the first element in map.values().


fromMap

public static <E,N extends Number> Counter<E> fromMap(Map<E,N> map,
                                                      Class<N> type)
Returns a counter view of the given map. The type parameter is the type of the values in the map, which because of Java's generics type erasure, can't be discovered by reflection if the map is currently empty.


asMap

public static <E> Map<E,Double> asMap(Counter<E> counter)
Returns a map view of the given counter.


getCopy

public static <E> Counter<E> getCopy(Counter<E> originalCounter)
Type Parameters:
E -
Parameters:
originalCounter -
Returns:
a copy of the original counter


Stanford NLP Group