edu.stanford.nlp.util
Class Counter

java.lang.Object
  |
  +--java.util.AbstractMap
        |
        +--java.util.HashMap
              |
              +--edu.stanford.nlp.util.Counter
All Implemented Interfaces:
Cloneable, Map, Serializable

public class Counter
extends HashMap
implements Serializable

Specialized Map for storing numeric counts for objects. Works like a normal Map but with extra methods for easily getting/setting/incrementing counts for objects and computing various functions with the counts. Any attempt to put a non-Double value into this Counter will result in an IllegalArgumentException being thrown. Note however that the Map constructor and putAll method can be used to copy another Counter's contents over (see also addAll(Counter)). This class also provides access to Comparators that can be used to sort the keys or entries of this Counter by the counts, in either ascending or descending order.

Author:
Dan Klein, Joseph Smarr (jsmarr@stanford.edu)
See Also:
Serialized Form

Constructor Summary
Counter()
          Constructs a new (empty) Counter.
Counter(Map m)
          Constructs a new Counter with the contents of the given Map.
 
Method Summary
 void add(Object o, double count)
          Deprecated. use incrementCount instead.
 void addAll(Counter counter)
          Adds the counts in the given Counter to the counts in this Counter.
 void addCounter(Counter c)
          Deprecated. use addAll instead.
 Object argmax()
          Finds and returns the key in this Counter with the largest count.
 Object argmax(Comparator tieBreaker)
          Finds and returns the key in this Counter with the largest count.
 Object argmin()
          Finds and returns the key in this Counter with the smallest count.
 Object argmin(Comparator tieBreaker)
          Finds and returns the key in this Counter with the smallest count.
 Counter average(Counter other)
          Returns a new Counter with counts averaged from this Counter and the given Counter.
 double averageCount()
          Returns the mean of all the counts (totalCount/size).
 void clear()
          Removes all counts from this Counter.
 Comparator comparator()
          Comparator that sorts objects by (increasing) count.
 Comparator comparator(boolean ascending)
          Returns a comparator suitable for sorting this Counter's keys or entries by their respective counts.
 double countOf(Object o)
          Deprecated. use getCount instead.
 void decrementCount(Object key)
          Subtracts 1.0 from the count for the given key.
 void decrementCount(Object key, double count)
          Subtracts the given count from the current count for the given key.
 void decrementCounts(Collection keys)
          Subtracts 1.0 from the counts of each of the given keys.
 void decrementCounts(Collection keys, double count)
          Subtracts the given count from the current counts for each of the given keys.
 double entropy()
          Calculates the entropy of this counter (in bits).
 double getCount(Object key)
          Returns the current count for the given key, which is 0 if it hasn't been seen before.
 double getNormalizedCount(Object key)
          Returns the current count for the given key as a fraction of the total count in the counter.
 boolean hasSeen(Object o)
          Deprecated. use the standard containsKey function of Map.
 void increment(Object o)
          Deprecated. use incrementCount instead.
 void incrementCount(Object key)
          Adds 1.0 to the count for the given key.
 void incrementCount(Object key, double count)
          Adds the given count to the current count for the given key.
 void incrementCounts(Collection keys)
          Adds 1.0 to the counts for each of the given keys.
 void incrementCounts(Collection keys, double count)
          Adds the given count to the current counts for each of the given keys.
 double informationRadius(Counter other)
          Calculates the information radius (aka the Jensen-Shannon divergence) between this Counter and another counter.
 Set keysAbove(double countThreshold)
          Returns the set of keys whose counts are at or above the given threshold.
 Set keysAt(double count)
          Returns the set of keys that have exactly the given count.
 Set keysBelow(double countThreshold)
          Returns the set of keys whose counts are at or below the given threshold.
 double klDivergence(Counter other)
          Calculates the KL divergence between this counter and another counter.
static void main(String[] args)
          For internal debugging purposes only.
 double max()
          Finds and returns the largest count in this Counter.
 double min()
          Finds and returns the smallest count in this Counter.
 void normalize()
          Divides all the counts by the total so they collectively sum to 1.0.
 Object put(Object key, Object value)
          Adds a count for the given key if value is a Number.
 Object remove(Object key)
          Removes the given key from this Counter.
 void removeAll(Collection c)
          Removes all the given keys from this Counter.
 void removeZeroCounts()
          Removes all keys whose count is 0.
 Set seenSet()
          Deprecated. use the standard keySet() function of Map.
 void setCount(Object key, double count)
          Sets the current count for the given key.
 void setCounts(Collection keys, double count)
          Sets the current count for each of the given keys.
 void subtractAll(Counter counter)
          Subtracts the counts in the given Counter from the counts in this Counter.
 double total()
          Deprecated. use totalCount instead.
 double totalCount()
          Returns the current total count for all objects in this Counter.
 double totalCount(Filter filter)
          Returns the total count for all objects in this Counter that pass the given Filter.
 
Methods inherited from class java.util.HashMap
clone, containsKey, containsValue, entrySet, get, isEmpty, keySet, putAll, size, values
 
Methods inherited from class java.util.AbstractMap
equals, hashCode, toString
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface java.util.Map
equals, hashCode
 

Constructor Detail

Counter

public Counter()
Constructs a new (empty) Counter.


Counter

public Counter(Map m)
Constructs a new Counter with the contents of the given Map. The values in m must all be Doubles or an IllegalArgumentException will be thrown when they are added.

Method Detail

totalCount

public double totalCount()
Returns the current total count for all objects in this Counter. The total is maintained as counts are adjusted so it can be returned quickly without having to sum all the counts on demand.


totalCount

public double totalCount(Filter filter)
Returns the total count for all objects in this Counter that pass the given Filter. Passing in a filter that always returns true is equivalent to calling total(), though the latter is faster since the total is maintained internally as counts are adjusted.


averageCount

public double averageCount()
Returns the mean of all the counts (totalCount/size).


getCount

public double getCount(Object key)
Returns the current count for the given key, which is 0 if it hasn't been seen before. This is a convenient version of get that casts and extracts the primitive value.


getNormalizedCount

public double getNormalizedCount(Object key)
Returns the current count for the given key as a fraction of the total count in the counter. This is equivalent to assuming that all the counts sum to one, but it doesn't actually change the raw counts.


setCount

public void setCount(Object key,
                     double count)
Sets the current count for the given key. This will wipe out any existing count for that key.

To add to a count instead of replacing it, use incrementCount(Object,double).


setCounts

public void setCounts(Collection keys,
                      double count)
Sets the current count for each of the given keys. This will wipe out any existing counts for these keys.

To add to the counts of a collection of objects instead of replacing them, use incrementCounts(Collection,double).


incrementCount

public void incrementCount(Object key,
                           double count)
Adds the given count to the current count for the given key. If the key hasn't been seen before, it is assumed to have count 0, and thus this method will set its count to the given amount. Negative increments are equivalent to calling decrementCount.

To more conviently increment the count by 1.0, use incrementCount(Object).

To set a count to a specifc value instead of incrementing it, use setCount(Object,double).


incrementCount

public void incrementCount(Object key)
Adds 1.0 to the count for the given key. If the key hasn't been seen before, it is assumed to have count 0, and thus this method will set its count to 1.0.

To increment the count by a value other than 1.0, use incrementCount(Object,double).

To set a count to a specifc value instead of incrementing it, use setCount(Object,double).


incrementCounts

public void incrementCounts(Collection keys,
                            double count)
Adds the given count to the current counts for each of the given keys. If any of the keys haven't been seen before, they are assumed to have count 0, and thus this method will set their counts to the given amount. Negative increments are equivalent to calling decrementCounts.

To more conviniently increment the counts of a collection of objects by 1.0, use incrementCounts(Collection).

To set the counts of a collection of objects to a specific value instead of incrementing them, use setCounts(Collection,double).


incrementCounts

public void incrementCounts(Collection keys)
Adds 1.0 to the counts for each of the given keys. If any of the keys haven't been seen before, they are assumed to have count 0, and thus this method will set their counts to 1.0.

To increment the counts of a collection of object by a value other than 1.0, use incrementCounts(Collection,double).

To set the counts of a collection of objects to a specifc value instead of incrementing them, use setCounts(Collection,double).


decrementCount

public void decrementCount(Object key,
                           double count)
Subtracts the given count from the current count for the given key. If the key hasn't been seen before, it is assumed to have count 0, and thus this method will set its count to the negative of the given amount. Negative increments are equivalent to calling incrementCount.

To more conviently decrement the count by 1.0, use decrementCount(Object).

To set a count to a specifc value instead of decrementing it, use setCount(Object,double).


decrementCount

public void decrementCount(Object key)
Subtracts 1.0 from the count for the given key. If the key hasn't been seen before, it is assumed to have count 0, and thus this method will set its count to -1.0.

To decrement the count by a value other than 1.0, use decrementCount(Object,double).

To set a count to a specifc value instead of decrementing it, use setCount(Object,double).


decrementCounts

public void decrementCounts(Collection keys,
                            double count)
Subtracts the given count from the current counts for each of the given keys. If any of the keys haven't been seen before, they are assumed to have count 0, and thus this method will set their counts to the negative of the given amount. Negative increments are equivalent to calling incrementCounts.

To more conviniently decrement the counts of a collection of objects by 1.0, use decrementCounts(Collection).

To set the counts of a collection of objects to a specific value instead of decrementing them, use setCounts(Collection,double).


decrementCounts

public void decrementCounts(Collection keys)
Subtracts 1.0 from the counts of each of the given keys. If any of the keys haven't been seen before, they are assumed to have count 0, and thus this method will set their counts to -1.0.

To decrement the counts of a collection of object by a value other than 1.0, use decrementCounts(Collection,double).

To set the counts of a collection of objects to a specifc value instead of decrementing them, use setCounts(Collection,double).


addAll

public void addAll(Counter counter)
Adds the counts in the given Counter to the counts in this Counter.

To copy the values from another Counter rather than adding them, use HashMap.putAll(Map) or Counter(Map).


subtractAll

public void subtractAll(Counter counter)
Subtracts the counts in the given Counter from the counts in this Counter.

To copy the values from another Counter rather than subtracting them, use HashMap.putAll(Map) or Counter(Map).


put

public Object put(Object key,
                  Object value)
           throws IllegalArgumentException
Adds a count for the given key if value is a Number. Throws an IllegalArgumentException otherwise. All of HashMap's various put functions route through this method, so it's the single choke point to ensure that only Object-->Double entries are contained in this map. If the value is a non-Double Number, its doubleValue is extracted and inserted in a new Double, so the values should always be of type Double, but this way adding an Integer doesn't break it. For normal use, use incrementCount instead.

Specified by:
put in interface Map
Overrides:
put in class HashMap
IllegalArgumentException

remove

public Object remove(Object key)
Removes the given key from this Counter. Its count will now be 0 and it will no longer be considered previously seen.

Specified by:
remove in interface Map
Overrides:
remove in class HashMap

removeAll

public void removeAll(Collection c)
Removes all the given keys from this Counter.


clear

public void clear()
Removes all counts from this Counter.

Specified by:
clear in interface Map
Overrides:
clear in class HashMap

normalize

public void normalize()
Divides all the counts by the total so they collectively sum to 1.0. This effectively turns this Counter into a probability distribution. The behavior will be weird if there are any negative counts. Note that calling getNormalizedCount(java.lang.Object) returns the same thing as getCount(java.lang.Object) after calling normalize but it doesn't change the raw counts.


removeZeroCounts

public void removeZeroCounts()
Removes all keys whose count is 0. After incrementing and decrementing counts or adding and subtracting Counters, there may be keys left whose count is 0, though normally this is undesirable. This method cleans up the map.

Maybe in the future we should try to do this more on-the-fly, though it's not clear whether a distinction should be made between "never seen" (i.e. null count) and "seen with 0 count". Certainly there's no distinction in getCount() but there is in containsKey().


max

public double max()
Finds and returns the largest count in this Counter.


min

public double min()
Finds and returns the smallest count in this Counter.


argmax

public Object argmax(Comparator tieBreaker)
Finds and returns the key in this Counter with the largest count. Ties are broken by comparing the objects using the given tie breaking Comparator, favoring Objects that are sorted to the front. This is useful if the keys are numeric and there is a bias to prefer smaller or larger values, and can be useful in other circumstances where random tie-breaking is not desirable. Returns null if this Counter is empty.


argmax

public Object argmax()
Finds and returns the key in this Counter with the largest count. Ties are broken according to the natural ordering of the objects. This will prefer smaller numeric keys and lexicographically earlier String keys. To use a different tie-breaking Comparator, use argmax(Comparator). Returns null if this Counter is empty.


argmin

public Object argmin(Comparator tieBreaker)
Finds and returns the key in this Counter with the smallest count. Ties are broken by comparing the objects using the given tie breaking Comparator, favoring Objects that are sorted to the front. This is useful if the keys are numeric and there is a bias to prefer smaller or larger values, and can be useful in other circumstances where random tie-breaking is not desirable. Returns null if this Counter is empty.


argmin

public Object argmin()
Finds and returns the key in this Counter with the smallest count. Ties are broken according to the natural ordering of the objects. This will prefer smaller numeric keys and lexicographically earlier String keys. To use a different tie-breaking Comparator, use argmin(Comparator). Returns null if this Counter is empty.


keysAbove

public Set keysAbove(double countThreshold)
Returns the set of keys whose counts are at or above the given threshold. This set may have 0 elements but will not be null.


keysBelow

public Set keysBelow(double countThreshold)
Returns the set of keys whose counts are at or below the given threshold. This set may have 0 elements but will not be null.


keysAt

public Set keysAt(double count)
Returns the set of keys that have exactly the given count. This set may have 0 elements but will not be null.


entropy

public double entropy()
Calculates the entropy of this counter (in bits). This method internally uses normalized counts (so they sum to one), but the value returned is meaningless if some of the counts are negative.

Returns:
The entropy of this counter (in bits)

klDivergence

public double klDivergence(Counter other)
Calculates the KL divergence between this counter and another counter. That is, it calculates KL(this||other). This method internally uses normalized counts (so they sum to one), but the value returned is meaningless if some of the counts are negative.

Parameters:
other - The other Counter
Returns:
The KL divergence between the distributions

informationRadius

public double informationRadius(Counter other)
Calculates the information radius (aka the Jensen-Shannon divergence) between this Counter and another counter. This measure is defined as:
iRad(p,q) = D(p||(p+q)/2)+D(q,(p+q)/2)
where p is this Counter, q is the other counter, and D(p||q) is the KL divergence bewteen p and q. Note that iRad(p,q) = iRad(q,p).

Parameters:
other - The other Counter
Returns:
The information radius between the distributions

average

public Counter average(Counter other)
Returns a new Counter with counts averaged from this Counter and the given Counter. The average Counter will contain the union of keys in both source Counters, and each count will be the average of the two source counts for that key, where as usual a missing count in one Counter is treated as count 0.


comparator

public Comparator comparator(boolean ascending)
Returns a comparator suitable for sorting this Counter's keys or entries by their respective counts. If ascending is true, lower counts will be returned first, otherwise higher counts will be returned first.

Sample usage:

 Counter c = new Counter();
 // add to the counter...
 List biggestKeys = Collections.sort(new ArrayList(c.keySet()), c.comparator(false));
 List smallestEntries = Collections.sort(new ArrayList(c.entrySet()), c.comparator(true))
 


comparator

public Comparator comparator()
Comparator that sorts objects by (increasing) count. Shortcut for calling comparator(true).


main

public static void main(String[] args)
For internal debugging purposes only.


hasSeen

public boolean hasSeen(Object o)
Deprecated. use the standard containsKey function of Map.

Returns whether count of object is non-zero. This normally corresponds to the object having been previously seen.


seenSet

public Set seenSet()
Deprecated. use the standard keySet() function of Map.

Returns the set of objects with non-zero counts. This normally corresponds to the objects that have been previously seen.


countOf

public double countOf(Object o)
Deprecated. use getCount instead.

Returne the current count for the given object, or 0 if it hasn't been seen before.


total

public double total()
Deprecated. use totalCount instead.

Returns the total count.


add

public void add(Object o,
                double count)
Deprecated. use incrementCount instead.

Adds the given count to the given object.


addCounter

public void addCounter(Counter c)
Deprecated. use addAll instead.

Adds all the counts from the given Counter.


increment

public void increment(Object o)
Deprecated. use incrementCount instead.

Adds 1.0 to the count of the given Object.



Stanford NLP Group