edu.stanford.nlp.math
Class SloppyMath

java.lang.Object
  extended by edu.stanford.nlp.math.SloppyMath

public final class SloppyMath
extends java.lang.Object

The class SloppyMath contains methods for performing basic numeric operations. In some cases, such as max and min, they cut a few corners in the implementation for the sake of efficiency. In particular, they may not handle special notions like NaN and -0.0 correctly. This was the origin of the class name, but many other methods are just useful math additions, such as logSum. This class just has static math methds.

Author:
Christopher Manning

Method Summary
static double chiSquare2by2(int k, int n, int r, int m)
          Find a 2x2 chi-square value.
static double exactBinomial(int k, int n, double p)
          Find a one tailed exact binomial test probability.
static double factorial(int x)
          Uses floating point so that it can represent the really big numbers that come up.
static double gamma(double n)
           
static double hypergeometric(int k, int n, int r, int m)
          Find a hypergeometric distribution.
static double intPow(double b, int e)
          Exponentiation like we learned in grade school: multiply b by itself e times.
static float intPow(float b, int e)
          Exponentiation like we learned in grade school: multiply b by itself e times.
static int intPow(int b, int e)
          Exponentiation like we learned in grade school: multiply b by itself e times.
static boolean isCloseTo(double a, double b)
           
static boolean isDangerous(double d)
          Returns true if the argument is a "dangerous" double to have around, namely one that is infinite, NaN or zero.
static boolean isVeryDangerous(double d)
          Returns true if the argument is a "very dangerous" double to have around, namely one that is infinite or NaN.
static double lgamma(double x)
           
static double log(double num, double base)
          Convenience method for log to a different base
static double logAdd(double lx, double ly)
          Returns the log of the sum of two numbers, which are themselves input in log form.
static float logAdd(float lx, float ly)
          Returns the log of the sum of two numbers, which are themselves input in log form.
static void main(java.lang.String[] args)
          Tests the hypergeometric distribution code, or other functions provided in this module.
static int max(java.util.Collection<java.lang.Integer> vals)
           
static double max(double a, double b)
          Returns the greater of two double values.
static float max(float a, float b)
          Returns the greater of two float values.
static int max(int a, int b)
          Returns the greater of two int values.
static int max(int a, int b, int c)
          max() that works on three integers.
static double min(double a, double b)
          Returns the smaller of two double values.
static float min(float a, float b)
          Returns the smaller of two float values.
static int min(int a, int b, int c)
          Returns the minimum of three int values.
static int nChooseK(int n, int k)
          Computes n choose k in an efficient way.
static double oneTailedFishersExact(int k, int n, int r, int m)
          Find a one-tailed Fisher's exact probability.
static double poisson(int x, double lambda)
           
static double pow(double a, double b)
          Returns an approximation to Math.pow(a,b) that is ~27x faster with a margin of error possibly around ~10%.
static double round(double x)
          Round a double to the nearest integer, via conventional rules (.5 rounds up, .49 rounds down), and return the result, still as a double.
static double round(double x, int precision)
          Round a double to the given number of decimal places, rounding to the nearest value via conventional rules (5 rounds up, 49 rounds down).
static double sigmoid(double x)
          Compute the sigmoid function with mean zero.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

round

public static double round(double x)
Round a double to the nearest integer, via conventional rules (.5 rounds up, .49 rounds down), and return the result, still as a double.

Parameters:
x - What to round
Returns:
The rounded value

round

public static double round(double x,
                           int precision)
Round a double to the given number of decimal places, rounding to the nearest value via conventional rules (5 rounds up, 49 rounds down). E.g. round(3.1416, 2) == 3.14, round(431.5, -2) == 400, round(431.5, 0) = 432


max

public static int max(int a,
                      int b,
                      int c)
max() that works on three integers. Like many of the other max() functions in this class, doesn't perform special checks like NaN or -0.0f to save time.

Returns:
The maximum of three int values.

max

public static int max(java.util.Collection<java.lang.Integer> vals)

max

public static int max(int a,
                      int b)
Returns the greater of two int values. That is, the result is the argument closer to positive infinity. If the arguments have the same value, the result is that same value. Does none of the special checks for NaN or -0.0f that Math.max does.

Parameters:
a - an argument.
b - another argument.
Returns:
the larger of a and b.

max

public static float max(float a,
                        float b)
Returns the greater of two float values. That is, the result is the argument closer to positive infinity. If the arguments have the same value, the result is that same value. Does none of the special checks for NaN or -0.0f that Math.max does.

Parameters:
a - an argument.
b - another argument.
Returns:
the larger of a and b.

max

public static double max(double a,
                         double b)
Returns the greater of two double values. That is, the result is the argument closer to positive infinity. If the arguments have the same value, the result is that same value. Does none of the special checks for NaN or -0.0f that Math.max does.

Parameters:
a - an argument.
b - another argument.
Returns:
the larger of a and b.

min

public static int min(int a,
                      int b,
                      int c)
Returns the minimum of three int values.


min

public static float min(float a,
                        float b)
Returns the smaller of two float values. That is, the result is the value closer to negative infinity. If the arguments have the same value, the result is that same value. Does none of the special checks for NaN or -0.0f that Math.max does.

Parameters:
a - an argument.
b - another argument.
Returns:
the smaller of a and b.

min

public static double min(double a,
                         double b)
Returns the smaller of two double values. That is, the result is the value closer to negative infinity. If the arguments have the same value, the result is that same value. Does none of the special checks for NaN or -0.0f that Math.max does.

Parameters:
a - an argument.
b - another argument.
Returns:
the smaller of a and b.

lgamma

public static double lgamma(double x)
Returns:
an approximation of the log of the Gamma function * of x. Laczos Approximation Reference: Numerical Recipes in C http://www.library.cornell.edu/nr/cbookcpdf.html from www.cs.berkeley.edu/~milch/blog/versions/blog-0.1.3/blog/distrib

isDangerous

public static boolean isDangerous(double d)
Returns true if the argument is a "dangerous" double to have around, namely one that is infinite, NaN or zero.


isVeryDangerous

public static boolean isVeryDangerous(double d)
Returns true if the argument is a "very dangerous" double to have around, namely one that is infinite or NaN.


isCloseTo

public static boolean isCloseTo(double a,
                                double b)

gamma

public static double gamma(double n)

log

public static double log(double num,
                         double base)
Convenience method for log to a different base


logAdd

public static float logAdd(float lx,
                           float ly)
Returns the log of the sum of two numbers, which are themselves input in log form. This uses natural logarithms. Reasonable care is taken to do this as efficiently as possible (under the assumption that the numbers might differ greatly in magnitude), with high accuracy, and without numerical overflow. Also, handle correctly the case of arguments being -Inf (e.g., probability 0).

Parameters:
lx - First number, in log form
ly - Second number, in log form
Returns:
log(exp(lx) + exp(ly))

logAdd

public static double logAdd(double lx,
                            double ly)
Returns the log of the sum of two numbers, which are themselves input in log form. This uses natural logarithms. Reasonable care is taken to do this as efficiently as possible (under the assumption that the numbers might differ greatly in magnitude), with high accuracy, and without numerical overflow. Also, handle correctly the case of arguments being -Inf (e.g., probability 0).

Parameters:
lx - First number, in log form
ly - Second number, in log form
Returns:
log(exp(lx) + exp(ly))

nChooseK

public static int nChooseK(int n,
                           int k)
Computes n choose k in an efficient way. Works with k == 0 or k == n but undefined if k < 0 or k > n

Returns:
fact(n) / fact(k) * fact(n-k)

pow

public static double pow(double a,
                         double b)
Returns an approximation to Math.pow(a,b) that is ~27x faster with a margin of error possibly around ~10%. From http://martin.ankerl.com/2007/10/04/optimized-pow-approximation-for-java-and-c-c/


intPow

public static int intPow(int b,
                         int e)
Exponentiation like we learned in grade school: multiply b by itself e times. Uses power of two trick. e must be nonnegative!!! no checking!!! For e <= 0, the exponent is treated as 0, and 1 is returned. 0^0 also returns 1.

Parameters:
b - base
e - exponent
Returns:
b^e

intPow

public static float intPow(float b,
                           int e)
Exponentiation like we learned in grade school: multiply b by itself e times. Uses power of two trick. e must be nonnegative!!! no checking!!!

Parameters:
b - base
e - exponent
Returns:
b^e

intPow

public static double intPow(double b,
                            int e)
Exponentiation like we learned in grade school: multiply b by itself e times. Uses power of two trick. e must be nonnegative!!! no checking!!!

Parameters:
b - base
e - exponent
Returns:
b^e

hypergeometric

public static double hypergeometric(int k,
                                    int n,
                                    int r,
                                    int m)
Find a hypergeometric distribution. This uses exact math, trying fairly hard to avoid numeric overflow by interleaving multiplications and divisions. (To do: make it even better at avoiding overflow, by using loops that will do either a multiple or divide based on the size of the intermediate result.)

Parameters:
k - The number of black balls drawn
n - The total number of balls
r - The number of black balls
m - The number of balls drawn
Returns:
The hypergeometric value

exactBinomial

public static double exactBinomial(int k,
                                   int n,
                                   double p)
Find a one tailed exact binomial test probability. Finds the chance of this or a higher result

Parameters:
k - number of successes
n - Number of trials
p - Probability of a success

oneTailedFishersExact

public static double oneTailedFishersExact(int k,
                                           int n,
                                           int r,
                                           int m)
Find a one-tailed Fisher's exact probability. Chance of having seen this or a more extreme departure from what you would have expected given independence. I.e., k >= the value passed in. Warning: this was done just for collocations, where you are concerned with the case of k being larger than predicted. It doesn't correctly handle other cases, such as k being smaller than expected.

Parameters:
k - The number of black balls drawn
n - The total number of balls
r - The number of black balls
m - The number of balls drawn
Returns:
The Fisher's exact p-value

chiSquare2by2

public static double chiSquare2by2(int k,
                                   int n,
                                   int r,
                                   int m)
Find a 2x2 chi-square value. Note: could do this more neatly using simplified formula for 2x2 case.

Parameters:
k - The number of black balls drawn
n - The total number of balls
r - The number of black balls
m - The number of balls drawn
Returns:
The Fisher's exact p-value

sigmoid

public static double sigmoid(double x)
Compute the sigmoid function with mean zero. Care is taken to compute an accurate answer without numerical overflow. (Added by rajatr)

Parameters:
x - Point to compute sigmoid at.
Returns:
Value of the sigmoid, given by 1/(1+exp(-x))

poisson

public static double poisson(int x,
                             double lambda)

factorial

public static double factorial(int x)
Uses floating point so that it can represent the really big numbers that come up.

Parameters:
x - Argumet to take factorial of
Returns:
Factorial of argument

main

public static void main(java.lang.String[] args)
Tests the hypergeometric distribution code, or other functions provided in this module.

Parameters:
args - Either none, and the log add rountines are tested, or the following 4 arguments: k (cell), n (total), r (row), m (col)


Stanford NLP Group