|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Object | +--mark.nlp.features.CorpusCounter
Assume a sampling process, where each sample determines two discrete random variables, C and W. The range of C is the integers in [0, l(C) - 1], where l(C) is the number of values C can assume. The range of W is the set of distinct objects {W_0, ..., W_(l(W)-1)}, where l(W) is the number of values W can assume. A CorpusCounter maintains joint counts of C and W. Counts may be doubles, meaning that a count of 0.5, for example, is legal.
| Constructor Summary | |
CorpusCounter()
|
|
| Method Summary | |
abstract int |
CLength()
Returns the number of values C can assume. |
Table |
countTable(int c,
java.lang.Object w)
Return (smoothed) counts in a 2-by-2 contingency table. |
Table |
countTable(java.lang.Object w)
Return (smoothed) counts in a l(C)-by-2 contingency table. |
abstract double |
num()
Returns the total number of samples. |
abstract double |
numC(int c)
Returns #(C = c). |
abstract double |
numCW(int c,
java.lang.Object w)
Returns #(C = c, W = w). |
abstract double |
numW(java.lang.Object w)
Returns #(W = w). |
abstract java.util.Iterator |
WIterator()
Returns an iterator over the values W can assume. |
abstract java.util.Iterator |
WIterator(int c)
Returns an iterator over the values W assumes in conjunction with a given value of C. |
abstract int |
WLength()
Returns the number of values W can assume. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
public CorpusCounter()
| Method Detail |
public abstract java.util.Iterator WIterator()
public abstract java.util.Iterator WIterator(int c)
c - the C value.
public abstract int CLength()
public abstract int WLength()
public abstract double numCW(int c,
java.lang.Object w)
c - the C value.w - the W value.
public abstract double numC(int c)
c - the C value.
public abstract double numW(java.lang.Object w)
w - the W value.
public abstract double num()
public Table countTable(int c,
java.lang.Object w)
| ^w | w
----+----+----
^c | |
----+----+----
c | |
For example, a cell in row ^c and col ^w indicates the number of samples
not in class c and that are not w. We use lidstone smoothing.
c - the C value.w - the W value.
public Table countTable(java.lang.Object w)
| ^w | w
-----+----+----
c_0 | |
-----+----+----
. | .
. | .
. | .
-----+----+----
c_n | |
A cell in row c_i and col ^w indicates the number of samples in class
c_i that are not w. A cell in row c_i and col w indicates the number of
of samples in class c_i that are w. (We use lidstone smoothing).
w - the W value.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||