L
- The type of the labels in the ClassifierF
- The type of the features in the Classifierpublic class LinearClassifier<L,F> extends Object implements ProbabilisticClassifier<L,F>, RVFClassifier<L,F>
weightsAsMapOfCounters()
, Angel Chang (Add functions to get top features, and number of features with weights above a certain threshold)Modifier and Type | Field and Description |
---|---|
boolean |
intern |
static String |
TEXT_SERIALIZATION_DELIMITER |
Constructor and Description |
---|
LinearClassifier(Counter<? extends Pair<F,L>> weightCounter) |
LinearClassifier(Counter<? extends Pair<F,L>> weightCounter,
Counter<L> thresholdsC) |
LinearClassifier(double[][] weights,
Index<F> featureIndex,
Index<L> labelIndex)
Make a linear classifier from the parameters.
|
LinearClassifier(double[][] weights,
Index<F> featureIndex,
Index<L> labelIndex,
double[] thresholds) |
LinearClassifier(double[] weights,
Index<Pair<F,L>> weightIndex) |
Modifier and Type | Method and Description |
---|---|
void |
adaptWeights(Dataset<L,F> adapt,
LinearClassifierFactory<L,F> lcf) |
L |
classOf(Datum<L,F> example) |
L |
classOf(RVFDatum<L,F> example)
Deprecated.
|
void |
dump()
Print all features in the classifier and the weight that they assign
to each class.
|
void |
dump(PrintWriter pw) |
void |
dumpSorted()
Print all features in the classifier and the weight that they assign
to each class.
|
Index<F> |
featureIndex() |
Collection<F> |
features() |
int |
getFeatureCount(double threshold,
boolean useMagnitude)
Returns number of features with weight above a certain threshold
(across all labels).
|
int |
getFeatureCount(Set<L> labels,
double threshold,
boolean useMagnitude)
Returns number of features with weight above a certain threshold.
|
protected int |
getFeatureCountLabelIndices(Set<Integer> iLabels,
double threshold,
boolean useMagnitude)
Returns number of features with weight above a certain threshold.
|
protected Set<Integer> |
getLabelIndices(Set<L> labels)
Returns indices of labels
|
List<Triple<F,L,Double>> |
getTopFeatures(double threshold,
boolean useMagnitude,
int numFeatures)
Returns list of top features with weight above a certain threshold
(list is descending and across all labels).
|
List<Triple<F,L,Double>> |
getTopFeatures(Set<L> labels,
double threshold,
boolean useMagnitude,
int numFeatures,
boolean descending)
Returns list of top features with weight above a certain threshold
|
protected List<Triple<F,L,Double>> |
getTopFeaturesLabelIndices(Set<Integer> iLabels,
double threshold,
boolean useMagnitude,
int numFeatures,
boolean descending)
Returns list of top features with weight above a certain threshold
|
void |
justificationOf(Datum<L,F> example) |
void |
justificationOf(Datum<L,F> example,
PrintWriter pw)
Print all features active for a particular datum and the weight that
the classifier assigns to each class for those features.
|
void |
justificationOf(Datum<L,F> example,
PrintWriter pw,
boolean sorted)
Print all features active for a particular datum and the weight that
the classifier assigns to each class for those features.
|
<T> void |
justificationOf(Datum<L,F> example,
PrintWriter pw,
java.util.function.Function<F,T> printer) |
<T> void |
justificationOf(Datum<L,F> example,
PrintWriter pw,
java.util.function.Function<F,T> printer,
boolean sortedByFeature)
Print all features active for a particular datum and the weight that
the classifier assigns to each class for those features.
|
void |
justificationOf(RVFDatum<L,F> example)
Deprecated.
|
void |
justificationOf(RVFDatum<L,F> example,
PrintWriter pw)
Deprecated.
|
Index<L> |
labelIndex() |
Collection<L> |
labels() |
Counter<L> |
logProbabilityOf(Datum<L,F> example)
Returns a counter mapping from each class name to the log probability of
that class for a certain example.
|
Counter<L> |
logProbabilityOf(int[] features)
Given a datum's features, returns a counter mapping from each
class name to the log probability of that class.
|
Counter<L> |
logProbabilityOf(RVFDatum<L,F> example)
Deprecated.
|
Counter<L> |
probabilityOf(Datum<L,F> example)
Returns a counter mapping from each class name to the probability of
that class for a certain example.
|
Counter<L> |
probabilityOf(int[] features) |
Counter<L> |
probabilityOf(RVFDatum<L,F> example)
Deprecated.
|
static <L,F> LinearClassifier<L,F> |
readClassifier(String loadPath)
Loads a classifier from a file.
|
void |
saveToFilename(String file)
Saves this out to a standard text file, instead of as a serialized Java object.
|
double |
scoreOf(Datum<L,F> example,
L label)
Returns of the score of the Datum for the specified label.
|
double |
scoreOf(RVFDatum<L,F> example,
L label)
Deprecated.
|
Counter<L> |
scoresOf(Datum<L,F> example)
Construct a counter with keys the labels of the classifier and
values the score (unnormalized log probability) of each class.
|
Counter<L> |
scoresOf(Datum<L,F> example,
Collection<L> possibleLabels) |
Counter<L> |
scoresOf(int[] features)
Given a datum's features, construct a counter with keys
the labels and values the score (unnormalized log probability)
for each class.
|
Counter<L> |
scoresOf(RVFDatum<L,F> example)
Deprecated.
|
void |
setWeights(double[][] newWeights) |
String |
toAllWeightsString() |
String |
toBiggestWeightFeaturesString(boolean useMagnitude,
int numFeatures,
boolean printDescending)
Return a String that prints features with large weights.
|
String |
toDistributionString(int threshold)
Similar to histogram but exact values of the weights
to see whether there are many equal weights.
|
String |
toHistogramString() |
String |
topFeaturesToString(List<Triple<F,L,Double>> topFeatures)
Returns string representation of a list of top features
|
String |
toString()
Print out a partial representation of a linear classifier.
|
String |
toString(String style,
int param)
Print out a partial representation of a linear classifier in one of
several ways.
|
int |
totalSize() |
double |
weight(F feature,
L label) |
double[][] |
weights() |
Map<L,Counter<F>> |
weightsAsMapOfCounters()
This method returns a map from each label to a counter of feature weights for that label.
|
static void |
writeClassifier(LinearClassifier<?,?> classifier,
String writePath)
Convenience wrapper for IOUtils.writeObjectToFile
|
public boolean intern
public static final String TEXT_SERIALIZATION_DELIMITER
public LinearClassifier(double[][] weights, Index<F> featureIndex, Index<L> labelIndex)
weights
- The parameters of the classifier. The first index is the
featureIndex value and second index is the labelIndex value.featureIndex
- An index from F to integers used to index the features in the weights arraylabelIndex
- An index from L to integers used to index the labels in the weights arraypublic LinearClassifier(double[][] weights, Index<F> featureIndex, Index<L> labelIndex, double[] thresholds) throws Exception
Exception
public Collection<L> labels()
labels
in interface Classifier<L,F>
public Collection<F> features()
public Counter<L> scoresOf(Datum<L,F> example)
scoresOf
in interface Classifier<L,F>
public Counter<L> scoresOf(int[] features)
public double scoreOf(Datum<L,F> example, L label)
@Deprecated public Counter<L> scoresOf(RVFDatum<L,F> example)
scoresOf
in interface RVFClassifier<L,F>
@Deprecated public double scoreOf(RVFDatum<L,F> example, L label)
public Counter<L> probabilityOf(Datum<L,F> example)
probabilityOf
in interface ProbabilisticClassifier<L,F>
@Deprecated public Counter<L> probabilityOf(RVFDatum<L,F> example)
public Counter<L> logProbabilityOf(Datum<L,F> example)
logProbabilityOf
in interface ProbabilisticClassifier<L,F>
public Counter<L> logProbabilityOf(int[] features)
@Deprecated public Counter<L> logProbabilityOf(RVFDatum<L,F> example)
protected Set<Integer> getLabelIndices(Set<L> labels)
labels
- - Set of labels to get indicespublic int getFeatureCount(double threshold, boolean useMagnitude)
threshold
- Threshold above which we will count the featureuseMagnitude
- Whether the notion of "large" should ignore
the sign of the feature weight.public int getFeatureCount(Set<L> labels, double threshold, boolean useMagnitude)
labels
- Set of labels we care about when counting features
Use null to get counts across all labelsthreshold
- Threshold above which we will count the featureuseMagnitude
- Whether the notion of "large" should ignore
the sign of the feature weight.protected int getFeatureCountLabelIndices(Set<Integer> iLabels, double threshold, boolean useMagnitude)
iLabels
- Set of label indices we care about when counting features
Use null to get counts across all labelsthreshold
- Threshold above which we will count the featureuseMagnitude
- Whether the notion of "large" should ignore
the sign of the feature weight.public List<Triple<F,L,Double>> getTopFeatures(double threshold, boolean useMagnitude, int numFeatures)
threshold
- Threshold above which we will count the featureuseMagnitude
- Whether the notion of "large" should ignore
the sign of the feature weight.numFeatures
- How many top features to return (-1 for unlimited)public List<Triple<F,L,Double>> getTopFeatures(Set<L> labels, double threshold, boolean useMagnitude, int numFeatures, boolean descending)
labels
- Set of labels we care about when getting features
Use null to get features across all labelsthreshold
- Threshold above which we will count the featureuseMagnitude
- Whether the notion of "large" should ignore
the sign of the feature weight.numFeatures
- How many top features to return (-1 for unlimited)descending
- Return weights in descending orderprotected List<Triple<F,L,Double>> getTopFeaturesLabelIndices(Set<Integer> iLabels, double threshold, boolean useMagnitude, int numFeatures, boolean descending)
iLabels
- Set of label indices we care about when getting features
Use null to get features across all labelsthreshold
- Threshold above which we will count the featureuseMagnitude
- Whether the notion of "large" should ignore
the sign of the feature weight.numFeatures
- How many top features to return (-1 for unlimited)descending
- Return weights in descending orderpublic String topFeaturesToString(List<Triple<F,L,Double>> topFeatures)
topFeatures
- List of triples indicating feature, label, weightpublic String toBiggestWeightFeaturesString(boolean useMagnitude, int numFeatures, boolean printDescending)
useMagnitude
- Whether the notion of "large" should ignore
the sign of the feature weight.numFeatures
- How many top features to printprintDescending
- Print weights in descending orderpublic String toDistributionString(int threshold)
public int totalSize()
public String toHistogramString()
public String toString()
public String toString(String style, int param)
style
- Options are:
HighWeight: print out the param parameters with largest weights;
HighMagnitude: print out the param parameters for which the absolute
value of their weight is largest;
AllWeights: print out the weights of all features;
WeightHistogram: print out a particular hard-coded textual histogram
representation of a classifier;
WeightDistribution;param
- Determines the number of things printed in certain stylesIllegalArgumentException
- if the style name is unrecognizedpublic String toAllWeightsString()
public void dump()
public void dump(PrintWriter pw)
@Deprecated public void justificationOf(RVFDatum<L,F> example)
@Deprecated public void justificationOf(RVFDatum<L,F> example, PrintWriter pw)
public <T> void justificationOf(Datum<L,F> example, PrintWriter pw, java.util.function.Function<F,T> printer)
public <T> void justificationOf(Datum<L,F> example, PrintWriter pw, java.util.function.Function<F,T> printer, boolean sortedByFeature)
example
- The datum for which features are to be printedpw
- Where to print it toprinter
- If this is non-null, then it is applied to each
feature to convert it to a more readable formsortedByFeature
- Whether to sort by feature namespublic Map<L,Counter<F>> weightsAsMapOfCounters()
public void justificationOf(Datum<L,F> example, PrintWriter pw)
public void dumpSorted()
public void justificationOf(Datum<L,F> example, PrintWriter pw, boolean sorted)
@Deprecated public L classOf(RVFDatum<L,F> example)
classOf
in interface RVFClassifier<L,F>
public double[][] weights()
public void setWeights(double[][] newWeights)
public static <L,F> LinearClassifier<L,F> readClassifier(String loadPath)
public static void writeClassifier(LinearClassifier<?,?> classifier, String writePath)
public void saveToFilename(String file)
file
- String filepath to write out to.