edu.stanford.nlp.classify
Class NBLinearClassifierFactory<L,F>

java.lang.Object
  extended by edu.stanford.nlp.classify.AbstractLinearClassifierFactory<L,F>
      extended by edu.stanford.nlp.classify.NBLinearClassifierFactory<L,F>
Type Parameters:
L - The type of the labels in the Classifier
F - The type of the features in the Classifier
All Implemented Interfaces:
ClassifierFactory<L,F,Classifier<L,F>>, Serializable

public class NBLinearClassifierFactory<L,F>
extends AbstractLinearClassifierFactory<L,F>

Provides a medium-weight implementation of Bernoulli (or binary) Naive Bayes via a linear classifier. It's medium weight in that it uses dense arrays for counts and calculation (but, hey, NB is efficient to estimate). Each feature is treated as an independent binary variable.

CDM Jun 2003: I added a dirty trick so that if there is a feature that is always on in input examples, then its weight is turned into a prior feature! (This will work well iff it is also always on at test time.) In fact, this is done for each such feature, so by having several such features, one can even get an integral prior boost out of this.

Author:
Dan Klein, Sarah Spikes (sdspikes@cs.stanford.edu) (Templatization)
See Also:
Serialized Form

Constructor Summary
NBLinearClassifierFactory()
          Create a ClassifierFactory.
NBLinearClassifierFactory(double sigma)
          Create a ClassifierFactory.
NBLinearClassifierFactory(double sigma, boolean interpretAlwaysOnFeatureAsPrior)
          Create a ClassifierFactory.
 
Method Summary
 void setTuneSigmaCV(int folds)
          setTuneSigmaCV sets the tuneSigma flag: when turned on, the sigma is tuned by cross-validation.
protected  double[][] trainWeights(GeneralDataset<L,F> data)
           
 
Methods inherited from class edu.stanford.nlp.classify.AbstractLinearClassifierFactory
trainClassifier, trainClassifier, trainClassifier, trainClassifier
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

NBLinearClassifierFactory

public NBLinearClassifierFactory()
Create a ClassifierFactory.


NBLinearClassifierFactory

public NBLinearClassifierFactory(double sigma)
Create a ClassifierFactory.

Parameters:
sigma - The amount of add-sigma smoothing of evidence

NBLinearClassifierFactory

public NBLinearClassifierFactory(double sigma,
                                 boolean interpretAlwaysOnFeatureAsPrior)
Create a ClassifierFactory.

Parameters:
sigma - The amount of add-sigma smoothing of evidence
interpretAlwaysOnFeatureAsPrior - If true, a feature that is in every data item is interpreted as an indication to include a prior factor over classes. (If there are multiple such features, an integral "prior boost" will occur.) If false, an always on feature is interpreted as an evidence feature (and, following the standard math) will have no effect on the model.
Method Detail

trainWeights

protected double[][] trainWeights(GeneralDataset<L,F> data)
Specified by:
trainWeights in class AbstractLinearClassifierFactory<L,F>

setTuneSigmaCV

public void setTuneSigmaCV(int folds)
setTuneSigmaCV sets the tuneSigma flag: when turned on, the sigma is tuned by cross-validation. If there is less data than the number of folds, leave-one-out is used. The default for tuneSigma is false.

Parameters:
folds - Number of folds for cross validation


Stanford NLP Group