edu.stanford.nlp.tmt.model.lda

GibbsLDA

class GibbsLDA extends LDA[HardAssignmentModelState, GibbsLDADocument, (String, Array[Int])] with HardAssignmentModel[LDAModelParams, LDADocumentParams, GibbsLDADocument]

Collapsed Gibbs sampler for LDA learning and inference. This class is not threadsafe for learning. However, it is threadsafe for inerence, but no guarantees are provided about repeatability in a threaded environment if the number of threads is different between runs. This is because each thread is given its own random number generator to avoid synchronization overhead, so the sequence of random numbers seen on an particular document may be a function of the number of threads.

go to: companion
linear super types: HardAssignmentModel[LDAModelParams, LDADocumentParams, GibbsLDADocument], LDA[HardAssignmentModelState, GibbsLDADocument, (String, Array[Int])], DirichletTopicSmoothing, DirichletTermSmoothing, ClosedTopicSet, TopicModel[LDAModelParams, HardAssignmentModelState, LDADocumentParams, GibbsLDADocument, (String, Array[Int])], RepCheck, Stateful[HardAssignmentModelState], AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. Hide All
  2. Show all
  1. GibbsLDA
  2. HardAssignmentModel
  3. LDA
  4. DirichletTopicSmoothing
  5. DirichletTermSmoothing
  6. ClosedTopicSet
  7. TopicModel
  8. RepCheck
  9. Stateful
  10. AnyRef
  11. Any
Visibility
  1. Public
  2. All
Impl.
  1. Concrete
  2. Abstract

Instance constructors

  1. new GibbsLDA (params: LDAModelParams, seed: Long =, inferParams: GibbsInferParams =, log: (String) ⇒ Unit =})

Value Members

  1. def != (arg0: AnyRef) : Boolean

    attributes: final
    definition classes: AnyRef
  2. def != (arg0: Any) : Boolean

    o != arg0 is the same as !(o == (arg0)).

    o != arg0 is the same as !(o == (arg0)).

    arg0

    the object to compare against this object for dis-equality.

    returns

    false if the receiver object is equivalent to the argument; true otherwise.

    attributes: final
    definition classes: Any
  3. def ## () : Int

    attributes: final
    definition classes: AnyRef → Any
  4. def $asInstanceOf [T0] () : T0

    attributes: final
    definition classes: AnyRef
  5. def $isInstanceOf [T0] () : Boolean

    attributes: final
    definition classes: AnyRef
  6. def == (arg0: AnyRef) : Boolean

    o == arg0 is the same as if (o eq null) arg0 eq null else o.equals(arg0).

    o == arg0 is the same as if (o eq null) arg0 eq null else o.equals(arg0).

    arg0

    the object to compare against this object for equality.

    returns

    true if the receiver object is equivalent to the argument; false otherwise.

    attributes: final
    definition classes: AnyRef
  7. def == (arg0: Any) : Boolean

    o == arg0 is the same as o.equals(arg0).

    o == arg0 is the same as o.equals(arg0).

    arg0

    the object to compare against this object for equality.

    returns

    true if the receiver object is equivalent to the argument; false otherwise.

    attributes: final
    definition classes: Any
  8. def asInstanceOf [T0] : T0

    This method is used to cast the receiver object to be of type T0.

    This method is used to cast the receiver object to be of type T0.

    Note that the success of a cast at runtime is modulo Scala's erasure semantics. Therefore the expression1.asInstanceOf[String] will throw a ClassCastException at runtime, while the expressionList(1).asInstanceOf[List[String]] will not. In the latter example, because the type argument is erased as part of compilation it is not possible to check whether the contents of the list are of the requested typed.

    returns

    the receiver object.

    attributes: final
    definition classes: Any
  9. var checkers : List[Function0[_]]

    attributes: protected
    definition classes: RepCheck
  10. def checkrep () : Unit

    Assert invariants.

    Assert invariants.

    attributes: protected final
    definition classes: RepCheck
  11. def clone () : AnyRef

    This method creates and returns a copy of the receiver object.

    This method creates and returns a copy of the receiver object.

    The default implementation of the clone method is platform dependent.

    returns

    a copy of the receiver object.

    attributes: protected
    definition classes: AnyRef
  12. def computeCrossEntropy (doc: LDADocumentParams) : (Double, Int)

    Computes the total cross-entropy of the terms in the second half of the document based on an estimate of theta from the terms in the fisrt half of the doucment.

    Computes the total cross-entropy of the terms in the second half of the document based on an estimate of theta from the terms in the fisrt half of the doucment. Returns (sum crossEntropy, numTerms). This is used as the basis of computePerplexity.

    definition classes: LDA
  13. def computeLogPW (doc: GibbsLDADocument) : Double

    Computes the log probability for the current document.

    Computes the log probability for the current document. This measure treats the assignment to theta and the model counts as observed. Returns sum_i P(w_i | theta*, beta*). Beta maps from (topic,term) to probability.

    definition classes: LDA
  14. def computePerplexity (docs: Traversable[LDADocumentParams]) : Double

    Computes the average per-word perplexity of the given dataset.

    Computes the average per-word perplexity of the given dataset.

    definition classes: LDA
  15. val countTopic : Array[Int]

    How many times each topic is seen overall.

    How many times each topic is seen overall.

    definition classes: HardAssignmentModel
  16. val countTopicTerm : Array[Array[Int]]

    How many times each term is seen in each topic.

    How many times each term is seen in each topic.

    definition classes: HardAssignmentModel
  17. def create (dp: LDADocumentParams) : GibbsLDADocument

    Creates a document from the given document parameters.

    Creates a document from the given document parameters.

    definition classes: GibbsLDATopicModel
  18. def eq (arg0: AnyRef) : Boolean

    This method is used to test whether the argument (arg0) is a reference to the receiver object (this).

    This method is used to test whether the argument (arg0) is a reference to the receiver object (this).

    The eq method implements an [http://en.wikipedia.org/wiki/Equivalence_relation equivalence relation] on non-null instances of AnyRef: * It is reflexive: for any non-null instance x of type AnyRef, x.eq(x) returns true. * It is symmetric: for any non-null instances x and y of type AnyRef, x.eq(y) returns true if and only if y.eq(x) returns true. * It is transitive: for any non-null instances x, y, and z of type AnyRef if x.eq(y) returns true and y.eq(z) returns true, then x.eq(z) returns true.

    Additionally, the eq method has three other properties. * It is consistent: for any non-null instances x and y of type AnyRef, multiple invocations of x.eq(y) consistently returns true or consistently returns false. * For any non-null instance x of type AnyRef, x.eq(null) and null.eq(x) returns false. * null.eq(null) returns true.

    When overriding the equals or hashCode methods, it is important to ensure that their behavior is consistent with reference equality. Therefore, if two objects are references to each other (o1 eq o2), they should be equal to each other (o1 == o2) and they should hash to the same value (o1.hashCode == o2.hashCode).

    arg0

    the object to compare against this object for reference equality.

    returns

    true if the argument is a reference to the receiver object; false otherwise.

    attributes: final
    definition classes: AnyRef
  19. def equals (arg0: Any) : Boolean

    This method is used to compare the receiver object (this) with the argument object (arg0) for equivalence.

    This method is used to compare the receiver object (this) with the argument object (arg0) for equivalence.

    The default implementations of this method is an [http://en.wikipedia.org/wiki/Equivalence_relation equivalence relation]: * It is reflexive: for any instance x of type Any, x.equals(x) should return true. * It is symmetric: for any instances x and y of type Any, x.equals(y) should return true if and only if y.equals(x) returns true. * It is transitive: for any instances x, y, and z of type AnyRef if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.

    If you override this method, you should verify that your implementation remains an equivalence relation. Additionally, when overriding this method it is often necessary to override hashCode to ensure that objects that are "equal" (o1.equals(o2) returns true) hash to the same scala.Int (o1.hashCode.equals(o2.hashCode)).

    arg0

    the object to compare against this object for equality.

    returns

    true if the receiver object is equivalent to the argument; false otherwise.

    definition classes: AnyRef → Any
  20. def finalize () : Unit

    This method is called by the garbage collector on the receiver object when garbage collection determines that there are no more references to the object.

    This method is called by the garbage collector on the receiver object when garbage collection determines that there are no more references to the object.

    The details of when and if the finalize method are invoked, as well as the interaction between finalizeand non-local returns and exceptions, are all platform dependent.

    attributes: protected
    definition classes: AnyRef
  21. def getClass () : java.lang.Class[_]

    Returns a representation that corresponds to the dynamic class of the receiver object.

    Returns a representation that corresponds to the dynamic class of the receiver object.

    The nature of the representation is platform dependent.

    returns

    a representation that corresponds to the dynamic class of the receiver object.

    attributes: final
    definition classes: AnyRef
  22. def getTopicTermDistribution (topic: String) : Array[Double]

    Returns the distribution over terms for the given topic.

    Returns the distribution over terms for the given topic. The return value of this method is assumed to have already incorporated the corresponding getTermSmoothing to the appropriate extent.

    attributes: final
    definition classes: ClosedTopicSet
  23. def getTopicTermDistribution (topic: Int) : Array[Double]

    Returns the distribution over terms for the given topic.

    Returns the distribution over terms for the given topic. The return value of this method is assumed to have already incorporated the corresponding getTermSmoothing to the appropriate extent.

    definition classes: ClosedTopicSet
  24. def hashCode () : Int

    Returns a hash code value for the object.

    Returns a hash code value for the object.

    The default hashing algorithm is platform dependent.

    Note that it is allowed for two objects to have identical hash codes (o1.hashCode.equals(o2.hashCode)) yet not be equal (o1.equals(o2) returns false). A degenerate implementation could always return 0. However, it is required that if two objects are equal (o1.equals(o2) returns true) that they have identical hash codes (o1.hashCode.equals(o2.hashCode)). Therefore, when overriding this method, be sure to verify that the behavior is consistent with the equals method.

    returns

    the hash code value for the object.

    definition classes: AnyRef → Any
  25. def infer (doc: GibbsLDADocument) : Array[Double]

    Does inference on the given document until convergence.

    Does inference on the given document until convergence.

    definition classes: GibbsLDALDA
  26. def infer (doc: String) : Array[Double]

    Does inference on the given document until convergence.

    Does inference on the given document until convergence.

    definition classes: LDA
  27. def infer (doc: LDADocumentParams) : Array[Double]

    Does inference on the given document until convergence.

    Does inference on the given document until convergence.

    definition classes: LDA
  28. val inferParams : GibbsInferParams

  29. def inferSampler : InferSampler

    Gets a thread-local inference sampler.

    Gets a thread-local inference sampler.

  30. val inferSamplerTL : ThreadLocal[InferSampler]

    attributes: protected
  31. def isInstanceOf [T0] : Boolean

    This method is used to test whether the dynamic type of the receiver object is T0.

    This method is used to test whether the dynamic type of the receiver object is T0.

    Note that the test result of the test is modulo Scala's erasure semantics. Therefore the expression1.isInstanceOf[String] will return false, while the expression List(1).isInstanceOf[List[String]] will return true. In the latter example, because the type argument is erased as part of compilation it is not possible to check whether the contents of the list are of the requested typed.

    returns

    true if the receiver object is an instance of erasure of type T0; false otherwise.

    attributes: final
    definition classes: Any
  32. val learnSampler : LearnSampler

  33. val log : (String) ⇒ Unit

    Where log messages go.

    Where log messages go. Defaults to System.err.println.

    definition classes: GibbsLDATopicModel
  34. def ne (arg0: AnyRef) : Boolean

    o.ne(arg0) is the same as !(o.eq(arg0)).

    o.ne(arg0) is the same as !(o.eq(arg0)).

    arg0

    the object to compare against this object for reference dis-equality.

    returns

    false if the argument is not a reference to the receiver object; true otherwise.

    attributes: final
    definition classes: AnyRef
  35. def notify () : Unit

    Wakes up a single thread that is waiting on the receiver object's monitor.

    Wakes up a single thread that is waiting on the receiver object's monitor.

    attributes: final
    definition classes: AnyRef
  36. def notifyAll () : Unit

    Wakes up all threads that are waiting on the receiver object's monitor.

    Wakes up all threads that are waiting on the receiver object's monitor.

    attributes: final
    definition classes: AnyRef
  37. val numTerms : Int

    The number of terms in the model.

    The number of terms in the model.

    definition classes: LDATopicModel
  38. val numTopics : Int

    The number of topics in the model.

    The number of topics in the model.

    definition classes: LDAClosedTopicSet
  39. def pTopicTerm (topic: Int, term: Int) : Double

    Returns the probability of the given term in the given topic.

    Returns the probability of the given term in the given topic.

    attributes: final
    definition classes: HardAssignmentModelClosedTopicSet
  40. def pTopicTerm (topic: String, term: String) : Double

    Returns the probability of the given term in the given topic.

    Returns the probability of the given term in the given topic.

    definition classes: ClosedTopicSet
  41. val params : LDAModelParams

    The parameters used to create this model.

    The parameters used to create this model.

    definition classes: GibbsLDALDATopicModel
  42. def registerCheck (check: Function0[_]) : Unit

    Registers a function as a checker of invariants.

    Registers a function as a checker of invariants.

    attributes: protected
    definition classes: RepCheck
  43. def reset () : Unit

    Resets to the default state.

    Resets to the default state.

    definition classes: HardAssignmentModelStateful
  44. def sampleInfer (doc: GibbsLDADocument) : Unit

  45. def sampleLearn (doc: GibbsLDADocument) : Unit

  46. val seed : Long

  47. def state : HardAssignmentModelState

    Gets the current state of this object.

    Gets the current state of this object.

    definition classes: HardAssignmentModelStateful
  48. def state_= (state: HardAssignmentModelState) : Unit

    Sets the current state of this object.

    Sets the current state of this object.

    definition classes: HardAssignmentModelStateful
  49. def summary : Iterator[String]

    Returns human-readable summary of the current topic model.

    Returns human-readable summary of the current topic model.

    definition classes: HardAssignmentModel
  50. def synchronized [T0] (arg0: T0) : T0

    attributes: final
    definition classes: AnyRef
  51. def termIndex : Option[Index[String]]

    The term index describing which terms are in the model.

    The term index describing which terms are in the model.

    attributes: final
    definition classes: TopicModel
  52. def termIndex_= (index: Option[Index[String]]) : Unit

    attributes: protected final
    definition classes: TopicModel
  53. def termSmoothDenom : Double

    attributes: protected
    definition classes: DirichletTermSmoothing
  54. def termSmoothing : Array[Double]

    Add-k prior counts for each term (eta in the model formulation).

    Add-k prior counts for each term (eta in the model formulation).

    attributes: final
    definition classes: DirichletTermSmoothing
  55. def termSmoothing_= (smoothing: Array[Double]) : Unit

    attributes: protected
    definition classes: DirichletTermSmoothing
  56. def toString () : String

    Returns a string representation of the object.

    Returns a string representation of the object.

    The default representation is platform dependent.

    returns

    a string representation of the object.

    definition classes: AnyRef → Any
  57. def tokenize (document: String) : Iterable[Int]

    Tokenizes the given input string using our stored tokenizer and term index, if available.

    Tokenizes the given input string using our stored tokenizer and term index, if available. Otherwise, throws an IllegalArgumentException.

    attributes: protected
    definition classes: TopicModel
  58. def tokenizer : Option[Tokenizer]

    The tokenizer used to break input documents into terms.

    The tokenizer used to break input documents into terms.

    attributes: final
    definition classes: TopicModel
  59. def tokenizer_= (tokenizer: Option[Tokenizer]) : Unit

    attributes: protected final
    definition classes: TopicModel
  60. var topicIndex : Option[Index[String]]

    The term index describing which terms are in the model.

    The term index describing which terms are in the model.

    definition classes: ClosedTopicSet
  61. def topicName (topic: Int) : String

    Gets the name for this topic.

    Gets the name for this topic.

    definition classes: ClosedTopicSet
  62. def topicSmoothing : Array[Double]

    Prior counts for each topic (alpha in the model formulation).

    Prior counts for each topic (alpha in the model formulation).

    attributes: final
    definition classes: DirichletTopicSmoothing
  63. def topicSmoothing_= (smoothing: Array[Double]) : Unit

    attributes: protected
    definition classes: DirichletTopicSmoothing
  64. def wait () : Unit

    attributes: final
    definition classes: AnyRef
  65. def wait (arg0: Long, arg1: Int) : Unit

    attributes: final
    definition classes: AnyRef
  66. def wait (arg0: Long) : Unit

    attributes: final
    definition classes: AnyRef

Inherited from HardAssignmentModel[LDAModelParams, LDADocumentParams, GibbsLDADocument]

Inherited from LDA[HardAssignmentModelState, GibbsLDADocument, (String, Array[Int])]

Inherited from DirichletTopicSmoothing

Inherited from DirichletTermSmoothing

Inherited from ClosedTopicSet

Inherited from TopicModel[LDAModelParams, HardAssignmentModelState, LDADocumentParams, GibbsLDADocument, (String, Array[Int])]

Inherited from RepCheck

Inherited from Stateful[HardAssignmentModelState]

Inherited from AnyRef

Inherited from Any