Report a bug
If you spot a problem with this page, click here to create a Github issue.
Improve this page
Quickly fork, edit online, and submit a pull request for this page. Requires a signed-in GitHub account. This works well for small changes. If you'd like to make larger changes you may want to consider using a local clone.


Online variational Bayes for latent Dirichlet allocation

References: Hoffman, Matthew D., Blei, David M. and Bach, Francis R.. "Online Learning for Latent Dirichlet Allocation.." Paper presented at the meeting of the NIPS, 2010.

Ilya Yaroshenko
  • struct LdaHoffman(F) if (isFloatingPoint!F);
    Batch variational Bayes for LDA with mini-batches.
  • this(size_t K, size_t W, size_t D, F alpha, F eta, F tau0, F kappa, F eps = 1e-05, TaskPool tp = taskPool());
    size_t K theme count
    size_t W dictionary size
    size_t D approximate total number of documents in a collection.
    F alpha Dirichlet document-topic prior (0.1)
    F eta Dirichlet word-topic prior (0.1)
    F tau0 𝞽0 ≧ 0 slows down the early iterations of the algorithm.
    F kappa 𝞳 ∈ (0.5, 1], controls the rate at which old values of 𝝺 are forgotten. 𝝺 = (1 - 𝞀(𝞽)) 𝝺 + 𝞀 𝝺', 𝞀(𝞽) = (𝞽0 + 𝞽)^(-𝞳). Use 𝞳 = 0 for Batch variational Bayes LDA.
    F eps Stop iterations if ||𝝺 - 𝝺'||_l1 < s * eps, where s is a documents count in a batch.
    TaskPool tp task pool
  • void updateBeta();
  • @property Slice!(Contiguous, [2], F*) beta();
    Posterior over the topics
  • @property Slice!(Contiguous, [2], F*) lambda();
    Parameterized posterior over the topics.
  • tau
    const @property F tau();

    @property void tau(F v);
    Count of already seen documents. Slows down the iterations of the algorithm.
  • size_t putBatch(SliceKind kind, C, I, J)(Slice!(kind, [1], FieldIterator!(CompressedField!(C, I, J))) n, size_t maxIterations);
    Accepts mini-batch and performs multiple E-step iterations for each document and single M-step.
    This implementation is optimized for sparse documents, which contain much less unique words than a dictionary.
    Slice!(kind, [1], FieldIterator!(CompressedField!(C, I, J))) n mini-batch, a collection of compressed documents.
    size_t maxIterations maximal number of iterations for single document in a batch for E-step.