Report a bug
If you spot a problem with this page, click here to create a Github issue.
Improve this page
Quickly fork, edit online, and submit a pull request for this page. Requires a signed-in GitHub account. This works well for small changes. If you'd like to make larger changes you may want to consider using a local clone.


Online variational Bayes for latent Dirichlet allocation

ReferencesHoffman, Matthew D., Blei, David M. and Bach, Francis R.. "Online Learning for Latent Dirichlet Allocation.." Paper presented at the meeting of the NIPS, 2010.

Ilya Yaroshenko
  • struct LdaHoffman(F) if (isFloatingPoint!F);
    Batch variational Bayes for LDA with mini-batches.
  • this(size_t K, size_t W, size_t D, F alpha, F eta, F tau0, F kappa, F eps = 1e-05, TaskPool tp = taskPool());
    size_t Ktheme count
    size_t Wdictionary size
    size_t Dapproximate total number of documents in a collection.
    F alphaDirichlet document-topic prior (0.1)
    F etaDirichlet word-topic prior (0.1)
    F tau0𝞽0 ≧ 0 slows down the early iterations of the algorithm.
    F kappa𝞳 ∈ (0.5, 1], controls the rate at which old values of 𝝺 are forgotten. 𝝺 = (1 - 𝞀(𝞽)) 𝝺 + 𝞀 𝝺', 𝞀(𝞽) = (𝞽0 + 𝞽)^(-𝞳). Use 𝞳 = 0 for Batch variational Bayes LDA.
    F epsStop iterations if ||𝝺 - 𝝺'||_l1 < s * eps, where s is a documents count in a batch.
    TaskPool tptask pool
  • void updateBeta();
  • @property Slice!(F*, 2) beta();
    Posterior over the topics
  • @property Slice!(F*, 2) lambda();
    Parameterized posterior over the topics.
  • tau
    const @property F tau();

    @property void tau(F v);
    Count of already seen documents. Slows down the iterations of the algorithm.
  • size_t putBatch(SliceKind kind, C, I, J)(Slice!(ChopIterator!(J*, Series!(I*, C*)), 1, kind) n, size_t maxIterations);
    Accepts mini-batch and performs multiple E-step iterations for each document and single M-step.
    This implementation is optimized for sparse documents, which contain much less unique words than a dictionary.
    Slice!(ChopIterator!(J*, Series!(I*, C*)), 1, kind) nmini-batch, a collection of compressed documents.
    size_t maxIterationsmaximal number of iterations for s This implementation is optimized for sparse documents, ingle document in a batch for E-step.