Latent Dirichlet Allocation (LDA) is a generative model which is used as a language topic model and so on.

Graphical model of LDA (this figure from Wikipedia)

Each random variable means the following

- θ : document-topic distribution,
- φ : topic-word distribution,
- Z : word topic,
- W : word,

There are some populaer estimation methods for LDA, and Collapsed Gibbs sampling (CGS) is one of them.

This method is to integral out random variables except for word topic {z_mn} and draw each z_mn from posterior.

The posterior of z_mn is the following:

where n_mz is a word count of document m with topic z, n_tz is a count of word t with topic z, n_z is a word count with topic z and -mn means “except z_mn.”

The estimation iterates until its perplexity converges or appropriate times.

where

and n_m is a word count of document m.

However perplexities usually decrease as learnings are progressing, my experiment told some different tendencies.

Continued on the next post.

### Like this:

Like Loading...

*Related*

Pingback: Quora

Hi, Thanks for your help. Why you don’t have minus before exp in perplexity?

Oh, I had made mistake ;(

I fixed it. Very Thanks!

hi shuyo…

your code is good..although i havent tried them…..

but before coding..i need to understand how actualy the equations of LDA are derived….

m having trouble in understanding these equations..all papers i have been through…talk of messy mathematics..and somewhere in between i get stuck….

it would be very helpful to me.nd many others…if you could write up a small concrete example..explaining how actualy topics infered..for one or two iterations….describing along equations….

since you hve wriiten the code..it would be easy to you……

thank you🙂

Hi,

I cannot tell it plainly without understanding probabilistic model and optimization.

I suppose you should try “a small concrete example” as you mentioned.

I am wondering why don’t we need p(θ|α) in perplexity(w) equation.

THanks

Hi, thank for explaining LDA. I’m wondering why don’t normalize the prior term (theta_mz) in the posterior formula?

The term with beta has denominator and nominator, whereas the term with alpha only has nomimator