HDP-LDA updates

Hierarchical Dirichlet Processes (Teh+ 2006) are a nonparametric bayesian topic model which can treat infinite topics.
In particular, HDP-LDA is interesting as an extention of LDA.

(Teh+ 2006) introduced updates of Collapsed Gibbs sampling for a general framework of HDP, but not for HDP-LDA.
To obtain updates of HDP-LDA, it is necessary to apply the base measure H and the emission F(phi) on HDP-LDA’s setting into the below equation:

, (eq. 30 on [Teh+ 2006])

where h is a probabilistic density function of H and f is one of F.
In the case of HDP-LDA, H is a Dirichlet distribution over vocabulary and F is a topic-word multinominal distribution, that is

where ,
.

To substitute these for equation (30), we obtain




,

where

We also need f_k^new when t takes a new table. It is obtained as the following:


And it is necessary to write down f_k(x_jt) also for sampling k.


For

(it means “term count of word w with topic k”)
(excluding ),


When implementation in Python, it is faster not to unfold Gamma functions than another. It is necessary to use these logarithms in either case, or f_k(x_jt) must overflow float range.

Finally,

Advertisements
This entry was posted in LDA, Machine Learning, Nonparametric Bayesian. Bookmark the permalink.

4 Responses to HDP-LDA updates

  1. ming says:

    Hi, thank you so much for your explanation here. I have a question about this process. I found that in chong wang’s code, he “sample_tables(d_state, q, f) ” for each document after he sampled all words in this doc. I am curious why he did this. Do you have any idea?

  2. Tim Hopper says:

    I’m trying to fill in the steps in your derivation of (30). Do you have any insight on the missing steps here: http://mathb.in/34749?key=f1b1b8e9c8ef6386abf89eb81f6a23347485e887. It is something to do with the conjugacy, right?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s