Hierarchical Dirichlet Processes (Teh+ 2006) are a nonparametric bayesian topic model which can treat infinite topics.
In particular, HDP-LDA is interesting as an extention of LDA.
(Teh+ 2006) introduced updates of Collapsed Gibbs sampling for a general framework of HDP, but not for HDP-LDA.
To obtain updates of HDP-LDA, it is necessary to apply the base measure H and the emission F(phi) on HDP-LDA’s setting into the below equation:
, (eq. 30 on [Teh+ 2006])
where h is a probabilistic density function of H and f is one of F.
In the case of HDP-LDA, H is a Dirichlet distribution over vocabulary and F is a topic-word multinominal distribution, that is
where
,
.
To substitute these for equation (30), we obtain
,
where
We also need f_k^new when t takes a new table. It is obtained as the following:
And it is necessary to write down f_k(x_jt) also for sampling k.
For
(it means “term count of word w with topic k”)
(excluding
),
When implementation in Python, it is faster not to unfold Gamma functions than another. It is necessary to use these logarithms in either case, or f_k(x_jt) must overflow float range.
Finally,