University of Cambridge > Talks.cam > NLIP Seminar Series > Sentence-level Topic Models

Sentence-level Topic Models

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Anita Verő.

We present two generative models of documents which generate whole sentences from underlying topics. This relaxes the word exchangeability assumption of traditional generative models of documents to sentence exchangeability, and can hence capture inter-word dependencies that LDA misses. Despite the additional model complexity, model training and inference is still feasible using state-of-the-art approximate inference techniques. We show that both our proposed models achieve lower perplexities than a standard LDA topic model and a strong LSTM language model on held-out documents. We also manually inspect samples from the topics learnt, and show that the topics both models learn are coherent. Finally, we show that on a shallow document classification task, LDA outperforms our models, and analyse the reasons behind this.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity