|COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring.|
A Hierarchical Bayesian Language Model based on Pitman-Yor Processes
If you have a question about this talk, please contact Hanna Wallach.
We propose a new hierarchical Bayesian n-gram model of natural languages. Our model makes use of a generalization of the commonly used Dirichlet distributions called Pitman-Yor processes which produce power-law distributions more closely resembling those in natural languages. We show that an approximation to the hierarchical Pitman-Yor language model recovers the exact formulation of interpolated Kneser-Ney, one of the best smoothing methods for n-gram language models. Experiments verify that our model gives cross entropy results superior to interpolated Kneser-Ney and comparable to modified Kneser-Ney.
This talk is part of the Machine Learning Journal Club series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
Other listsJunior Category Theory Seminar Travel and Expeditions Unilever Centre for Molecular Science Informatics
Other talksHyperbolic geometry in liquid crystalline interfaces The Immunology of a successful pregnancy The organelles and traffic machinery of the late endocytic pathway In Situ Hybridisation Symposium 2013 How to Fuck Display, piety and dedication: the re-use of coins in later medieval England