Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

NLIP reading group: Bayesian Smoothing for Language Models

Add to your list(s) Download to your calendar using vCal

Andreas Vlachos (University of Cambridge)
Thursday 26 January 2012, 12:00-13:00
GS15, Computer Laboratory.

If you have a question about this talk, please contact Jimme Jardine.

Yee-Whye Teh will be visiting us on 27/1 to give the NLIP seminar. The abstract of his talk is in the bottom of my message. We were thinking it might useful to read the paper he’s going to talk about beforehand:

http://www.gatsby.ucl.ac.uk/~ywteh/research/compling/WooGasArc2011a.pdf

> Smoothing is a central component of language modelling technologies. > It attempts to improve probabilities estimated from language data by > shifting mass from high probability areas to low or zero probability > areas, thus “smoothing” the distribution. Many smoothing techniques > have been proposed in the past based on a variety of principles and > empirical observations. > > In this talk I will present a Bayesian statistical approach to smoothing. > By using a hierarchical Bayesian methodology to effectively share > information across the different parts of the language model, and by > incorporating the prior knowledge that languages obey power-law > behaviours using Pitman-Yor processes, we are able to construct > language models with state-of-the-art results. Our approach also > gives an interesting new interpretation of interpolated Kneser-Ney and why it works so well. > Finally, we describe an extension of our model from finite n-grams to > “infinite-grams” which we call the sequence memoizer. > > This is joint work with Frank Wood, Jan Gasthaus, Cedric Archambeau > and Lancelot James, and is based on work most recently reported in the > Communications of the ACM (Feb 2011 issue).

This talk is part of the Natural Language Processing Reading Group series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

NLIP reading group: Bayesian Smoothing for Language Models

This talk is included in these lists:

Other lists

Other talks