|COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring.|
Applications of Lexicographic Semirings in Speech and Language Processing
If you have a question about this talk, please contact Bill Byrne.
Sandwiches will be provided.
In this talk, I’ll present a couple of applications of lexicographic semirings for encoding sequence models, which yield useful algorithms based on weighted finite-state determinization. Lexicographic semirings involve an ordered set of dimensions, each of which is itself a semiring. First, I’ll briefly introduce weighted finite-state automata and transducers, semirings, and lexicographic semirings, followed by a presentation of two special cases. The first lexicographic semiring we examine involves a pair of tropical semirings, which provides an exact automata encoding of smoothed n-gram models using simple epsilon transitions rather than failure transitions. This allows for off-line optimization of exact models represented as large weighted finite-state transducers in contrast to implicit (on-line) failure transition representations. The second lexicographic semiring is a pair of a tropical semiring and a new string semiring which we call a “categorial semiring”. The categorial semiring is inspired by categorial grammar and includes an operation of string division. This semiring allows us to use weighted finite-state determinization on a weighted transducer so that every input sequence has exactly one (minimum cost) output sequence. For example, a part-of-speech tagged word lattice can be determinized so that every word string in the original lattice has just one path in the tagged lattice, corresponding to the Viterbi-best POS -tag sequence for that word string. Tools based on both of these methods will be available as part of the new ngram library available from OpenGrm.org. (Joint work with Richard Sproat, Izhak Shafran and Mahsa Yarmohammadi)
Brian Roark is an Associate Professor in the Center for Spoken Language Understanding (CSLU) and Dept. of Biomedical Engineering at Oregon Health & Science University (OHSU). He received his PhD from Brown University in 2001 and spent 3 years in the Speech Algorithms Department at AT&T Labs – Research before joining CSLU . His research interests include natural language processing, language modeling for various applications, assistive technology, and spoken language understanding.
This talk is part of the Speech Seminars series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
Other listsCambridge University Anthropological Society MEMS seminar Cambridge Finance Seminar Series
Other talksMultiscale Simulations of Patchy Particle Systems Combining Molecular Dynamics, Path Sampling and Green’s Function Reaction Dynamics Exploring psychotic experiences in ‘non-need for care’ populations: Findings from the UNIQUE study Frogs in space: physiological research into metric relationships and laws of nature Scientific habits circa 1900 Reality discrimination deficits in hallucinations Easter Term Poster Session