COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Speech Seminars > Applications of Lexicographic Semirings in Speech and Language Processing
Applications of Lexicographic Semirings in Speech and Language ProcessingAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Bill Byrne. Sandwiches will be provided. In this talk, I’ll present a couple of applications of lexicographic semirings for encoding sequence models, which yield useful algorithms based on weighted finite-state determinization. Lexicographic semirings involve an ordered set of dimensions, each of which is itself a semiring. First, I’ll briefly introduce weighted finite-state automata and transducers, semirings, and lexicographic semirings, followed by a presentation of two special cases. The first lexicographic semiring we examine involves a pair of tropical semirings, which provides an exact automata encoding of smoothed n-gram models using simple epsilon transitions rather than failure transitions. This allows for off-line optimization of exact models represented as large weighted finite-state transducers in contrast to implicit (on-line) failure transition representations. The second lexicographic semiring is a pair of a tropical semiring and a new string semiring which we call a “categorial semiring”. The categorial semiring is inspired by categorial grammar and includes an operation of string division. This semiring allows us to use weighted finite-state determinization on a weighted transducer so that every input sequence has exactly one (minimum cost) output sequence. For example, a part-of-speech tagged word lattice can be determinized so that every word string in the original lattice has just one path in the tagged lattice, corresponding to the Viterbi-best POS -tag sequence for that word string. Tools based on both of these methods will be available as part of the new ngram library available from OpenGrm.org. (Joint work with Richard Sproat, Izhak Shafran and Mahsa Yarmohammadi) Brian Roark is an Associate Professor in the Center for Spoken Language Understanding (CSLU) and Dept. of Biomedical Engineering at Oregon Health & Science University (OHSU). He received his PhD from Brown University in 2001 and spent 3 years in the Speech Algorithms Department at AT&T Labs – Research before joining CSLU . His research interests include natural language processing, language modeling for various applications, assistive technology, and spoken language understanding. This talk is part of the Speech Seminars series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsSelf Leadership&Self Management Individual in the Labour Market Research Group Centre for Trophoblast ResearchOther talks'The Japanese Mingei Movement and the art of Katazome' Organoid systems to study the maternal-fetal dialogue of early pregnancy An African orient? West Africans in World War Two India, 1943-1947 Unbiased Estimation of the Eigenvalues of Large Implicit Matrices Liver Regeneration in the Damaged Liver The frequency of ‘America’ in America BP KEYNOTE LECTURE: Importance of C-O Bond Activation for CO2/COUtilization - An Approach to Energy Conversion and Storage Polish Britain: Multilingualism and Diaspora Community |