|COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring.|
Applications of Lexicographic Semirings in Speech and Language Processing
If you have a question about this talk, please contact Bill Byrne.
Sandwiches will be provided.
In this talk, I’ll present a couple of applications of lexicographic semirings for encoding sequence models, which yield useful algorithms based on weighted finite-state determinization. Lexicographic semirings involve an ordered set of dimensions, each of which is itself a semiring. First, I’ll briefly introduce weighted finite-state automata and transducers, semirings, and lexicographic semirings, followed by a presentation of two special cases. The first lexicographic semiring we examine involves a pair of tropical semirings, which provides an exact automata encoding of smoothed n-gram models using simple epsilon transitions rather than failure transitions. This allows for off-line optimization of exact models represented as large weighted finite-state transducers in contrast to implicit (on-line) failure transition representations. The second lexicographic semiring is a pair of a tropical semiring and a new string semiring which we call a “categorial semiring”. The categorial semiring is inspired by categorial grammar and includes an operation of string division. This semiring allows us to use weighted finite-state determinization on a weighted transducer so that every input sequence has exactly one (minimum cost) output sequence. For example, a part-of-speech tagged word lattice can be determinized so that every word string in the original lattice has just one path in the tagged lattice, corresponding to the Viterbi-best POS -tag sequence for that word string. Tools based on both of these methods will be available as part of the new ngram library available from OpenGrm.org. (Joint work with Richard Sproat, Izhak Shafran and Mahsa Yarmohammadi)
Brian Roark is an Associate Professor in the Center for Spoken Language Understanding (CSLU) and Dept. of Biomedical Engineering at Oregon Health & Science University (OHSU). He received his PhD from Brown University in 2001 and spent 3 years in the Speech Algorithms Department at AT&T Labs – Research before joining CSLU . His research interests include natural language processing, language modeling for various applications, assistive technology, and spoken language understanding.
This talk is part of the Speech Seminars series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
Other listsTrinity Mathematical Society Representational Similarity Analysis Andrew Chamblin Memorial Lecture 2013
Other talksSpace Weather and Satellites Introduction: Challenge 2. Modelling Sewer Networks 'Finance, Power and Globalisation: German Bankers and the Rise of International Banking in China (1885-1919)' High strain rate deformation response of titanium for aerospace gas turbines The influence of fairy tale on contemporary Australian fantasy fiction for young people TBC