COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > CUED Speech Group Seminars > Automatic Speech Recognition in a State-of-Flux
Automatic Speech Recognition in a State-of-FluxAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Dr Kate Knill. Abstract: Initiated by the successful utilization of deep neural network modeling for large vocabulary automatic speech recognition (ASR), the last decade brought a considerable diversification of ASR architectures. Following the classical state-of-the-art hidden Markov model (HMM) based architecture, connectionist temporal classification (CTC), attention-based encoder-decoder, recurrent neural network transducer (RNN-T) and monotonic variants, as well as segmental approaches including inverted HMM architectures were introduced. All these architectures show competitive performance and the question arises, which of these will finally prevail and define the new state-of-the-art in large vocabulary ASR ? In this presentation, a comparative review of current architectures in the context of Bayes decision rule is provided. Relations and equivalences between architectures are derived, utilization of data is considered and especially the role of language modeling within integrated end-to-end architectures will be discussed. Bio: Ralf Schlüter serves as Academic Director and Lecturer (Privatdozent) in the Department of Computer Science of the Faculty of Computer Science, Mathematics and Natural Sciences at RWTH Aachen University. He leads the Automatic Speech Recognition Group at the Lehrstuhl Informatik 6: Human Language Technology and Pattern Recognition. He studied physics at RWTH Aachen University and Edinburgh University and received his Diploma in Physics (1995), in Computer Science (2000) and Habilitation for Computer Science (2019), each at RWTH Aachen University. Dr. Schlüter works on all aspects of automatic speech recognition and has been leading the scientific work of the Lehrstuhl Informatik 6 in the area of automatic speech recognition in many large national and international research projects, e.g. EU-Bridge and TC-STAR (EU), Babel (US-IARPA) and Quaero (French OSEO ). This talk is provided through the ISCA International Virtual Seminar Programme. This talk is part of the CUED Speech Group Seminars series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsChanging Health Centenary Year of the Medical Research Council and International Year of Statistics Let cout inOther talksQuality control of proteins orphaned in the cytosol The origin of mitochondrial DNA mutations: population genetics and disease Protein complexes subjected to tandem mass spectrometry reveal allosteric binding partners AI-guided solutions for early detection of neurodegenerative disorders What the Archaeology of Decolonization Can Teach Us About the Decolonization of Archaeology Train your Voice: Resonate your Confidence |