COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > CUED Speech Group Seminars > Very deep convolutional neural networks for speech recognition
Very deep convolutional neural networks for speech recognitionAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Anton Ragni. Convolutional Neural Networks are one of the main drivers of the recent deep learning explosion, with the “Alexnet” (2012) result on the imagenet competition, and consecutive models like Overfeat (2013), VGG net (2014), GoogLeNet (2014), and residual networks (2015). In the speech recognition domain, CNNs with 2 convolutional layers were introduced around 2012 and have not seen major updates since. We will present a number of recent architectural advances in CNNs for speech recognition. We introduce a very deep convolutional network architecture with up to 14 weight layers. There are multiple convolutional layers before each pooling layer, with small 3×3 kernels, inspired by the VGG Imagenet 2014 architecture. We will discuss the design choice of strided pooling and zero-padding along the time direction, which renders convolutional evaluation of sequences highly inefficient. This can be phrased in the computer vision terminology of classification vs dense pixelwise prediction. We define the architectural constraints to make efficient evaluation of full utterances possible. This allows batch normalization to be adopted during full-utterance sequence training, resulting in faster training and improved performance. We show state of the art results on the benchmark switchboard 2000 hour dataset (Hub5 eval). We also adapted our architecture to the multilingual setting and got strong results on the babel OP3 surprise language after multilingual training on 25 languages. This talk is part of the CUED Speech Group Seminars series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsCambridge Climate computer science Science@Darwin CU German Society Talks Graduate and internship applications Market Square – The Cambridge Business & Society Interdisciplinary Research GroupOther talksEthics for the working mathematician, seminar 12: Going back to the start. Aromatic foldamers: mastering molecular shape Information Theory, Codes, and Compression Part IIB Poster Presentations TBC Developing an optimisation algorithm to supervise active learning in drug discovery |