BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Tackling Multispeaker Conversation Processing based on Speaker Dia
 rization and Multispeaker Speech Recognition - Prof Shinji Watanabe\, Carn
 egie Mellon University (CMU)\, USA
DTSTART:20210209T120000Z
DTEND:20210209T130000Z
UID:TALK156679@talks.cam.ac.uk
CONTACT:Dr Kate Knill
DESCRIPTION:Abstract: Recently\, speech recognition and understanding stud
 ies have shifted their focus from single-speaker automatic speech recognit
 ion (ASR) in controlled scenarios to more challenging and realistic multis
 peaker conversation analysis based on ASR and speaker diarization. The CHi
 ME speech separation and recognition challenge is one of the attempts to t
 ackle these new paradigms. This talk first describes the introduction and 
 challenge results of the latest CHiME-6 challenge\, focusing on recognizin
 g multispeaker conversations in a dinner party scenario. The second part o
 f this talk is to tackle this problem based on an emergent technique based
  on an end-to-end neural architecture. We first introduce an end-to-end si
 ngle-microphone multispeaker ASR technique based on a recurrent neural net
 work and transformer to show the effectiveness of the proposed method. Sec
 ond\, we extend this approach to leverage the benefit of the multi-microph
 one input and realize simultaneous speech separation and recognition withi
 n a single neural network trained only with the ASR objective. Finally\, w
 e also introduce our recent attempts of speaker diarization based on end-t
 o-end neural architecture\, including basic concepts\, on-line extensions\
 , and handling unknown numbers of speakers.\n\nBio:\nShinji Watanabe is an
  Associate Professor at Carnegie Mellon University\, Pittsburgh\, PA. He r
 eceived his B.S.\, M.S.\, and Ph.D. (Dr. Eng.) degrees from Waseda Univers
 ity\, Tokyo\, Japan. He was a research scientist at NTT Communication Scie
 nce Laboratories\, Kyoto\, Japan\, from 2001 to 2011\, a visiting scholar 
 in Georgia institute of technology\, Atlanta\, GA in 2009\, a Senior Princ
 ipal Research Scientist at Mitsubishi Electric Research Laboratories (MERL
 )\, Cambridge\, MA USA from 2012 to 2017\, and an associate research profe
 ssor at Johns Hopkins University\, Baltimore\, MD from 2017 to 2020. His r
 esearch interests include automatic speech recognition\, speech enhancemen
 t\, spoken language understanding\, and machine learning for speech and la
 nguage processing. He has been published more than 200 papers in peer-revi
 ewed journals and conferences and received several awards\, including the 
 best paper award from the IEEE ASRU in 2019. He served as an Associate Edi
 tor of the IEEE Transactions on Audio Speech and Language Processing. He w
 as/has been a member of several technical committees\, including the APSIP
 A Speech\, Language\, and Audio Technical Committee (SLA)\, IEEE Signal Pr
 ocessing Society Speech and Language Technical Committee (SLTC)\, and Mach
 ine Learning for Signal Processing Technical Committee (MLSP).\n\nThis tal
 k is made possible through the ISCA International Virtual Seminars.
LOCATION:Zoom: https://zoom.us/j/95352633552?pwd=RzJVK2UzOGZyNU5mVHd1Y1VPT
 2tDUT09
END:VEVENT
END:VCALENDAR