Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

End-to-end contextual speech recognition with Tree-constrained pointer generator

Add to your list(s) Download to your calendar using vCal

Brian Sun, Cambridge University Engineering Department
Monday 19 February 2024, 12:00-13:00
Hybrid: JDB Teaching Room, Engineering Department or Zoom: https://cam-ac-uk.zoom.us/j/85153166972?pwd=bkgzN0lKejBjQ1BFSVVRbDAvbUdmUT09.

If you have a question about this talk, please contact Simon Webster McKnight.

Contextual knowledge is of vital importance to end-to-end automatic speech recognition (ASR) systems, especially for the long-tailed word problem where systems suffer from degraded performance on rare or unseen words that are both relevant to the context and carrying important information. Integrating such contextual knowledge into such end-to-end systems is both necessary and challenging, as contextual knowledge is always dynamically changing while neural systems adopt a static set of trained parameters. In ASR , dynamic contextual knowledge is often incorporated via contextual biasing, where a list of rare words or phrases that are likely to appear in a given context is included, denoted as a biasing list of biasing words. A word is more likely to be correctly recognised if it is incorporated into the biasing list. This talk introduces tree-constrained pointer generator (TCPGen) as an effective neural-based biasing component for end-to-end contextual ASR . TCPGen effectively integrate contextual knowledge via a pointer generator mechanism, and efficiently structures biasing lists into prefix-trees. This talk includes the detailed TCP Gen approach, the use of graph neural networks for tree encodings and its application to Whisper models.

This talk is part of the CUED Speech Group Seminars series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

End-to-end contextual speech recognition with Tree-constrained pointer generator

This talk is included in these lists:

Other lists

Other talks