Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

SummaryMixing: A Linear-Time Attention Alternative

Add to your list(s) Download to your calendar using vCal

Shucong Zhang, Samsung AI Center
Monday 21 October 2024, 12:00-13:00
Hybrid: JDB Teaching Room, Engineering Department or Zoom: https://cam-ac-uk.zoom.us/j/87165608116?pwd=llCLiWvBAfOR7RtealbVtVbjXruh3O.1.

If you have a question about this talk, please contact Simon Webster McKnight.

Modern speech processing systems rely on self-attention. Unfortunately, self-attention takes quadratic time in the length of the speech utterance, causing inference and training on long sequences to be slower and consume more memory. Though cheaper alternatives to self-attention for speech recognition have been developed, they degrade performance. We propose a novel linear-time alternative to self-attention that, for the first time, does reach better accuracy. Our model, SummaryMixing, computes a mean over the whole utterance and feeds this summary back to each time step.Experiments are performed in three vital scenarios: an encoder-decoder offline model; an online streaming Transducer model; and a self-supervised model. In all three scenarios, SummaryMixing gives equal or better accuracy than self-attention, at lower cost.

This talk is part of the CUED Speech Group Seminars series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

SummaryMixing: A Linear-Time Attention Alternative

This talk is included in these lists:

Other lists

Other talks