Neural Attention
Add to your list(s)
Download to your calendar using vCal
If you have a question about this talk, please contact Robert Pinsler.
Sequence-to-sequence models have been very successful in natural language processing tasks such as language translation. Self-attention – an attention mechanism that relates different positions of a single sequence – has proved to be an important technique that significantly improves the quality of model outputs. We will discuss attention in the context of neural machine translation and introduce the transformer model. Transformers rely entirely on self-attention to capture long-range dependencies, rather than recurrence or convolutions. By dispensing with recurrence, transformers are also more parallelizable than recurrent models, which are inherently sequential. We consider a number of case studies, most notably BERT , which is the major success story for self-attention. We also consider applications of attention to images and graph neural networks.
This talk is part of the Machine Learning Reading Group @ CUED series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
|