Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Disfluency detection in spoken learner English

Add to your list(s) Download to your calendar using vCal

Andrew Caines, DTAL, University of Cambridge
Friday 01 May 2015, 12:30-13:00
FW26, Computer Laboratory.

If you have a question about this talk, please contact Tamara Polajnar.

Due to the non-canonical nature of spoken language (containing filled pauses, non-standard grammatical variations, hesitations and other disfluencies) and compounded by a lack of available training data, spoken language parsing has been a challenge for standard NLP tools. Recently the Redshift parser (Honnibal et al., CoNLL 2013) has been shown to be successful in identifying grammatical relations and certain disfluencies in native speaker spoken language, returning unlabelled dependency accuracy of 90.5% and a disfluency F-measure of 84.1% (Honnibal & Johnson, TACL 2014 ). We investigate how this parser handles spoken data from learners of English at various proficiency levels. Firstly, we find that Redshift’s parsing accuracy on non-native speech data is comparable to Honnibal & Johnson’s results, with 91.1% of dependency relations correctly identified. However, disfluency detection is markedly down, with an F-measure of just 47.8%. We consider why this should be, and relate our findings to the use of NLP technology for automatic language assessment and computer-assisted language learning applications.

This talk is part of the NLIP Seminar Series series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Disfluency detection in spoken learner English

This talk is included in these lists:

Other lists

Other talks