Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

A New Dataset and Method for Automatically Grading ESOL Texts

Add to your list(s) Download to your calendar using vCal

Helen Yannakoudakis, University of Cambridge
Friday 10 June 2011, 12:00-12:30
FW26, Computer Laboratory.

If you have a question about this talk, please contact Thomas Lippincott.

We demonstrate how supervised discriminative machine learning techniques can be used to automate the assessment of ‘English as a Second or Other Language’ (ESOL) examination scripts. In particular, we use rank preference learning to explicitly model the grade relationships between scripts. A number of different features are extracted and ablation tests are used to investigate their contribution to overall performance. A comparison between regression and rank preference models further supports our method. Experimental results on the first publically available dataset show that our system can achieve levels of performance close to the upper bound for the task, as defined by the agreement between human examiners on the same corpus. Finally, using a set of ‘outlier’ texts, we test the validity of our model and identify cases where the model’s scores diverge from that of a human examiner.

This talk is part of the NLIP Seminar Series series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

A New Dataset and Method for Automatically Grading ESOL Texts

This talk is included in these lists:

Other lists

Other talks