University of Cambridge > Talks.cam > NLIP Seminar Series > A New Dataset and Method for Automatically Grading ESOL Texts

A New Dataset and Method for Automatically Grading ESOL Texts

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Thomas Lippincott.

We demonstrate how supervised discriminative machine learning techniques can be used to automate the assessment of ‘English as a Second or Other Language’ (ESOL) examination scripts. In particular, we use rank preference learning to explicitly model the grade relationships between scripts. A number of different features are extracted and ablation tests are used to investigate their contribution to overall performance. A comparison between regression and rank preference models further supports our method. Experimental results on the first publically available dataset show that our system can achieve levels of performance close to the upper bound for the task, as defined by the agreement between human examiners on the same corpus. Finally, using a set of ‘outlier’ texts, we test the validity of our model and identify cases where the model’s scores diverge from that of a human examiner.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2017 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity