University of Cambridge > Talks.cam > RCEAL Tuesday Colloquia > The English Profile Project: upgrading the Cambridge Learner Corpus and future directions for research.

The English Profile Project: upgrading the Cambridge Learner Corpus and future directions for research.

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Teresa Parodi.

CHANGE OF VENUE TO GR06/07. This talk is a dry-run for a presentation to be given at the EPP annual meeting.

The ultimate goal of the English Profile Project (EPP) is to provide a set of reference level descriptions for English for all six levels of the CEFR (Common European Framework of Reference for Languages). In this talk we define our research goals and summarise ongoing and future work. Our goals are to deliver two products of direct relevance tothe EPP : (i) a set of “criterial” features that characterize and distinguish the six levels of the CEFR with respect to English; and (ii) an assessment of the impact of different first languages (L1s) on performance at each of the levels and their interaction with the criterial features. The data for much of this work come from the Cambridge Learner Corpus, a 22 million word corpus of written learner data, of which 11 million words have been manually error-coded and corrected. We will present preliminary results illustrating correlations between determiner errors and first languages using the error-coded portion of the CLC . We will describe the part-of-speech tagging and syntactic parsing of the CLC and of the BNC using the RASP system, and demonstrate how this additional annotation allows us to address questions which go beyond specific lexical items and the error codes associated with them. In this context, we will report preliminary work on determiner-noun agreement errors and on verb subcategorisation and selectional disparities from native speaker performance. Future work is outlined, including work on lexical choice errors and on the use of the CLC (and the BNC ) to detect grammatical errors automatically.

This talk is part of the RCEAL Tuesday Colloquia series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity