University of Cambridge > Talks.cam > Cambridge University Linguistic Society (LingSoc) > Challenges and results of large-scale mapping of contemporary English dialects using online surveys

Challenges and results of large-scale mapping of contemporary English dialects using online surveys

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Christopher Lucas.

Linguists are just beginning to mine the web (typically via Google) for primary linguistic data (cf. Nakov and Hearst 2005, A Study of Using Search Engine Page Hits as a Proxy for n-gram Frequencies, or Nicholson and Baldwin 2006, Interpretation of Compound Nominalisations using Corpus and Web Statistics). Back in 1997, tired of saving hundreds of handwritten student surveys and having to present students with generalisations about English dialects that ceased being true 75 years ago, I decided to create an engine for collecting linguistic survey data via the web. My hope was to collect up-to-date dialect data and lots of it, in a form that could be directly dumped into a database for statistical analysis and geographic visualisation. Nine years and about eight surveys later, a host of interesting and surprising results have emerged concerning the current form of English varieties around the world. A number of unanticipated challenges have arisen as well. In this talk I present some of the most striking of these findings and problems.

This talk is part of the Cambridge University Linguistic Society (LingSoc) series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity