Predicting Strong Associations on the Basis of Corpus Data
- 👤 Speaker: Colin Kelly (Computer Laboratory)
- 📅 Date & Time: Monday 02 March 2009, 12:30 - 13:30
- 📍 Venue: GS15, Computer Laboratory
Abstract
At this session of the NLIP Reading Group we’ll be discussing the following paper:
Yves Peirsman and Dirk Geeraerts. 2009. Predicting Strong Associations on the Basis of Corpus Data. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL-09).
This paper doesn’t seem to be up on the Web yet; please contact the group organiser if you need a copy.
Abstract: Current approaches to the prediction of associations rely on just one type of information, generally taking the form of either word space models or collocation measures. At the moment, it is an open question how these approaches compare to one another. In this paper, we will investigate the performance of these two types of models and that of a new approach based on compounding. The best single predictor is the log-likelihood ratio, followed closely by the document-based word space model. We will show, however, that an ensemble method that combines these two best approaches with the compounding algorithm achieves an increase in performance of almost 30% over the current state of the art.
Series This talk is part of the Natural Language Processing Reading Group series.
Included in Lists
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Chris Davis' list
- GS15, Computer Laboratory
- Guy Emerson's list
- Natural Language Processing Reading Group
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Colin Kelly (Computer Laboratory)
Monday 02 March 2009, 12:30-13:30