COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > NLIP Seminar Series > Functional Distributional Semantics: Learning Linguistically Informed Representations from a Precisely Annotated Corpus
Functional Distributional Semantics: Learning Linguistically Informed Representations from a Precisely Annotated CorpusAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Andrew Caines. The aim of distributional semantics is to design computational techniques that can automatically learn the meanings of words from a body of text. The twin challenges are: how do we represent meaning, and how do we learn these representations? The current state of the art is to represent meanings as vectors—but vectors do not correspond to any traditional notion of meaning. In particular, there is no way to talk about truth, a crucial concept in logic and formal semantics. In this dissertation, I develop a framework for distributional semantics which answers this challenge. The meanings of words are not represented as vectors, but as functions, which map from entities to probabilities of truth. Such a function can be interpreted both in the machine learning sense of a classifier, and in the formal semantic sense of a truth-conditional function. This simultaneously allows both the use of machine learning techniques to exploit large datasets, and also the use of formal semantic techniques to manipulate the learnt representations. I define a probabilistic graphical model, which incorporates a probabilistic generalisation of model theory (allowing a strong connection with formal semantics), and which generates semantic dependency graphs (allowing it to be trained on a corpus). This graphical model provides a natural way to model logical inference, semantic composition, and context-dependent meanings, where Bayesian inference plays a crucial role. I demonstrate the feasibility of this approach by training a model on WikiWoods, a parsed version of the English Wikipedia, and evaluating it on three tasks. The results indicate that the model can learn information not captured by vector space models. This talk is part of the NLIP Seminar Series series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsThe obesity epidemic: Discussing the global health crisis Seminar 2016 listsOther talksSuperconformal quantum mechanics and integrability What has Engineering Design to say about Healthcare Improvement? Full-field optical metrology The Design of Resilient Engineering Infrastructure Systems with Bayesian Networks Breathing Solids: from Human Hair to Designer Nanoporous Materials Organic Bio-Electronic systems: from tissue engineering to drug discovery |