University of Cambridge > > NLIP Seminar Series > Learning Latent Syntactic Representations with Joint Models

Learning Latent Syntactic Representations with Joint Models

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Tamara Polajnar.

A human listener, charged with the difficult task of mapping language to meaning, must infer a rich hierarchy of linguistic structures before reaching an understanding of what was spoken. Much in the same manner, developing complete natural language processing systems requires the processing of many different layers of linguistic information in order to solve complex tasks, like answering a question or translating a document.

One such “pre-requisite” layer, syntactic structure, has proven to be useful information for a wide variety of downstream NLP tasks, and research in the supervised training of syntactic parsers has received significant attention by the NLP community for decades. However, training such parsers requires access to large corpora of syntactically-annotated data, which are both costly to produce and unavailable in many widely spoken languages. Even when available, the domain of interest often differs from the domain of the syntactic data, often newswire, leading to domain drift and lower performance on the downstream task.

To address these issues we present a general framework for constructing and reasoning with joint graphical models. In a joint model individual component models are coupled and inference is performed globally, allowing the beliefs of one model to influence the other and vice versa. While joint inference is traditionally pursued to limit the propagation of errors between components, here we utilize it for a different purpose: to train syntactic models without syntactically-annotated data. We propose a novel marginalization-based training method in which end task annotations are used to guide the induction of a constrained latent syntactic representation, with the resulting syntactic distribution being specially-tailored for the desired end task. We find that across a number NLP tasks (semantic role labeling, named entity recognition, relation extraction) this approach not only offers performance comparable to the fully supervised training of the joint model (using syntactically-annotated data), but in some instances even improves upon it by learning latent structures which are more appropriate for the task.

This is joint work with Mark Johnson, Sebastian Riedel, and David A. Smith

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2020, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity