University of Cambridge > Talks.cam > NLIP Seminar Series > Relational-Realizational Syntax: An Architecture for Describing and Parsing Rich Morphosyntactic Descriptions

Relational-Realizational Syntax: An Architecture for Describing and Parsing Rich Morphosyntactic Descriptions

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Thomas Lippincott.

Precision grammars and treebank grammars present two alternatives for obtaining an accurate, consistent and maximally complete syntactic analysis of natural language sentences. For a long time these two research endeavors have been conducted in separate communities and optimized for disparate goals—the former for rich and accurate descriptions of linguistic structures, and the latter for efficient and accurate statistical parsing,. Recently, these disparate research efforts started to acknowledge their usefulness for one another by borrowing terms, theoretical constructs and techniques from one research endeavor to the other. In this talk I take a step back to consider the morpho-syntactic analysis task from first principles and develop a novel architecture which remains faithful to both kinds of goals.

In this talk I present a novel architecture for specifying rich morphosyntactic representations and learning the associated grammars from annotated data. The key idea underlying the architecture is the application of the traditional notion of a “paradigm” to the syntactic domain. N-place predicates associated with paradigm cells are viewed as relational networks that are realized recursively by combining and ordering cells from other paradigms. The function of paradigm cells is mapped to forms in a recursive fashion, be means of realization rules that make reference both to the morphological and to the syntactic domains. This architecture, called Relational-Realizational, has a simple instantiation as a generative probabilistic model of which parameters can be statistically learned from treebank data, and which can be used for efficient parsing.

An application of the model to Hebrew and Swedish allows for accurate description of word-order and argument marking patterns of the different language types. The associated treebank grammar can be used for statistical parsing and is shown to improve state-of-the-art parsing results for the Semitic language Modern Hebrew. The availability of a simple, formal, robust, implementable and statistically interpretable working model opens new horizons in computational linguistics — at least in principle, we should now be able to quantify typological trends which have so far been stated informally or only tacitly reflected in corpus statistics.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity