University of Cambridge > Talks.cam > NLIP Seminar Series > Improved Information Structure Analysis of Scientific Documents Through Discourse and Lexical Constraints

Improved Information Structure Analysis of Scientific Documents Through Discourse and Lexical Constraints

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Ekaterina Kochmar.

Inferring the information structure of scientific documents is useful for many down-stream applications. Existing feature-based machine learning approaches to this task require substantial training data and suffer from limited performance. Our idea is to guide feature-based models with declarative domain knowledge encoded as posterior distribution constraints. We explore a rich set of discourse and lexical constraints which we incorporate through the Generalized Expectation (GE) criterion. Our constrained model improves the performance of existing fully and lightly supervised models. Even a fully unsupervised version of this model outperforms lightly supervised feature-based models, showing that our approach can be useful even when no labeled data is available.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity