University of Cambridge > Talks.cam > NLIP Seminar Series > The Web as an Implicit Training Set: Application to Noun Compounds' Syntax and Semantics

Log in

University Account

External (via Google)

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

The Web as an Implicit Training Set: Application to Noun Compounds' Syntax and Semantics

Download to your calendar using vCal

Preslav Nakov - National University of Singapore
Friday 29 January 2010, 12:00-13:00
SW01, Computer Laboratory.

If you have a question about this talk, please contact Laura Rimell .

I will present Web-based approaches to the syntax and semantics of noun compounds (NCs), which can be used in query parsing, technical term understanding, etc. I will also describe an application to machine translation.

First, I will present a highly accurate lightly supervised method based on surface features and paraphrases for making bracketing decisions for three-word noun compounds, e.g. “[[liver cell] antibody]” is left-bracketed, while “[liver [cell line]]” is right-bracketed. The enormous size of the Web makes such features frequent enough to be useful.

Second, I will introduce an unsupervised method for discovering the implicit predicates characterizing the semantic relations that hold in noun-noun compounds. For example, “malaria mosquito” is a “mosquito that carries/spreads/causes/transmits/brings/infects with/... malaria”.

Finally, I will present a method for improving Machine Translation (SMT). Most modern SMT systems rely on aligned sentences of bilingual corpora for training. I will describe a method for expanding the training set with conceptually similar but syntactically differing paraphrases at the NP-level which involve NCs. The English to Spanish evaluation on the Europarl corpus shows an improvement equivalent to 33%-50% of that of doubling the amount of training data.

This talk is part of the NLIP Seminar Series series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

The Web as an Implicit Training Set: Application to Noun Compounds' Syntax and Semantics

📅 Download to calendar (vCal)

👤 Speaker: Preslav Nakov - National University of Singapore
📅 Date & Time: Friday 29 January 2010, 12:00 - 13:00
📍 Venue: SW01, Computer Laboratory

Questions? Contact Laura Rimell

Abstract

Series This talk is part of the NLIP Seminar Series series.

Included in Lists

Note: Ex-directory lists are not shown.

Log in

🔐 Log In

Information on

ℹ️ Information

The Web as an Implicit Training Set: Application to Noun Compounds' Syntax and Semantics

This talk is included in these lists:

The Web as an Implicit Training Set: Application to Noun Compounds' Syntax and Semantics

Abstract

Included in Lists

Log in

🔐 Log In

Information on

ℹ️ Information

The Web as an Implicit Training Set: Application to Noun Compounds' Syntax and Semantics

This talk is included in these lists:

Other lists

Other talks

The Web as an Implicit Training Set: Application to Noun Compounds' Syntax and Semantics

Abstract

Included in Lists