University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > High-dimensional variable selection when features are sparse

Log in

University Account

External (via Google)

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

High-dimensional variable selection when features are sparse

Download to your calendar using vCal

Jacob Bien (University of Southern California)
Thursday 26 April 2018, 11:00-12:00
Seminar Room 2, Newton Institute.

If you have a question about this talk, please contact INI IT .

STS - Statistical scalability

It is common in modern prediction problems for many predictor variables to be counts of rarely occurring events. This leads to design matrices in which a large number of columns are highly sparse. The challenge posed by such “rare features” has received little attention despite its prevalence in diverse areas, ranging from biology (e.g., rare species) to natural language processing (e.g., rare words). We show, both theoretically and empirically, that not explicitly accounting for the rareness of features can greatly reduce the effectiveness of an analysis. We next propose a framework for aggregating rare features into denser features in a flexible manner that creates better predictors of the response. An application to online hotel reviews demonstrates the gain in accuracy achievable by proper treatment of rare words. This is joint work with Xiaohan Yan.

This talk is part of the Isaac Newton Institute Seminar Series series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

High-dimensional variable selection when features are sparse

📅 Download to calendar (vCal)

⚠️ Important: STS - Statistical scalability

👤 Speaker: Jacob Bien (University of Southern California)
📅 Date & Time: Thursday 26 April 2018, 11:00 - 12:00
📍 Venue: Seminar Room 2, Newton Institute

Questions? Contact INI IT

Abstract

Series This talk is part of the Isaac Newton Institute Seminar Series series.

Included in Lists

Note: Ex-directory lists are not shown.

Log in

🔐 Log In

Information on

ℹ️ Information

High-dimensional variable selection when features are sparse

This talk is included in these lists:

High-dimensional variable selection when features are sparse

Abstract

Included in Lists

Log in

🔐 Log In

Information on

ℹ️ Information

High-dimensional variable selection when features are sparse

This talk is included in these lists:

Other lists

Other talks

High-dimensional variable selection when features are sparse

Abstract

Included in Lists