University of Cambridge > Talks.cam > Natural Language Processing Reading Group > Improving Multiclass Text Classification with Error-Correcting Output Coding and Sub-class Partitions

Improving Multiclass Text Classification with Error-Correcting Output Coding and Sub-class Partitions

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Marek Rei.

Stuart will present the following paper:

Improving Multiclass Text Classification with Error-Correcting Output Coding and Sub-class Partitions. 2010. Baoli Li and Carl Vogel.

http://www.springerlink.com/content/d546042162276641/

Error-Correcting Output Coding (ECOC) is a general framework for multiclass text classification with a set of binary classifiers. It can not only help a binary classifier solve multi-class classification problems, but also boost the performance of a multi-class classifier. When building each individual binary classifier in ECOC , multiple classes are randomly grouped into two disjoint groups: positive and negative. However, when training such a binary classifier, sub-class distribution within positive and negative classes is neglected. Utilizing this information is expected to improve a binary classifier. We thus design a simple binary classification strategy via multi-class categorization (2vM) to make use of sub-class partition information, which can lead to better performance over the traditional binary classification. The proposed binary classification strategy is then applied to enhance ECOC . Experiments on document categorization and question classification show its effectiveness.

Anyone interested in more background material might want to look at http://arxiv.org/abs/cs/9501101 which was the original paper introducing this method.

This talk is part of the Natural Language Processing Reading Group series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity