University of Cambridge > > Computer Laboratory Opera Group Seminars > Cobra - Building an Internet-Scale Publish/Subscribe System

Cobra - Building an Internet-Scale Publish/Subscribe System

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Minor Gordon.

Blogs and RSS feeds are becoming increasingly popular. The blogging site LiveJournal has over 11 million user accounts, and according to one report, over 1.6 million postings are made to blogs every day. The “Blogosphere” is a new hotbed of Internet-based media that represents a shift from mostly static content to dynamic, continuously-updated discussions. The problem is that finding and tracking blogs with interesting content is an extremely cumbersome process. In this talk, I present our work on Cobra (Content-Based RSS Aggregator), a publish/subscribe system that crawls and filters vast numbers of RSS feeds, delivering to each user a personalised feed based on their interests. Cobra consists of a three-tiered network of crawlers that scan web feeds, filters that match crawled articles to user subscriptions, and reflectors that provide recently-matching articles on each subscription as an RSS feed, which can be browsed using a standard RSS reader. I will talk about the design, implementation and evaluation of Cobra in three settings: a dedicated cluster, the Emulab testbed and on PlanetLab. I also present our performance study of the Cobra system, demonstrating that the system is able to scale well to support a large number of source feeds and users; that the mean update detection latency is low (bounded by the crawler rate); and that an offline service provisioning step combined with several performance optimisations are effective at reducing memory usage and network load.

This talk is part of the Computer Laboratory Opera Group Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2022, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity