Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Extracting and Querying Probabilistic Information in BayesStore

Add to your list(s) Download to your calendar using vCal

Daisy Zhe Wang, University of California, Berkeley
Tuesday 12 April 2011, 10:40-11:40
Small lecture theatre, Microsoft Research Ltd, 7 J J Thomson Avenue (Off Madingley Road), Cambridge.

If you have a question about this talk, please contact Microsoft Research Cambridge Talks Admins.

This talk has been canceled/deleted

In the past few years, the number of applications that need to process large-scale data has grown remarkably. The data driving these applications is often uncertain, as is the analysis, which often involves probabilistic modeling and inference. Examples include sensor-based monitoring, information extraction and online advertising. Prior to our work, probabilistic database research advocated an approach in which uncertainty is modeled by attaching probabilities to data items. However, such systems do not and cannot take advantage of the wealth of Statistical Machine Learning (SML) research, because they are unable to represent and reason about the pervasive probabilistic correlations in the data.

In my thesis, I proposed, built, and evaluated BayesStore, a probabilistic database system that natively supports SML models and various inference algorithms to perform advanced data analysis. This marriage of database and SML technologies creates a declarative and efficient probabilistic processing framework for applications dealing with large-scale uncertain data. I have explored a variety of research challenges, including extending the database data model with probabilistic data and statistical models, defining relational operators (e.g., select, project, join) over probabilistic data and models, developing joint optimization of inference operators and the relational algebra, and devising novel query execution plans. I used information extraction over text as the driving application. My research shows that using in-database SML methods to extract and query probabilistic information can significantly improve answer quality. Moreover, it shows that optimizations for query-driven SML inference lead to orders-of-magnitude speed-up on large corpora.

This talk is part of the Microsoft Research Cambridge, public talks series.

This talk is included in these lists:

This talk is not included in any other list

Note that ex-directory lists are not shown.

Log in

Information on

Extracting and Querying Probabilistic Information in BayesStore

This talk is included in these lists:

Other lists

Other talks