Parsing Images the UCLA Way
Add to your list(s)
Download to your calendar using vCal
If you have a question about this talk, please contact Shakir Mohamed.
2-D images of natural scenes have a hierarchical structure: scenes are made of objects and other large elements, objects are made of parts, parts can be decomposed into textures and other features, and these components in turn yield the pixels we see on the screen. For decades, computer vision experts have attempted to create programs that can identify this whole-image-explaining structure as easily as humans can, but only recently have they achieved notable results. In this talk I will review some of the image parsing work of Song-chun Zhu et al. at UCLA . Zhu is arguably one of the most successful of these new photo parsing pioneers, and my talk will focus on the stochastic grammars and probabilistic inference methods employed in his work.
Main paper:
Z. Tu, X. Chen, A. Yuille, S. Zhu. Image parsing: unifying segmentation, detection, and recognition. Intl. Journal Comp. Vis. 63(2):113-140, 2005.
http://www.stat.ucla.edu/%7Esczhu/papers/reprint_IJCV_parsing.pdf
Some material in the talk will also touch on the following papers:
S. Zhu, D. Mumford, A stochastic grammar of images. Foundations and Trends in Computer Graphics and Vision 2(4):259-362, 2007.
http://www.stat.ucla.edu/%7Esczhu/papers/Reprint_Grammar.pdf
F. Han, S. Zhu. Bottom-up/top-down image parsing with attribute grammar. IEEE PAMI , 31(1), 2009
http://www.stat.ucla.edu/%7Esczhu/papers/PAMI_grammar_reprint.pdf
This talk is part of the Machine Learning Reading Group @ CUED series.
This talk is included in these lists:
Note that ex-directory lists are not shown.