COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Microsoft Research Cambridge, public talks > Efficient Data Structures for Nonlinear Video Processing
Efficient Data Structures for Nonlinear Video ProcessingAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Microsoft Research Cambridge Talks Admins. Nonlinear techniques are used extensively in image and video processing with applications ranging from low level kernels such as denoising and detail enhancement to higher level operations such as object manipulation and special effects. In this talk, we will describe two computationally efficient data structures which dramatically simplify and accelerate a variety of algorithms for video processing. Our first data structure is the bilateral grid, an image representation that explicitly accounts for intensity edges. By interpreting brightness differences as Euclidean distances, the bilateral grid naturally encodes the notion of edge-awareness into filters defined on it. Smooth functions defined on the bilateral grid are piecewise-smooth in image space. Within this framework, we derive efficient reinterpretations of a number of nonlinear filters commonly used in computational photography as operations on the bilateral grid, including the bilateral filter, edge-aware scattered data interpolation, and local histogram equalization. We also show how these techniques can be easily parallelized onto modern graphics hardware for real-time processing of high definition video. The second data structure we describe is the video mesh, designed as a flexible central data structure for general-purpose nonlinear video editing workflows. It represents objects in a video sequence as 2.5D “paper cutouts” and allows interactive editing of moving objects and modeling of depth, which enables 3D effects and post-exposure camera control. In our representation, motion and depth are sparsely encoded by a set of points tracked over time. The video mesh is a triangulation over this point set and per-pixel information is obtained by interpolation. To handle occlusions and detailed object boundaries, we rely on the user to rotoscope the scene at a sparse set of frames using spline curves. We introduce an algorithm to robustly and automatically cut the mesh into local layers with proper occlusion topology, and propagate the splines to the remaining frames. Object boundaries are refined with per-pixel alpha mattes. At its core, the video mesh is a collection of texture-mapped triangles, which we can edit and render interactively using graphics hardware. We demonstrate the effectiveness of our representation with special effects such as 3D viewpoint changes, object insertion, depth-of-field manipulation, and 2D to 3D video conversion. This talk is part of the Microsoft Research Cambridge, public talks series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsCaius-Trinity MedSoc Talks: The Future of Medicine Meeting the Challenge of Healthy Ageing in the 21st Century CILR workshop: Pragmatics in interfacesOther talksDynamics of Phenotypic and Genomic Evolution in a Long-Term Experiment with E. coli Propagation of Very Low Frequency Emissions from Lightning Uncertainty Quantification of geochemical and mechanical compaction in layered sedimentary basins Social support and breastfeeding in the UK: evolutionary perspectives and implications for public health Glanville Lecture 2017/18: The Book of Exodus and the Invention of Religion Asclepiadaceae Migration in Science Vision Journal Club: feedforward vs back in figure ground segmentation Scale and anisotropic effects in necking of metallic tensile specimens The Partition of India and Migration HONORARY FELLOWS PRIZE LECTURE - Towards a silent aircraft |