Model-based EM Source Separation and Localization in Reverberant Mixtures
Add to your list(s)
Download to your calendar using vCal
If you have a question about this talk, please contact Rachel Fogg.
This talk will describe our system for separating and localizing multiple sound sources from a reverberant two-channel recording. The talk begins with a characterization of the interaural spectrogram for single source recordings, and a method for constructing a probabilistic model of interaural parameters that can be evaluated at individual spectrogram points. Multiple models can then be combined into a mixture model of sources and delays, which reduces the multi-source localization problem to a collection of single source problems. The talk will then outline an expectation maximization algorithm for finding the maximum-likelihood parameters of this mixture model, and show that these parameters correspond well with interaural parameters measured in isolation. As a byproduct of fitting this model, the algorithm creates probabilistic spectrogram masks that can be used for source separation. In experiments performed in simulated anechoic and reverberant environments, our system on average improved the direct-path signal-to-noise ratio of target sources by 2.1 dB more than two comparable algorithms.
This talk is part of the Signal Processing and Communications Lab Seminars series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
|