Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Itakura-Saito nonnegative factorizations of the power spectrogram for music signal decomposition

Add to your list(s) Download to your calendar using vCal

Dr Cedric Fevotte, CNRS - TELECOM ParisTech
Thursday 18 March 2010, 14:15-15:15
LR5, Engineering, Department of.

If you have a question about this talk, please contact Rachel Fogg.

Nonnegative matrix factorization (NMF) is a popular linear regression technique in the fields of machine learning and signal/image processing. Much research about this topic has been driven by applications in audio. NMF has been for example applied with success to automatic music transcription and audio source separation, where the data is usually taken as the magnitude spectrogram of the sound signal, and the Euclidean distance or Kullback-Leibler divergence are used as measures of fit between the original spectrogram and its approximate factorization.

After a brief overview of NMF , in this presentation we will show evidence of the relevance of considering factorization of the power spectrogram, with the Itakura-Saito (IS) divergence. Indeed, IS-NMF is shown to be connected to maximum likelihood inference of variance parameters in a well-defined statistical model of superimposed Gaussian components and this model is in turn shown to be well suited to audio. Furthermore, the statistical setting opens doors to Bayesian approaches and to a variety of computational inference techniques. We discuss in particular model order selection strategies and Markov regularization of the activation matrix, to account for time-persistence in audio.

This presentation will also adress extensions of NMF to the multichannel case, in both instantaneous or convolutive recordings, possibly underdetermined, leading to nonnegative tensor factorizations under novel structures. We will present in particular audio source separation results of real-world stereo musical excerpts.

References :

C. Févotte, N. Bertin and J.-L. Durrieu. “Nonnegative matrix factorization with the Itakura-Saito divergence. With application to music analysis,” Neural Computation, vol. 21, no 3, Mar. 2009 http://www.tsi.enst.fr/_{fevotte/Journals/neco09_is-nmf.pdf}

A. Ozerov and C. Févotte. “Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation,” IEEE Trans. Audio, Speech and Language Processing, 2010 (to appear) http://www.tsi.enst.fr/fevotte/TechRep/techrep09_multinmf.pdf

This talk is part of the Probabilistic Systems, Information, and Inference Group Seminars series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Itakura-Saito nonnegative factorizations of the power spectrogram for music signal decomposition

This talk is included in these lists:

Other lists

Other talks