COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Signal Processing and Communications Lab Seminars > Itakura-Saito nonnegative factorizations of the power spectrogram for music signal decomposition
Itakura-Saito nonnegative factorizations of the power spectrogram for music signal decompositionAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Rachel Fogg. Nonnegative matrix factorization (NMF) is a popular linear regression technique in the fields of machine learning and signal/image processing. Much research about this topic has been driven by applications in audio. NMF has been for example applied with success to automatic music transcription and audio source separation, where the data is usually taken as the magnitude spectrogram of the sound signal, and the Euclidean distance or Kullback-Leibler divergence are used as measures of fit between the original spectrogram and its approximate factorization. After a brief overview of NMF , in this presentation we will show evidence of the relevance of considering factorization of the power spectrogram, with the Itakura-Saito (IS) divergence. Indeed, IS-NMF is shown to be connected to maximum likelihood inference of variance parameters in a well-defined statistical model of superimposed Gaussian components and this model is in turn shown to be well suited to audio. Furthermore, the statistical setting opens doors to Bayesian approaches and to a variety of computational inference techniques. We discuss in particular model order selection strategies and Markov regularization of the activation matrix, to account for time-persistence in audio. This presentation will also adress extensions of NMF to the multichannel case, in both instantaneous or convolutive recordings, possibly underdetermined, leading to nonnegative tensor factorizations under novel structures. We will present in particular audio source separation results of real-world stereo musical excerpts. References : C. Févotte, N. Bertin and J.-L. Durrieu. “Nonnegative matrix factorization with the Itakura-Saito divergence. With application to music analysis,” Neural Computation, vol. 21, no 3, Mar. 2009 http://www.tsi.enst.fr/fevotte/Journals/neco09_is-nmf.pdf A. Ozerov and C. Févotte. “Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation,” IEEE Trans. Audio, Speech and Language Processing, 2010 (to appear) http://www.tsi.enst.fr/fevotte/TechRep/techrep09_multinmf.pdf This talk is part of the Signal Processing and Communications Lab Seminars series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsEPRG Energy and Environment Seminar Series Emerge Cambridge StatsOther talksPTPmesh: Data Center Network Latency Measurements Using PTP Demographics, presentation, diagnosis and patient pathway of haematological malignancies SciScreen: Finding Dory Ethics for the working mathematician, seminar 10: Mathematicians being leaders. Direct measurements of dynamic granular compaction at the mesoscale using synchrotron X-ray radiography The frequency of ‘America’ in America "Mechanosensitive regulation of cancer epigenetics and pluripotency" Active bacterial suspensions: from individual effort to team work Market Socialism and Community Rating in Health Insurance Modulating developmental signals allows establishment of cultures of expanded potential stem cells |