COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > CMS seminar series in the Faculty of Music > Automatic Identification of Samples in Hip-Hop Music via Deep Metric Learning and an Artificial Dataset
Automatic Identification of Samples in Hip-Hop Music via Deep Metric Learning and an Artificial DatasetAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact jbf43. Abstract Sampling, the practice of reusing recorded music or sounds from another source in a new work, is common in popular music genres like hip-hop and rap. Numerous services have emerged that allow users to identify connections between samples and the songs that incorporate them, with the goal of enhancing music recommendation. Designing a system that can perform the same task automatically is challenging, however, as samples are commonly altered with audio effects like pitch shifting or filtering, and may only be several seconds long. Progress on this task has also been blocked due to the availability of training data. Here we show that a convolutional neural network trained on an artificial dataset can identify real-world samples in commercial hip-hop music. We extract vocal, harmonic, and percussive elements from several databases of non-commercial music recordings using audio source separation, and train the model to fingerprint a subset of these elements in transformed versions of the original audio. We optimize the model using a joint classification and metric learning loss and show that it achieves 13\% greater precision on real-world instances of sampling than a fingerprinting system using acoustic landmarks, and that it can recognize samples that have been both pitch shifted and time stretched. We also show that, for half of the commercial music recordings we tested, our model is capable of locating the position of a sample to within five seconds. More broadly, our results demonstrate how machine listening models can perform audio retrieval tasks previously reserved for experts. Biography Huw Cheston is a PhD student at the Centre for Music and Science, University of Cambridge, focussing on music information retrieval. His PhD research uses large-scale quantitative and computational methods to investigate performance style in improvised music, drawing from audio signal processing, machine learning, data science, and corpus analysis. He is also interested in developing reusable software, models, and datasets that can be deployed by researchers across a broad variety of audio-related domains. His research has been published in journals including Royal Society Open Science, Transactions of the International Society of Music Information Retrieval, and Music Perception. The work Huw will be presenting at this seminar derives from research completed as an intern in Spotify’s Audio Intelligence laboratory during Summer 2024. Zoom link https://zoom.us/j/99433440421?pwd=ZWxCQXFZclRtbjNXa0s2K1Q2REVPZz09 (Meeting ID: 994 3344 0421; Passcode: 714277) This talk is part of the CMS seminar series in the Faculty of Music series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsNexttechhub Are you interested in the Pharmaceutical industry as a profession after graduation? Wright Lecture SeriesOther talks‘A peculiar survey … for our peculiar purpose’: founding the Ordnance Survey of Ireland Echoes on the map: unveiling the auditory history of late Ottoman Istanbul through digital cartography Traversing the Eckmann-Hilton Hyperclock Roundtable on 'Hegel and Italian Political Thought: The Practice of Ideas, 1832-1900' A matrix algebra for graphical statistical models Neural representations of learned spatial behaviours |