{"id":2227,"date":"2024-09-20T08:42:25","date_gmt":"2024-09-20T06:42:25","guid":{"rendered":"https:\/\/reach.ircam.fr\/?p=2227"},"modified":"2024-10-07T08:47:13","modified_gmt":"2024-10-07T06:47:13","slug":"simultaneous-music-separation-and-generation-using-multi-track-latent-diffusion-models","status":"publish","type":"post","link":"https:\/\/reach.ircam.fr\/index.php\/2024\/09\/20\/simultaneous-music-separation-and-generation-using-multi-track-latent-diffusion-models\/","title":{"rendered":"Simultaneous Music Separation and Generation Using Multi-Track Latent Diffusion Models"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"2227\" class=\"elementor elementor-2227\">\n\t\t\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-5903b18 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"5903b18\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-0c464cd\" data-id=\"0c464cd\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-d1eb687 elementor-widget elementor-widget-text-editor\" data-id=\"d1eb687\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p><span class=\"chakra-text css-0\">Read original:\u00a0<a class=\"chakra-link css-1j13i2a\" href=\"https:\/\/arxiv.org\/abs\/2409.12346\" target=\"_blank\" rel=\"noopener\"><span class=\"chakra-text css-1081t4c\">arXiv:2409.12346<\/span><\/a> &#8211; Published 20\/09\/2024 by Tornike Karchkhadze, Mohammad Rasool Izadi, Shlomo Dubnov.<\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-2d9dce3 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"2d9dce3\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-4d0a516\" data-id=\"4d0a516\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-a0fc346 elementor-widget elementor-widget-text-editor\" data-id=\"a0fc346\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p><span style=\"color: rgb(0, 0, 0); font-family: &quot;Lucida Grande&quot;, Helvetica, Arial, sans-serif; font-size: 13.608px; font-style: normal;\"><b>Abstract<\/b><\/span><span style=\"color: #000000; font-family: 'Lucida Grande', Helvetica, Arial, sans-serif; font-size: 13.608px; font-style: normal; font-weight: 400;\">: Diffusion models have recently shown strong potential in both music generation and music source separation tasks. Although in early stages, a trend is emerging towards integrating these tasks into a single framework, as both involve generating musically aligned parts and can be seen as facets of the same generative process. In this work, we introduce a latent diffusion-based multi-track generation model capable of both source separation and multi-track music synthesis by learning the joint probability distribution of tracks sharing a musical context. Our model also enables arrangement generation by creating any subset of tracks given the others. We trained our model on the Slakh2100 dataset, compared it with an existing simultaneous generation and separation model, and observed significant improvements across objective metrics for source separation, music, and arrangement generation tasks. Sound examples are available at&nbsp;<\/span><a class=\"link-external link-https\" style=\"font-weight: 400; font-family: 'Lucida Grande', Helvetica, Arial, sans-serif; font-size: 13.608px; font-style: normal;\" href=\"https:\/\/msg-ld.github.io\/\" rel=\"external noopener nofollow\">this https URL<\/a><span style=\"color: #000000; font-family: 'Lucida Grande', Helvetica, Arial, sans-serif; font-size: 13.608px; font-style: normal; font-weight: 400;\">.<\/span><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Read original:\u00a0arXiv:2409.12346 &#8211; Published 20\/09\/2024 by Tornike Karchkhadze, Mohammad Rasool Izadi, Shlomo Dubnov. Abstract: Diffusion models have recently shown strong potential in both music generation and music source separation tasks. Although in early stages, a trend is emerging towards integrating these tasks into a single framework, as both involve generating musically aligned parts and can [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":2229,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[46],"tags":[],"class_list":["post-2227","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-publications-research"],"aioseo_notices":[],"blog_post_layout_featured_media_urls":{"thumbnail":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/10\/x1-150x150.png",150,150,true],"full":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/10\/x1.png",1578,401,false]},"categories_names":{"46":{"name":"Publications","link":"https:\/\/reach.ircam.fr\/index.php\/category\/research\/publications-research\/"}},"tags_names":[],"comments_number":"0","wpmagazine_modules_lite_featured_media_urls":{"thumbnail":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/10\/x1-150x150.png",150,150,true],"cvmm-medium":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/10\/x1-300x300.png",300,300,true],"cvmm-medium-plus":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/10\/x1-305x207.png",305,207,true],"cvmm-portrait":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/10\/x1-400x401.png",400,401,true],"cvmm-medium-square":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/10\/x1-600x401.png",600,401,true],"cvmm-large":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/10\/x1-1024x401.png",1024,401,true],"cvmm-small":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/10\/x1-130x95.png",130,95,true],"full":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/10\/x1.png",1578,401,false]},"_links":{"self":[{"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/posts\/2227","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/comments?post=2227"}],"version-history":[{"count":4,"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/posts\/2227\/revisions"}],"predecessor-version":[{"id":2232,"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/posts\/2227\/revisions\/2232"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/media\/2229"}],"wp:attachment":[{"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/media?parent=2227"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/categories?post=2227"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/tags?post=2227"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}