{"id":1051,"date":"2023-05-05T10:54:44","date_gmt":"2023-05-05T08:54:44","guid":{"rendered":"https:\/\/reach.ircam.fr\/?p=1051"},"modified":"2024-03-09T17:07:06","modified_gmt":"2024-03-09T16:07:06","slug":"multitrack-music-transformer","status":"publish","type":"post","link":"https:\/\/reach.ircam.fr\/index.php\/2023\/05\/05\/multitrack-music-transformer\/","title":{"rendered":"Multitrack Music Transformer"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"1051\" class=\"elementor elementor-1051\">\n\t\t\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-d3ceaf6 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"d3ceaf6\" data-element_type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-a33d929\" data-id=\"a33d929\" data-element_type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-346df73 elementor-widget elementor-widget-text-editor\" data-id=\"346df73\" data-element_type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p><span data-sheets-root=\"1\" data-sheets-value=\"{&quot;1&quot;:2,&quot;2&quot;:&quot;HAo-Wen Dong, K. Chen, S. Dubnov, J. McAuley and T. Berg-Kirkpatrick, \\&quot;Multitrack Music Transformer,\\&quot; ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5, doi: 10.1109\/ICASSP49357.2023.10094628.&quot;}\" data-sheets-userformat=\"{&quot;2&quot;:6915,&quot;3&quot;:{&quot;1&quot;:0},&quot;4&quot;:{&quot;1&quot;:2,&quot;2&quot;:16777215},&quot;11&quot;:4,&quot;12&quot;:0,&quot;14&quot;:{&quot;1&quot;:2,&quot;2&quot;:1136076},&quot;15&quot;:&quot;Arial, Helvetica, sans-serif&quot;}\">HAo-Wen Dong, K. Chen, S. Dubnov, J. McAuley and T. Berg-Kirkpatrick, \u00ab\u00a0Multitrack Music Transformer,\u00a0\u00bb ICASSP 2023 &#8211; 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5, doi: 10.1109\/ICASSP49357.2023.10094628.<\/span><\/p><p><a href=\"https:\/\/ieeexplore.ieee.org\/document\/10094628\">Full publication<\/a><\/p><p><strong>Abstract<\/strong>: Existing approaches for generating multitrack music with transformer models have been limited in terms of the number of instruments, the length of the music segments and slow inference. This is partly due to the memory requirements of the lengthy input sequences necessitated by existing representations. In this work, we propose a new multitrack music representation that allows a diverse set of instruments while keeping a short sequence length. Our proposed Multitrack Music Transformer (MMT) achieves comparable performance with state-of-the-art systems, landing in between two recently proposed models in a subjective listening test, while achieving substantial speedups and memory reductions over both, making the method attractive for real time improvisation or near real time creative applications. Further, we propose a new measure for analyzing musical self-attention and show that the trained model attends more to notes that form a consonant interval with the current note and to notes that are 4N beats away from the current step.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>HAo-Wen Dong, K. Chen, S. Dubnov, J. McAuley and T. Berg-Kirkpatrick, \u00ab\u00a0Multitrack Music Transformer,\u00a0\u00bb ICASSP 2023 &#8211; 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 2023, pp. 1-5, doi: 10.1109\/ICASSP49357.2023.10094628. Full publication Abstract: Existing approaches for generating multitrack music with transformer models have been limited in terms of the [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":1053,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[52,46],"tags":[],"class_list":["post-1051","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-conferences","category-publications-research"],"aioseo_notices":[],"blog_post_layout_featured_media_urls":{"thumbnail":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/03\/web_banner_v6-150x150.jpg",150,150,true],"full":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/03\/web_banner_v6.jpg",1920,548,false]},"categories_names":{"52":{"name":"Conferences","link":"https:\/\/reach.ircam.fr\/index.php\/category\/research\/conferences\/"},"46":{"name":"Publications","link":"https:\/\/reach.ircam.fr\/index.php\/category\/research\/publications-research\/"}},"tags_names":[],"comments_number":"0","wpmagazine_modules_lite_featured_media_urls":{"thumbnail":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/03\/web_banner_v6-150x150.jpg",150,150,true],"cvmm-medium":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/03\/web_banner_v6-300x300.jpg",300,300,true],"cvmm-medium-plus":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/03\/web_banner_v6-305x207.jpg",305,207,true],"cvmm-portrait":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/03\/web_banner_v6-400x548.jpg",400,548,true],"cvmm-medium-square":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/03\/web_banner_v6-600x548.jpg",600,548,true],"cvmm-large":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/03\/web_banner_v6-1024x548.jpg",1024,548,true],"cvmm-small":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/03\/web_banner_v6-130x95.jpg",130,95,true],"full":["https:\/\/reach.ircam.fr\/wp-content\/uploads\/2024\/03\/web_banner_v6.jpg",1920,548,false]},"_links":{"self":[{"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/posts\/1051","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/comments?post=1051"}],"version-history":[{"count":4,"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/posts\/1051\/revisions"}],"predecessor-version":[{"id":1056,"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/posts\/1051\/revisions\/1056"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/media\/1053"}],"wp:attachment":[{"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/media?parent=1051"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/categories?post=1051"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/reach.ircam.fr\/index.php\/wp-json\/wp\/v2\/tags?post=1051"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}