Skip to main content

Research Repository

Advanced Search

Augmentednet: A Roman Numeral Analysis Network with Synthetic Training Examples and Additional Tonal Tasks

López, Néstor Nápoles; Gotham, Mark; Fujinaga, Ichiro

Authors

Néstor Nápoles López

Ichiro Fujinaga



Contributors

Jin Ha Lee
Editor

Alexander Lerch
Editor

Zhiyao Duan
Editor

Juhan Nam
Editor

Preeti Rao
Editor

Peter van Kranenburg
Editor

Ajay Srinivasamurthy
Editor

Abstract

AugmentedNet is a new convolutional recurrent neural network for predicting Roman numeral labels. The network architecture is characterized by a separate convolutional block for bass and chromagram inputs. This layout is further enhanced by using synthetic training examples for data augmentation, and a greater number of tonal tasks to solve simultaneously via multitask learning. This paper reports the improved performance achieved by combining these ideas. The additional tonal tasks strengthen the shared representation learned through multitask learning. The synthetic examples, in turn, complement key transposition, which is often the only technique used for data augmentation in similar problems related to tonal music. The name ‘AugmentedNet’ speaks to the increased number of both training examples and tonal tasks. We report on tests across six relevant and publicly available datasets: ABC, BPS, HaydnSun, TAVERN, When-in-Rome, and WTC. In our tests, our model outperforms recent methods of functional harmony, such as other convolutional neural networks and Transformer-based models. Finally, we show a new method for reconstructing the full Roman numeral label, based on common Roman numeral classes, which leads to better results compared to previous methods. © 2021 Proceedings of the 22nd International Conference on Music Information Retrieval, ISMIR 2021. All Rights Reserved.

Citation

López, N. N., Gotham, M., & Fujinaga, I. (2021). Augmentednet: A Roman Numeral Analysis Network with Synthetic Training Examples and Additional Tonal Tasks. In J. H. Lee, A. Lerch, Z. Duan, J. Nam, P. Rao, P. van Kranenburg, & A. Srinivasamurthy (Eds.), Proceedings of the 22nd International Society for Music Information Retrieval Conference (404-411)

Conference Name ISMIR 2021: 22nd International Society for Music Information Retrieval Conference
Conference Location Online
Start Date Nov 7, 2021
End Date Nov 12, 2021
Acceptance Date Jun 1, 2021
Online Publication Date Nov 7, 2021
Publication Date Nov 7, 2021
Deposit Date Feb 29, 2024
Pages 404-411
Series Title Proceedings of the 22nd International Conference on Music Information Retrieval, ISMIR 2021
Book Title Proceedings of the 22nd International Society for Music Information Retrieval Conference
ISBN 9781732729902
Public URL https://durham-repository.worktribe.com/output/2273228
Publisher URL https://www.ismir.net/conferences/ismir2021.html
Related Public URLs https://zenodo.org/records/5624533