Musical Genre Recognition Based on Deep Descriptors of Harmony, Instrumentation, and Segments

Vatolkin, Igor; Gotham, Mark; Lόpez, Néstor Nápoles; Ostermann, Fabian

doi:10.1007/978-3-031-29956-8_27

Musical Genre Recognition Based on Deep Descriptors of Harmony, Instrumentation, and Segments

Vatolkin, Igor; Gotham, Mark; Lόpez, Néstor Nápoles; Ostermann, Fabian

Authors

Igor Vatolkin

Mark Gotham

Néstor Nápoles Lόpez

Fabian Ostermann

Contributors

Colin Johnson
Editor

Nereida Rodríguez-Fernández
Editor

Sérgio M. Rebelo
Editor

Abstract

Deep learning has recently established itself as a cluster of methods of choice for almost all classification tasks in music information retrieval. However, despite very good classification performance, it sometimes brings disadvantages including long training times and higher energy costs, lower interpretability of classification models, or an increased risk of overfitting when applied to small training sets due to a very large number of trainable parameters. In this paper, we investigate the combination of both deep and shallow algorithms for recognition of musical genres using a transfer learning approach. We train deep classification models once to predict harmonic, instrumental, and segment properties from datasets with respective annotations. Their predictions for another dataset with annotated genres are used as features for shallow classification methods. They can be trained over and again for different categories, and are particularly useful when the training sets are small, in a real world scenario when listeners define various musical categories selecting only a few prototype tracks. The experiments show the potential of the proposed approach for genre recognition. In particular, when combined with evolutionary feature selection which identifies the most relevant deep feature dimensions, the classification errors became significantly lower in almost all cases, compared to a baseline based on MFCCs or results reported in the previous work.

Citation

Vatolkin, I., Gotham, M., Lόpez, N. N., & Ostermann, F. (2023, April). Musical Genre Recognition Based on Deep Descriptors of Harmony, Instrumentation, and Segments. Presented at EvoMUSART 2023: Artificial Intelligence in Music, Sound, Art and Design, Brno, Czech Republic

Presentation Conference Type	Conference Paper (published)
Conference Name	EvoMUSART 2023: Artificial Intelligence in Music, Sound, Art and Design
Start Date	Apr 12, 2023
End Date	Apr 14, 2023
Acceptance Date	Feb 1, 2023
Online Publication Date	Apr 1, 2023
Publication Date	2023
Deposit Date	Feb 15, 2024
Print ISSN	0302-9743
Publisher	Springer
Volume	13988
Pages	413-427
Series Title	Lecture Notes in Computer Science
Book Title	Artificial Intelligence in Music, Sound, Art and Design
ISBN	9783031299568
DOI	https://doi.org/10.1007/978-3-031-29956-8_27
Public URL	https://durham-repository.worktribe.com/output/2256006