Skip to main content

Research Repository

Advanced Search

Hindustani raga and singer classification using 2D and 3D pose estimation from video recordings

Clayton, Martin; Li, Jin; Clarke, Alison; Weinzierl, Marion

Hindustani raga and singer classification using 2D and 3D pose estimation from video recordings Thumbnail


Authors

Jin Li

Alison Clarke

Marion Weinzierl



Abstract

Using pose estimation with video recordings, we apply an action recognition machine learning algorithm to demonstrate the use of the movement information to classify singers and the ragas (melodic modes) they perform. Movement information is derived from a specially recorded video dataset of solo Hindustani (North Indian) raga recordings by three professional singers each performing the same nine ragas, a smaller duo dataset (one singer with tabla accompaniment) as well as recordings of concert performances by the same singers. Data is extracted using pose estimation algorithms, both 2D (OpenPose) and 3D. A two-pathway convolutional neural network structure is proposed for skeleton action recognition to train a model to classify 12-second clips by singer and raga. The model is capable of distinguishing the three singers on the basis of movement information alone. For each singer, it is capable of distinguishing between the nine ragas with a mean accuracy of 38.2% (with the most successful model). The model trained on solo recordings also proved effective at classifying duo and concert recordings. These findings are consistent with the view that while the gesturing of Indian singers is idiosyncratic, it remains tightly linked to patterns of melodic movement: indeed we show that in some cases different ragas are distinguishable on the basis of movement information alone. A series of technical challenges are identified and addressed, with code shared alongside audiovisual data to accompany the paper.

Citation

Clayton, M., Li, J., Clarke, A., & Weinzierl, M. (online). Hindustani raga and singer classification using 2D and 3D pose estimation from video recordings. Journal of New Music Research, 1-16. https://doi.org/10.1080/09298215.2024.2331788

Journal Article Type Article
Acceptance Date Mar 10, 2024
Online Publication Date Apr 3, 2024
Deposit Date Apr 26, 2024
Publicly Available Date Apr 29, 2024
Journal Journal of New Music Research
Print ISSN 0929-8215
Electronic ISSN 1744-5027
Publisher Taylor and Francis Group
Peer Reviewed Peer Reviewed
Pages 1-16
DOI https://doi.org/10.1080/09298215.2024.2331788
Public URL https://durham-repository.worktribe.com/output/2397628

Files





You might also like



Downloadable Citations