Semi-supervised Speech Confidence Detection using Psuedo-labelling and Whisper Embeddings

Wynn, Adam; Wang, Jingyun; Tan, Xiangyu

doi:10.1007/978-3-031-98465-5_34

Semi-supervised Speech Confidence Detection using Psuedo-labelling and Whisper Embeddings

Wynn, Adam; Wang, Jingyun; Tan, Xiangyu

Authors

Adam Wynn adam.t.wynn@durham.ac.uk
PGR Student Doctor of Philosophy

Dr Jingyun Wang jingyun.wang@durham.ac.uk
Assistant Professor

Xiangyu Tan

Abstract

Understanding speaker confidence is crucial in educational settings, as it can enhance personalised feedback and improve learning outcomes. This study introduces a novel framework for detecting speaker confidence by integrating human-engineered features with embeddings from the Whisper encoder. To address data limitations, a pseudo-labelling technique is employed to expand the labelled dataset, allowing the model to learn from both human-annotated and model-generated labels. The framework combines traditional speech features including pitch, volume, rate of speech, and the presence of disfluencies and stress, with Whisper embeddings, and uses a co-attention mechanism to fuse these representations and achieve an overall accuracy of 75%. This study contributes to advancing speech analysis, enabling applications that support personalised learning and speaking skill development.

Citation

Wynn, A., Wang, J., & Tan, X. (2025, July). Semi-supervised Speech Confidence Detection using Psuedo-labelling and Whisper Embeddings. Presented at The 26th International Conference on Artificial Intelligence in Education, Palermo, Italy

Presentation Conference Type	Conference Paper (published)
Conference Name	The 26th International Conference on Artificial Intelligence in Education
Start Date	Jul 22, 2025
End Date	Jul 26, 2025
Acceptance Date	Apr 4, 2025
Online Publication Date	Feb 20, 2025
Publication Date	Feb 20, 2025
Deposit Date	Jul 4, 2025
Peer Reviewed	Peer Reviewed
Book Title	Artificial Intelligence in Education
DOI	https://doi.org/10.1007/978-3-031-98465-5_34
Public URL	https://durham-repository.worktribe.com/output/4253149

A topic map based learning management system to facilitate meaningful grammar learning: the case of Japanese grammar learning (2024)
Journal Article

BETTER: An Automatic feedBack systEm for supporTing emoTional spEech tRaining (2023)
Book Chapter

An AI-Based Feedback Visualisation System for Speech Training (2022)
Book Chapter

Simplifying Multimedia Programming for Novice Programmers: MediaLib and Its Learning Materials (2024)
Presentation / Conference Contribution

Multiplayer Serious Games Supporting Programming Learning (2023)
Presentation / Conference Contribution

Downloadable Citations

HTML

BIB

RTF

Authors

Abstract

Citation

You might also like

Downloadable Citations