Adam Wynn adam.t.wynn@durham.ac.uk
PGR Student Doctor of Philosophy
Semi-supervised Speech Confidence Detection using Psuedo-labelling and Whisper Embeddings
Wynn, Adam; Wang, Jingyun; Tan, Xiangyu
Authors
Dr Jingyun Wang jingyun.wang@durham.ac.uk
Assistant Professor
Xiangyu Tan
Abstract
Understanding speaker confidence is crucial in educational settings, as it can enhance personalised feedback and improve learning outcomes. This study introduces a novel framework for detecting speaker confidence by integrating human-engineered features with embeddings from the Whisper encoder. To address data limitations, a pseudo-labelling technique is employed to expand the labelled dataset, allowing the model to learn from both human-annotated and model-generated labels. The framework combines traditional speech features including pitch, volume, rate of speech, and the presence of disfluencies and stress, with Whisper embeddings, and uses a co-attention mechanism to fuse these representations and achieve an overall accuracy of 75%. This study contributes to advancing speech analysis, enabling applications that support personalised learning and speaking skill development.
Citation
Wynn, A., Wang, J., & Tan, X. (2025, July). Semi-supervised Speech Confidence Detection using Psuedo-labelling and Whisper Embeddings. Presented at The 26th International Conference on Artificial Intelligence in Education, Palermo, Italy
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | The 26th International Conference on Artificial Intelligence in Education |
Start Date | Jul 22, 2025 |
End Date | Jul 26, 2025 |
Acceptance Date | Apr 4, 2025 |
Online Publication Date | Feb 20, 2025 |
Publication Date | Feb 20, 2025 |
Deposit Date | Jul 4, 2025 |
Peer Reviewed | Peer Reviewed |
Book Title | Artificial Intelligence in Education |
DOI | https://doi.org/10.1007/978-3-031-98465-5_34 |
Public URL | https://durham-repository.worktribe.com/output/4253149 |
You might also like
BETTER: An Automatic feedBack systEm for supporTing emoTional spEech tRaining
(2023)
Book Chapter
An AI-Based Feedback Visualisation System for Speech Training
(2022)
Book Chapter
Simplifying Multimedia Programming for Novice Programmers: MediaLib and Its Learning Materials
(2024)
Presentation / Conference Contribution
Multiplayer Serious Games Supporting Programming Learning
(2023)
Presentation / Conference Contribution
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search