J. Godwin
Robust Statistical Methods for Rapid Data Labelling
Godwin, J.; Matthews, P.C.
Authors
Dr Peter Matthews p.c.matthews@durham.ac.uk
Associate Professor
Contributors
V. Bhatnagar
Editor
Abstract
Labelling of data is an expensive, labour-intensive, and time consuming process and, as such, results in vast quantities of data being unexploited when performing analysis through data mining. This chapter presents a new paradigm using robust multivariate statistical methods to encapsulate normal operational behaviour—not failure behaviour—to autonomously derive unsupervised classifier labels for previously collected data in a rapid, cost-effective manner. This enables traditional machine learning to take place on a much richer dataset. Two case studies are presented in the mechanical engineering domain, namely, a wind turbine gearbox and a rolling element bearing. A statistically sound and robust methodology is contributed, allowing for rapid labelling of data to enable traditional data mining techniques. Model development is detailed, along with a comparative evaluation of the metrics. Robust derivatives are presented and their superiority is shown. Example “R” code is given in the appendix, allowing readers to employ the techniques discussed. High levels of agreement between the derived statistical approaches and the underlying condition of the components can be found, showing the practical nature and benefit of this approach.
Citation
Godwin, J., & Matthews, P. (2014). Robust Statistical Methods for Rapid Data Labelling. In V. Bhatnagar (Ed.), Data mining and analysis in the engineering field (107-141). IGI Global. https://doi.org/10.4018/978-1-4666-6086-1.ch007
Publication Date | May 1, 2014 |
---|---|
Deposit Date | Sep 28, 2015 |
Publisher | IGI Global |
Pages | 107-141 |
Book Title | Data mining and analysis in the engineering field. |
Chapter Number | 7 |
ISBN | 9781466660861 |
DOI | https://doi.org/10.4018/978-1-4666-6086-1.ch007 |
Public URL | https://durham-repository.worktribe.com/output/1668898 |
Contract Date | Sep 28, 2015 |
You might also like
An Integrated Stacked Sparse Autoencoder and CNN-BLSTM Model for Ultra-Short-Term Wind Power Forecasting with Advanced Feature Learning
(2024)
Presentation / Conference Contribution
Transactive Energy and Flexibility Provision in Multi-microgrids using Stackelberg Game
(2022)
Journal Article
Appliance Scheduling Optimisation Method Using Historical Data in Households with RES Generation and Battery Storage Systems
(2022)
Presentation / Conference Contribution
Appliance Classification using BiLSTM Neural Networks and Feature Extraction
(2021)
Presentation / Conference Contribution
Implementation and Analyses of Yaw Based Coordinated Control of Wind Farms
(2019)
Journal Article
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search