Barbara Kitchenham
Robust Statistical Methods for Empirical Software Engineering
Kitchenham, Barbara; Madeyski, Lech; Budgen, David; Keung, Jacky; Brereton, Pearl; Charters, Stuart; Gibbs, Shirley; Pohthong, Amnart
Authors
Lech Madeyski
David Budgen david.budgen@durham.ac.uk
Emeritus Professor
Jacky Keung
Pearl Brereton
Stuart Charters
Shirley Gibbs
Amnart Pohthong
Abstract
There have been many changes in statistical theory in the past 30 years, including increased evidence that non-robust methods may fail to detect important results. The statistical advice available to software engineering researchers needs to be updated to address these issues. This paper aims both to explain the new results in the area of robust analysis methods and to provide a large-scale worked example of the new methods. We summarise the results of analyses of the Type 1 error efficiency and power of standard parametric and non-parametric statistical tests when applied to non-normal data sets. We identify parametric and non-parametric methods that are robust to non-normality. We present an analysis of a large-scale software engineering experiment to illustrate their use. We illustrate the use of kernel density plots, and parametric and non-parametric methods using four different software engineering data sets. We explain why the methods are necessary and the rationale for selecting a specific analysis. We suggest using kernel density plots rather than box plots to visualise data distributions. For parametric analysis, we recommend trimmed means, which can support reliable tests of the differences between the central location of two or more samples. When the distribution of the data differs among groups, or we have ordinal scale data, we recommend non-parametric methods such as Cliff’s δ or a robust rank-based ANOVA-like method.
Citation
Kitchenham, B., Madeyski, L., Budgen, D., Keung, J., Brereton, P., Charters, S., …Pohthong, A. (2016). Robust Statistical Methods for Empirical Software Engineering. Empirical Software Engineering, 22(2), 579-630. https://doi.org/10.1007/s10664-016-9437-5
Journal Article Type | Article |
---|---|
Acceptance Date | May 4, 2016 |
Online Publication Date | Jun 16, 2016 |
Publication Date | Jun 16, 2016 |
Deposit Date | May 5, 2016 |
Publicly Available Date | May 6, 2016 |
Journal | Empirical Software Engineering |
Print ISSN | 1382-3256 |
Electronic ISSN | 1573-7616 |
Publisher | Springer |
Peer Reviewed | Peer Reviewed |
Volume | 22 |
Issue | 2 |
Pages | 579-630 |
DOI | https://doi.org/10.1007/s10664-016-9437-5 |
Public URL | https://durham-repository.worktribe.com/output/1413110 |
Files
Published Journal Article
(2.6 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
Published Journal Article (Advance online version)
(2.6 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
Copyright Statement
Advance online version
Accepted Journal Article
(2.2 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
Copyright Statement
© The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
You might also like
How Should Software Engineering Secondary Studies Include Grey Material?
(2022)
Journal Article
SEGRESS: Software Engineering Guidelines for REporting Secondary Studies
(2022)
Journal Article
Short communication: Evolution of secondary studies in software engineering
(2022)
Journal Article
A Service Scheduling Security Model for a Cloud Environment
(2020)
Journal Article