Dr Jonathan Cumming j.a.cumming@durham.ac.uk
Associate Professor
For many large-scale datasets it is necessary to reduce dimensionality to the point where further exploration and analysis can take place. Principal variables are a subset of the original variables and preserve, to some extent, the structure and information carried by the original variables. Dimension reduction using principal variables is considered and a novel algorithm for determining such principal variables is proposed. This method is tested and compared with 11 other variable selection methods from the literature in a simulation study and is shown to be highly effective. Extensions to this procedure are also developed, including a method to determine longitudinal principal variables for repeated measures data, and a technique for incorporating utilities in order to modify the selection process. The method is further illustrated with real datasets, including some larger UK data relating to patient outcome after total knee replacement.
Cumming, J., & Wooff, D. (2007). Dimension reduction via principal variables. Computational Statistics & Data Analysis, 52(1), 550-565. https://doi.org/10.1016/j.csda.2007.02.012
Journal Article Type | Article |
---|---|
Publication Date | Sep 15, 2007 |
Deposit Date | Feb 15, 2008 |
Publicly Available Date | Aug 8, 2016 |
Journal | Computational Statistics & Data Analysis |
Print ISSN | 0167-9473 |
Electronic ISSN | 1872-7352 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 52 |
Issue | 1 |
Pages | 550-565 |
DOI | https://doi.org/10.1016/j.csda.2007.02.012 |
Keywords | Variable selection, Principal components, Partial correlation, Partial covariance, Utility. |
Public URL | https://durham-repository.worktribe.com/output/1571476 |
Accepted Journal Article
(270 Kb)
PDF
Copyright Statement
NOTICE: this is the author’s version of a work that was accepted for publication in Computational Statistics & Data Analysis. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Computational Statistics & Data Analysis, 52, 15 September 2007, 10.1016/j.csda.2007.02.012.
Bayes Linear Statistics: Theory and Methods
(2007)
Book
Inferring marketing channel relevance in the customer journey to online purchase
(2013)
Preprint / Working Paper
Time-weighted attribution of revenue to multiple e-commerce marketing channels in the customer journey
(2013)
Preprint / Working Paper
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
Apache License Version 2.0 (http://www.apache.org/licenses/)
Apache License Version 2.0 (http://www.apache.org/licenses/)
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search