Dr Konstantinos Perrakis konstantinos.perrakis@durham.ac.uk
Assistant Professor
Dr Konstantinos Perrakis konstantinos.perrakis@durham.ac.uk
Assistant Professor
Thomas Lartigue
Frank Dondelinger
Sach Mukherjee
Regularized regression models are well studied and, under appropriate conditions, offer fast and statistically interpretable results. However, large data in many applications are heterogeneous in the sense of harboring distributional differences between latent groups. Then, the assumption that the conditional distribution of response Y given features X is the same for all samples may not hold. Furthermore, in scientific applications, the covariance structure of the features may contain important signals and its learning is also affected by latent group structure. We propose a class of mixture models for paired data pX, Y q that couples together the distribution of X (using sparse graphical models) and the conditional Y | X (using sparse regression models). The regression and graphical models are specific to the latent groups and model parameters are estimated jointly. This allows signals in either or both of the feature distribution and regression model to inform learning of latent structure and provides automatic control of confounding by such structure. Estimation is handled via an expectation-maximization algorithm, whose convergence is established theoretically. We illustrate the key ideas via empirical examples. An R package is available at https://github.com/k-perrakis/regjmix.
Perrakis, K., Lartigue, T., Dondelinger, F., & Mukherjee, S. (2023). Regularized joint mixture models. Journal of Machine Learning Research, 24, 1-47
Journal Article Type | Article |
---|---|
Acceptance Date | Nov 10, 2022 |
Online Publication Date | Jan 1, 2023 |
Publication Date | 2023 |
Deposit Date | Nov 21, 2022 |
Publicly Available Date | May 2, 2023 |
Journal | Journal of Machine Learning Research |
Print ISSN | 1532-4435 |
Electronic ISSN | 1533-7928 |
Publisher | Journal of Machine Learning Research |
Peer Reviewed | Peer Reviewed |
Volume | 24 |
Article Number | 19 |
Pages | 1-47 |
Public URL | https://durham-repository.worktribe.com/output/1186170 |
Publisher URL | https://jmlr.org/papers/v24/ |
Related Public URLs | https://arxiv.org/pdf/1908.07869.pdf |
Published Journal Article
(1.3 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
Copyright Statement
© 2023 Konstantinos Perrakis, Thomas Lartigue, Frank Dondelinger and Sach Mukherjee.
License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided
at http://jmlr.org/papers/v24/21-0796.html.
Proceedings of the 38th International Workshop on Statistical Modelling
(2024)
Presentation / Conference Contribution
Developments in Statistical Modelling
(2024)
Book
Variations of power-expected-posterior priors in normal regression models
(2019)
Journal Article
Scalable Bayesian regression in high dimensions with multiple data sources
(2019)
Journal Article
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
Apache License Version 2.0 (http://www.apache.org/licenses/)
Apache License Version 2.0 (http://www.apache.org/licenses/)
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search