Your browser doesn't support javascript.
loading
Homogeneity pursuit and variable selection in regression models for multivariate abundance data.
Hui, Francis K C; Maestrini, Luca; Welsh, Alan H.
Afiliação
  • Hui FKC; Research School of Finance, Actuarial Studies and Statistics, Australian National University, Canberra, ACT 2601, Australia.
  • Maestrini L; Research School of Finance, Actuarial Studies and Statistics, Australian National University, Canberra, ACT 2601, Australia.
  • Welsh AH; Research School of Finance, Actuarial Studies and Statistics, Australian National University, Canberra, ACT 2601, Australia.
Biometrics ; 80(1)2024 Jan 29.
Article em En | MEDLINE | ID: mdl-38364807
ABSTRACT
When building regression models for multivariate abundance data in ecology, it is important to allow for the fact that the species are correlated with each other. Moreover, there is often evidence species exhibit some degree of homogeneity in their responses to each environmental predictor, and that most species are informed by only a subset of predictors. We propose a generalized estimating equation (GEE) approach for simultaneous homogeneity pursuit (ie, grouping species with similar coefficient values while allowing differing groups for different covariates) and variable selection in regression models for multivariate abundance data. Using GEEs allows us to straightforwardly account for between-response correlations through a (reduced-rank) working correlation matrix. We augment the GEE with both adaptive fused lasso- and adaptive lasso-type penalties, which aim to cluster the species-specific coefficients within each covariate and encourage differing levels of sparsity across the covariates, respectively. Numerical studies demonstrate the strong finite sample performance of the proposed method relative to several existing approaches for modeling multivariate abundance data. Applying the proposed method to presence-absence records collected along the Great Barrier Reef in Australia reveals both a substantial degree of homogeneity and sparsity in species-environmental relationships. We show this leads to a more parsimonious model for understanding the environmental drivers of seabed biodiversity, and results in stronger out-of-sample predictive performance relative to methods that do not accommodate such features.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Biometrics Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Austrália

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Biometrics Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Austrália