Your browser doesn't support javascript.
loading
Addressing dispersion in mis-measured multivariate binomial outcomes: A novel statistical approach for detecting differentially methylated regions in bisulfite sequencing data.
Zhao, Kaiqiong; Oualkacha, Karim; Zeng, Yixiao; Shen, Cathy; Klein, Kathleen; Lakhal-Chaieb, Lajmi; Labbe, Aurélie; Pastinen, Tomi; Hudson, Marie; Colmegna, Inés; Bernatsky, Sasha; Greenwood, Celia M T.
Afiliación
  • Zhao K; Department of Mathematics and Statistics, York University, Toronto, Ontario, Canada.
  • Oualkacha K; Département de Mathématiques, Université du Québec à Montréal, Montreal, Quebec, Canada.
  • Zeng Y; Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada.
  • Shen C; Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada.
  • Klein K; Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada.
  • Lakhal-Chaieb L; Département de Mathématiques et de Statistique, Université Laval, Quebec, Quebec, Canada.
  • Labbe A; Département de Sciences de la Décision, HEC Montrèal, Montreal, Quebec, Canada.
  • Pastinen T; Genomic Medicine Center, Children's Mercy, Independence, Missouri, USA.
  • Hudson M; Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada.
  • Colmegna I; Department of Medicine, McGill University, Montreal, Quebec, Canada.
  • Bernatsky S; Department of Medicine, McGill University, Montreal, Quebec, Canada.
  • Greenwood CMT; The Research Institute of the McGill University Health Centre, Montreal, Quebec, Canada.
Stat Med ; 2024 Jun 26.
Article en En | MEDLINE | ID: mdl-38932470
ABSTRACT
Motivated by a DNA methylation application, this article addresses the problem of fitting and inferring a multivariate binomial regression model for outcomes that are contaminated by errors and exhibit extra-parametric variations, also known as dispersion. While dispersion in univariate binomial regression has been extensively studied, addressing dispersion in the context of multivariate outcomes remains a complex and relatively unexplored task. The complexity arises from a noteworthy data characteristic observed in our motivating dataset non-constant yet correlated dispersion across outcomes. To address this challenge and account for possible measurement error, we propose a novel hierarchical quasi-binomial varying coefficient mixed model, which enables flexible dispersion patterns through a combination of additive and multiplicative dispersion components. To maximize the Laplace-approximated quasi-likelihood of our model, we further develop a specialized two-stage expectation-maximization (EM) algorithm, where a plug-in estimate for the multiplicative scale parameter enhances the speed and stability of the EM iterations. Simulations demonstrated that our approach yields accurate inference for smooth covariate effects and exhibits excellent power in detecting non-zero effects. Additionally, we applied our proposed method to investigate the association between DNA methylation, measured across the genome through targeted custom capture sequencing of whole blood, and levels of anti-citrullinated protein antibodies (ACPA), a preclinical marker for rheumatoid arthritis (RA) risk. Our analysis revealed 23 significant genes that potentially contribute to ACPA-related differential methylation, highlighting the relevance of cell signaling and collagen metabolism in RA. We implemented our method in the R Bioconductor package called "SOMNiBUS."
Palabras clave

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Idioma: En Revista: Stat Med Año: 2024 Tipo del documento: Article País de afiliación: Canadá

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Idioma: En Revista: Stat Med Año: 2024 Tipo del documento: Article País de afiliación: Canadá