Your browser doesn't support javascript.
loading
Statistical method for modeling sequencing data from different technologies in longitudinal studies with application to Huntington disease.
Fuady, Angga M; van Roon-Mom, Willeke M C; Kielbasa, Szymon M; Uh, Hae-Won; Houwing-Duistermaat, Jeanine J.
Afiliación
  • Fuady AM; Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands.
  • van Roon-Mom WMC; Department of Biostatistics and Research Support, Div. Julius Centrum, University Medical Center Utrecht, Utrecht, the Netherlands.
  • Kielbasa SM; Department Human Genetics, Leiden University Medical Center, Leiden, the Netherlands.
  • Uh HW; Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands.
  • Houwing-Duistermaat JJ; Department of Biostatistics and Research Support, Div. Julius Centrum, University Medical Center Utrecht, Utrecht, the Netherlands.
Biom J ; 63(4): 745-760, 2021 04.
Article en En | MEDLINE | ID: mdl-33350510
ABSTRACT
Advancement of gene expression measurements in longitudinal studies enables the identification of genes associated with disease severity over time. However, problems arise when the technology used to measure gene expression differs between time points. Observed differences between the results obtained at different time points can be caused by technical differences. Modeling the two measurements jointly over time might provide insight into the causes of these different results. Our work is motivated by a study of gene expression data of blood samples from Huntington disease patients, which were obtained using two different sequencing technologies. At time point 1, DeepSAGE technology was used to measure the gene expression, with a subsample also measured using RNA-Seq technology. At time point 2, all samples were measured using RNA-Seq technology. Significant associations between gene expression measured by DeepSAGE and disease severity using data from the first time point could not be replicated by the RNA-Seq data from the second time point. We modeled the relationship between the two sequencing technologies using the data from the overlapping samples. We used linear mixed models with either DeepSAGE or RNA-Seq measurements as the dependent variable and disease severity as the independent variable. In conclusion, (1) for one out of 14 genes, the initial significant result could be replicated with both technologies using data from both time points; (2) statistical efficiency is lost due to disagreement between the two technologies, measurement error when predicting gene expressions, and the need to include additional parameters to account for possible differences.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Enfermedad de Huntington Tipo de estudio: Observational_studies / Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: Biom J Año: 2021 Tipo del documento: Article País de afiliación: Países Bajos

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Enfermedad de Huntington Tipo de estudio: Observational_studies / Prognostic_studies / Risk_factors_studies Límite: Humans Idioma: En Revista: Biom J Año: 2021 Tipo del documento: Article País de afiliación: Países Bajos