Your browser doesn't support javascript.
loading
Shall genomic correlation structure be considered in copy number variants detection?
Qin, Fei; Luo, Xizhi; Cai, Guoshuai; Xiao, Feifei.
Afiliación
  • Qin F; Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina (USC), Discovery 449, 915 Greene St, Columbia, SC 29208, USA.
  • Luo X; Department of Epidemiology and Biostatistics, Arnold School of Public Health, USC, Discovery 449, 915 Greene St, Columbia, SC 29208, USA.
  • Cai G; Department of Environmental Health Science, Arnold School of Public Health, USC, Discovery 449, 915 Greene St, Columbia, SC 29208, USA.
  • Xiao F; Department of Epidemiology and Biostatistics, Arnold School of Public Health, USC, Discovery 449, 915 Greene St, Columbia, SC 29208, USA.
Brief Bioinform ; 22(6)2021 11 05.
Article en En | MEDLINE | ID: mdl-34114005
ABSTRACT
Copy number variation has been identified as a major source of genomic variation associated with disease susceptibility. With the advent of whole-exome sequencing (WES) technology, massive WES data have been generated, allowing for the identification of copy number variants (CNVs) in the protein-coding regions with direct functional interpretation. We have previously shown evidence of the genomic correlation structure in array data and developed a novel chromosomal breakpoint detection algorithm, LDcnv, which showed significantly improved detection power through integrating the correlation structure in a systematic modeling manner. However, it remains unexplored whether the genomic correlation exists in WES data and how such correlation structure integration can improve the CNV detection accuracy. In this study, we first explored the correlation structure of the WES data using the 1000 Genomes Project data. Both real raw read depth and median-normalized data showed strong evidence of the correlation structure. Motivated by this fact, we proposed a correlation-based method, CORRseq, as a novel release of the LDcnv algorithm in profiling WES data. The performance of CORRseq was evaluated in extensive simulation studies and real data analysis from the 1000 Genomes Project. CORRseq outperformed the existing methods in detecting medium and large CNVs. In conclusion, it would be more advantageous to model genomic correlation structure in detecting relatively long CNVs. This study provides great insights for methodology development of CNV detection with NGS data.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Pruebas Genéticas / Predisposición Genética a la Enfermedad / Genómica / Estudios de Asociación Genética / Variaciones en el Número de Copia de ADN Tipo de estudio: Diagnostic_studies / Prognostic_studies Límite: Humans Idioma: En Revista: Brief Bioinform Asunto de la revista: BIOLOGIA / INFORMATICA MEDICA Año: 2021 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Pruebas Genéticas / Predisposición Genética a la Enfermedad / Genómica / Estudios de Asociación Genética / Variaciones en el Número de Copia de ADN Tipo de estudio: Diagnostic_studies / Prognostic_studies Límite: Humans Idioma: En Revista: Brief Bioinform Asunto de la revista: BIOLOGIA / INFORMATICA MEDICA Año: 2021 Tipo del documento: Article País de afiliación: Estados Unidos