Your browser doesn't support javascript.
loading
A comprehensive framework for detecting copy number variants from single nucleotide polymorphism data: 'rCNV', a versatile r package for paralogue and CNV detection.
Karunarathne, Piyal; Zhou, Qiujie; Schliep, Klaus; Milesi, Pascal.
Afiliação
  • Karunarathne P; Plant Ecology and Evolution, Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden.
  • Zhou Q; Science for Life Laboratory (SciLifeLab), Uppsala, Sweden.
  • Schliep K; Institute of Population Genetics, Heinrich Heine University, Düsseldorf, Germany.
  • Milesi P; Plant Ecology and Evolution, Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden.
Mol Ecol Resour ; 23(8): 1772-1789, 2023 Nov.
Article em En | MEDLINE | ID: mdl-37515483
ABSTRACT
Recent studies have highlighted the significant role of copy number variants (CNVs) in phenotypic diversity, environmental adaptation and species divergence across eukaryotes. The presence of CNVs also has the potential to introduce genotyping biases, which can pose challenges to accurate population and quantitative genetic analyses. However, detecting CNVs in genomes, particularly in non-model organisms, presents a formidable challenge. To address this issue, we have developed a statistical framework and an accompanying r software package that leverage allelic-read depth from single nucleotide polymorphism (SNP) data for accurate CNV detection. Our framework capitalises on two key principles. First, it exploits the distribution of allelic-read depth ratios in heterozygotes for individual SNPs by comparing it against an expected distribution based on binomial sampling. Second, it identifies SNPs exhibiting an apparent excess of heterozygotes under Hardy-Weinberg equilibrium. By employing multiple statistical tests, our method not only enhances sensitivity to sampling effects but also effectively addresses reference biases, resulting in optimised SNP classification. Our framework is compatible with various NGS technologies (e.g. RADseq, Exome-capture). This versatility enables CNV calling from genomes of diverse complexities. To streamline the analysis process, we have implemented our framework in the user-friendly r package 'rCNV', which automates the entire workflow seamlessly. We trained our models using simulated data and validated their performance on four datasets derived from different sequencing technologies, including RADseq (Chinook salmon-Oncorhynchus tshawytscha), Rapture (American lobster-Homarus americanus), Exome-capture (Norway spruce-Picea abies) and WGS (Malaria mosquito-Anopheles gambiae).
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Polimorfismo de Nucleotídeo Único / Variações do Número de Cópias de DNA Tipo de estudo: Diagnostic_studies Limite: Animals País/Região como assunto: Europa Idioma: En Revista: Mol Ecol Resour Ano de publicação: 2023 Tipo de documento: Article País de afiliação: Suécia

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Polimorfismo de Nucleotídeo Único / Variações do Número de Cópias de DNA Tipo de estudo: Diagnostic_studies Limite: Animals País/Região como assunto: Europa Idioma: En Revista: Mol Ecol Resour Ano de publicação: 2023 Tipo de documento: Article País de afiliação: Suécia