RESUMO
High-throughput genotyping enables the large-scale analysis of genetic diversity in population genomics and genome-wide association studies that combine the genotypic and phenotypic characterization of large collections of accessions. Sequencing-based approaches for genotyping are progressively replacing traditional genotyping methods because of the lower ascertainment bias. However, genome-wide genotyping based on sequencing becomes expensive in species with large genomes and a high proportion of repetitive DNA. Here we describe the use of CRISPR-Cas9 technology to deplete repetitive elements in the 3.76-Gb genome of lentil (Lens culinaris), 84% consisting of repeats, thus concentrating the sequencing data on coding and regulatory regions (single-copy regions). We designed a custom set of 566,766 gRNAs targeting 2.9 Gbp of repeats and excluding repetitive regions overlapping annotated genes and putative regulatory elements based on ATAC-seq data. The novel depletion method removed â¼40% of reads mapping to repeats, increasing those mapping to single-copy regions by â¼2.6-fold. When analyzing 25 million fragments, this repeat-to-single-copy shift in the sequencing data increased the number of genotyped bases of â¼10-fold compared to nondepleted libraries. In the same condition, we were also able to identify â¼12-fold more genetic variants in the single-copy regions and increased the genotyping accuracy by rescuing thousands of heterozygous variants that otherwise would be missed because of low coverage. The method performed similarly regardless of the multiplexing level, type of library or genotypes, including different cultivars and a closely related species (L. orientalis). Our results showed that CRISPR-Cas9-driven repeat depletion focuses sequencing data on single-copy regions, thus improving high-density and genome-wide genotyping in large and repetitive genomes.
Assuntos
Sistemas CRISPR-Cas , Estudo de Associação Genômica Ampla , Genótipo , Genoma de Planta , Técnicas de Genotipagem , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodosRESUMO
The common bean (Phaseolus vulgaris L.) is a crucial legume crop and an ideal evolutionary model to study adaptive diversity in wild and domesticated populations. Here, we present a common bean pan-genome based on five high-quality genomes and whole-genome reads representing 339 genotypes. It reveals ~234 Mb of additional sequences containing 6,905 protein-coding genes missing from the reference, constituting 49% of all presence/absence variants (PAVs). More non-synonymous mutations are found in PAVs than core genes, probably reflecting the lower effective population size of PAVs and fitness advantages due to the purging effect of gene loss. Our results suggest pan-genome shrinkage occurred during wild range expansion. Selection signatures provide evidence that partial or complete gene loss was a key adaptive genetic change in common bean populations with major implications for plant adaptation. The pan-genome is a valuable resource for food legume research and breeding for climate change mitigation and sustainable agriculture.
Assuntos
Domesticação , Genoma de Planta , Phaseolus , Phaseolus/genética , Adaptação Fisiológica/genética , Genótipo , Variação Genética , Produtos Agrícolas/genética , Seleção Genética , Evolução Molecular , Mutação , Melhoramento Vegetal/métodosRESUMO
High-throughput chromosome conformation capture (Hi-C) is widely used for scaffolding in de novo assembly because it produces highly contiguous genomes, but its indirect statistical approach can introduce connection errors. We employed optical mapping (Bionano Genomics) as an orthogonal scaffolding technology to assess the structural solidity of Hi-C reconstructed scaffolds. Optical maps were used to assess the correctness of five de novo genome assemblies based on long-read sequencing for contig generation and Hi-C for scaffolding. Hundreds of inconsistencies were found between the reconstructions generated using the Hi-C and optical mapping approaches. Manual inspection, exploiting raw long-read sequencing data and optical maps, confirmed that several of these conflicts were derived from Hi-C joining errors. Such misjoins were widespread, involved the connection of both small and large contigs, and even overlapped annotated genes. We conclude that the integration of optical mapping data after, not before, Hi-C-based scaffolding, improves the quality of the assembly and limits reconstruction errors by highlighting misjoins that can then be subjected to further investigation.
RESUMO
Background: During citizen-science expeditions to the Ulu Temburong National Park, Brunei, several individuals were collected of a semi-slug species of the genus Microparmarion that, based on morphology and in-the-field DNA-barcoding, was found to be an undescribed species. New information: In this paper, we describe Microparmarionsallehi Wu, Ezzwan & Hamdani, n. sp., after field centre supervisor Md Salleh Abdullah Bat. We provide details on the external and internal reproductive morphology, the shell and the ecology of the type locality, as well as a diagnosis comparing it with related species. DNA barcodes were generated for five individuals and used for a phylogenetic reconstruction. Microparmarionsallehi sp. n. and M.exquadratus Schilthuizen et al., 2019 so far are the only Bornean species of the genus that live in lowland forest; other species are found in montane forests.