Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 32
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
medRxiv ; 2024 Mar 18.
Artículo en Inglés | MEDLINE | ID: mdl-38562723

RESUMEN

Comprehending the mechanism behind human diseases with an established heritable component represents the forefront of personalized medicine. Nevertheless, numerous medically important genes are inaccurately represented in short-read sequencing data analysis due to their complexity and repetitiveness or the so-called 'dark regions' of the human genome. The advent of PacBio as a long-read platform has provided new insights, yet HiFi whole-genome sequencing (WGS) cost remains frequently prohibitive. We introduce a targeted sequencing and analysis framework, Twist Alliance Dark Genes Panel (TADGP), designed to offer phased variants across 389 medically important yet complex autosomal genes. We highlight TADGP accuracy across eleven control samples and compare it to WGS. This demonstrates that TADGP achieves variant calling accuracy comparable to HiFi-WGS data, but at a fraction of the cost. Thus, enabling scalability and broad applicability for studying rare diseases or complementing previously sequenced samples to gain insights into these complex genes. TADGP revealed several candidate variants across all cases and provided insight into LPA diversity when tested on samples from rare disease and cardiovascular disease cohorts. In both cohorts, we identified novel variants affecting individual disease-associated genes (e.g., IKZF1, KCNE1). Nevertheless, the annotation of the variants across these 389 medically important genes remains challenging due to their underrepresentation in ClinVar and gnomAD. Consequently, we also offer an annotation resource to enhance the evaluation and prioritization of these variants. Overall, we can demonstrate that TADGP offers a cost-efficient and scalable approach to routinely assess the dark regions of the human genome with clinical relevance.

2.
Nat Genet ; 55(2): 301-311, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-36658436

RESUMEN

Ixodes spp. and related ticks transmit prevalent infections, although knowledge of their biology and development of anti-tick measures have been hindered by the lack of a high-quality genome. In the present study, we present the assembly of a 2.23-Gb Ixodes scapularis genome by sequencing two haplotypes within one individual, complemented by chromosome-level scaffolding and full-length RNA isoform sequencing, yielding a fully reannotated genome featuring thousands of new protein-coding genes and various RNA species. Analyses of the repetitive DNA identified transposable elements, whereas the examination of tick-associated bacterial sequences yielded an improved Rickettsia buchneri genome. We demonstrate how the Ixodes genome advances tick science by contributing to new annotations, gene models and epigenetic functions, expansion of gene families, development of in-depth proteome catalogs and deciphering of genetic variations in wild ticks. Overall, we report critical genetic resources and biological insights impacting our understanding of tick biology and future interventions against tick-transmitted infections.


Asunto(s)
Ixodes , Animales , Ixodes/genética , Ixodes/microbiología , Genoma/genética , Bacterias/genética , Secuencia de Bases , ARN
3.
Front Plant Sci ; 12: 720670, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34567033

RESUMEN

A defining component of agroforestry parklands across Sahelo-Sudanian Africa (SSA), the shea tree (Vitellaria paradoxa) is central to sustaining local livelihoods and the farming environments of rural communities. Despite its economic and cultural value, however, not to mention the ecological roles it plays as a dominant parkland species, shea remains semi-domesticated with virtually no history of systematic genetic improvement. In truth, shea's extended juvenile period makes traditional breeding approaches untenable; but the opportunity for genome-assisted breeding is immense, provided the foundational resources are available. Here we report the development and public release of such resources. Using the FALCON-Phase workflow, 162.6 Gb of long-read PacBio sequence data were assembled into a 658.7 Mbp, chromosome-scale reference genome annotated with 38,505 coding genes. Whole genome duplication (WGD) analysis based on this gene space revealed clear signatures of two ancient WGD events in shea's evolutionary past, one prior to the Astrid-Rosid divergence (116-126 Mya) and the other at the root of the order Ericales (65-90 Mya). In a first genome-wide look at the suite of fatty acid (FA) biosynthesis genes that likely govern stearin content, the primary determinant of shea butter quality, relatively high copy numbers of six key enzymes were found (KASI, KASIII, FATB, FAD2, FAD3, and FAX2), some likely originating in shea's more recent WGD event. To help translate these findings into practical tools for characterization, selection, and genome-wide association studies (GWAS), resequencing data from a shea diversity panel was used to develop a database of more than 3.5 million functionally annotated, physically anchored SNPs. Two smaller, more curated sets of suggested SNPs, one for GWAS (104,211 SNPs) and the other targeting FA biosynthesis genes (90 SNPs), are also presented. With these resources, the hope is to support national programs across the shea belt in the strategic, genome-enabled conservation and long-term improvement of the shea tree for SSA.

4.
Nat Commun ; 12(1): 1935, 2021 04 28.
Artículo en Inglés | MEDLINE | ID: mdl-33911078

RESUMEN

Haplotype-resolved genome assemblies are important for understanding how combinations of variants impact phenotypes. To date, these assemblies have been best created with complex protocols, such as cultured cells that contain a single-haplotype (haploid) genome, single cells where haplotypes are separated, or co-sequencing of parental genomes in a trio-based approach. These approaches are impractical in most situations. To address this issue, we present FALCON-Phase, a phasing tool that uses ultra-long-range Hi-C chromatin interaction data to extend phase blocks of partially-phased diploid assembles to chromosome or scaffold scale. FALCON-Phase uses the inherent phasing information in Hi-C reads, skipping variant calling, and reduces the computational complexity of phasing. Our method is validated on three benchmark datasets generated as part of the Vertebrate Genomes Project (VGP), including human, cow, and zebra finch, for which high-quality, fully haplotype-resolved assemblies are available using the trio-based approach. FALCON-Phase is accurate without having parental data and performance is better in samples with higher heterozygosity. For cow and zebra finch the accuracy is 97% compared to 80-91% for human. FALCON-Phase is applicable to any draft assembly that contains long primary contigs and phased associate contigs.


Asunto(s)
Mapeo Contig/métodos , Genoma Humano/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Algoritmos , Animales , Bovinos , Haplotipos/genética , Humanos , Polimorfismo de Nucleótido Simple/genética , Pez Cebra/genética
5.
Plant Genome ; 14(1): e20072, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33605092

RESUMEN

Hop (Humulus lupulus L. var Lupulus) is a diploid, dioecious plant with a history of cultivation spanning more than one thousand years. Hop cones are valued for their use in brewing and contain compounds of therapeutic interest including xanthohumol. Efforts to determine how biochemical pathways responsible for desirable traits are regulated have been challenged by the large (2.8 Gb), repetitive, and heterozygous genome of hop. We present a draft haplotype-phased assembly of the Cascade cultivar genome. Our draft assembly and annotation of the Cascade genome is the most extensive representation of the hop genome to date. PacBio long-read sequences from hop were assembled with FALCON and partially phased with FALCON-Unzip. Comparative analysis of haplotype sequences provides insight into selective pressures that have driven evolution in hop. We discovered genes with greater sequence divergence enriched for stress-response, growth, and flowering functions in the draft phased assembly. With improved resolution of long terminal retrotransposons (LTRs) due to long-read sequencing, we found that hop is over 70% repetitive. We identified a homolog of cannabidiolic acid synthase (CBDAS) that is expressed in multiple tissues. The approaches we developed to analyze the draft phased assembly serve to deepen our understanding of the genomic landscape of hop and may have broader applicability to the study of other large, complex genomes.


Asunto(s)
Humulus , Diploidia , Genoma de Planta , Genómica , Haplotipos , Humulus/genética
6.
Nat Commun ; 11(1): 2071, 2020 04 29.
Artículo en Inglés | MEDLINE | ID: mdl-32350247

RESUMEN

Inbred animals were historically chosen for genome analysis to circumvent assembly issues caused by haplotype variation but this resulted in a composite of the two genomes. Here we report a haplotype-aware scaffolding and polishing pipeline which was used to create haplotype-resolved, chromosome-level genome assemblies of Angus (taurine) and Brahman (indicine) cattle subspecies from contigs generated by the trio binning method. These assemblies reveal structural and copy number variants that differentiate the subspecies and that variant detection is sensitive to the specific reference genome chosen. Six genes with immune related functions have additional copies in the indicine compared with taurine lineage and an indicus-specific extra copy of fatty acid desaturase is under positive selection. The haplotyped genomes also enable transcripts to be phased to detect allele-specific expression. This work exemplifies the value of haplotype-resolved genomes to better explore evolutionary and functional variations.


Asunto(s)
Bovinos/genética , Variación Genética , Genoma , Haplotipos/genética , Alelos , Desequilibrio Alélico , Animales , Secuencia de Bases , Cromosomas de los Mamíferos/genética , Femenino , Sitios Genéticos , Mutación INDEL/genética , Masculino , Anotación de Secuencia Molecular , Polimorfismo de Nucleótido Simple/genética , ARN Mensajero/genética , ARN Mensajero/metabolismo , Secuencias Repetitivas de Ácidos Nucleicos/genética
7.
Gigascience ; 8(10)2019 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-31609423

RESUMEN

BACKGROUND: A high-quality reference genome is an essential tool for applied and basic research on arthropods. Long-read sequencing technologies may be used to generate more complete and contiguous genome assemblies than alternate technologies; however, long-read methods have historically had greater input DNA requirements and higher costs than next-generation sequencing, which are barriers to their use on many samples. Here, we present a 2.3 Gb de novo genome assembly of a field-collected adult female spotted lanternfly (Lycorma delicatula) using a single Pacific Biosciences SMRT Cell. The spotted lanternfly is an invasive species recently discovered in the northeastern United States that threatens to damage economically important crop plants in the region. RESULTS: The DNA from 1 individual was used to make 1 standard, size-selected library with an average DNA fragment size of ∼20 kb. The library was run on 1 Sequel II SMRT Cell 8M, generating a total of 132 Gb of long-read sequences, of which 82 Gb were from unique library molecules, representing ∼36× coverage of the genome. The assembly had high contiguity (contig N50 length = 1.5 Mb), completeness, and sequence level accuracy as estimated by conserved gene set analysis (96.8% of conserved genes both complete and without frame shift errors). Furthermore, it was possible to segregate more than half of the diploid genome into the 2 separate haplotypes. The assembly also recovered 2 microbial symbiont genomes known to be associated with L. delicatula, each microbial genome being assembled into a single contig. CONCLUSIONS: We demonstrate that field-collected arthropods can be used for the rapid generation of high-quality genome assemblies, an attractive approach for projects on emerging invasive species, disease vectors, or conservation efforts of endangered species.


Asunto(s)
Dípteros/genética , Genoma de los Insectos , Genómica/métodos , Animales , Femenino , Biblioteca de Genes , Especies Introducidas , Análisis de Secuencia de ADN
8.
BMC Plant Biol ; 19(1): 319, 2019 Jul 16.
Artículo en Inglés | MEDLINE | ID: mdl-31311507

RESUMEN

BACKGROUND: Non-host resistance (NHR) presents a compelling long-term plant protection strategy for global food security, yet the genetic basis of NHR remains poorly understood. For many diseases, including stem rust of wheat [causal organism Puccinia graminis (Pg)], NHR is largely unexplored due to the inherent challenge of developing a genetically tractable system within which the resistance segregates. The present study turns to the pathogen's alternate host, barberry (Berberis spp.), to overcome this challenge. RESULTS: In this study, an interspecific mapping population derived from a cross between Pg-resistant Berberis thunbergii (Bt) and Pg-susceptible B. vulgaris was developed to investigate the Pg-NHR exhibited by Bt. To facilitate QTL analysis and subsequent trait dissection, the first genetic linkage maps for the two parental species were constructed and a chromosome-scale reference genome for Bt was assembled (PacBio + Hi-C). QTL analysis resulted in the identification of a single 13 cM region (~ 5.1 Mbp spanning 13 physical contigs) on the short arm of Bt chromosome 3. Differential gene expression analysis, combined with sequence variation analysis between the two parental species, led to the prioritization of several candidate genes within the QTL region, some of which belong to gene families previously implicated in disease resistance. CONCLUSIONS: Foundational genetic and genomic resources developed for Berberis spp. enabled the identification and annotation of a QTL associated with Pg-NHR. Although subsequent validation and fine mapping studies are needed, this study demonstrates the feasibility of and lays the groundwork for dissecting Pg-NHR in the alternate host of one of agriculture's most devastating pathogens.


Asunto(s)
Basidiomycota/fisiología , Berberis/genética , Berberis/microbiología , Enfermedades de las Plantas/genética , Mapeo Cromosómico , Cromosomas de las Plantas , Resistencia a la Enfermedad/genética , Perfilación de la Expresión Génica , Genoma de Planta , Hibridación Genética , Patrón de Herencia , Fenotipo , Enfermedades de las Plantas/microbiología , Tallos de la Planta/microbiología , Sitios de Carácter Cuantitativo
9.
Genes (Basel) ; 10(1)2019 01 18.
Artículo en Inglés | MEDLINE | ID: mdl-30669388

RESUMEN

A high-quality reference genome is a fundamental resource for functional genetics, comparative genomics, and population genomics, and is increasingly important for conservation biology. PacBio Single Molecule, Real-Time (SMRT) sequencing generates long reads with uniform coverage and high consensus accuracy, making it a powerful technology for de novo genome assembly. Improvements in throughput and concomitant reductions in cost have made PacBio an attractive core technology for many large genome initiatives, however, relatively high DNA input requirements (~5 µg for standard library protocol) have placed PacBio out of reach for many projects on small organisms that have lower DNA content, or on projects with limited input DNA for other reasons. Here we present a high-quality de novo genome assembly from a single Anopheles coluzzii mosquito. A modified SMRTbell library construction protocol without DNA shearing and size selection was used to generate a SMRTbell library from just 100 ng of starting genomic DNA. The sample was run on the Sequel System with chemistry 3.0 and software v6.0, generating, on average, 25 Gb of sequence per SMRT Cell with 20 h movies, followed by diploid de novo genome assembly with FALCON-Unzip. The resulting curated assembly had high contiguity (contig N50 3.5 Mb) and completeness (more than 98% of conserved genes were present and full-length). In addition, this single-insect assembly now places 667 (>90%) of formerly unplaced genes into their appropriate chromosomal contexts in the AgamP4 PEST reference. We were also able to resolve maternal and paternal haplotypes for over 1/3 of the genome. By sequencing and assembling material from a single diploid individual, only two haplotypes were present, simplifying the assembly process compared to samples from multiple pooled individuals. The method presented here can be applied to samples with starting DNA amounts as low as 100 ng per 1 Gb genome size. This new low-input approach puts PacBio-based assemblies in reach for small highly heterozygous organisms that comprise much of the diversity of life.


Asunto(s)
Anopheles/genética , Genoma de los Insectos , Análisis de Secuencia de ADN/métodos , Animales , Mapeo Contig/métodos , Mapeo Contig/normas , Ploidias , Polimorfismo Genético , Análisis de Secuencia de ADN/normas
10.
Nat Commun ; 10(1): 260, 2019 01 16.
Artículo en Inglés | MEDLINE | ID: mdl-30651564

RESUMEN

Rapid innovation in sequencing technologies and improvement in assembly algorithms have enabled the creation of highly contiguous mammalian genomes. Here we report a chromosome-level assembly of the water buffalo (Bubalus bubalis) genome using single-molecule sequencing and chromatin conformation capture data. PacBio Sequel reads, with a mean length of 11.5 kb, helped to resolve repetitive elements and generate sequence contiguity. All five B. bubalis sub-metacentric chromosomes were correctly scaffolded with centromeres spanned. Although the index animal was partly inbred, 58% of the genome was haplotype-phased by FALCON-Unzip. This new reference genome improves the contig N50 of the previous short-read based buffalo assembly more than a thousand-fold and contains only 383 gaps. It surpasses the human and goat references in sequence contiguity and facilitates the annotation of hard to assemble gene clusters such as the major histocompatibility complex (MHC).


Asunto(s)
Búfalos/genética , Cromosomas de los Mamíferos/genética , Mapeo Contig/métodos , Genoma/genética , Cabras/genética , Animales , Cromatina/química , Cromatina/genética , Femenino , Genómica/métodos , Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Complejo Mayor de Histocompatibilidad/genética , Anotación de Secuencia Molecular/métodos , Familia de Multigenes/genética , Secuencias Repetitivas de Ácidos Nucleicos/genética , Secuenciación Completa del Genoma
11.
Elife ; 72018 12 13.
Artículo en Inglés | MEDLINE | ID: mdl-30543325

RESUMEN

During speciation, sex chromosomes often accumulate interspecific genetic incompatibilities faster than the rest of the genome. The drive theory posits that sex chromosomes are susceptible to recurrent bouts of meiotic drive and suppression, causing the evolutionary build-up of divergent cryptic sex-linked drive systems and, incidentally, genetic incompatibilities. To assess the role of drive during speciation, we combine high-resolution genetic mapping of X-linked hybrid male sterility with population genomics analyses of divergence and recent gene flow between the fruitfly species, Drosophila mauritiana and D. simulans. Our findings reveal a high density of genetic incompatibilities and a corresponding dearth of gene flow on the X chromosome. Surprisingly, we find that a known drive element recently migrated between species and, rather than contributing to interspecific divergence, caused a strong reduction in local sequence divergence, undermining the evolution of hybrid sterility. Gene flow can therefore mediate the effects of selfish genetic elements during speciation.


Asunto(s)
Evolución Biológica , Especiación Genética , Cromosoma X/genética , Cromosoma Y/genética , Animales , Drosophila/genética , Drosophila simulans/genética , Flujo Génico , Infertilidad Masculina/genética , Masculino , Meiosis/genética , Especificidad de la Especie
12.
Nature ; 563(7732): 501-507, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-30429615

RESUMEN

Female Aedes aegypti mosquitoes infect more than 400 million people each year with dangerous viral pathogens including dengue, yellow fever, Zika and chikungunya. Progress in understanding the biology of mosquitoes and developing the tools to fight them has been slowed by the lack of a high-quality genome assembly. Here we combine diverse technologies to produce the markedly improved, fully re-annotated AaegL5 genome assembly, and demonstrate how it accelerates mosquito science. We anchored physical and cytogenetic maps, doubled the number of known chemosensory ionotropic receptors that guide mosquitoes to human hosts and egg-laying sites, provided further insight into the size and composition of the sex-determining M locus, and revealed copy-number variation among glutathione S-transferase genes that are important for insecticide resistance. Using high-resolution quantitative trait locus and population genomic analyses, we mapped new candidates for dengue vector competence and insecticide resistance. AaegL5 will catalyse new biological insights and intervention strategies to fight this deadly disease vector.


Asunto(s)
Aedes/genética , Infecciones por Arbovirus/virología , Arbovirus , Genoma de los Insectos/genética , Genómica/normas , Control de Insectos , Mosquitos Vectores/genética , Mosquitos Vectores/virología , Aedes/virología , Animales , Infecciones por Arbovirus/transmisión , Arbovirus/aislamiento & purificación , Variaciones en el Número de Copia de ADN/genética , Virus del Dengue/aislamiento & purificación , Femenino , Variación Genética/genética , Genética de Población , Glutatión Transferasa/genética , Resistencia a los Insecticidas/efectos de los fármacos , Masculino , Anotación de Secuencia Molecular , Familia de Multigenes/genética , Piretrinas/farmacología , Estándares de Referencia , Procesos de Determinación del Sexo/genética
13.
Nat Biotechnol ; 2018 Oct 22.
Artículo en Inglés | MEDLINE | ID: mdl-30346939

RESUMEN

Complex allelic variation hampers the assembly of haplotype-resolved sequences from diploid genomes. We developed trio binning, an approach that simplifies haplotype assembly by resolving allelic variation before assembly. In contrast with prior approaches, the effectiveness of our method improved with increasing heterozygosity. Trio binning uses short reads from two parental genomes to first partition long reads from an offspring into haplotype-specific sets. Each haplotype is then assembled independently, resulting in a complete diploid reconstruction. We used trio binning to recover both haplotypes of a diploid human genome and identified complex structural variants missed by alternative approaches. We sequenced an F1 cross between the cattle subspecies Bos taurus taurus and Bos taurus indicus and completely assembled both parental haplotypes with NG50 haplotig sizes of >20 Mb and 99.998% accuracy, surpassing the quality of current cattle reference genomes. We suggest that trio binning improves diploid genome assembly and will facilitate new studies of haplotype variation and inheritance.

14.
Genes (Basel) ; 9(8)2018 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-30071683

RESUMEN

Abstract: Genome-level data can provide researchers with unprecedented precision to examine the causes and genetic consequences of population declines, which can inform conservation management. Here, we present a high-quality, long-read, de novo genome assembly for one of the world's most endangered bird species, the 'Alala (Corvus hawaiiensis; Hawaiian crow). As the only remaining native crow species in Hawai'i, the 'Alala survived solely in a captive-breeding program from 2002 until 2016, at which point a long-term reintroduction program was initiated. The high-quality genome assembly was generated to lay the foundation for both comparative genomics studies and the development of population-level genomic tools that will aid conservation and recovery efforts. We illustrate how the quality of this assembly places it amongst the very best avian genomes assembled to date, comparable to intensively studied model systems. We describe the genome architecture in terms of repetitive elements and runs of homozygosity, and we show that compared with more outbred species, the 'Alala genome is substantially more homozygous. We also provide annotations for a subset of immunity genes that are likely to be important in conservation management, and we discuss how this genome is currently being used as a roadmap for downstream conservation applications.

15.
Curr Biol ; 28(8): 1289-1295.e4, 2018 04 23.
Artículo en Inglés | MEDLINE | ID: mdl-29606420

RESUMEN

Crossing over between homologous chromosomes during meiosis repairs programmed DNA double-strand breaks, ensures proper segregation at meiosis I [1], shapes the genomic distribution of nucleotide variability in populations, and enhances the efficacy of natural selection among genetically linked sites [2]. Between closely related Drosophila species, large differences exist in the rate and chromosomal distribution of crossing over. Little, however, is known about the molecular genetic changes or population genetic forces that mediate evolved differences in recombination between species [3, 4]. Here, we show that a meiosis gene with a history of rapid evolution acts as a trans-acting modifier of species differences in crossing over. In transgenic flies, the dicistronic gene, mei-217/mei-218, recapitulates a large part of the species differences in the rate and chromosomal distribution of crossing over. These phenotypic differences appear to result from changes in protein sequence not gene expression. Our population genetics analyses show that the protein-coding sequence of mei-218, but not mei-217, has a history of recurrent positive natural selection. By modulating the intensity of centromeric and telomeric suppression of crossing over, evolution at mei-217/-218 has incidentally shaped gross differences in the chromosomal distribution of nucleotide variability between species. We speculate that recurrent bouts of adaptive evolution at mei-217/-218 might reflect a history of coevolution with selfish genetic elements.


Asunto(s)
Proteínas de Ciclo Celular/genética , Intercambio Genético/genética , Proteínas de Drosophila/genética , Meiosis/genética , Secuencia de Aminoácidos , Animales , Animales Modificados Genéticamente/genética , Proteínas de Ciclo Celular/metabolismo , Proteínas de Ciclo Celular/fisiología , Centrómero/genética , Centrómero/fisiología , Roturas del ADN de Doble Cadena , Drosophila/genética , Proteínas de Drosophila/metabolismo , Proteínas de Drosophila/fisiología , Drosophila melanogaster/genética , Evolución Molecular , Expresión Génica/genética , Recombinación Genética/genética , Selección Genética , Especificidad de la Especie
16.
Gigascience ; 6(11): 1-7, 2017 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-29069494

RESUMEN

Common bread wheat, Triticum aestivum, has one of the most complex genomes known to science, with 6 copies of each chromosome, enormous numbers of near-identical sequences scattered throughout, and an overall haploid size of more than 15 billion bases. Multiple past attempts to assemble the genome have produced assemblies that were well short of the estimated genome size. Here we report the first near-complete assembly of T. aestivum, using deep sequencing coverage from a combination of short Illumina reads and very long Pacific Biosciences reads. The final assembly contains 15 344 693 583 bases and has a weighted average (N50) contig size of 232 659 bases. This represents by far the most complete and contiguous assembly of the wheat genome to date, providing a strong foundation for future genetic studies of this important food crop. We also report how we used the recently published genome of Aegilops tauschii, the diploid ancestor of the wheat D genome, to identify 4 179 762 575 bp of T. aestivum that correspond to its D genome components.


Asunto(s)
Genoma de Planta , Triticum/genética , Anotación de Secuencia Molecular , Poliploidía , Secuenciación Completa del Genoma
17.
Gigascience ; 6(10): 1-16, 2017 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-29020750

RESUMEN

Reference-quality genomes are expected to provide a resource for studying gene structure, function, and evolution. However, often genes of interest are not completely or accurately assembled, leading to unknown errors in analyses or additional cloning efforts for the correct sequences. A promising solution is long-read sequencing. Here we tested PacBio-based long-read sequencing and diploid assembly for potential improvements to the Sanger-based intermediate-read zebra finch reference and Illumina-based short-read Anna's hummingbird reference, 2 vocal learning avian species widely studied in neuroscience and genomics. With DNA of the same individuals used to generate the reference genomes, we generated diploid assemblies with the FALCON-Unzip assembler, resulting in contigs with no gaps in the megabase range, representing 150-fold and 200-fold improvements over the current zebra finch and hummingbird references, respectively. These long-read and phased assemblies corrected and resolved what we discovered to be numerous misassemblies in the references, including missing sequences in gaps, erroneous sequences flanking gaps, base call errors in difficult-to-sequence regions, complex repeat structure errors, and allelic differences between the 2 haplotypes. These improvements were validated by single long-genome and transcriptome reads and resulted for the first time in completely resolved protein-coding genes widely studied in neuroscience and specialized in vocal learning species. These findings demonstrate the impact of long reads, sequencing of previously difficult-to-sequence regions, and phasing of haplotypes on generating the high-quality assemblies necessary for understanding gene structure, function, and evolution.


Asunto(s)
Aves/genética , Animales , Proteínas Aviares/genética , Fosfatasa 1 de Especificidad Dual/genética , Proteína 1 de la Respuesta de Crecimiento Precoz/genética , Femenino , Factores de Transcripción Forkhead/genética , Genoma , Masculino , Proteínas del Tejido Nervioso/genética , Análisis de Secuencia de ADN
19.
PLoS One ; 10(4): e0118621, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25874895

RESUMEN

Secondary contact between divergent populations or incipient species may result in the exchange and introgression of genomic material. We develop a simple DNA sequence measure, called Gmin, which is designed to identify genomic regions experiencing introgression in a secondary contact model. Gmin is defined as the ratio of the minimum between-population number of nucleotide differences in a genomic window to the average number of between-population differences. Although it is conceptually simple, one advantage of Gmin is that it is computationally inexpensive relative to model-based methods for detecting gene flow and it scales easily to the level of whole-genome analysis. We compare the sensitivity and specificity of Gmin to those of the widely used index of population differentiation, FST, and suggest a simple statistical test for identifying genomic outliers. Extensive computer simulations demonstrate that Gmin has both greater sensitivity and specificity for detecting recent introgression than does FST. Furthermore, we find that the sensitivity of Gmin is robust with respect to both the population mutation and recombination rates. Finally, a scan of Gmin across the X chromosome of Drosophila melanogaster identifies candidate regions of introgression between sub-Saharan African and cosmopolitan populations that were previously missed by other methods. These results show that Gmin is a biologically straightforward, yet powerful, alternative to FST, as well as to more computationally intensive model-based methods for detecting gene flow.


Asunto(s)
Flujo Génico , Genética de Población/métodos , Metagenómica/métodos , Modelos Genéticos , Migración Animal , Animales , Simulación por Computador , Drosophila melanogaster/genética , Evolución Molecular , Francia , Genes de Insecto , Haplotipos/genética , Hibridación Genética/genética , Densidad de Población , Aislamiento Reproductivo , Rwanda , Alineación de Secuencia , Homología de Secuencia de Ácido Nucleico , Especificidad de la Especie
20.
Genome Biol Evol ; 6(9): 2444-58, 2014 Sep 04.
Artículo en Inglés | MEDLINE | ID: mdl-25193308

RESUMEN

Drosophila mauritiana is an Indian Ocean island endemic species that diverged from its two sister species, Drosophila simulans and Drosophila sechellia, approximately 240,000 years ago. Multiple forms of incomplete reproductive isolation have evolved among these species, including sexual, gametic, ecological, and intrinsic postzygotic barriers, with crosses among all three species conforming to Haldane's rule: F(1) hybrid males are sterile and F(1) hybrid females are fertile. Extensive genetic resources and the fertility of hybrid females have made D. mauritiana, in particular, an important model for speciation genetics. Analyses between D. mauritiana and both of its siblings have shown that the X chromosome makes a disproportionate contribution to hybrid male sterility. But why the X plays a special role in the evolution of hybrid sterility in these, and other, species remains an unsolved problem. To complement functional genetic analyses, we have investigated the population genomics of D. mauritiana, giving special attention to differences between the X and the autosomes. We present a de novo genome assembly of D. mauritiana annotated with RNAseq data and a whole-genome analysis of polymorphism and divergence from ten individuals. Our analyses show that, relative to the autosomes, the X chromosome has reduced nucleotide diversity but elevated nucleotide divergence; an excess of recurrent adaptive evolution at its protein-coding genes; an excess of recent, strong selective sweeps; and a large excess of satellite DNA. Interestingly, one of two centimorgan-scale selective sweeps on the D. mauritiana X chromosome spans a region containing two sex-ratio meiotic drive elements and a high concentration of satellite DNA. Furthermore, genes with roles in reproduction and chromosome biology are enriched among genes that have histories of recurrent adaptive protein evolution. Together, these genome-wide analyses suggest that genetic conflict and frequent positive natural selection on the X chromosome have shaped the molecular evolutionary history of D. mauritiana, refining our understanding of the possible causes of the large X-effect in speciation.


Asunto(s)
Cromosomas de Insectos/genética , Drosophila/genética , Evolución Molecular , Variación Genética , Genoma de los Insectos , Animales , Drosophila/fisiología , Femenino , Especiación Genética , Genoma , Masculino , Modelos Genéticos , Reproducción
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...