Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Nature ; 541(7636): 212-216, 2017 01 12.
Artículo en Inglés | MEDLINE | ID: mdl-28024298

RESUMEN

Ash trees (genus Fraxinus, family Oleaceae) are widespread throughout the Northern Hemisphere, but are being devastated in Europe by the fungus Hymenoscyphus fraxineus, causing ash dieback, and in North America by the herbivorous beetle Agrilus planipennis. Here we sequence the genome of a low-heterozygosity Fraxinus excelsior tree from Gloucestershire, UK, annotating 38,852 protein-coding genes of which 25% appear ash specific when compared with the genomes of ten other plant species. Analyses of paralogous genes suggest a whole-genome duplication shared with olive (Olea europaea, Oleaceae). We also re-sequence 37 F. excelsior trees from Europe, finding evidence for apparent long-term decline in effective population size. Using our reference sequence, we re-analyse association transcriptomic data, yielding improved markers for reduced susceptibility to ash dieback. Surveys of these markers in British populations suggest that reduced susceptibility to ash dieback may be more widespread in Great Britain than in Denmark. We also present evidence that susceptibility of trees to H. fraxineus is associated with their iridoid glycoside levels. This rapid, integrated, multidisciplinary research response to an emerging health threat in a non-model organism opens the way for mitigation of the epidemic.


Asunto(s)
Fraxinus/genética , Predisposición Genética a la Enfermedad/genética , Variación Genética , Genoma de Planta/genética , Enfermedades de las Plantas/genética , Árboles/genética , Ascomicetos/patogenicidad , Secuencia Conservada/genética , Dinamarca , Fraxinus/microbiología , Genes de Plantas/genética , Genómica , Glicósidos Iridoides/metabolismo , Enfermedades de las Plantas/microbiología , Enfermedades de las Plantas/prevención & control , Proteínas de Plantas/genética , Densidad de Población , Análisis de Secuencia de ADN , Especificidad de la Especie , Transcriptoma , Árboles/microbiología , Reino Unido
2.
Genome Res ; 27(5): 885-896, 2017 05.
Artículo en Inglés | MEDLINE | ID: mdl-28420692

RESUMEN

Advances in genome sequencing and assembly technologies are generating many high-quality genome sequences, but assemblies of large, repeat-rich polyploid genomes, such as that of bread wheat, remain fragmented and incomplete. We have generated a new wheat whole-genome shotgun sequence assembly using a combination of optimized data types and an assembly algorithm designed to deal with large and complex genomes. The new assembly represents >78% of the genome with a scaffold N50 of 88.8 kb that has a high fidelity to the input data. Our new annotation combines strand-specific Illumina RNA-seq and Pacific Biosciences (PacBio) full-length cDNAs to identify 104,091 high-confidence protein-coding genes and 10,156 noncoding RNA genes. We confirmed three known and identified one novel genome rearrangements. Our approach enables the rapid and scalable assembly of wheat genomes, the identification of structural variants, and the definition of complete gene models, all powerful resources for trait analysis and breeding of this key global crop.


Asunto(s)
Mapeo Contig/métodos , Genoma de Planta , Anotación de Secuencia Molecular/métodos , Proteínas de Plantas/genética , Translocación Genética , Triticum/genética , Algoritmos , Mapeo Contig/normas , Anotación de Secuencia Molecular/normas , Polimorfismo Genético , Poliploidía
3.
Bioinformatics ; 33(4): 574-576, 2017 02 15.
Artículo en Inglés | MEDLINE | ID: mdl-27797770

RESUMEN

Motivation: De novo assembly of whole genome shotgun (WGS) next-generation sequencing (NGS) data benefits from high-quality input with high coverage. However, in practice, determining the quality and quantity of useful reads quickly and in a reference-free manner is not trivial. Gaining a better understanding of the WGS data, and how that data is utilized by assemblers, provides useful insights that can inform the assembly process and result in better assemblies. Results: We present the K-mer Analysis Toolkit (KAT): a multi-purpose software toolkit for reference-free quality control (QC) of WGS reads and de novo genome assemblies, primarily via their k-mer frequencies and GC composition. KAT enables users to assess levels of errors, bias and contamination at various stages of the assembly process. In this paper we highlight KAT's ability to provide valuable insights into assembly composition and quality of genome assemblies through pairwise comparison of k-mers present in both input reads and the assemblies. Availability and Implementation: KAT is available under the GPLv3 license at: https://github.com/TGAC/KAT . Contact: bernardo.clavijo@earlham.ac.uk. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genoma de Planta , Secuenciación de Nucleótidos de Alto Rendimiento/normas , Control de Calidad , Análisis de Secuencia de ADN/normas , Programas Informáticos , Fraxinus/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos
4.
Plant J ; 82(4): 680-92, 2015 May.
Artículo en Inglés | MEDLINE | ID: mdl-25759247

RESUMEN

The medicinal plant Madagascar periwinkle, Catharanthus roseus (L.) G. Don, produces hundreds of biologically active monoterpene-derived indole alkaloid (MIA) metabolites and is the sole source of the potent, expensive anti-cancer compounds vinblastine and vincristine. Access to a genome sequence would enable insights into the biochemistry, control, and evolution of genes responsible for MIA biosynthesis. However, generation of a near-complete, scaffolded genome is prohibitive to small research communities due to the expense, time, and expertise required. In this study, we generated a genome assembly for C. roseus that provides a near-comprehensive representation of the genic space that revealed the genomic context of key points within the MIA biosynthetic pathway including physically clustered genes, tandem gene duplication, expression sub-functionalization, and putative neo-functionalization. The genome sequence also facilitated high resolution co-expression analyses that revealed three distinct clusters of co-expression within the components of the MIA pathway. Coordinated biosynthesis of precursors and intermediates throughout the pathway appear to be a feature of vinblastine/vincristine biosynthesis. The C. roseus genome also revealed localization of enzyme-rich genic regions and transporters near known biosynthetic enzymes, highlighting how even a draft genome sequence can empower the study of high-value specialized metabolites.


Asunto(s)
Productos Biológicos/metabolismo , Catharanthus/metabolismo , Regulación de la Expresión Génica de las Plantas , Genoma de Planta/genética , Vinblastina/metabolismo
5.
Brief Bioinform ; 14(5): 548-55, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-23793381

RESUMEN

Next-generation sequencing (NGS) is increasingly being adopted as the backbone of biomedical research. With the commercialization of various affordable desktop sequencers, NGS will be reached by increasing numbers of cellular and molecular biologists, necessitating community consensus on bioinformatics protocols to tackle the exponential increase in quantity of sequence data. The current resources for NGS informatics are extremely fragmented. Finding a centralized synthesis is difficult. A multitude of tools exist for NGS data analysis; however, none of these satisfies all possible uses and needs. This gap in functionality could be filled by integrating different methods in customized pipelines, an approach helped by the open-source nature of many NGS programmes. Drawing from community spirit and with the use of the Wikipedia framework, we have initiated a collaborative NGS resource: The NGS WikiBook. We have collected a sufficient amount of text to incentivize a broader community to contribute to it. Users can search, browse, edit and create new content, so as to facilitate self-learning and feedback to the community. The overall structure and style for this dynamic material is designed for the bench biologists and non-bioinformaticians. The flexibility of online material allows the readers to ignore details in a first read, yet have immediate access to the information they need. Each chapter comes with practical exercises so readers may familiarize themselves with each step. The NGS WikiBook aims to create a collective laboratory book and protocol that explains the key concepts and describes best practices in this fast-evolving field.


Asunto(s)
Biología Computacional/educación , Instrucción por Computador/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Conducta Cooperativa , Internet , Enseñanza
6.
Bioinformatics ; 30(4): 566-8, 2014 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-24297520

RESUMEN

SUMMARY: Illumina's recently released Nextera Long Mate Pair (LMP) kit enables production of jumping libraries of up to 12 kb. The LMP libraries are an invaluable resource for carrying out complex assemblies and other downstream bioinformatics analyses such as the characterization of structural variants. However, LMP libraries are intrinsically noisy and to maximize their value, post-sequencing data analysis is required. Standardizing laboratory protocols and the selection of sequenced reads for downstream analysis are non-trivial tasks. NextClip is a tool for analyzing reads from LMP libraries, generating a comprehensive quality report and extracting good quality trimmed and deduplicated reads. AVAILABILITY AND IMPLEMENTATION: Source code, user guide and example data are available from https://github.com/richardmleggett/nextclip/.


Asunto(s)
Proteínas de Arabidopsis/genética , Biología Computacional/métodos , Biblioteca Genómica , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Arabidopsis/genética
7.
G3 (Bethesda) ; 10(6): 1823-1827, 2020 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-32241919

RESUMEN

Barley (Hordeum vulgare) is one of the most important crops worldwide and is also considered a research model for the large-genome small grain temperate cereals. Despite genomic resources improving all the time, they are limited for the cv Golden Promise, the most efficient genotype for genetic transformation. We have developed a barley cv Golden Promise reference assembly integrating Illumina paired-end reads, long mate-pair reads, Dovetail Chicago in vitro proximity ligation libraries and chromosome conformation capture sequencing (Hi-C) libraries into a contiguous reference assembly. The assembled genome of 7 chromosomes and 4.13Gb in size, has a super-scaffold N50 after Chicago libraries of 4.14Mb and contains only 2.2% gaps. Using BUSCO (benchmarking universal single copy orthologous genes) as evaluation the genome assembly contains 95.2% of complete and single copy genes from the plant database. A high-quality Golden Promise reference assembly will be useful and utilized by the whole barley research community but will prove particularly useful for CRISPR-Cas9 experiments.


Asunto(s)
Hordeum , Genoma , Genómica , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento , Hordeum/genética
8.
F1000Res ; 8: 1490, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31723420

RESUMEN

The Sequence Distance Graph (SDG) framework works with genome assembly graphs and raw data from paired, linked and long reads. It includes a simple deBruijn graph module, and can import graphs using the graphical fragment assembly (GFA) format. It also maps raw reads onto graphs, and provides a Python application programming interface (API) to navigate the graph, access the mapped and raw data and perform interactive or scripted analyses. Its complete workspace can be dumped to and loaded from disk, decoupling mapping from analysis and supporting multi-stage pipelines. We present the design and implementation of the framework, and example analyses scaffolding a short read graph with long reads, and navigating paths in a heterozygous graph for a simulated parent-offspring trio dataset. SDG  is  freely  available  under  the  MIT  license  at https://github.com/bioinfologics/sdg.


Asunto(s)
Análisis de Secuencia de ADN , Programas Informáticos , Genómica
9.
Genome Biol ; 20(1): 69, 2019 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-30982471

RESUMEN

BACKGROUND: Sequence exchange between homologous chromosomes through crossing over and gene conversion is highly conserved among eukaryotes, contributing to genome stability and genetic diversity. A lack of recombination limits breeding efforts in crops; therefore, increasing recombination rates can reduce linkage drag and generate new genetic combinations. RESULTS: We use computational analysis of 13 recombinant inbred mapping populations to assess crossover and gene conversion frequency in the hexaploid genome of wheat (Triticum aestivum). We observe that high-frequency crossover sites are shared between populations and that closely related parents lead to populations with more similar crossover patterns. We demonstrate that gene conversion is more prevalent and covers more of the genome in wheat than in other plants, making it a critical process in the generation of new haplotypes, particularly in centromeric regions where crossovers are rare. We identify quantitative trait loci for altered gene conversion and crossover frequency and confirm functionality for a novel RecQ helicase gene that belongs to an ancient clade that is missing in some plant lineages including Arabidopsis. CONCLUSIONS: This is the first gene to be demonstrated to be involved in gene conversion in wheat. Harnessing the RecQ helicase has the potential to break linkage drag utilizing widespread gene conversions.


Asunto(s)
Intercambio Genético , Conversión Génica , Triticum/genética , Genoma de Planta , Poliploidía , Secuenciación Completa del Genoma
10.
Nat Ecol Evol ; 2(6): 1000-1008, 2018 06.
Artículo en Inglés | MEDLINE | ID: mdl-29686237

RESUMEN

Accelerating international trade and climate change make pathogen spread an increasing concern. Hymenoscyphus fraxineus, the causal agent of ash dieback, is a fungal pathogen that has been moving across continents and hosts from Asian to European ash. Most European common ash trees (Fraxinus excelsior) are highly susceptible to H. fraxineus, although a minority (~5%) have partial resistance to dieback. Here, we assemble and annotate a H. fraxineus draft genome, which approaches chromosome scale. Pathogen genetic diversity across Europe and in Japan, reveals a strong bottleneck in Europe, though a signal of adaptive diversity remains in key host interaction genes. We find that the European population was founded by two divergent haploid individuals. Divergence between these haplotypes represents the ancestral polymorphism within a large source population. Subsequent introduction from this source would greatly increase adaptive potential of the pathogen. Thus, further introgression of H. fraxineus into Europe represents a potential threat and Europe-wide biological security measures are needed to manage this disease.


Asunto(s)
Ascomicetos/genética , Fraxinus/microbiología , Genoma Fúngico , Enfermedades de las Plantas/microbiología , Europa (Continente) , Haplotipos/genética
11.
Gigascience ; 6(11): 1-7, 2017 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-29069494

RESUMEN

Common bread wheat, Triticum aestivum, has one of the most complex genomes known to science, with 6 copies of each chromosome, enormous numbers of near-identical sequences scattered throughout, and an overall haploid size of more than 15 billion bases. Multiple past attempts to assemble the genome have produced assemblies that were well short of the estimated genome size. Here we report the first near-complete assembly of T. aestivum, using deep sequencing coverage from a combination of short Illumina reads and very long Pacific Biosciences reads. The final assembly contains 15 344 693 583 bases and has a weighted average (N50) contig size of 232 659 bases. This represents by far the most complete and contiguous assembly of the wheat genome to date, providing a strong foundation for future genetic studies of this important food crop. We also report how we used the recently published genome of Aegilops tauschii, the diploid ancestor of the wheat D genome, to identify 4 179 762 575 bp of T. aestivum that correspond to its D genome components.


Asunto(s)
Genoma de Planta , Triticum/genética , Anotación de Secuencia Molecular , Poliploidía , Secuenciación Completa del Genoma
12.
Front Genet ; 4: 288, 2013 Dec 17.
Artículo en Inglés | MEDLINE | ID: mdl-24381581

RESUMEN

The processes of quality assessment and control are an active area of research at The Genome Analysis Centre (TGAC). Unlike other sequencing centers that often concentrate on a certain species or technology, TGAC applies expertise in genomics and bioinformatics to a wide range of projects, often requiring bespoke wet lab and in silico workflows. TGAC is fortunate to have access to a diverse range of sequencing and analysis platforms, and we are at the forefront of investigations into library quality and sequence data assessment. We have developed and implemented a number of algorithms, tools, pipelines and packages to ascertain, store, and expose quality metrics across a number of next-generation sequencing platforms, allowing rapid and in-depth cross-platform Quality Control (QC) bioinformatics. In this review, we describe these tools as a vehicle for data-driven informatics, offering the potential to provide richer context for downstream analysis and to inform experimental design.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA