Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 122
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 174(5): 1188-1199.e14, 2018 08 23.
Artículo en Inglés | MEDLINE | ID: mdl-30057118

RESUMEN

In stationary-phase Escherichia coli, Dps (DNA-binding protein from starved cells) is the most abundant protein component of the nucleoid. Dps compacts DNA into a dense complex and protects it from damage. Dps has also been proposed to act as a global regulator of transcription. Here, we directly examine the impact of Dps-induced compaction of DNA on the activity of RNA polymerase (RNAP). Strikingly, deleting the dps gene decompacted the nucleoid but did not significantly alter the transcriptome and only mildly altered the proteome during stationary phase. Complementary in vitro assays demonstrated that Dps blocks restriction endonucleases but not RNAP from binding DNA. Single-molecule assays demonstrated that Dps dynamically condenses DNA around elongating RNAP without impeding its progress. We conclude that Dps forms a dynamic structure that excludes some DNA-binding proteins yet allows RNAP free access to the buried genes, a behavior characteristic of phase-separated organelles.


Asunto(s)
ADN Bacteriano , Proteínas de Escherichia coli/metabolismo , Escherichia coli/metabolismo , Regulación Bacteriana de la Expresión Génica , Transcripción Genética , Proteínas de la Membrana Bacteriana Externa/metabolismo , Enzimas de Restricción del ADN/metabolismo , Proteínas de Unión al ADN/metabolismo , ARN Polimerasas Dirigidas por ADN/metabolismo , Holoenzimas/metabolismo , Microscopía Fluorescente , Poliestirenos/química , Proteoma , Análisis de Secuencia de ARN , Estrés Mecánico , Transcriptoma
2.
PLoS Genet ; 20(7): e1011336, 2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-38950081

RESUMEN

Increasing natural resistance and resilience in plants is key for ensuring food security within a changing climate. Breeders improve these traits by crossing cultivars with their wild relatives and introgressing specific alleles through meiotic recombination. However, some genomic regions are devoid of recombination especially in crosses between divergent genomes, limiting the combinations of desirable alleles. Here, we used pooled-pollen sequencing to build a map of recombinant and non-recombinant regions between tomato and five wild relatives commonly used for introgressive tomato breeding. We detected hybrid-specific recombination coldspots that underscore the role of structural variations in modifying recombination patterns and maintaining genetic linkage in interspecific crosses. Crossover regions and coldspots show strong association with specific TE superfamilies exhibiting differentially accessible chromatin between somatic and meiotic cells. About two-thirds of the genome are conserved coldspots, located mostly in the pericentromeres and enriched with retrotransposons. The coldspots also harbor genes associated with agronomic traits and stress resistance, revealing undesired consequences of linkage drag and possible barriers to breeding. We presented examples of linkage drag that can potentially be resolved by pairing tomato with other wild species. Overall, this catalogue will help breeders better understand crossover localization and make informed decisions on generating new tomato varieties.

3.
Plant J ; 117(4): 1281-1297, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-37965720

RESUMEN

Phytoplasmas are pathogenic bacteria that reprogram plant host development for their own benefit. Previous studies have characterized a few different phytoplasma effector proteins that destabilize specific plant transcription factors. However, these are only a small fraction of the potential effectors used by phytoplasmas; therefore, the molecular mechanisms through which phytoplasmas modulate their hosts require further investigation. To obtain further insights into the phytoplasma infection mechanisms, we generated a protein-protein interaction network between a broad set of phytoplasma effectors and a large, unbiased collection of Arabidopsis thaliana transcription factors and transcriptional regulators. We found widespread, but specific, interactions between phytoplasma effectors and host transcription factors, especially those related to host developmental processes. In particular, many unrelated effectors target specific sets of TCP transcription factors, which regulate plant development and immunity. Comparison with other host-pathogen protein interaction networks shows that phytoplasma effectors have unusual targets, indicating that phytoplasmas have evolved a unique and unusual infection strategy. This study contributes a rich and solid data source that guides further investigations of the functions of individual effectors, as demonstrated for some herein. Moreover, the dataset provides insights into the underlying molecular mechanisms of phytoplasma infection.


Asunto(s)
Arabidopsis , Phytoplasma , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Plantas/metabolismo , Arabidopsis/metabolismo , Mapeo de Interacción de Proteínas , Enfermedades de las Plantas/microbiología
4.
Nucleic Acids Res ; 51(5): 2363-2376, 2023 03 21.
Artículo en Inglés | MEDLINE | ID: mdl-36718935

RESUMEN

It has been known for decades that codon usage contributes to translation efficiency and hence to protein production levels. However, its role in protein synthesis is still only partly understood. This lack of understanding hampers the design of synthetic genes for efficient protein production. In this study, we generated a synonymous codon-randomized library of the complete coding sequence of red fluorescent protein. Protein production levels and the full coding sequences were determined for 1459 gene variants in Escherichia coli. Using different machine learning approaches, these data were used to reveal correlations between codon usage and protein production. Interestingly, protein production levels can be relatively accurately predicted (Pearson correlation of 0.762) by a Random Forest model that only relies on the sequence information of the first eight codons. In this region, close to the translation initiation site, mRNA secondary structure rather than Codon Adaptation Index (CAI) is the key determinant of protein production. This study clearly demonstrates the key role of codons at the start of the coding sequence. Furthermore, these results imply that commonly used CAI-based codon optimization of the full coding sequence is not a very effective strategy. One should rather focus on optimizing protein production via reducing mRNA secondary structure formation with the first few codons.


Asunto(s)
Escherichia coli , Aprendizaje Automático , Distribución Aleatoria , Codón/genética , Codón/metabolismo , ARN Mensajero/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Biosíntesis de Proteínas
5.
Artículo en Inglés | MEDLINE | ID: mdl-38648121

RESUMEN

The selective pressure of pathogen-host symbiosis drives adaptations. How these interactions shape the metabolism of pathogens is largely unknown. Here, we use comparative genomics to systematically analyse the metabolic networks of oomycetes, a diverse group of eukaryotes that includes saprotrophs as well as pathogens of animal- and plant pathogens, the latter causing devastating diseases with significant economic and/or ecological impact. In our analyses of 44 oomycete species, we uncover considerable variation in metabolism that can be linked to lifestyle differences. Comparisons of metabolic gene content reveal that plant pathogenic oomycetes have a bipartite metabolism consisting of a conserved core and an accessory set. The accessory set can be associated with the degradation of defence compounds produced by plants when challenged by pathogens. Obligate biotrophic oomycetes have smaller metabolic networks, and taxonomically distantly related biotrophic lineages display convergent evolution by repeated gene losses in both the conserved as well as the accessory set of metabolism. When investigating to what extent the metabolic networks in obligate biotrophs differ from those in hemibiotrophic plant pathogens, we observe that the losses of metabolic enzymes in obligate biotrophs are not random and that gene losses predominantly influence the terminal branches of the metabolic networks. Our analyses represent the first metabolism-focused comparison of oomycetes at this scale and will contribute to a better understanding of the evolution of oomycete metabolism in relation to lifestyle adaptation.

6.
Plant J ; 112(5): 1298-1315, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36239071

RESUMEN

Photosynthesis is a key process in sustaining plant and human life. Improving the photosynthetic capacity of agricultural crops is an attractive means to increase their yields. While the core mechanisms of photosynthesis are highly conserved in C3 plants, these mechanisms are very flexible, allowing considerable diversity in photosynthetic properties. Among this diversity is the maintenance of high photosynthetic light-use efficiency at high irradiance as identified in a small number of exceptional C3 species. Hirschfeldia incana, a member of the Brassicaceae family, is such an exceptional species, and because it is easy to grow, it is an excellent model for studying the genetic and physiological basis of this trait. Here, we present a reference genome of H. incana and confirm its high photosynthetic light-use efficiency. While H. incana has the highest photosynthetic rates found so far in the Brassicaceae, the light-saturated assimilation rates of closely related Brassica rapa and Brassica nigra are also high. The H. incana genome has extensively diversified from that of B. rapa and B. nigra through large chromosomal rearrangements, species-specific transposon activity, and differential retention of duplicated genes. Duplicated genes in H. incana, B. rapa, and B. nigra that are involved in photosynthesis and/or photoprotection show a positive correlation between copy number and gene expression, providing leads into the mechanisms underlying the high photosynthetic efficiency of these species. Our work demonstrates that the H. incana genome serves as a valuable resource for studying the evolution of high photosynthetic light-use efficiency and enhancing photosynthetic rates in crop species.


Asunto(s)
Brassica rapa , Brassicaceae , Humanos , Brassicaceae/metabolismo , Fotosíntesis/genética , Productos Agrícolas , Fenotipo
7.
Mol Biol Evol ; 39(1)2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34597400

RESUMEN

Meiotic recombination is a biological process of key importance in breeding, to generate genetic diversity and develop novel or agronomically relevant haplotypes. In crop tomato, recombination is curtailed as manifested by linkage disequilibrium decay over a longer distance and reduced diversity compared with wild relatives. Here, we compared domesticated and wild populations of tomato and found an overall conserved recombination landscape, with local changes in effective recombination rate in specific genomic regions. We also studied the dynamics of recombination hotspots resulting from domestication and found that loss of such hotspots is associated with selective sweeps, most notably in the pericentromeric heterochromatin. We detected footprints of genetic changes and structural variants, among them associated with transposable elements, linked with hotspot divergence during domestication, likely causing fine-scale alterations to recombination patterns and resulting in linkage drag.


Asunto(s)
Domesticación , Solanum lycopersicum , Elementos Transponibles de ADN/genética , Solanum lycopersicum/genética , Fitomejoramiento , Recombinación Genética
8.
Bioinformatics ; 38(18): 4403-4405, 2022 09 15.
Artículo en Inglés | MEDLINE | ID: mdl-35861394

RESUMEN

SUMMARY: The ever-increasing number of sequenced genomes necessitates the development of pangenomic approaches for comparative genomics. Introduced in 2016, PanTools is a platform that allows pangenome construction, homology grouping and pangenomic read mapping. The use of graph database technology makes PanTools versatile, applicable from small viral genomes like SARS-CoV-2 up to large plant or animal genomes like tomato or human. Here, we present our third major update to PanTools that enables the integration of functional annotations and provides both gene-level analyses and phylogenetics. AVAILABILITY AND IMPLEMENTATION: PanTools is implemented in Java 8 and released under the GNU GPLv3 license. Software and documentation are available at https://git.wur.nl/bioinformatics/pantools. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , Filogenia , SARS-CoV-2/genética , Programas Informáticos , Genoma Viral
9.
PLoS Genet ; 16(9): e1009027, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32966296

RESUMEN

The availability of genomes for many species has advanced our understanding of the non-protein-coding fraction of the genome. Comparative genomics has proven itself to be an invaluable approach for the systematic, genome-wide identification of conserved non-protein-coding elements (CNEs). However, for many non-mammalian model species, including chicken, our capability to interpret the functional importance of variants overlapping CNEs has been limited by current genomic annotations, which rely on a single information type (e.g. conservation). We here studied CNEs in chicken using a combination of population genomics and comparative genomics. To investigate the functional importance of variants found in CNEs we develop a ch(icken) Combined Annotation-Dependent Depletion (chCADD) model, a variant effect prediction tool first introduced for humans and later on for mouse and pig. We show that 73 Mb of the chicken genome has been conserved across more than 280 million years of vertebrate evolution. The vast majority of the conserved elements are in non-protein-coding regions, which display SNP densities and allele frequency distributions characteristic of genomic regions constrained by purifying selection. By annotating SNPs with the chCADD score we are able to pinpoint specific subregions of the CNEs to be of higher functional importance, as supported by SNPs found in these subregions are associated with known disease genes in humans, mice, and rats. Taken together, our findings indicate that CNEs harbor variants of functional significance that should be object of further investigation along with protein-coding mutations. We therefore anticipate chCADD to be of great use to the scientific community and breeding companies in future functional studies in chicken.


Asunto(s)
Pollos/genética , ADN Intergénico/genética , Genómica/métodos , Alelos , Animales , Secuencia Conservada/genética , ADN Intergénico/metabolismo , Evolución Molecular , Frecuencia de los Genes/genética , Variación Genética/genética , Genoma/genética , Intrones/genética , Metagenómica/métodos , Polimorfismo de Nucleótido Simple/genética , Análisis de Secuencia/métodos
10.
Haematologica ; 107(1): 143-153, 2022 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-33596640

RESUMEN

T-cell prolymphocytic leukemia (T-PLL) is mostly characterized by aberrant expansion of small- to medium-sized prolymphocytes with a mature post-thymic phenotype, high aggressiveness of the disease and poor prognosis. However, T-PLL is more heterogeneous with a wide range of clinical, morphological, and molecular features, which occasionally impedes the diagnosis. We hypothesized that T-PLL consists of phenotypic and/or genotypic subgroups that may explain the heterogeneity of the disease. Multi-dimensional immuno-phenotyping and gene expression profiling did not reveal clear T-PLL subgroups, and no clear T-cell receptor a or ß CDR3 skewing was observed between different T-PLL cases. We revealed that the expression of microRNA (miRNA) is aberrant and often heterogeneous in T-PLL. We identified 35 miRNA that were aberrantly expressed in T-PLL with miR-200c/141 as the most differentially expressed cluster. High miR- 200c/141 and miR-181a/181b expression was significantly correlated with increased white blood cell counts and poor survival. Furthermore, we found that overexpression of miR-200c/141 correlated with downregulation of their targets ZEB2 and TGFßR3 and aberrant TGFß1- induced phosphorylated SMAD2 (p-SMAD2) and p-SMAD3, indicating that the TGFß pathway is affected in T-PLL. Our results thus highlight the potential role for aberrantly expressed oncogenic miRNA in T-PLL and pave the way for new therapeutic targets in this disease.


Asunto(s)
Leucemia Prolinfocítica de Células T , MicroARNs , Perfilación de la Expresión Génica , Humanos , Leucemia Prolinfocítica de Células T/diagnóstico , Leucemia Prolinfocítica de Células T/genética , Leucemia Prolinfocítica de Células T/terapia , Linfocitos , MicroARNs/genética , Factor de Crecimiento Transformador beta , Caja Homeótica 2 de Unión a E-Box con Dedos de Zinc/genética
11.
PLoS Comput Biol ; 17(3): e1008197, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33750949

RESUMEN

Sesquiterpene synthases (STSs) catalyze the formation of a large class of plant volatiles called sesquiterpenes. While thousands of putative STS sequences from diverse plant species are available, only a small number of them have been functionally characterized. Sequence identity-based screening for desired enzymes, often used in biotechnological applications, is difficult to apply here as STS sequence similarity is strongly affected by species. This calls for more sophisticated computational methods for functionality prediction. We investigate the specificity of precursor cation formation in these elusive enzymes. By inspecting multi-product STSs, we demonstrate that STSs have a strong selectivity towards one precursor cation. We use a machine learning approach combining sequence and structure information to accurately predict precursor cation specificity for STSs across all plant species. We combine this with a co-evolutionary analysis on the wealth of uncharacterized putative STS sequences, to pinpoint residues and distant functional contacts influencing cation formation and reaction pathway selection. These structural factors can be used to predict and engineer enzymes with specific functions, as we demonstrate by predicting and characterizing two novel STSs from Citrus bergamia.


Asunto(s)
Transferasas Alquil y Aril/metabolismo , Evolución Molecular , Aprendizaje Automático , Plantas/enzimología , Sesquiterpenos/metabolismo , Transferasas Alquil y Aril/química , Secuencia de Aminoácidos , Cationes , Conformación Proteica , Homología de Secuencia de Aminoácido , Especificidad por Sustrato
12.
Genomics ; 113(4): 2229-2239, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-34022350

RESUMEN

The genotype-phenotype link is a major research topic in the life sciences but remains highly complex to disentangle. Part of the complexity arises from the number of genes contributing to the observed phenotype. Despite the vast increase of molecular data, pinpointing the causal variant underlying a phenotype of interest is still challenging. In this study, we present an approach to map causal variation and molecular pathways underlying important phenotypes in pigs. We prioritize variation by utilizing and integrating predicted variant impact scores (pCADD), functional genomic information, and associated phenotypes in other mammalian species. We demonstrate the efficacy of our approach by reporting known and novel causal variants, of which many affect non-coding sequences. Our approach allows the disentangling of the biology behind important phenotypes by accelerating the discovery of novel causal variants and molecular mechanisms affecting important phenotypes in pigs. This information on molecular mechanisms could be applicable in other mammalian species, including humans.


Asunto(s)
Variación Genética , Genómica , Animales , Genotipo , Mamíferos , Fenotipo , Porcinos/genética
13.
Biophys J ; 120(16): 3253-3260, 2021 08 17.
Artículo en Inglés | MEDLINE | ID: mdl-34237288

RESUMEN

Förster resonance energy transfer (FRET) is a useful phenomenon in biomolecular investigations, as it can be leveraged for nanoscale measurements. The optical signals produced by such experiments can be analyzed by fitting a statistical model. Several software tools exist to fit such models in an unsupervised manner but lack the flexibility to adapt to different experimental setups and require local installations. Here, we propose to fit models to optical signals more intuitively by adopting a semisupervised approach, in which the user interactively guides the model to fit a given data set, and introduce FRETboard, a web tool that allows users to provide such guidance. We show that our approach is able to closely reproduce ground truth FRET statistics in a wide range of simulated single-molecule scenarios and correctly estimate parameters for up to 11 states. On in vitro data, we retrieve parameters identical to those obtained by laborious manual classification in a fraction of the required time. Moreover, we designed FRETboard to be easily extendable to other models, allowing it to adapt to future developments in FRET measurement and analysis.


Asunto(s)
Transferencia Resonante de Energía de Fluorescencia , Programas Informáticos , Nanotecnología
14.
Plant J ; 102(3): 480-492, 2020 05.
Artículo en Inglés | MEDLINE | ID: mdl-31820490

RESUMEN

Genome wide screening of pooled pollen samples from a single interspecific F1 hybrid obtained from a cross between tomato, Solanum lycopersicum and its wild relative, Solanum pimpinellifolium using linked read sequencing of the haploid nuclei, allowed profiling of the crossover (CO) and gene conversion (GC) landscape. We observed a striking overlap between cold regions of CO in the male gametes and our previously established F6 recombinant inbred lines (RILs) population. COs were overrepresented in non-coding regions in the gene promoter and 5'UTR regions of genes. Poly-A/T and AT rich motifs were found enriched in 1 kb promoter regions flanking the CO sites. Non-crossover associated allelic and ectopic GCs were detected in most chromosomes, confirming that besides CO, GC represents also a source for genetic diversity and genome plasticity in tomato. Furthermore, we identified processed break junctions pointing at the involvement of both homology directed and non-homology directed repair pathways, suggesting a recombination machinery in tomato that is more complex than currently anticipated.


Asunto(s)
Meiosis/fisiología , Solanum lycopersicum/citología , Solanum lycopersicum/genética , Regiones no Traducidas 5'/genética , Cromosomas de las Plantas/genética , Intercambio Genético , Genoma de Planta/genética , Genotipo , Meiosis/genética , Regiones Promotoras Genéticas/genética , Análisis de Secuencia de ADN
15.
BMC Genomics ; 22(1): 265, 2021 Apr 14.
Artículo en Inglés | MEDLINE | ID: mdl-33849459

RESUMEN

BACKGROUND: Bacterial plant pathogens of the Pectobacterium genus are responsible for a wide spectrum of diseases in plants, including important crops such as potato, tomato, lettuce, and banana. Investigation of the genetic diversity underlying virulence and host specificity can be performed at genome level by using a comprehensive comparative approach called pangenomics. A pangenomic approach, using newly developed functionalities in PanTools, was applied to analyze the complex phylogeny of the Pectobacterium genus. We specifically used the pangenome to investigate genetic differences between virulent and avirulent strains of P. brasiliense, a potato blackleg causing species dominantly present in Western Europe. RESULTS: Here we generated a multilevel pangenome for Pectobacterium, comprising 197 strains across 19 species, including type strains, with a focus on P. brasiliense. The extensive phylogenetic analysis of the Pectobacterium genus showed robust distinct clades, with most detail provided by 452,388 parsimony-informative single-nucleotide polymorphisms identified in single-copy orthologs. The average Pectobacterium genome consists of 47% core genes, 1% unique genes, and 52% accessory genes. Using the pangenome, we zoomed in on differences between virulent and avirulent P. brasiliense strains and identified 86 genes associated to virulent strains. We found that the organization of genes is highly structured and linked with gene conservation, function, and transcriptional orientation. CONCLUSION: The pangenome analysis demonstrates that evolution in Pectobacteria is a highly dynamic process, including gene acquisitions partly in clusters, genome rearrangements, and loss of genes. Pectobacterium species are typically not characterized by a set of species-specific genes, but instead present themselves using new gene combinations from the shared gene pool. A multilevel pangenomic approach, fusing DNA, protein, biological function, taxonomic group, and phenotypes, facilitates studies in a flexible taxonomic context.


Asunto(s)
Pectobacterium , Solanum tuberosum , Europa (Continente) , Pool de Genes , Pectobacterium/genética , Filogenia , Enfermedades de las Plantas , Solanum tuberosum/genética
16.
Bioinformatics ; 36(Suppl_2): i718-i725, 2020 12 30.
Artículo en Inglés | MEDLINE | ID: mdl-33381814

RESUMEN

MOTIVATION: As the number of experimentally solved protein structures rises, it becomes increasingly appealing to use structural information for predictive tasks involving proteins. Due to the large variation in protein sizes, folds and topologies, an attractive approach is to embed protein structures into fixed-length vectors, which can be used in machine learning algorithms aimed at predicting and understanding functional and physical properties. Many existing embedding approaches are alignment based, which is both time-consuming and ineffective for distantly related proteins. On the other hand, library- or model-based approaches depend on a small library of fragments or require the use of a trained model, both of which may not generalize well. RESULTS: We present Geometricus, a novel and universally applicable approach to embedding proteins in a fixed-dimensional space. The approach is fast, accurate, and interpretable. Geometricus uses a set of 3D moment invariants to discretize fragments of protein structures into shape-mers, which are then counted to describe the full structure as a vector of counts. We demonstrate the applicability of this approach in various tasks, ranging from fast structure similarity search, unsupervised clustering and structure classification across proteins from different superfamilies as well as within the same family. AVAILABILITY AND IMPLEMENTATION: Python code available at https://git.wur.nl/durai001/geometricus.


Asunto(s)
Algoritmos , Proteínas , Análisis por Conglomerados , Aprendizaje Automático
17.
BMC Bioinformatics ; 21(1): 253, 2020 Jun 18.
Artículo en Inglés | MEDLINE | ID: mdl-32552661

RESUMEN

BACKGROUND: Haplotype information is essential for many genetic and genomic analyses, including genotype-phenotype associations in human, animals and plants. Haplotype assembly is a method for reconstructing haplotypes from DNA sequencing reads. By the advent of new sequencing technologies, new algorithms are needed to ensure long and accurate haplotypes. While a few linked-read haplotype assembly algorithms are available for diploid genomes, to the best of our knowledge, no algorithms have yet been proposed for polyploids specifically exploiting linked reads. RESULTS: The first haplotyping algorithm designed for linked reads generated from a polyploid genome is presented, built on a typical short-read haplotyping method, SDhaP. Using the input aligned reads and called variants, the haplotype-relevant information is extracted. Next, reads with the same barcodes are combined to produce molecule-specific fragments. Then, these fragments are clustered into strongly connected components which are then used as input of a haplotype assembly core in order to estimate accurate and long haplotypes. CONCLUSIONS: Hap10 is a novel algorithm for haplotype assembly of polyploid genomes using linked reads. The performance of the algorithms is evaluated in a number of simulation scenarios and its applicability is demonstrated on a real dataset of sweet potato.


Asunto(s)
Genoma Humano/genética , Haplotipos/fisiología , Poliploidía , Algoritmos , Humanos
18.
Mol Plant Microbe Interact ; 33(5): 742-753, 2020 May.
Artículo en Inglés | MEDLINE | ID: mdl-32237964

RESUMEN

Along with Plasmopara destructor, Peronosopora belbahrii has arguably been the economically most important newly emerging downy mildew pathogen of the past two decades. Originating from Africa, it has started devastating basil production throughout the world, most likely due to the distribution of infested seed material. Here, we present the genome of this pathogen and results from comparisons of its genomic features to other oomycetes. The assembly of the nuclear genome was around 35.4 Mbp in length, with an N50 scaffold length of around 248 kbp and an L50 scaffold count of 46. The circular mitochondrial genome consisted of around 40.1 kbp. From the repeat-masked genome, 9,049 protein-coding genes were predicted, out of which 335 were predicted to have extracellular functions, representing the smallest secretome so far found in peronosporalean oomycetes. About 16% of the genome consists of repetitive sequences, and, based on simple sequence repeat regions, we provide a set of microsatellites that could be used for population genetic studies of P. belbahrii. P. belbahrii has undergone a high degree of convergent evolution with other obligate parasitic pathogen groups, reflecting its obligate biotrophic lifestyle. Features of its secretome, signaling networks, and promoters are presented, and some patterns are hypothesized to reflect the high degree of host specificity in Peronospora species. In addition, we suggest the presence of additional virulence factors apart from classical effector classes that are promising candidates for future functional studies.


Asunto(s)
Genoma Mitocondrial , Peronospora/genética , Genómica , Enfermedades de las Plantas/microbiología , Regiones Promotoras Genéticas
19.
Brief Bioinform ; 19(3): 387-403, 2018 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-28065918

RESUMEN

Haplotypes are the units of inheritance in an organism, and many genetic analyses depend on their precise determination. Methods for haplotyping single individuals use the phasing information available in next-generation sequencing reads, by matching overlapping single-nucleotide polymorphisms while penalizing post hoc nucleotide corrections made. Haplotyping diploids is relatively easy, but the complexity of the problem increases drastically for polyploid genomes, which are found in both model organisms and in economically relevant plant and animal species. Although a number of tools are available for haplotyping polyploids, the effects of the genomic makeup and the sequencing strategy followed on the accuracy of these methods have hitherto not been thoroughly evaluated.We developed the simulation pipeline haplosim to evaluate the performance of three haplotype estimation algorithms for polyploids: HapCompass, HapTree and SDhaP, in settings varying in sequencing approach, ploidy levels and genomic diversity, using tetraploid potato as the model. Our results show that sequencing depth is the major determinant of haplotype estimation quality, that 1 kb PacBio circular consensus sequencing reads and Illumina reads with large insert-sizes are competitive and that all methods fail to produce good haplotypes when ploidy levels increase. Comparing the three methods, HapTree produces the most accurate estimates, but also consumes the most resources. There is clearly room for improvement in polyploid haplotyping algorithms.


Asunto(s)
Simulación por Computador , Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Poliploidía , Análisis de Secuencia de ADN/métodos , Solanum tuberosum/genética , Algoritmos , Genoma de Planta , Genómica
20.
Bioinformatics ; 35(15): 2663-2664, 2019 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-30590415

RESUMEN

SUMMARY: Nanopore sequencing is a novel development in nucleic acid analysis. As such, nanopore-sequencing hardware and software are updated frequently and extensively, which quickly renders peer-reviewed publications on analysis pipeline benchmarking efforts outdated. To provide the user community with a faster, more flexible alternative to peer-reviewed benchmark papers for de novo assembly tool performance we constructed poreTally, a comprehensive benchmarking tool. poreTally automatically assembles a given read set using several often-used assembly pipelines, analyzes the resulting assemblies for correctness and continuity, and finally generates a quality report, which can immediately be published on Github/Gitlab. AVAILABILITY AND IMPLEMENTATION: poreTally is available on Github at https://github.com/ cvdelannoy/poreTally, under an MIT license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Nanoporos , Benchmarking , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA