Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 53
Filter
1.
IMA Fungus ; 15(1): 13, 2024 Jun 07.
Article in English | MEDLINE | ID: mdl-38849861

ABSTRACT

The Terminal Fusarium Clade (TFC) is a group in the Nectriaceae family with agricultural and clinical relevance. In recent years, various phylogenies have been presented in the literature, showing disagreement in the topologies, but only a few studies have conducted analyses on the divergence time scale of the group. Therefore, the evolutionary history of this group is still being determined. This study aimed to understand the evolutionary history of the TFC from a phylogenomic perspective. To achieve this objective, we performed a phylogenomic analysis using the available genomes in GenBank and ran eight different pipelines. We presented a new robust topology of the TFC that differs at some nodes from previous studies. These new relationships allowed us to formulate new hypotheses about the evolutionary history of the TFC. We also inferred new divergence time estimates, which differ from those of previous studies due to topology discordances and taxon sampling. The results suggested an important diversification process in the Neogene period, likely associated with the diversification and predominance of terrestrial ecosystems by angiosperms. In conclusion, we presented a robust time-scale phylogeny that allowed us to formulate new hypotheses regarding the evolutionary history of the TFC.

2.
Gigascience ; 132024 Jan 02.
Article in English | MEDLINE | ID: mdl-38206589

ABSTRACT

BACKGROUND: Structural variants (SVs) are genomic polymorphisms defined by their length (>50 bp). The usual types of SVs are deletions, insertions, translocations, inversions, and copy number variants. SV detection and genotyping is fundamental given the role of SVs in phenomena such as phenotypic variation and evolutionary events. Thus, methods to identify SVs using long-read sequencing data have been recently developed. FINDINGS: We present an accurate and efficient algorithm to predict germline SVs from long-read sequencing data. The algorithm starts collecting evidence (signatures) of SVs from read alignments. Then, signatures are clustered based on a Euclidean graph with coordinates calculated from lengths and genomic positions. Clustering is performed by the DBSCAN algorithm, which provides the advantage of delimiting clusters with high resolution. Clusters are transformed into SVs and a Bayesian model allows to precisely genotype SVs based on their supporting evidence. This algorithm is integrated into the single sample variants detector of the Next Generation Sequencing Experience Platform, which facilitates the integration with other functionalities for genomics analysis. We performed multiple benchmark experiments, including simulation and real data, representing different genome profiles, sequencing technologies (PacBio HiFi, ONT), and read depths. CONCLUSION: The results show that our approach outperformed state-of-the-art tools on germline SV calling and genotyping, especially at low depths, and in error-prone repetitive regions. We believe this work significantly contributes to the development of bioinformatic strategies to maximize the use of long-read sequencing technologies.


Subject(s)
Algorithms , Benchmarking , Bayes Theorem , Genotype , Cluster Analysis
3.
Sci Rep ; 14(1): 2054, 2024 01 24.
Article in English | MEDLINE | ID: mdl-38267502

ABSTRACT

Chagas is an endemic disease in tropical regions of Latin America, caused by the parasite Trypanosoma cruzi. High intraspecies variability and genome complexity have been challenges to assemble high quality genomes needed for studies in evolution, population genomics, diagnosis and drug development. Here we present a chromosome-level phased assembly of a TcI T. cruzi strain (Dm25). While 29 chromosomes show a large collinearity with the assembly of the Brazil A4 strain, three chromosomes show both large heterozygosity and large divergence, compared to previous assemblies of TcI T. cruzi strains. Nucleotide and protein evolution statistics indicate that T. cruzi Marinkellei separated before the diversification of T. cruzi in the known DTUs. Interchromosomal paralogs of dispersed gene families and histones appeared before but at the same time have a more strict purifying selection, compared to other repeat families. Previously unreported large tandem arrays of protein kinases and histones were identified in this assembly. Over one million variants obtained from Illumina reads aligned to the primary assembly clearly separate the main DTUs. We expect that this new assembly will be a valuable resource for further studies on evolution and functional genomics of Trypanosomatids.


Subject(s)
Chagas Disease , Trypanosoma cruzi , Humans , Trypanosoma cruzi/genetics , Colombia , Histones , Brazil
4.
Commun Biol ; 6(1): 803, 2023 08 02.
Article in English | MEDLINE | ID: mdl-37532823

ABSTRACT

The domestication process in lima bean (Phaseolus lunatus L.) involves two independent events, within the Mesoamerican and Andean gene pools. This makes lima bean an excellent model to understand convergent evolution. The mechanisms of adaptation followed by Mesoamerican and Andean landraces are largely unknown. Genes related to these adaptations can be selected by identification of selective sweeps within gene pools. Previous genetic analyses in lima bean have relied on Single Nucleotide Polymorphism (SNP) loci, and have ignored transposable elements (TEs). Here we show the analysis of whole-genome sequencing data from 61 lima bean accessions to characterize a genomic variation database including TEs and SNPs, to associate selective sweeps with variable TEs and to predict candidate domestication genes. A small percentage of genes under selection are shared among gene pools, suggesting that domestication followed different genetic avenues in both gene pools. About 75% of TEs are located close to genes, which shows their potential to affect gene functions. The genetic structure inferred from variable TEs is consistent with that obtained from SNP markers, suggesting that TE dynamics can be related to the demographic history of wild and domesticated lima bean and its adaptive processes, in particular selection processes during domestication.


Subject(s)
Phaseolus , Phaseolus/genetics , DNA Transposable Elements/genetics , Polymorphism, Single Nucleotide , Population Dynamics
5.
Appl Plant Sci ; 11(4): e11520, 2023.
Article in English | MEDLINE | ID: mdl-37601317

ABSTRACT

Premise: Transposable elements (TEs) make up more than half of the genomes of complex plant species and can modulate the expression of neighboring genes, producing significant variability of agronomically relevant traits. The availability of long-read sequencing technologies allows the building of genome assemblies for plant species with large and complex genomes. Unfortunately, TE annotation currently represents a bottleneck in the annotation of genome assemblies. Methods and Results: We present a new functionality of the Next-Generation Sequencing Experience Platform (NGSEP) to perform efficient homology-based TE annotation. Sequences in a reference library are treated as long reads and mapped to an input genome assembly. A hierarchical annotation is then assigned by homology using the annotation of the reference library. We tested the performance of our algorithm on genome assemblies of different plant species, including Arabidopsis thaliana, Oryza sativa, Coffea humblotiana, and Triticum aestivum (bread wheat). Our algorithm outperforms traditional homology-based annotation tools in speed by a factor of three to >20, reducing the annotation time of the T. aestivum genome from months to hours, and recovering up to 80% of TEs annotated with RepeatMasker with a precision of up to 0.95. Conclusions: NGSEP allows rapid analysis of TEs, especially in very large and TE-rich plant genomes.

6.
Article in English | MEDLINE | ID: mdl-37188652

ABSTRACT

BACKGROUND: Danger-associated molecular patterns (DAMPs) may be implicated in the pathophysiological pathways associated with an unfavorable outcome after acute brain injury (ABI). METHODS: We collected samples of ventricular cerebrospinal fluid (vCSF) for 5 days in 50 consecutive patients at risk of intracranial hypertension after traumatic and nontraumatic ABI. Differences in vCSF protein expression over time were evaluated using linear models and selected for functional network analysis using the PANTHER and STRING databases. The primary exposure of interest was the type of brain injury (traumatic vs. nontraumatic), and the primary outcome was the vCSF expression of DAMPs. Secondary exposures of interest included the occurrence of intracranial pressure ≥20 or ≥ 30 mm Hg during the 5 days post-ABI, intensive care unit (ICU) mortality, and neurological outcome (assessed using the Glasgow Outcome Score) at 3 months post-ICU discharge. Secondary outcomes included associations of these exposures with the vCSF expression of DAMPs. RESULTS: A network of 6 DAMPs (DAMP_trauma; protein-protein interaction [PPI] P=0.04) was differentially expressed in patients with ABI of traumatic origin compared with those with nontraumatic ABI. ABI patients with intracranial pressure ≥30 mm Hg differentially expressed a set of 38 DAMPS (DAMP_ICP30; PPI P< 0.001). Proteins in DAMP_ICP30 are involved in cellular proteolysis, complement pathway activation, and post-translational modifications. There were no relationships between DAMP expression and ICU mortality or unfavorable versus favorable outcomes. CONCLUSIONS: Specific patterns of vCSF DAMP expression differentiated between traumatic and nontraumatic types of ABI and were associated with increased episodes of severe intracranial hypertension.

7.
Food Res Int ; 165: 112555, 2023 03.
Article in English | MEDLINE | ID: mdl-36869541

ABSTRACT

The global market of chocolate has increased worldwide during the last decade and is expected to reach a value of USD 200 billion by 2028. Chocolate is obtained from different varieties of Theobroma cacao L, a plant domesticated more than 4000 years ago in the Amazon rainforest. However, chocolate production is a complex process requiring extensive post-harvesting, mainly involving cocoa bean fermentation, drying, and roasting. These steps have a critical impact on chocolate quality. Standardizing and better understanding cocoa processing is, therefore, a current challenge to boost the global production of high-quality cocoa worldwide. This knowledge can also help cocoa producers improve cocoa processing management and obtain a better chocolate. Several recent studies have been conducted to dissect cocoa processing via omics analysis. A vast amount of data has been produced regarding omics studies of cocoa processing performed worldwide. This review systematically analyzes the current data on cocoa omics using data mining techniques and discusses opportunities and gaps for cocoa processing standardization from this data. First, we observed a recurrent report in metagenomics studies of species of the fungi genus Candida and Pichia as well as bacteria from the genus Lactobacillus, Acetobacter, and Bacillus. Second, our analyzes of the available metabolomics data showed clear differences in the identified metabolites in cocoa and chocolate from different geographical origin, cocoa type, and processing stage. Finally, our analysis of peptidomics data revealed characteristic patterns in the gathered data including higher diversity and lower size distribution of peptides in fine-flavor cocoa. In addition, we discuss the current challenges in cocoa omics research. More research is still required to fill gaps in central matter in chocolate production as starter cultures for cocoa fermentation, flavor evolution of cocoa, and the role of peptides in the development of specific flavor notes. We also offer the most comprehensive collection of multi-omics data in cocoa processing gathered from different research articles.


Subject(s)
Bacillus , Cacao , Chocolate , Food , Candida
8.
Life Sci Alliance ; 6(5)2023 05.
Article in English | MEDLINE | ID: mdl-36813568

ABSTRACT

Building de novo genome assemblies for complex genomes is possible thanks to long-read DNA sequencing technologies. However, maximizing the quality of assemblies based on long reads is a challenging task that requires the development of specialized data analysis techniques. We present new algorithms for assembling long DNA sequencing reads from haploid and diploid organisms. The assembly algorithm builds an undirected graph with two vertices for each read based on minimizers selected by a hash function derived from the k-mer distribution. Statistics collected during the graph construction are used as features to build layout paths by selecting edges, ranked by a likelihood function. For diploid samples, we integrated a reimplementation of the ReFHap algorithm to perform molecular phasing. We ran the implemented algorithms on PacBio HiFi and Nanopore sequencing data taken from haploid and diploid samples of different species. Our algorithms showed competitive accuracy and computational efficiency, compared with other currently used software. We expect that this new development will be useful for researchers building genome assemblies for different species.


Subject(s)
Algorithms , High-Throughput Nucleotide Sequencing , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Genome , Software
9.
Methods Mol Biol ; 2590: 273-286, 2023.
Article in English | MEDLINE | ID: mdl-36335504

ABSTRACT

The ultimate goal of de novo assembly of reads sequenced from a diploid individual is the separate reconstruction of the sequences corresponding to the two copies of each chromosome. Unfortunately, the allele linkage information needed to perform phased genome assemblies has been difficult to generate. Hence, most current genome assemblies are a haploid mixture of the two underlying chromosome copies present in the sequenced individual. Sequencing technologies providing long (20 kb) and accurate reads are the basis to generate phased genome assemblies. This chapter provides a brief overview of the main milestones in traditional genome assembly, focusing on the bioinformatic techniques developed to generate haplotype information from different specialized protocols. Using these techniques as a knowledge background, the chapter reviews the current algorithms to generate phased assemblies from long reads with low error rates. Current techniques perform haplotype-aware error correction steps to increase the quality of the raw reads. In addition, variations on the traditional overlap-layout-consensus (OLC) graph have been developed in an effort to eliminate edges between reads sequenced from different chromosome copies. This allows for large presence-absence variants between the chromosome copies to be taken into account. The development of these algorithms, along with the improved sequencing technologies has been crucial to finish chromosome-level assemblies of complex genomes.


Subject(s)
Algorithms , Computational Biology , Sequence Analysis, DNA/methods , Haplotypes , Alleles , High-Throughput Nucleotide Sequencing/methods
10.
Mol Ecol Resour ; 23(3): 712-724, 2023 Apr.
Article in English | MEDLINE | ID: mdl-36377253

ABSTRACT

Whole-genome alignment allows researchers to understand the genomic structure and variation among genomes. Approaches based on direct pairwise comparisons of DNA sequences require large computational capacities. As a consequence, pipelines combining tools for orthologous gene identification and synteny have been developed. In this manuscript, we present the latest functionalities implemented in NGSEP 4, to identify orthogroups and perform whole genome alignments. NGSEP implements functionalities for identification of clusters of homologus genes, synteny analysis and whole genome alignment. Our results showed that the NGSEP algorithm for orthogroups identification has competitive accuracy and efficiency in comparison to commonly used tools. The implementation also includes a visualization of the whole genome alignment based on synteny of the orthogroups that were identified, and a reconstruction of the pangenome based on frequencies of the orthogroups among the genomes. NGSEP 4 also includes a new graphical user interface based on the JavaFX technology. We expect that these new developments will be very useful for several studies in evolutionary biology and population genomics.


Subject(s)
Genome , Software , Genomics/methods , Algorithms , Metagenomics
11.
Food Chem ; 397: 133845, 2022 Dec 15.
Article in English | MEDLINE | ID: mdl-35940096

ABSTRACT

The impact of cocoa lipid content on chocolate quality has been extensively described. Nevertheless, few studies have elucidated the cocoa lipid composition and their bioactive properties, focusing only on specific lipids. In the present study the lipidome of fine-flavor cocoa fermentation was analyzed using LC-MS-QTOF and a Machine Learning model to assess potential bioactivity was developed. Our results revealed that the cocoa lipidome, comprised mainly of fatty acyls and glycerophospholipids, remains stable during fine-flavor cocoa fermentations. Also, several Machine Learning algorithms were trained to explore potential biological activity among the identified lipids. We found that K-Nearest Neighbors had the best performance. This model was used to classify the identified lipids as bioactive or non-bioactive, nominating 28 molecules as potential bioactive lipids. None of these compounds have been previously reported as bioactive. Our work is the first untargeted lipidomic study and systematic effort to investigate potential bioactivity in fine-flavor cocoa lipids.


Subject(s)
Cacao , Chocolate , Fermentation , Lipidomics , Lipids , Taste
12.
New Phytol ; 235(6): 2454-2465, 2022 09.
Article in English | MEDLINE | ID: mdl-35708662

ABSTRACT

Fruit development has been central in the evolution and domestication of flowering plants. In common bean (Phaseolus vulgaris), the principal global grain legume staple, two main production categories are distinguished by fibre deposition in pods: dry beans, with fibrous, stringy pods; and stringless snap/green beans, with reduced fibre deposition, which frequently revert to the ancestral stringy state. Here, we identify genetic and developmental patterns associated with pod fibre deposition. Transcriptional, anatomical, epigenetic and genetic regulation of pod strings were explored through RNA-seq, RT-qPCR, fluorescence microscopy, bisulfite sequencing and whole-genome sequencing. Overexpression of the INDEHISCENT ('PvIND') orthologue was observed in stringless types compared with isogenic stringy lines, associated with overspecification of weak dehiscence-zone cells throughout the pod vascular sheath. No differences in DNA methylation were correlated with this phenotype. Nonstringy varieties showed a tandemly direct duplicated PvIND and a Ty1-copia retrotransposon inserted between the two repeats. These sequence features are lost during pod reversion and are predictive of pod phenotype in diverse materials, supporting their role in PvIND overexpression and reversible string phenotype. Our results give insight into reversible gain-of-function mutations and possible genetic solutions to the reversion problem, of considerable economic value for green bean production.


Subject(s)
Phaseolus , Domestication , Gene Duplication , Phaseolus/genetics , Phenotype , Retroelements/genetics
13.
Neurocrit Care ; 37(2): 463-470, 2022 10.
Article in English | MEDLINE | ID: mdl-35523916

ABSTRACT

BACKGROUND: Quantitative analysis of ventricular cerebrospinal fluid (vCSF) proteins following acute brain injury (ABI) may help identify pathophysiological pathways and potential biomarkers that can predict unfavorable outcome. METHODS: In this prospective proteomic analysis study, consecutive patients with severe ABI expected to require intraventricular catheterization for intracranial pressure (ICP) monitoring for at least 5 days and patients without ABI admitted for elective clipping of an unruptured cerebral aneurysm were included. vCSF samples were collected within the first 24 h after ABI and ventriculostomy insertion and then every 24 h for 5 days. In patients without ABI, a single vCSF sample was collected at the time of elective clipping. Data-independent acquisition and sequential window acquisition of all theoretical spectra (SWATH) mass spectrometry were used to compare differences in protein expression in patients with ABI and patients without ABI and in patients with traumatic and nontraumatic ABI. Differences in protein expression according to different ICP values, intensive care unit outcome, subarachnoid hemorrhage (SAH) versus traumatic brain injury (TBI), and good versus poor 3-month functional status (assessed by using the Glasgow Outcome Scale) were also evaluated. vCSF proteins with significant differences between groups were compared by using linear models and selected for gene ontology analysis using R Language and the Panther database. RESULTS: We included 50 patients with ABI (SAH n = 23, TBI n = 15, intracranial hemorrhage n = 6, ischemic stroke n = 3, others n = 3) and 12 patients without ABI. There were significant differences in the expression of 255 proteins between patients with and without ABI (p < 0.01). There were intraday and interday differences in expression of seven proteins related to increased inflammation, apoptosis, oxidative stress, and cellular response to hypoxia and injury. Among these, glial fibrillary acidic protein expression was higher in patients with ABI with severe intracranial hypertension (ICH) (ICP ≥ 30 mm Hg) or death compared to those without (log 2 fold change: + 2.4; p < 0.001), suggesting extensive primary astroglial injury or death. There were differences in the expression of 96 proteins between patients with traumatic and nontraumatic ABI (p < 0.05); intraday and interday differences were observed for six proteins related to structural damage, complement activation, and cholesterol metabolism. Thirty-nine vCSF proteins were associated with an increased risk of severe ICH (ICP ≥ 30 mm Hg) in patients with traumatic compared with nontraumatic ABI (p < 0.05). No significant differences were found in protein expression between patients with SAH versus TBI or between those with good versus poor 3-month Glasgow Outcome Scale score. CONCLUSIONS: Dysregulated vCSF protein expression after ABI may be associated with an increased risk of severe ICH and death.


Subject(s)
Brain Injuries, Traumatic , Brain Injuries , Intracranial Hypertension , Subarachnoid Hemorrhage , Biomarkers , Cholesterol , Glial Fibrillary Acidic Protein , Humans , Intracranial Hypertension/etiology , Intracranial Pressure/physiology , Prospective Studies , Proteomics , Subarachnoid Hemorrhage/complications
15.
Hum Mutat ; 43(4): 449-460, 2022 04.
Article in English | MEDLINE | ID: mdl-35143088

ABSTRACT

The growing use of next-generation sequencing technologies on genetic diagnosis has produced an exponential increase in the number of variants of uncertain significance (VUS). In this manuscript, we compare three machine learning methods to classify VUS as Pathogenic or No pathogenic, implementing a Random Forest (RF), a Support Vector Machine (SVM), and a Multilayer Perceptron. To train the models, we extracted high-quality variants from ClinVar that were previously classified as VUS. For each variant, we retrieved nine conservation scores, the loss-of-function tool, and allele frequencies. For the RF and SVM models, hyperparameters were tuned using cross-validation with a grid search. The three models were tested on a nonoverlapping set of variants that had been classified as VUS over the last 3 years, but had been reclassified in August 2020. The three models yielded superior accuracy on this set compared to the benchmarked tools. The RF-based model yielded the best performance across different variant types and was used to create VusPrize, an open-source software tool for prioritization of VUS. We believe that our model can improve the process of genetic diagnosis in research and clinical settings.


Subject(s)
High-Throughput Nucleotide Sequencing , Machine Learning , High-Throughput Nucleotide Sequencing/methods , Humans , Neural Networks, Computer , Software , Support Vector Machine
16.
Mol Ecol Resour ; 22(1): 439-454, 2022 Jan.
Article in English | MEDLINE | ID: mdl-34288487

ABSTRACT

Genotyping-by-sequencing (GBS) is a widely used and cost-effective technique for obtaining large numbers of genetic markers from populations by sequencing regions adjacent to restriction cut sites. Although a standard reference-based pipeline can be followed to analyse GBS reads, a reference genome is still not available for a large number of species. Hence, reference-free approaches are required to generate the genetic variability information that can be obtained from a GBS experiment. Unfortunately, available tools to perform de novo analysis of GBS reads face issues of usability, accuracy and performance. Furthermore, few available tools are suitable for analysing data sets from polyploid species. In this manuscript, we describe a novel algorithm to perform reference-free variant detection and genotyping from GBS reads. Nonexact searches on a dynamic hash table of consensus sequences allow for efficient read clustering and sorting. This algorithm was integrated in the Next Generation Sequencing Experience Platform (NGSEP) to integrate the state-of-the-art variant detector already implemented in this tool. We performed benchmark experiments with three different empirical data sets of plants and animals with different population structures and ploidies, and sequenced with different GBS protocols at different read depths. These experiments show that NGSEP has comparable and in some cases better accuracy and always better computational efficiency compared to existing solutions. We expect that this new development will be useful for many research groups conducting population genetic studies in a wide variety of species.


Subject(s)
Diploidy , Polyploidy , Genomics , Genotype , Humans , Software
17.
Front Plant Sci ; 12: 730251, 2021.
Article in English | MEDLINE | ID: mdl-34745164

ABSTRACT

Solanum betaceum is a tree from the Andean region bearing edible fruits, considered an exotic export. Although there has been renewed interest in its commercialization, sustainability, and disease management have been limiting factors. Phytophthora betacei is a recently described species that causes late blight in S. betaceum. There is no general study of the response of S. betaceum, particularly, in the changes in expression of pathogenesis-related genes. In this manuscript we present a comprehensive RNA-seq time-series study of the plant response to the infection of P. betacei. Following six time points of infection, the differentially expressed genes (DEGs) involved in the defense by the plant were contextualized in a sequential manner. We documented 5,628 DEGs across all time-points. From 6 to 24 h post-inoculation, we highlighted DEGs involved in the recognition of the pathogen by the likely activation of pattern-triggered immunity (PTI) genes. We also describe the possible effect of the pathogen effectors in the host during the effector-triggered response. Finally, we reveal genes related to the susceptible outcome of the interaction caused by the onset of necrotrophy and the sharp transcriptional changes as a response to the pathogen. This is the first report of the transcriptome of the tree tomato in response to the newly described pathogen P. betacei.

18.
Front Plant Sci ; 12: 694859, 2021.
Article in English | MEDLINE | ID: mdl-34484261

ABSTRACT

Recent developments in High Throughput Sequencing (HTS) technologies and bioinformatics, including improved read lengths and genome assemblers allow the reconstruction of complex genomes with unprecedented quality and contiguity. Sugarcane has one of the most complicated genomes among grassess with a haploid length of 1Gbp and a ploidies between 8 and 12. In this work, we present a genome assembly of the Colombian sugarcane hybrid CC 01-1940. Three types of sequencing technologies were combined for this assembly: PacBio long reads, Illumina paired short reads, and Hi-C reads. We achieved a median contig length of 34.94 Mbp and a total genome assembly of 903.2 Mbp. We annotated a total of 63,724 protein coding genes and performed a reconstruction and comparative analysis of the sucrose metabolism pathway. Nucleotide evolution measurements between orthologs with close species suggest that divergence between Saccharum officinarum and Saccharum spontaneum occurred <2 million years ago. Synteny analysis between CC 01-1940 and the S. spontaneum genome confirms the presence of translocation events between the species and a random contribution throughout the entire genome in current sugarcane hybrids. Analysis of RNA-Seq data from leaf and root tissue of contrasting sugarcane genotypes subjected to water stress treatments revealed 17,490 differentially expressed genes, from which 3,633 correspond to genes expressed exclusively in tolerant genotypes. We expect the resources presented here to serve as a source of information to improve the selection processes of new varieties of the breeding programs of sugarcane.

19.
Front Chem ; 9: 700802, 2021.
Article in English | MEDLINE | ID: mdl-34422762

ABSTRACT

Fragment-based drug design (FBDD) and pharmacophore modeling have proven to be efficient tools to discover novel drugs. However, these approaches may become limited if the collection of fragments is highly repetitive, poorly diverse, or excessively simple. In this article, combining pharmacophore modeling and a non-classical type of fragmentation (herein called non-extensive) to screen a natural product (NP) library may provide fragments predicted as potent, diverse, and developable. Initially, we applied retrosynthetic combinatorial analysis procedure (RECAP) rules in two versions, extensive and non-extensive, in order to deconstruct a virtual library of NPs formed by the databases Traditional Chinese Medicine (TCM), AfroDb (African Medicinal Plants database), NuBBE (Nuclei of Bioassays, Biosynthesis, and Ecophysiology of Natural Products), and UEFS (Universidade Estadual de Feira de Santana). We then developed a virtual screening (VS) using two groups of natural-product-derived fragments (extensive and non-extensive NPDFs) and two overlapping pharmacophore models for each of 20 different proteins of therapeutic interest. Molecular weight, lipophilicity, and molecular complexity were estimated and compared for both types of NPDFs (and their original NPs) before and after the VS proceedings. As a result, we found that non-extensive NPDFs exhibited a much higher number of chemical entities compared to extensive NPDFs (45,355 vs. 11,525 compounds), accounting for the larger part of the hits recovered and being far less repetitive than extensive NPDFs. The structural diversity of both types of NPDFs and the NPs was shown to diminish slightly after VS procedures. Finally, and most interestingly, the pharmacophore fit score of the non-extensive NPDFs proved to be not only higher, on average, than extensive NPDFs (56% of cases) but also higher than their original NPs (69% of cases) when all of them were also recognized as hits after the VS. The findings obtained in this study indicated that the proposed cascade approach was useful to enhance the probability of identifying innovative chemical scaffolds, which deserve further development to become drug-sized candidate compounds. We consider that the knowledge about the deconstruction degree required to produce NPDFs of interest represents a good starting point for eventual synthesis, characterization, and biological activity studies.

20.
mBio ; 12(2)2021 03 23.
Article in English | MEDLINE | ID: mdl-33758086

ABSTRACT

tRNAs are encoded by a large gene family, usually with several isogenic tRNAs interacting with the same codon. Mutations in the anticodon region of other tRNAs can overcome specific tRNA deficiencies. Phylogenetic analysis suggests that such mutations have occurred in evolution, but the driving force is unclear. We show that in yeast suppressor mutations in other tRNAs are able to overcome deficiency of the essential TRT2-encoded tRNAThrCGU at high temperature (40°C). Surprisingly, these tRNA suppressor mutations were obtained after whole-genome transformation with DNA from thermotolerant Kluyveromyces marxianus or Ogataea polymorpha strains but from which the mutations did apparently not originate. We suggest that transient presence of donor DNA in the host facilitates proliferation at high temperature and thus increases the chances for occurrence of spontaneous mutations suppressing defective growth at high temperature. Whole-genome sequence analysis of three transformants revealed only four to five nonsynonymous mutations of which one causing TRT2 anticodon stem stabilization and two anticodon mutations in non-threonyl-tRNAs, tRNALysCUU and tRNAeMetCAU, were causative. Both anticodon mutations suppressed lethality of TRT2 deletion and apparently caused the respective tRNAs to become novel substrates for threonyl-tRNA synthetase. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) data could not detect any significant mistranslation, and reverse transcription-quantitative PCR results contradicted induction of the unfolded protein response. We suggest that stress conditions have been a driving force in evolution for the selection of anticodon-switching mutations in tRNAs as revealed by phylogenetic analysis.IMPORTANCE In this work, we have identified for the first time the causative elements in a eukaryotic organism introduced by applying whole-genome transformation and responsible for the selectable trait of interest, i.e., high temperature tolerance. Surprisingly, the whole-genome transformants contained just a few single nucleotide polymorphisms (SNPs), which were unrelated to the sequence of the donor DNA. In each of three independent transformants, we have identified a SNP in a tRNA, either stabilizing the essential tRNAThrCGU at high temperature or switching the anticodon of tRNALysCUU or tRNAeMetCAU into CGU, which is apparently enough for in vivo recognition by threonyl-tRNA synthetase. LC-MS/MS analysis indeed indicated absence of significant mistranslation. Phylogenetic analysis showed that similar mutations have occurred throughout evolution and we suggest that stress conditions may have been a driving force for their selection. The low number of SNPs introduced by whole-genome transformation may favor its application for improvement of industrial yeast strains.


Subject(s)
Anticodon/antagonists & inhibitors , Genome, Fungal , Kluyveromyces/genetics , Mutation , RNA, Transfer/genetics , Stress, Physiological/genetics , Suppression, Genetic , Anticodon/genetics , Chromatography, Liquid , Kluyveromyces/classification , Phylogeny , Polymorphism, Single Nucleotide , Tandem Mass Spectrometry , Whole Genome Sequencing
SELECTION OF CITATIONS
SEARCH DETAIL
...