Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 73
Filtrar
1.
ISME J ; 2024 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-38913498

RESUMO

Nitrous oxide (N2O) is a potent greenhouse gas of primarily microbial origin. Oxic and anoxic emissions are commonly ascribed to autotrophic nitrification and heterotrophic denitrification, respectively. Beyond this established dichotomy, we quantitatively show that heterotrophic denitrification can significantly contribute to aerobic nitrogen turnover and N2O emissions in complex microbiomes exposed to frequent oxic/anoxic transitions. Two planktonic, nitrification-inhibited enrichment cultures were established under continuous organic carbon and nitrate feeding, and cyclic oxygen availability. Over a third of the influent organic substrate was respired with nitrate as electron acceptor at high oxygen concentrations (> 6.5 mg/L). N2O accounted for up to one quarter of the nitrate reduced under oxic conditions. The enriched microorganisms maintained a constitutive abundance of denitrifying enzymes due to the oxic/anoxic frequencies exceeding their protein turnover - a common scenario in natural and engineered ecosystems. The aerobic denitrification rates are ascribed primarily to the residual activity of anaerobically synthesized enzymes. From an ecological perspective, the selection of organisms capable of sustaining significant denitrifying activity during aeration shows their competitive advantage over other heterotrophs under varying oxygen availabilities. Ultimately, we propose that the contribution of heterotrophic denitrification to aerobic nitrogen turnover and N2O emissions is currently underestimated in dynamic environments.

2.
Bioinformatics ; 40(6)2024 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-38775729

RESUMO

MOTIVATION: Today, we know the function of only a small fraction of the protein sequences predicted from genomic data. This problem is even more salient for bacteria, which represent some of the most phylogenetically and metabolically diverse taxa on Earth. This low rate of bacterial gene annotation is compounded by the fact that most function prediction algorithms have focused on eukaryotes, and conventional annotation approaches rely on the presence of similar sequences in existing databases. However, often there are no such sequences for novel bacterial proteins. Thus, we need improved gene function prediction methods tailored for bacteria. Recently, transformer-based language models-adopted from the natural language processing field-have been used to obtain new representations of proteins, to replace amino acid sequences. These representations, referred to as protein embeddings, have shown promise for improving annotation of eukaryotes, but there have been only limited applications on bacterial genomes. RESULTS: To predict gene functions in bacteria, we developed SAFPred, a novel synteny-aware gene function prediction tool based on protein embeddings from state-of-the-art protein language models. SAFpred also leverages the unique operon structure of bacteria through conserved synteny. SAFPred outperformed both conventional sequence-based annotation methods and state-of-the-art methods on multiple bacterial species, including for distant homolog detection, where the sequence similarity to the proteins in the training set was as low as 40%. Using SAFPred to identify gene functions across diverse enterococci, of which some species are major clinical threats, we identified 11 previously unrecognized putative novel toxins, with potential significance to human and animal health. AVAILABILITY AND IMPLEMENTATION: https://github.com/AbeelLab/safpred.


Assuntos
Algoritmos , Proteínas de Bactérias , Genoma Bacteriano , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Software , Bactérias/genética , Sintenia , Biologia Computacional/métodos , Anotação de Sequência Molecular/métodos
3.
NPJ Syst Biol Appl ; 10(1): 63, 2024 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-38821949

RESUMO

Yeast metabolism can be engineered to produce xenobiotic compounds, such as cannabinoids, the principal isoprenoids of the plant Cannabis sativa, through heterologous metabolic pathways. However, yeast cell factories continue to have low cannabinoid production. This study employed an integrated omics approach to investigate the physiological effects of cannabidiol on S. cerevisiae CENPK2-1C yeast cultures. We treated the experimental group with 0.5 mM CBD and monitored CENPK2-1C cultures. We observed a latent-stationary phase post-diauxic shift in the experimental group and harvested samples in the inflection point of this growth phase for transcriptomic and metabolomic analysis. We compared the transcriptomes of the CBD-treated yeast and the positive control, identifying eight significantly overexpressed genes with a log fold change of at least 1.5 and a significant adjusted p-value. Three notable genes were PDR5 (an ABC-steroid and cation transporter), CIS1, and YGR035C. These genes are all regulated by pleiotropic drug resistance linked promoters. Knockout and rescue of PDR5 showed that it is a causal factor in the post-diauxic shift phenotype. Metabolomic analysis revealed 48 significant spectra associated with CBD-fed cell pellets, 20 of which were identifiable as non-CBD compounds, including fatty acids, glycerophospholipids, and phosphate-salvage indicators. Our results suggest that mitochondrial regulation and lipidomic remodeling play a role in yeast's response to CBD, which are employed in tandem with pleiotropic drug resistance (PDR). We conclude that bioengineers should account for off-target product C-flux, energy use from ABC-transport, and post-stationary phase cell growth when developing cannabinoid-biosynthetic yeast strains.


Assuntos
Canabidiol , Lipidômica , Proteínas de Saccharomyces cerevisiae , Saccharomyces cerevisiae , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/efeitos dos fármacos , Saccharomyces cerevisiae/metabolismo , Canabidiol/farmacologia , Lipidômica/métodos , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Metabolômica/métodos , Transportadores de Cassetes de Ligação de ATP/genética , Transportadores de Cassetes de Ligação de ATP/metabolismo , Transcriptoma/genética , Transcriptoma/efeitos dos fármacos , Regulação Fúngica da Expressão Gênica/efeitos dos fármacos , Farmacorresistência Fúngica/genética , Perfilação da Expressão Gênica/métodos
4.
Proc Natl Acad Sci U S A ; 121(10): e2310852121, 2024 Mar 05.
Artigo em Inglês | MEDLINE | ID: mdl-38416678

RESUMO

Enterococci are gut microbes of most land animals. Likely appearing first in the guts of arthropods as they moved onto land, they diversified over hundreds of millions of years adapting to evolving hosts and host diets. Over 60 enterococcal species are now known. Two species, Enterococcus faecalis and Enterococcus faecium, are common constituents of the human microbiome. They are also now leading causes of multidrug-resistant hospital-associated infection. The basis for host association of enterococcal species is unknown. To begin identifying traits that drive host association, we collected 886 enterococcal strains from widely diverse hosts, ecologies, and geographies. This identified 18 previously undescribed species expanding genus diversity by >25%. These species harbor diverse genes including toxins and systems for detoxification and resource acquisition. Enterococcus faecalis and E. faecium were isolated from diverse hosts highlighting their generalist properties. Most other species showed a more restricted distribution indicative of specialized host association. The expanded species diversity permitted the Enterococcus genus phylogeny to be viewed with unprecedented resolution, allowing features to be identified that distinguish its four deeply rooted clades, and the entry of genes associated with range expansion such as B-vitamin biosynthesis and flagellar motility to be mapped to the phylogeny. This work provides an unprecedentedly broad and deep view of the genus Enterococcus, including insights into its evolution, potential new threats to human health, and where substantial additional enterococcal diversity is likely to be found.


Assuntos
Enterococcus faecium , Infecções por Bactérias Gram-Positivas , Animais , Humanos , Enterococcus/genética , Antibacterianos/farmacologia , Enterococcus faecium/genética , Enterococcus faecalis/genética , Filogenia , Testes de Sensibilidade Microbiana , Farmacorresistência Bacteriana
5.
Front Microbiol ; 14: 1308363, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38143860

RESUMO

Background: Enteric methane from cow burps, which results from microbial fermentation of high-fiber feed in the rumen, is a significant contributor to greenhouse gas emissions. A promising strategy to address this problem is microbiome-based precision feed, which involves identifying key microorganisms for methane production. While machine learning algorithms have shown success in associating human gut microbiome with various human diseases, there have been limited efforts to employ these algorithms to establish microbial biomarkers for methane emissions in ruminants. Methods: In this study, we aim to identify potential methane biomarkers for methane emission from ruminants by employing regression algorithms commonly used in human microbiome studies, coupled with different feature selection methods. To achieve this, we analyzed the microbiome compositions and identified possible confounding metadata variables in two large public datasets of Holstein cows. Using both the microbiome features and identified metadata variables, we trained different regressors to predict methane emission. With the optimized models, permutation tests were used to determine feature importance to find informative microbial features. Results: Among the regression algorithms tested, random forest regression outperformed others and allowed the identification of several crucial microbial taxa for methane emission as members of the native rumen microbiome, including the genera Piromyces, Succinivibrionaceae UCG-002, and Acetobacter. Additionally, our results revealed that certain herd locations and feed composition markers, such as the lipid intake and neutral-detergent fiber intake, are also predictive features for methane emissions. Conclusion: We demonstrated that machine learning, particularly regression algorithms, can effectively predict cow methane emissions and identify relevant rumen microorganisms. Our findings offer valuable insights for the development of microbiome-based precision feed strategies aiming at reducing methane emissions.

6.
BMC Bioinformatics ; 24(1): 400, 2023 Oct 26.
Artigo em Inglês | MEDLINE | ID: mdl-37884897

RESUMO

BACKGROUND: Pan-genome graphs are gaining importance in the field of bioinformatics as data structures to represent and jointly analyze multiple genomes. Compacted de Bruijn graphs are inherently suited for this purpose, as their graph topology naturally reveals similarity and divergence within the pan-genome. Most state-of-the-art pan-genome graphs are represented explicitly in terms of nodes and edges. Recently, an alternative, implicit graph representation was proposed that builds directly upon the unidirectional FM-index. As such, a memory-efficient graph data structure is obtained that inherits the FM-index' backward search functionality. However, this representation suffers from a number of shortcomings in terms of functionality and algorithmic performance. RESULTS: We present a data structure for a pan-genome, compacted de Bruijn graph that aims to address these shortcomings. It is built on the bidirectional FM-index, extending the ability of its unidirectional counterpart to navigate and search the graph in both directions. All basic graph navigation steps can be performed in constant time. Based on these features, we implement subgraph visualization as well as lossless approximate pattern matching to the graph using search schemes. We demonstrate that we can retrieve all occurrences corresponding to a read within a certain edit distance in a very efficient manner. Through a case study, we show the potential of exploiting the information embedded in the graph's topology through visualization and sequence alignment. CONCLUSIONS: We propose a memory-efficient representation of the pan-genome graph that supports subgraph visualization and lossless approximate pattern matching of reads against the graph using search schemes. The C++ source code of our software, called Nexus, is available at https://github.com/biointec/nexus under AGPL-3.0 license.


Assuntos
Algoritmos , Genoma , Análise de Sequência de DNA , Software , Biologia Computacional
7.
Bioinformatics ; 39(10)2023 10 03.
Artigo em Inglês | MEDLINE | ID: mdl-37796811

RESUMO

MOTIVATION: Plasmids are carriers for antimicrobial resistance (AMR) genes and can exchange genetic material with other structures, contributing to the spread of AMR. There is no reliable approach to identify the transfer of AMR genes across plasmids. This is mainly due to the absence of a method to assess the phylogenetic distance of plasmids, as they show large DNA sequence variability. Identifying and quantifying such transfer can provide novel insight into the role of small mobile elements and resistant plasmid regions in the spread of AMR. RESULTS: We developed SHIP, a novel method to quantify plasmid similarity based on the dynamics of plasmid evolution. This allowed us to find conserved fragments containing AMR genes in structurally different and phylogenetically distant plasmids, which is evidence for lateral transfer. Our results show that regions carrying AMR genes are highly mobilizable between plasmids through transposons, integrons, and recombination events, and contribute to the spread of AMR. Identified transferred fragments include a multi-resistant complex class 1 integron in Escherichia coli and Klebsiella pneumoniae, and a region encoding tetracycline resistance transferred through recombination in Enterococcus faecalis. AVAILABILITY AND IMPLEMENTATION: The code developed in this work is available at https://github.com/AbeelLab/plasmidHGT.


Assuntos
Antibacterianos , Farmacorresistência Bacteriana , Antibacterianos/farmacologia , Filogenia , Farmacorresistência Bacteriana/genética , Plasmídeos/genética , Escherichia coli/genética , Integrons/genética , Transferência Genética Horizontal
8.
ACS Synth Biol ; 12(9): 2588-2599, 2023 09 15.
Artigo em Inglês | MEDLINE | ID: mdl-37616156

RESUMO

Combinatorial pathway optimization is an important tool in metabolic flux optimization. Simultaneous optimization of a large number of pathway genes often leads to combinatorial explosions. Strain optimization is therefore often performed using iterative design-build-test-learn (DBTL) cycles. The aim of these cycles is to develop a product strain iteratively, every time incorporating learning from the previous cycle. Machine learning methods provide a potentially powerful tool to learn from data and propose new designs for the next DBTL cycle. However, due to the lack of a framework for consistently testing the performance of machine learning methods over multiple DBTL cycles, evaluating the effectiveness of these methods remains a challenge. In this work, we propose a mechanistic kinetic model-based framework to test and optimize machine learning for iterative combinatorial pathway optimization. Using this framework, we show that gradient boosting and random forest models outperform the other tested methods in the low-data regime. We demonstrate that these methods are robust for training set biases and experimental noise. Finally, we introduce an algorithm for recommending new designs using machine learning model predictions. We show that when the number of strains to be built is limited, starting with a large initial DBTL cycle is favorable over building the same number of strains for every cycle.


Assuntos
Algoritmos , Engenharia Metabólica , Cinética , Aprendizado de Máquina , Algoritmo Florestas Aleatórias
9.
bioRxiv ; 2023 Jul 12.
Artigo em Inglês | MEDLINE | ID: mdl-37293047

RESUMO

Enterococci are commensal gut microbes of most land animals. They diversified over hundreds of millions of years adapting to evolving hosts and host diets. Of over 60 known enterococcal species, Enterococcus faecalis and E. faecium uniquely emerged in the antibiotic era among leading causes of multidrug resistant hospital-associated infection. The basis for the association of particular enterococcal species with a host is largely unknown. To begin deciphering enterococcal species traits that drive host association, and to assess the pool of Enterococcus-adapted genes from which known facile gene exchangers such as E. faecalis and E. faecium may draw, we collected 886 enterococcal strains from nearly 1,000 specimens representing widely diverse hosts, ecologies and geographies. This provided data on the global occurrence and host associations of known species, identifying 18 new species in the process expanding genus diversity by >25%. The novel species harbor diverse genes associated with toxins, detoxification, and resource acquisition. E. faecalis and E. faecium were isolated from a wide diversity of hosts highlighting their generalist properties, whereas most other species exhibited more restricted distributions indicative of specialized host associations. The expanded species diversity permitted the Enterococcus genus phylogeny to be viewed with unprecedented resolution, allowing features to be identified that distinguish its four deeply rooted clades as well as genes associated with range expansion, such as B-vitamin biosynthesis and flagellar motility. Collectively, this work provides an unprecedentedly broad and deep view of the genus Enterococcus, potential threats to human health, and new insights into its evolution.

10.
bioRxiv ; 2023 Nov 21.
Artigo em Inglês | MEDLINE | ID: mdl-37205418

RESUMO

Motivation: Today, we know the function of only a small fraction of the protein sequences predicted from genomic data. This problem is even more salient for bacteria, which represent some of the most phylogenetically and metabolically diverse taxa on Earth. This low rate of bacterial gene annotation is compounded by the fact that most function prediction algorithms have focused on eukaryotes, and conventional annotation approaches rely on the presence of similar sequences in existing databases. However, often there are no such sequences for novel bacterial proteins. Thus, we need improved gene function prediction methods tailored for prokaryotes. Recently, transformer-based language models - adopted from the natural language processing field - have been used to obtain new representations of proteins, to replace amino acid sequences. These representations, referred to as protein embeddings, have shown promise for improving annotation of eukaryotes, but there have been only limited applications on bacterial genomes. Results: To predict gene functions in bacteria, we developed SAP, a novel synteny-aware gene function prediction tool based on protein embeddings from state-of-the-art protein language models. SAP also leverages the unique operon structure of bacteria through conserved synteny. SAP outperformed both conventional sequence-based annotation methods and state-of-the-art methods on multiple bacterial species, including for distant homolog detection, where the sequence similarity to the proteins in the training set was as low as 40%. Using SAP to identify gene functions across diverse enterococci, of which some species are major clinical threats, we identified 11 previously unrecognized putative novel toxins, with potential significance to human and animal health.

11.
Antonie Van Leeuwenhoek ; 116(7): 667-685, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37156983

RESUMO

The transformation of environmental microorganisms by extracellular DNA is an overlooked mechanism of horizontal gene transfer and evolution. It initiates the acquisition of exogenous genes and propagates antimicrobial resistance alongside vertical and conjugative transfers. We combined mixed-culture biotechnology and Hi-C sequencing to elucidate the transformation of wastewater microorganisms with a synthetic plasmid encoding GFP and kanamycin resistance genes, in the mixed culture of chemostats exposed to kanamycin at concentrations representing wastewater, gut and polluted environments (0.01-2.5-50-100 mg L-1). We found that the phylogenetically distant Gram-negative Runella (102 Hi-C links), Bosea (35), Gemmobacter (33) and Zoogloea (24) spp., and Gram-positive Microbacterium sp. (90) were transformed by the foreign plasmid, under high antibiotic exposure (50 mg L-1). In addition, the antibiotic pressure shifted the origin of aminoglycoside resistance genes from genomic DNA to mobile genetic elements on plasmids accumulating in microorganisms. These results reveal the power of Hi-C sequencing to catch and surveil the transfer of xenogenetic elements inside microbiomes.


Assuntos
Microbiota , Águas Residuárias , Antibacterianos/uso terapêutico , Plasmídeos/genética , DNA , Transferência Genética Horizontal , Conjugação Genética
12.
Front Plant Sci ; 14: 1160645, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37035076

RESUMO

Global soft fruit supply chains rely on trustworthy descriptions of product quality. However, crucial criteria such as sweetness and firmness cannot be accurately established without destroying the fruit. Since traditional alternatives are subjective assessments by human experts, it is desirable to obtain quality estimations in a consistent and non-destructive manner. The majority of research on fruit quality measurements analyzed fruits in the lab with uniform data collection. However, it is laborious and expensive to scale up to the level of the whole yield. The "harvest-first, analysis-second" method also comes too late to decide to adjust harvesting schedules. In this research, we validated our hypothesis of using in-field data acquirable via commodity hardware to obtain acceptable accuracies. The primary instance that the research concerns is the sugariness of strawberries, described by the juice's total soluble solid (TSS) content (unit: °Brix or Brix). We benchmarked the accuracy of strawberry Brix prediction using convolutional neural networks (CNN), variational autoencoders (VAE), principal component analysis (PCA), kernelized ridge regression (KRR), support vector regression (SVR), and multilayer perceptron (MLP), based on fusions of image data, environmental records, and plant load information, etc. Our results suggest that: (i) models trained by environment and plant load data can perform reliable prediction of aggregated Brix values, with the lowest RMSE at 0.59; (ii) using image data can further supplement the Brix predictions of individual fruits from (i), from 1.27 to as low up to 1.10, but they by themselves are not sufficiently reliable.

13.
BMC Genomics ; 24(1): 143, 2023 Mar 23.
Artigo em Inglês | MEDLINE | ID: mdl-36959546

RESUMO

Genomes of four Streptomyces isolates, two putative new species (Streptomyces sp. JH14 and Streptomyces sp. JH34) and two non thaxtomin-producing pathogens (Streptomyces sp. JH002 and Streptomyces sp. JH010) isolated from potato fields in Colombia were selected to investigate their taxonomic classification, their pathogenicity, and the production of unique secondary metabolites of Streptomycetes inhabiting potato crops in this region. The average nucleotide identity (ANI) value calculated between Streptomyces sp. JH34 and its closest relatives (92.23%) classified this isolate as a new species. However, Streptomyces sp. JH14 could not be classified as a new species due to the lack of genomic data of closely related strains. Phylogenetic analysis based on 231 single-copy core genes, confirmed that the two pathogenic isolates (Streptomyces sp. JH010 and JH002) belong to Streptomyces pratensis and Streptomyces xiamenensis, respectively, are distant from the most well-known pathogenic species, and belong to two different lineages. We did not find orthogroups of protein-coding genes characteristic of scab-causing Streptomycetes shared by all known pathogenic species. Most genes involved in biosynthesis of known virulence factors are not present in the scab-causing isolates (Streptomyces sp. JH002 and Streptomyces sp. JH010). However, Tat-system substrates likely involved in pathogenicity in Streptomyces sp. JH002 and Streptomyces sp. JH010 were identified. Lastly, the presence of a putative mono-ADP-ribosyl transferase, homologous to the virulence factor scabin, was confirmed in Streptomyces sp. JH002. The described pathogenic isolates likely produce virulence factors uncommon in Streptomyces species, including a histidine phosphatase and a metalloprotease potentially produced by Streptomyces sp. JH002, and a pectinesterase, potentially produced by Streptomyces sp. JH010. Biosynthetic gene clusters (BGCs) showed the presence of clusters associated with the synthesis of medicinal compounds and BGCs potentially linked to pathogenicity in Streptomyces sp. JH010 and JH002. Interestingly, BGCs that have not been previously reported were also found. Our findings suggest that the four isolates produce novel secondary metabolites and metabolites with medicinal properties.


Assuntos
Solanum tuberosum , Streptomyces , Virulência/genética , Filogenia , Fatores de Virulência/genética , Fatores de Virulência/metabolismo , Streptomyces/genética , Streptomyces/metabolismo , Genômica , Doenças das Plantas
14.
Front Microbiol ; 13: 1066995, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36532424

RESUMO

The success of antibiotics as a therapeutic agent has led to their ineffectiveness. The continuous use and misuse in clinical and non-clinical areas have led to the emergence and spread of antibiotic-resistant bacteria and its genetic determinants. This is a multi-dimensional problem that has now become a global health crisis. Antibiotic resistance research has primarily focused on the clinical healthcare sectors while overlooking the non-clinical sectors. The increasing antibiotic usage in the environment - including animals, plants, soil, and water - are drivers of antibiotic resistance and function as a transmission route for antibiotic resistant pathogens and is a source for resistance genes. These natural compartments are interconnected with each other and humans, allowing the spread of antibiotic resistance via horizontal gene transfer between commensal and pathogenic bacteria. Identifying and understanding genetic exchange within and between natural compartments can provide insight into the transmission, dissemination, and emergence mechanisms. The development of high-throughput DNA sequencing technologies has made antibiotic resistance research more accessible and feasible. In particular, the combination of metagenomics and powerful bioinformatic tools and platforms have facilitated the identification of microbial communities and has allowed access to genomic data by bypassing the need for isolating and culturing microorganisms. This review aimed to reflect on the different sequencing techniques, metagenomic approaches, and bioinformatics tools and pipelines with their respective advantages and limitations for antibiotic resistance research. These approaches can provide insight into resistance mechanisms, the microbial population, emerging pathogens, resistance genes, and their dissemination. This information can influence policies, develop preventative measures and alleviate the burden caused by antibiotic resistance.

15.
Bioinformatics ; 38(24): 5352-5359, 2022 12 13.
Artigo em Inglês | MEDLINE | ID: mdl-36308461

RESUMO

MOTIVATION: Haplotypes are the set of alleles co-occurring on a single chromosome and inherited together to the next generation. Because a monoploid reference genome loses this co-occurrence information, it has limited use in associating phenotypes with allelic combinations of genotypes. Therefore, methods to reconstruct the complete haplotypes from DNA sequencing data are crucial. Recently, several attempts have been made at haplotype reconstructions, but significant limitations remain. High-quality continuous haplotypes cannot be created reliably, particularly when there are few differences between the homologous chromosomes. RESULTS: Here, we introduce HAT, a haplotype assembly tool that exploits short and long reads along with a reference genome to reconstruct haplotypes. HAT tries to take advantage of the accuracy of short reads and the length of the long reads to reconstruct haplotypes. We tested HAT on the aneuploid yeast strain Saccharomyces pastorianus CBS1483 and multiple simulated polyploid datasets of the same strain, showing that it outperforms existing tools. AVAILABILITY AND IMPLEMENTATION: https://github.com/AbeelLab/hat/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Comportamento de Utilização de Ferramentas , Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Alelos , Algoritmos
16.
J Immunol ; 209(8): 1555-1565, 2022 10 15.
Artigo em Inglês | MEDLINE | ID: mdl-36096642

RESUMO

Tuberculosis (TB) remains one of the deadliest infectious diseases worldwide, posing great social and economic burden to affected countries. Novel vaccine approaches are needed to increase protective immunity against the causative agent Mycobacterium tuberculosis (Mtb) and to reduce the development of active TB disease in latently infected individuals. Donor-unrestricted T cell responses represent such novel potential vaccine targets. HLA-E-restricted T cell responses have been shown to play an important role in protection against TB and other infections, and recent studies have demonstrated that these cells can be primed in vitro. However, the identification of novel pathogen-derived HLA-E binding peptides presented by infected target cells has been limited by the lack of accurate prediction algorithms for HLA-E binding. In this study, we developed an improved HLA-E binding peptide prediction algorithm and implemented it to identify (to our knowledge) novel Mtb-derived peptides with capacity to induce CD8+ T cell activation and that were recognized by specific HLA-E-restricted T cells in Mycobacterium-exposed humans. Altogether, we present a novel algorithm for the identification of pathogen- or self-derived HLA-E-presented peptides.


Assuntos
Mycobacterium tuberculosis , Tuberculose , Antígenos de Bactérias , Linfócitos T CD8-Positivos , Epitopos de Linfócito T , Antígenos de Histocompatibilidade Classe I , Humanos , Peptídeos , Antígenos HLA-E
17.
Water Res ; 219: 118571, 2022 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-35576763

RESUMO

In the One Health context, wastewater treatment plants (WWTPs) are central to safeguarding water resources. Nonetheless, many questions remain about their effectiveness in preventing antimicrobial resistance (AMR) dissemination. Most surveillance studies monitor the levels and removal of selected antibiotic resistance genes (ARGs) and mobile genetic elements (MGEs) in intracellular DNA (iDNA) extracted from WWTP influents and effluents. The role of extracellular free DNA (exDNA) in wastewater is mostly overlooked. This study analyzed the transfer of ARGs and MGEs in a full-scale Nereda® reactor removing nutrients with aerobic granular sludge. We tracked the composition and fate of the iDNA and exDNA pools of influent, sludge, and effluent samples. Metagenomics was used to profile the microbiome, resistome, and mobilome signatures of iDNA and exDNA extracts. Selected ARGs and MGEs were analyzed by qPCR. From 2,840 ARGs identified, the genes arr-3 (2%), tetC (1.6%), sul1 (1.5%), oqxB (1.2%), and aph(3")-Ib (1.2%) were the most abundant among all sampling points and bioaggregates. Pseudomonas, Acinetobacter, Aeromonas, Acidovorax, Rhodoferax, and Streptomyces populations were the main potential hosts of ARGs in the sludge. In the effluent, 478 resistance determinants were detected, of which 89% were from exDNA potentially released by cell lysis during aeration in the reactor. MGEs and multiple ARGs were co-localized on the same extracellular genetic contigs. Total intracellular ARGs decreased 3-42% due to wastewater treatment. However, the ermB and sul1 genes increased by 2 and 1 log gene copies mL-1, respectively, in exDNA from influent to effluent. The exDNA fractions need to be considered in AMR surveillance, risk assessment, and mitigation strategies.


Assuntos
Esgotos , Purificação da Água , Antibacterianos/farmacologia , DNA , Resistência Microbiana a Medicamentos/genética , Genes Bacterianos , Metagenômica , Águas Residuárias
18.
Genome Biol ; 23(1): 74, 2022 03 07.
Artigo em Inglês | MEDLINE | ID: mdl-35255937

RESUMO

Human-associated microbial communities comprise not only complex mixtures of bacterial species, but also mixtures of conspecific strains, the implications of which are mostly unknown since strain level dynamics are underexplored due to the difficulties of studying them. We introduce the Strain Genome Explorer (StrainGE) toolkit, which deconvolves strain mixtures and characterizes component strains at the nucleotide level from short-read metagenomic sequencing with higher sensitivity and resolution than other tools. StrainGE is able to identify strains at 0.1x coverage and detect variants for multiple conspecific strains within a sample from coverages as low as 0.5x.


Assuntos
Microbiota , Bactérias/genética , Humanos , Metagenoma , Metagenômica , Microbiota/genética
19.
Gigascience ; 122022 Dec 28.
Artigo em Inglês | MEDLINE | ID: mdl-38000912

RESUMO

BACKGROUND: Assembly algorithm choice should be a deliberate, well-justified decision when researchers create genome assemblies for eukaryotic organisms from third-generation sequencing technologies. While third-generation sequencing by Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PacBio) has overcome the disadvantages of short read lengths specific to next-generation sequencing (NGS), third-generation sequencers are known to produce more error-prone reads, thereby generating a new set of challenges for assembly algorithms and pipelines. However, the introduction of HiFi reads, which offer substantially reduced error rates, has provided a promising solution for more accurate assembly outcomes. Since the introduction of third-generation sequencing technologies, many tools have been developed that aim to take advantage of the longer reads, and researchers need to choose the correct assembler for their projects. RESULTS: We benchmarked state-of-the-art long-read de novo assemblers to help readers make a balanced choice for the assembly of eukaryotes. To this end, we used 12 real and 64 simulated datasets from different eukaryotic genomes, with different read length distributions, imitating PacBio continuous long-read (CLR), PacBio high-fidelity (HiFi), and ONT sequencing to evaluate the assemblers. We include 5 commonly used long-read assemblers in our benchmark: Canu, Flye, Miniasm, Raven, and wtdbg2 for ONT and PacBio CLR reads. For PacBio HiFi reads , we include 5 state-of-the-art HiFi assemblers: HiCanu, Flye, Hifiasm, LJA, and MBG. Evaluation categories address the following metrics: reference-based metrics, assembly statistics, misassembly count, BUSCO completeness, runtime, and RAM usage. Additionally, we investigated the effect of increased read length on the quality of the assemblies and report that read length can, but does not always, positively impact assembly quality. CONCLUSIONS: Our benchmark concludes that there is no assembler that performs the best in all the evaluation categories. However, our results show that overall Flye is the best-performing assembler for PacBio CLR and ONT reads, both on real and simulated data. Meanwhile, best-performing PacBio HiFi assemblers are Hifiasm and LJA. Next, the benchmarking using longer reads shows that the increased read length improves assembly quality, but the extent to which that can be achieved depends on the size and complexity of the reference genome.


Assuntos
Genoma , Nanoporos , Análise de Sequência de DNA/métodos , Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala/métodos
20.
Forensic Sci Int Genet ; 56: 102632, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34839075

RESUMO

Machine learning obtains good accuracy in determining the number of contributors (NOC) in short tandem repeat (STR) mixture DNA profiles. However, the models used so far are not understandable to users as they only output a prediction without any reasoning for that conclusion. Therefore, we leverage techniques from the field of explainable artificial intelligence (XAI) to help users understand why specific predictions are made. Where previous attempts at explainability for NOC estimation have relied upon using simpler, more understandable models that achieve lower accuracy, we use techniques that can be applied to any machine learning model. Our explanations incorporate SHAP values and counterfactual examples for each prediction into a single visualization. Existing methods for generating counterfactuals focus on uncorrelated features. This makes them inappropriate for the highly correlated features derived from STR data for NOC estimation, as these techniques simulate combinations of features that could not have resulted from an STR profile. For this reason, we have constructed a new counterfactual method, Realistic Counterfactuals (ReCo), which generates realistic counterfactual explanations for correlated data. We show that ReCo outperforms state-of-the-art methods on traditional metrics, as well as on a novel realism score. A user evaluation of the visualization shows positive opinions of end-users, which is ultimately the most appropriate metric in assessing explanations for real-world settings.


Assuntos
Inteligência Artificial , Aprendizado de Máquina , DNA/genética , Medicina Legal , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA