Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 187
Filter
1.
Front Microbiol ; 15: 1330814, 2024.
Article in English | MEDLINE | ID: mdl-38495515

ABSTRACT

Introduction: Shotgun metagenomics has previously proven effective in the investigation of foodborne outbreaks by providing rapid and comprehensive insights into the microbial contaminant. However, culture enrichment of the sample has remained a prerequisite, despite the potential impact on pathogen detection resulting from the growth competition. To circumvent the need for culture enrichment, we explored the use of adaptive sampling using various databases for a targeted nanopore sequencing, compared to shotgun metagenomics alone. Methods: The adaptive sampling method was first tested on DNA of mashed potatoes mixed with DNA of a Staphylococcus aureus strain previously associated with a foodborne outbreak. The selective sequencing was used to either deplete the potato sequencing reads or enrich for the pathogen sequencing reads, and compared to a shotgun sequencing. Then, living S. aureus were spiked at 105 CFU into 25 g of mashed potatoes. Three DNA extraction kits were tested, in combination with enrichment using adaptive sampling, following whole genome amplification. After data analysis, the possibility to characterize the contaminant with the different sequencing and extraction methods, without culture enrichment, was assessed. Results: Overall, the adaptive sampling outperformed the shotgun sequencing. While the use of a host removal DNA extraction kit and targeted sequencing using a database of foodborne pathogens allowed rapid detection of the pathogen, the most complete characterization was achieved when using solely a database of S. aureus combined with a conventional DNA extraction kit, enabling accurate placement of the strain on a phylogenetic tree alongside outbreak cases. Discussion: This method shows great potential for strain-level analysis of foodborne outbreaks without the need for culture enrichment, thereby enabling faster investigations and facilitating precise pathogen characterization. The integration of adaptive sampling with metagenomics presents a valuable strategy for more efficient and targeted analysis of microbial communities in foodborne outbreaks, contributing to improved food safety and public health.

2.
bioRxiv ; 2024 Jan 19.
Article in English | MEDLINE | ID: mdl-37808782

ABSTRACT

Cancer is a heterogeneous disease that demands precise molecular profiling for better understanding and management. Recently, deep learning has demonstrated potentials for cost-efficient prediction of molecular alterations from histology images. While transformer-based deep learning architectures have enabled significant progress in non-medical domains, their application to histology images remains limited due to small dataset sizes coupled with the explosion of trainable parameters. Here, we develop SEQUOIA, a transformer model to predict cancer transcriptomes from whole-slide histology images. To enable the full potential of transformers, we first pre-train the model using data from 1,802 normal tissues. Then, we fine-tune and evaluate the model in 4,331 tumor samples across nine cancer types. The prediction performance is assessed at individual gene levels and pathway levels through Pearson correlation analysis and root mean square error. The generalization capacity is validated across two independent cohorts comprising 1,305 tumors. In predicting the expression levels of 25,749 genes, the highest performance is observed in cancers from breast, kidney and lung, where SEQUOIA accurately predicts the expression of 11,069, 10,086 and 8,759 genes, respectively. The accurately predicted genes are associated with the regulation of inflammatory response, cell cycles and metabolisms. While the model is trained at the tissue level, we showcase its potential in predicting spatial gene expression patterns using spatial transcriptomics datasets. Leveraging the prediction performance, we develop a digital gene expression signature that predicts the risk of recurrence in breast cancer. SEQUOIA deciphers clinically relevant gene expression patterns from histology images, opening avenues for improved cancer management and personalized therapies.

3.
Sci Rep ; 13(1): 19656, 2023 11 11.
Article in English | MEDLINE | ID: mdl-37952062

ABSTRACT

Rapid, accurate and comprehensive diagnostics are essential for outbreak prevention and pathogen surveillance. Real-time, on-site metagenomics on miniaturized devices, such as Oxford Nanopore Technologies MinION sequencing, could provide a promising approach. However, current sample preparation protocols often require substantial equipment and dedicated laboratories, limiting their use. In this study, we developed a rapid on-site applicable DNA extraction and library preparation approach for nanopore sequencing, using portable devices. The optimized method consists of a portable mechanical lysis approach followed by magnetic bead-based DNA purification and automated sequencing library preparation, and resulted in a throughput comparable to a current optimal, laboratory-based protocol using enzymatic digestion to lyse cells. By using spike-in reference communities, we compared the on-site method with other workflows, and demonstrated reliable taxonomic profiling, despite method-specific biases. We also demonstrated the added value of long-read sequencing by recovering reads containing full-length antimicrobial resistance genes, and attributing them to a host species based on the additional genomic information they contain. Our method may provide a rapid, widely-applicable approach for microbial detection and surveillance in a variety of on-site settings.


Subject(s)
Anti-Bacterial Agents , Nanopores , Workflow , Drug Resistance, Bacterial/genetics , Metagenome , Metagenomics/methods , High-Throughput Nucleotide Sequencing/methods , DNA , Sequence Analysis, DNA/methods
4.
BMC Biol ; 21(1): 244, 2023 11 06.
Article in English | MEDLINE | ID: mdl-37926805

ABSTRACT

BACKGROUND: Sterile-fertile heteroblasty is a common phenomenon observed in ferns, where the leaf shape of a fern sporophyll, responsible for sporangium production, differs from that of a regular trophophyll. However, due to the large size and complexity of most fern genomes, the molecular mechanisms that regulate the formation of these functionally different heteroblasty have remained elusive. To shed light on these mechanisms, we generated a full-length transcriptome of Ceratopteris chingii with PacBio Iso-Seq from five tissue samples. By integrating Illumina-based sequencing short reads, we identified the genes exhibiting the most significant differential expression between sporophylls and trophophylls. RESULTS: The long reads were assembled, resulting in a total of 24,024 gene models. The differential expressed genes between heteroblasty primarily involved reproduction and cell wall composition, with a particular focus on expansin genes. Reconstructing the phylogeny of expansin genes across 19 plant species, ranging from green algae to seed plants, we identified four ortholog groups for expansins. The observed high expression of expansin genes in the young sporophylls of C. chingii emphasizes their role in the development of heteroblastic leaves. Through gene coexpression analysis, we identified highly divergent expressions of expansin genes both within and between species. CONCLUSIONS: The specific regulatory interactions and accompanying expression patterns of expansin genes are associated with variations in leaf shapes between sporophylls and trophophylls.


Subject(s)
Cell Wall , Fertility , Phylogeny , Plant Leaves/genetics , Reproduction , Plant Proteins/genetics , Gene Expression Regulation, Plant
5.
Appl Environ Microbiol ; 89(10): e0115523, 2023 10 31.
Article in English | MEDLINE | ID: mdl-37819078

ABSTRACT

While the evolution of antimicrobial resistance is well studied in free-living bacteria, information on resistance development in dense and diverse biofilm communities is largely lacking. Therefore, we explored how the social interactions in a duo-species biofilm composed of the brewery isolates Pseudomonas rhodesiae and Raoultella terrigena influence the adaptation to the broad-spectrum antimicrobial sulfathiazole. Previously, we showed that the competition between these brewery isolates enhances the antimicrobial tolerance of P. rhodesiae. Here, we found that this enhanced tolerance in duo-species biofilms is associated with a strongly increased antimicrobial resistance development in P. rhodesiae. Whereas P. rhodesiae was not able to evolve resistance against sulfathiazole in monospecies conditions, it rapidly evolved resistance in the majority of the duo-species communities. Although the initial presence of R. terrigena was thus required for P. rhodesiae to acquire resistance, the resistance mechanisms did not depend on the presence of R. terrigena. Whole genome sequencing of resistant P. rhodesiae clones showed no clear mutational hot spots. This indicates that the acquired resistance phenotype depends on complex interactions between low-frequency mutations in the genetic background of the strains. We hypothesize that the increased tolerance in duo-species conditions promotes resistance by enhancing the selection of partially resistant mutants and opening up novel evolutionary trajectories that enable such genetic interactions. This hypothesis is reinforced by experimentally excluding potential effects of increased initial population size, enhanced mutation rate, and horizontal gene transfer. Altogether, our observations suggest that the community mode of life and the social interactions therein strongly affect the accessible evolutionary pathways toward antimicrobial resistance.IMPORTANCEAntimicrobial resistance is one of the most studied bacterial properties due to its enormous clinical and industrial relevance; however, most research focuses on resistance development of a single species in isolation. In the present study, we showed that resistance evolution of brewery isolates can differ greatly between single- and mixed-species conditions. Specifically, we observed that the development of antimicrobial resistance in certain species can be significantly enhanced in co-culture as compared to the single-species conditions. Overall, the current study emphasizes the need of considering the within bacterial interactions in microbial communities when evaluating antimicrobial treatments and resistance evolution.


Subject(s)
Anti-Infective Agents , Anti-Infective Agents/pharmacology , Biofilms , Bacteria/genetics , Phenotype , Sulfathiazoles/pharmacology , Anti-Bacterial Agents/pharmacology
6.
Front Microbiol ; 14: 1204630, 2023.
Article in English | MEDLINE | ID: mdl-37520372

ABSTRACT

Introduction: Shiga toxin-producing Escherichia coli (STEC) is a gastrointestinal pathogen causing foodborne outbreaks. Whole Genome Sequencing (WGS) in STEC surveillance holds promise in outbreak prevention and confinement, in broadening STEC epidemiology and in contributing to risk assessment and source attribution. However, despite international recommendations, WGS is often restricted to assist outbreak investigation and is not yet fully implemented in food safety surveillance across all European countries, in contrast to for example in the United States. Methods: In this study, WGS was retrospectively applied to isolates collected within the context of Belgian food safety surveillance and combined with data from clinical isolates to evaluate its benefits. A cross-sector WGS-based collection of 754 strains from 1998 to 2020 was analyzed. Results: We confirmed that WGS in food safety surveillance allows accurate detection of genomic relationships between human cases and strains isolated from food samples, including those dispersed over time and geographical locations. Identifying these links can reveal new insights into outbreaks and direct epidemiological investigations to facilitate outbreak management. Complete WGS-based isolate characterization enabled expanding epidemiological insights related to circulating serotypes, virulence genes and antimicrobial resistance across different reservoirs. Moreover, associations between virulence genes and severe disease were determined by incorporating human metadata into the data analysis. Gaps in the surveillance system were identified and suggestions for optimization related to sample centralization, harmonizing isolation methods, and expanding sampling strategies were formulated. Discussion: This study contributes to developing a representative WGS-based collection of circulating STEC strains and by illustrating its benefits, it aims to incite policymakers to support WGS uptake in food safety surveillance.

7.
Cancer Res ; 83(17): 2970-2984, 2023 09 01.
Article in English | MEDLINE | ID: mdl-37352385

ABSTRACT

In prostate cancer, there is an urgent need for objective prognostic biomarkers that identify the metastatic potential of a tumor at an early stage. While recent analyses indicated TP53 mutations as candidate biomarkers, molecular profiling in a clinical setting is complicated by tumor heterogeneity. Deep learning models that predict the spatial presence of TP53 mutations in whole slide images (WSI) offer the potential to mitigate this issue. To assess the potential of WSIs as proxies for spatially resolved profiling and as biomarkers for aggressive disease, we developed TiDo, a deep learning model that achieves state-of-the-art performance in predicting TP53 mutations from WSIs of primary prostate tumors. In an independent multifocal cohort, the model showed successful generalization at both the patient and lesion level. Analysis of model predictions revealed that false positive (FP) predictions could at least partially be explained by TP53 deletions, suggesting that some FP carry an alteration that leads to the same histological phenotype as TP53 mutations. Comparative expression and histologic cell type analyses identified a TP53-like cellular phenotype triggered by expression of pathways affecting stromal composition. Together, these findings indicate that WSI-based models might not be able to perfectly predict the spatial presence of individual TP53 mutations but they have the potential to elucidate the prognosis of a tumor by depicting a downstream phenotype associated with aggressive disease biomarkers. SIGNIFICANCE: Deep learning models predicting TP53 mutations from whole slide images of prostate cancer capture histologic phenotypes associated with stromal composition, lymph node metastasis, and biochemical recurrence, indicating their potential as in silico prognostic biomarkers. See related commentary by Bordeleau, p. 2809.


Subject(s)
Prostatic Neoplasms , Male , Humans , Mutation , Prostatic Neoplasms/genetics , Prostatic Neoplasms/pathology , Prognosis , Prostate/pathology , Phenotype , Tumor Suppressor Protein p53/genetics
8.
BMC Genomics ; 24(1): 247, 2023 May 09.
Article in English | MEDLINE | ID: mdl-37161318

ABSTRACT

BACKGROUND: The Human Leukocyte Antigen (HLA) genes are a group of highly polymorphic genes that are located in the Major Histocompatibility Complex (MHC) region on chromosome 6. The HLA genotype affects the presentability of tumour antigens to the immune system. While knowledge of these genotypes is of utmost importance to study differences in immune responses between cancer patients, gold standard, PCR-derived genotypes are rarely available in large Next Generation Sequencing (NGS) datasets. Therefore, a variety of methods for in silico NGS-based HLA genotyping have been developed, bypassing the need to determine these genotypes with separate experiments. However, there is currently no consensus on the best performing tool. RESULTS: We evaluated 13 MHC class I and/or class II HLA callers that are currently available for free academic use and run on either Whole Exome Sequencing (WES) or RNA sequencing data. Computational resource requirements were highly variable between these tools. Three orthogonal approaches were used to evaluate the accuracy on several large publicly available datasets: a direct benchmark using PCR-derived gold standard HLA calls, a correlation analysis with population-based allele frequencies and an analysis of the concordance between the different tools. The highest MHC-I calling accuracies were found for Optitype (98.0%) and arcasHLA (99.4%) on WES and RNA sequencing data respectively, while for MHC-II HLA-HD was the most accurate tool for both data types (96.2% and 99.4% on WES and RNA data respectively). CONCLUSION: The optimal strategy for HLA genotyping from NGS data depends on the availability of either WES or RNA data, the size of the dataset and the available computational resources. If sufficient resources are available, we recommend Optitype and HLA-HD for MHC-I and MHC-II genotype calling respectively.


Subject(s)
Benchmarking , HLA Antigens , Humans , Major Histocompatibility Complex , Genotype , High-Throughput Nucleotide Sequencing
9.
Microb Genom ; 9(1)2023 01.
Article in English | MEDLINE | ID: mdl-36748573

ABSTRACT

For antimicrobial resistance (AMR) surveillance, it is important not only to detect AMR genes, but also to determine their plasmidic or chromosomal location, as this will impact their spread differently. Whole-genome sequencing (WGS) is increasingly used for AMR surveillance. However, determining the genetic context of AMR genes using only short-read sequencing is complicated. The combination with long-read sequencing offers a potential solution, as it allows hybrid assemblies. Nevertheless, its use in surveillance has so far been limited. This study aimed to demonstrate its added value for AMR surveillance based on a case study of extended-spectrum beta-lactamases (ESBLs). ESBL genes have been reported to occur also on plasmids. To gain insight into the diversity and genetic context of ESBL genes detected in clinical isolates received by the Belgian National Reference Center between 2013 and 2018, 100 ESBL-producing Shigella and 31 ESBL-producing Salmonella were sequenced with MiSeq and a representative selection of 20 Shigella and six Salmonella isolates additionally with MinION technology, allowing hybrid assembly. The bla CTX-M-15 gene was found to be responsible for a rapid rise in the ESBL Shigella phenotype from 2017. This gene was mostly detected on multi-resistance-carrying IncFII plasmids. Based on clustering, these plasmids were determined to be distinct from the circulating plasmids before 2017. They were spread to different Shigella species and within Shigella sonnei between multiple genotypes. Another similar IncFII plasmid was detected after 2017 containing bla CTX-M-27 for which only clonal expansion occurred. Matches of up to 99 % to plasmids of various bacterial hosts from all over the world were found, but global alignments indicated that direct or recent ESBL-plasmid transfers did not occur. It is most likely that travellers introduced these in Belgium and subsequently spread them domestically. However, a clear link to a specific country could not be made. Moreover, integration of bla CTX-M in the chromosome of two Shigella isolates was determined for the first time, and shown to be related to ISEcp1. In contrast, in Salmonella, ESBL genes were only found on plasmids, of which bla CTX-M-55 and IncHI2 were the most prevalent, respectively. No matching ESBL plasmids or cassettes were detected between clinical Shigella and Salmonella isolates. The hybrid assembly data allowed us to check the accuracy of plasmid prediction tools. MOB-suite showed the highest accuracy. However, these tools cannot replace the accuracy of long-read and hybrid assemblies. This study illustrates the added value of hybrid assemblies for AMR surveillance and shows that a strategy where even just representative isolates of a collection used for hybrid assemblies could improve international AMR surveillance as it allows plasmid tracking.


Subject(s)
Shigella , beta-Lactamases , Belgium , beta-Lactamases/genetics , Microbial Sensitivity Tests , Plasmids/genetics , Shigella/genetics , Salmonella/genetics
11.
Mol Biol Evol ; 39(12)2022 12 05.
Article in English | MEDLINE | ID: mdl-36480297

ABSTRACT

Antibiotic cycling has been proposed as a promising approach to slow down resistance evolution against currently employed antibiotics. It remains unclear, however, to which extent the decreased resistance evolution is the result of collateral sensitivity, an evolutionary trade-off where resistance to one antibiotic enhances the sensitivity to the second, or due to additional effects of the evolved genetic background, in which mutations accumulated during treatment with a first antibiotic alter the emergence and spread of resistance against a second antibiotic via other mechanisms. Also, the influence of antibiotic exposure patterns on the outcome of drug cycling is unknown. Here, we systematically assessed the effects of the evolved genetic background by focusing on the first switch between two antibiotics against Salmonella Typhimurium, with cefotaxime fixed as the first and a broad variety of other drugs as the second antibiotic. By normalizing the antibiotic concentrations to eliminate the effects of collateral sensitivity, we demonstrated a clear contribution of the evolved genetic background beyond collateral sensitivity, which either enhanced or reduced the adaptive potential depending on the specific drug combination. We further demonstrated that the gradient strength with which cefotaxime was applied affected both cefotaxime resistance evolution and adaptation to second antibiotics, an effect that was associated with higher levels of clonal interference and reduced cost of resistance in populations evolved under weaker cefotaxime gradients. Overall, our work highlights that drug cycling can affect resistance evolution independently of collateral sensitivity, in a manner that is contingent on the antibiotic exposure pattern.


Subject(s)
Anti-Bacterial Agents , Drug Collateral Sensitivity , Anti-Bacterial Agents/pharmacology , Drug Resistance, Multiple, Bacterial/genetics , Microbial Sensitivity Tests , Cefotaxime/pharmacology , Drug Resistance, Bacterial/genetics
12.
Foods ; 11(21)2022 Oct 25.
Article in English | MEDLINE | ID: mdl-36359961

ABSTRACT

In this proof-of-concept study on food contaminated with norovirus, we investigated the feasibility of metagenomics as a new method to obtain the whole genome sequence of the virus and perform strain level characterization but also relate to human cases in order to resolve foodborne outbreaks. We tested several preparation methods to determine if a more open sequencing approach, i.e., shotgun metagenomics, or a more targeted approach, including hybrid capture, was the most appropriate. The genetic material was sequenced using Oxford Nanopore technologies with or without adaptive sampling, and the data were analyzed with an in-house bioinformatics workflow. We showed that a viral genome sequence could be obtained for phylogenetic analysis with shotgun metagenomics if the contamination load was sufficiently high or after hybrid capture for lower contamination. Relatedness to human cases goes well beyond the results obtained with the current qPCR methods. This workflow was also tested on a publicly available dataset of food spiked with norovirus and hepatitis A virus. This allowed us to prove that we could detect even fewer genome copies and two viruses present in a sample using shotgun metagenomics. We share the lessons learnt on the satisfactory and unsatisfactory results in an attempt to advance the field.

13.
Plant J ; 112(3): 830-846, 2022 11.
Article in English | MEDLINE | ID: mdl-36123806

ABSTRACT

Both gene duplication and alternative splicing (AS) drive the functional diversity of gene products in plants, yet the relative contributions of the two key mechanisms to the evolution of gene function are largely unclear. Here, we studied AS in two closely related lotus plants, Nelumbo lutea and Nelumbo nucifera, and the outgroup Arabidopsis thaliana, for both single-copy and duplicated genes. We show that most splicing events evolved rapidly between orthologs and that the origin of lineage-specific splice variants or isoforms contributed to gene functional changes during species divergence within Nelumbo. Single-copy genes contain more isoforms, have more AS events conserved across species, and show more complex tissue-dependent expression patterns than their duplicated counterparts. This suggests that expression divergence through isoforms is a mechanism to extend the expression breadth of genes with low copy numbers. As compared to isoforms of local, small-scale duplicates, isoforms of whole-genome duplicates are less conserved and display a less conserved tissue bias, pointing towards their contribution to subfunctionalization. Through comparative analysis of isoform expression networks, we identified orthologous genes of which the expression of at least some of their isoforms displays a conserved tissue bias across species, indicating a strong selection pressure for maintaining a stable expression pattern of these isoforms. Overall, our study shows that both AS and gene duplication contributed to the diversity of gene function during the evolution of lotus.


Subject(s)
Arabidopsis , Lotus , Nelumbo , Lotus/genetics , Gene Duplication , Genes, Duplicate , Protein Isoforms/genetics , Arabidopsis/genetics , Nelumbo/genetics , Gene Expression , Evolution, Molecular
14.
DNA Res ; 29(4)2022 Jun 25.
Article in English | MEDLINE | ID: mdl-35904558

ABSTRACT

With the decreasing cost of sequencing and availability of larger numbers of sequenced genomes, comparative genomics is becoming increasingly attractive to complement experimental techniques for the task of transcription factor (TF) binding site identification. In this study, we redesigned BLSSpeller, a motif discovery algorithm, to cope with larger sequence datasets. BLSSpeller was used to identify novel motifs in Zea mays in a comparative genomics setting with 16 monocot lineages. We discovered 61 motifs of which 20 matched previously described motif models in Arabidopsis. In addition, novel, yet uncharacterized motifs were detected, several of which are supported by available sequence-based and/or functional data. Instances of the predicted motifs were enriched around transcription start sites and contained signatures of selection. Moreover, the enrichment of the predicted motif instances in open chromatin and TF binding sites indicates their functionality, supported by the fact that genes carrying instances of these motifs were often found to be co-expressed and/or enriched in similar GO functions. Overall, our study unveiled several novel candidate motifs that might help our understanding of the genotype to phenotype association in crops.


Subject(s)
Arabidopsis , Zea mays , Algorithms , Arabidopsis/genetics , Binding Sites , Genomics/methods , Nucleotide Motifs , Protein Binding , Zea mays/genetics
15.
Bioinformatics ; 38(12): 3245-3251, 2022 06 13.
Article in English | MEDLINE | ID: mdl-35552634

ABSTRACT

MOTIVATION: Network-based driver identification methods that can exploit mutual exclusivity typically fail to detect rare drivers because of their statistical rigor. Propagation-based methods in contrast allow recovering rare driver genes, but the interplay between network topology and high-scoring nodes often results in spurious predictions. The specificity of driver gene detection can be improved by taking into account both gene-specific and gene-set properties. Combining these requires a formalism that can adjust gene-set properties depending on the exact network context within which a gene is analyzed. RESULTS: We developed OMEN: a logic programming framework based on random walk semantics. OMEN presents a number of novel concepts. In particular, its design is unique in that it presents an effective approach to combine both gene-specific driver properties and gene-set properties, and includes a novel method to avoid restrictive, a priori filtering of genes by exploiting the gene-set property of mutual exclusivity, expressed in terms of the functional impact scores of mutations, rather than in terms of simple binary mutation calls. Applying OMEN to a benchmark dataset derived from TCGA illustrates how OMEN is able to robustly identify driver genes and modules of driver genes as proxies of driver pathways. AVAILABILITY AND IMPLEMENTATION: The source code is freely available for download at www.github.com/DriesVanDaele/OMEN. The dataset is archived at https://doi.org/10.5281/zenodo.6419097 and the code at https://doi.org/10.5281/zenodo.6419764. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Computational Biology , Neoplasms , Humans , Computational Biology/methods , Algorithms , Neoplasms/genetics , Software , Mutation , Gene Regulatory Networks
16.
Cell Rep Methods ; 2(2): 100171, 2022 02 28.
Article in English | MEDLINE | ID: mdl-35474966

ABSTRACT

We present deep link prediction (DLP), a method for the interpretation of loss-of-function screens. Our approach uses representation-based link prediction to reprioritize phenotypic readouts by integrating screening experiments with gene-gene interaction networks. We validate on 2 different loss-of-function technologies, RNAi and CRISPR, using datasets obtained from DepMap. Extensive benchmarking shows that DLP-DeepWalk outperforms other methods in recovering cell-specific dependencies, achieving an average precision well above 90% across 7 different cancer types and on both RNAi and CRISPR data. We show that the genes ranked highest by DLP-DeepWalk are appreciably more enriched in drug targets compared to the ranking based on original screening scores. Interestingly, this enrichment is more pronounced on RNAi data compared to CRISPR data, consistent with the greater inherent noise of RNAi screens. Finally, we demonstrate how DLP-DeepWalk can infer the molecular mechanism through which putative targets trigger cell line mortality.


Subject(s)
Neoplasms , Humans , Neoplasms/genetics , RNA Interference , Cell Line
17.
Gigascience ; 11(1)2022 01 12.
Article in English | MEDLINE | ID: mdl-35022699

ABSTRACT

BACKGROUND: The accurate detection of somatic variants from sequencing data is of key importance for cancer treatment and research. Somatic variant calling requires a high sequencing depth of the tumor sample, especially when the detection of low-frequency variants is also desired. In turn, this leads to large volumes of raw sequencing data to process and hence, large computational requirements. For example, calling the somatic variants according to the GATK best practices guidelines requires days of computing time for a typical whole-genome sequencing sample. FINDINGS: We introduce Halvade Somatic, a framework for somatic variant calling from DNA sequencing data that takes advantage of multi-node and/or multi-core compute platforms to reduce runtime. It relies on Apache Spark to provide scalable I/O and to create and manage data streams that are processed on different CPU cores in parallel. Halvade Somatic contains all required steps to process the tumor and matched normal sample according to the GATK best practices recommendations: read alignment (BWA), sorting of reads, preprocessing steps such as marking duplicate reads and base quality score recalibration (GATK), and, finally, calling the somatic variants (Mutect2). Our approach reduces the runtime on a single 36-core node to 19.5 h compared to a runtime of 84.5 h for the original pipeline, a speedup of 4.3 times. Runtime can be further decreased by scaling to multiple nodes, e.g., we observe a runtime of 1.36 h using 16 nodes, an additional speedup of 14.4 times. Halvade Somatic supports variant calling from both whole-genome sequencing and whole-exome sequencing data and also supports Strelka2 as an alternative or complementary variant calling tool. We provide a Docker image to facilitate single-node deployment. Halvade Somatic can be executed on a variety of compute platforms, including Amazon EC2 and Google Cloud. CONCLUSIONS: To our knowledge, Halvade Somatic is the first somatic variant calling pipeline that leverages Big Data processing platforms and provides reliable, scalable performance. Source code is freely available.


Subject(s)
High-Throughput Nucleotide Sequencing , Software , High-Throughput Nucleotide Sequencing/methods , Polymorphism, Single Nucleotide , Sequence Analysis, DNA/methods , Exome Sequencing , Whole Genome Sequencing
18.
Front Microbiol ; 12: 752883, 2021.
Article in English | MEDLINE | ID: mdl-34956117

ABSTRACT

The increasing worldwide prevalence of extended-spectrum beta-lactamase (ESBL) producing Escherichia coli constitutes a serious threat to global public health. Surgical site infections are associated with high morbidity and mortality rates in developing countries, fueled by the limited availability of effective antibiotics. We used whole-genome sequencing (WGS) to evaluate antimicrobial resistance and the phylogenomic relationships of 19 ESBL-positive E. coli isolates collected from surgical site infections in patients across public hospitals in Benin in 2019. Isolates were identified by MALDI-TOF mass spectrometry and phenotypically tested for susceptibility to 16 antibiotics. Core-genome multi-locus sequence typing and single-nucleotide polymorphism-based phylogenomic methods were used to investigate the relatedness between samples. The broader phylogenetic context was characterized through the inclusion of publicly available genome data. Among the 19 isolates, 13 different sequence types (STs) were observed, including ST131 (n = 2), ST38 (n = 2), ST410 (n = 2), ST405 (n = 2), ST617 (n = 2), and ST1193 (n = 2). The bla CTX-M-15 gene encoding ESBL resistance was found in 15 isolates (78.9%), as well as other genes associated with ESBL, such as bla OXA-1 (n = 14) and bla TEM-1 (n = 9). Additionally, we frequently observed genes encoding resistance against aminoglycosides [aac-(6')-Ib-cr, n = 14], quinolones (qnrS1 , n = 4), tetracyclines [tet(B), n = 14], sulfonamides (sul2, n = 14), and trimethoprim (dfrA17, n = 13). Nonsynonymous chromosomal mutations in the housekeeping genes parC and gyrA associated with resistance to fluoroquinolones were also detected in multiple isolates. Although the phylogenomic investigation did not reveal evidence of hospital-acquired transmissions, we observed two very similar strains collected from patients in different hospitals. By characterizing a set of multidrug-resistant isolates collected from a largely unexplored environment, this study highlights the added value for WGS as an effective early warning system for emerging pathogens and antimicrobial resistance.

19.
Front Microbiol ; 12: 738284, 2021.
Article in English | MEDLINE | ID: mdl-34803953

ABSTRACT

The current routine laboratory practices to investigate food samples in case of foodborne outbreaks still rely on attempts to isolate the pathogen in order to characterize it. We present in this study a proof of concept using Shiga toxin-producing Escherichia coli spiked food samples for a strain-level metagenomics foodborne outbreak investigation method using the MinION and Flongle flow cells from Oxford Nanopore Technologies, and we compared this to Illumina short-read-based metagenomics. After 12 h of MinION sequencing, strain-level characterization could be achieved, linking the food containing a pathogen to the related human isolate of the affected patient, by means of a single-nucleotide polymorphism (SNP)-based phylogeny. The inferred strain harbored the same virulence genes as the spiked isolate and could be serotyped. This was achieved by applying a bioinformatics method on the long reads using reference-based classification. The same result could be obtained after 24-h sequencing on the more recent lower output Flongle flow cell, on an extract treated with eukaryotic host DNA removal. Moreover, an alternative approach based on in silico DNA walking allowed to obtain rapid confirmation of the presence of a putative pathogen in the food sample. The DNA fragment harboring characteristic virulence genes could be matched to the E. coli genus after sequencing only 1 h with the MinION, 1 h with the Flongle if using a host DNA removal extraction, or 5 h with the Flongle with a classical DNA extraction. This paves the way towards the use of metagenomics as a rapid, simple, one-step method for foodborne pathogen detection and for fast outbreak investigation that can be implemented in routine laboratories on samples prepared with the current standard practices.

20.
Front Microbiol ; 12: 750278, 2021.
Article in English | MEDLINE | ID: mdl-34795649

ABSTRACT

Through staphylococcal enterotoxin (SE) production, Staphylococcus aureus is a common cause of food poisoning. Detection of staphylococcal food poisoning (SFP) is mostly performed using immunoassays, which, however, only detect five of 27 SEs described to date. Polymerase chain reactions are, therefore, frequently used in complement to identify a bigger arsenal of SE at the gene level (se) but are labor-intensive. Complete se profiling of isolates from different sources, i.e., food and human cases, is, however, important to provide an indication of their potential link within foodborne outbreak investigation. In addition to complete se gene profiling, relatedness between isolates is determined with more certainty using pulsed-field gel electrophoresis, Staphylococcus protein A gene typing and other methods, but these are shown to lack resolution. We evaluated how whole genome sequencing (WGS) can offer a solution to these shortcomings. By WGS analysis of a selection of S. aureus isolates, including some belonging to a confirmed foodborne outbreak, its added value as the ultimate multiplexing method was demonstrated. In contrast to PCR-based se gene detection for which primers are sometimes shown to be non-specific, WGS enabled complete se gene profiling with high performance, provided that a database containing reference sequences for all se genes was constructed and employed. The custom compiled database and applied parameters were made publicly available in an online user-friendly interface. As an all-in-one approach with high resolution, WGS additionally allowed inferring correct isolate relationships. The different DNA extraction kits that were tested affected neither se gene profiling nor relatedness determination, which is interesting for data sharing during SFP outbreak investigation. Although confirming the production of enterotoxins remains important for SFP investigation, we delivered a proof-of-concept that WGS is a valid alternative and/or complementary tool for outbreak investigation.

SELECTION OF CITATIONS
SEARCH DETAIL
...