Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 8.931
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 121(44): e2411413121, 2024 Oct 29.
Artigo em Inglês | MEDLINE | ID: mdl-39432787

RESUMO

Methane (CH4) is a potent greenhouse gas but also an important carbon and energy substrate for some lake food webs. Understanding how CH4 incorporates into food webs is, therefore, crucial for unraveling CH4 cycling and its impacts on climate and ecosystems. However, CH4-fueled lake food webs from pre-Holocene intervals, particularly during greenhouse climates in Earth history, have received relatively little attention. Here, we present a long-term record of CH4-fueled pelagic food webs across the Cretaceous Oceanic Anoxic Event 1a (~120 Mya) that serves as a geological analog to future warming. We show an exceptionally strong expansion of both methanogens and CH4-oxidizing bacteria (up to 87% of hopanoid-producing bacteria) during this Event. Grazing on CH4-oxidizing bacteria by zooplankton (up to 47% of ciliate diets) within the chemocline transferred substantial CH4-derived carbon to the higher trophic levels, representing an important CH4 sink in the water column. Our findings suggest that as Earth warms, microbial CH4 cycling could restructure food webs and fundamentally alter carbon and energy flows and trophic pathways in lake ecosystems.


Assuntos
Cadeia Alimentar , Lagos , Metano , Zooplâncton , Metano/metabolismo , Lagos/microbiologia , Zooplâncton/metabolismo , Animais , Ecossistema , Gases de Efeito Estufa/metabolismo , Gases de Efeito Estufa/análise , Bactérias/metabolismo , Efeito Estufa
2.
Proc Natl Acad Sci U S A ; 121(28): e2403888121, 2024 Jul 09.
Artigo em Inglês | MEDLINE | ID: mdl-38968102

RESUMO

Real-world communication frequently requires language producers to address more than one comprehender at once, yet most psycholinguistic research focuses on one-on-one communication. As the audience size grows, interlocutors face new challenges that do not arise in dyads. They must consider multiple perspectives and weigh multiple sources of feedback to build shared understanding. Here, we ask which properties of the group's interaction structure facilitate successful communication. We used a repeated reference game paradigm in which directors instructed between one and five matchers to choose specific targets out of a set of abstract figures. Across 313 games (N = 1,319 participants), we manipulated several key constraints on the group's interaction, including the amount of feedback that matchers could give to directors and the availability of peer interaction between matchers. Across groups of different sizes and interaction constraints, describers produced increasingly efficient utterances and matchers made increasingly accurate selections. Critically, however, we found that smaller groups and groups with less-constrained interaction structures ("thick channels") showed stronger convergence to group-specific conventions than large groups with constrained interaction structures ("thin channels"), which struggled with convention formation. Overall, these results shed light on the core structural factors that enable communication to thrive in larger groups.


Assuntos
Comunicação , Humanos , Masculino , Feminino , Adulto , Idioma , Processos Grupais , Relações Interpessoais , Adulto Jovem , Psicolinguística
3.
Hum Mol Genet ; 33(14): 1207-1214, 2024 Jul 06.
Artigo em Inglês | MEDLINE | ID: mdl-38643062

RESUMO

Genotype imputation is widely used in genome-wide association studies (GWAS). However, both the genotyping chips and imputation reference panels are dependent on next-generation sequencing (NGS). Due to the nature of NGS, some regions of the genome are inaccessible to sequencing. To date, there has been no complete evaluation of these regions and their impact on the identification of associations in GWAS remains unclear. In this study, we systematically assess the extent to which variants in inaccessible regions are underrepresented on genotyping chips and imputation reference panels, in GWAS results and in variant databases. We also determine the proportion of genes located in inaccessible regions and compare the results across variant masks defined by the 1000 Genomes Project and the TOPMed program. Overall, fewer variants were observed in inaccessible regions in all categories analyzed. Depending on the mask used and normalized for region size, only 4%-17% of the genotyped variants are located in inaccessible regions and 52 to 581 genes were almost completely inaccessible. From the Cooperative Health Research in South Tyrol (CHRIS) study, we present a case study of an association located in an inaccessible region that is driven by genotyped variants and cannot be reproduced by imputation in GRCh37. We conclude that genotyping, NGS, genotype imputation and downstream analyses such as GWAS and fine mapping are systematically biased in inaccessible regions, due to missed variants and spurious associations. To help researchers assess gene and variant accessibility, we provide an online application (https://gab.gm.eurac.edu).


Assuntos
Genoma Humano , Estudo de Associação Genômica Ampla , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Polimorfismo de Nucleotídeo Único , Humanos , Estudo de Associação Genômica Ampla/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Polimorfismo de Nucleotídeo Único/genética
4.
Am J Hum Genet ; 110(2): 251-272, 2023 02 02.
Artigo em Inglês | MEDLINE | ID: mdl-36669495

RESUMO

For neurodevelopmental disorders (NDDs), a molecular diagnosis is key for management, predicting outcome, and counseling. Often, routine DNA-based tests fail to establish a genetic diagnosis in NDDs. Transcriptome analysis (RNA sequencing [RNA-seq]) promises to improve the diagnostic yield but has not been applied to NDDs in routine diagnostics. Here, we explored the diagnostic potential of RNA-seq in 96 individuals including 67 undiagnosed subjects with NDDs. We performed RNA-seq on single individuals' cultured skin fibroblasts, with and without cycloheximide treatment, and used modified OUTRIDER Z scores to detect gene expression outliers and mis-splicing by exonic and intronic outliers. Analysis was performed by a user-friendly web application, and candidate pathogenic transcriptional events were confirmed by secondary assays. We identified intragenic deletions, monoallelic expression, and pseudoexonic insertions but also synonymous and non-synonymous variants with deleterious effects on transcription, increasing the diagnostic yield for NDDs by 13%. We found that cycloheximide treatment and exonic/intronic Z score analysis increased detection and resolution of aberrant splicing. Importantly, in one individual mis-splicing was found in a candidate gene nearly matching the individual's specific phenotype. However, pathogenic splicing occurred in another neuronal-expressed gene and provided a molecular diagnosis, stressing the need to customize RNA-seq. Lastly, our web browser application allowed custom analysis settings that facilitate diagnostic application and ranked pathogenic transcripts as top candidates. Our results demonstrate that RNA-seq is a complementary method in the genomic diagnosis of NDDs and, by providing accessible analysis with improved sensitivity, our transcriptome analysis approach facilitates wider implementation of RNA-seq in routine genome diagnostics.


Assuntos
Perfilação da Expressão Gênica , Transtornos do Neurodesenvolvimento , Humanos , RNA-Seq , Cicloeximida , Análise de Sequência de RNA/métodos , Transtornos do Neurodesenvolvimento/diagnóstico , Transtornos do Neurodesenvolvimento/genética
5.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38770717

RESUMO

Drug therapy is vital in cancer treatment. Accurate analysis of drug sensitivity for specific cancers can guide healthcare professionals in prescribing drugs, leading to improved patient survival and quality of life. However, there is a lack of web-based tools that offer comprehensive visualization and analysis of pancancer drug sensitivity. We gathered cancer drug sensitivity data from publicly available databases (GEO, TCGA and GDSC) and developed a web tool called Comprehensive Pancancer Analysis of Drug Sensitivity (CPADS) using Shiny. CPADS currently includes transcriptomic data from over 29 000 samples, encompassing 44 types of cancer, 288 drugs and more than 9000 gene perturbations. It allows easy execution of various analyses related to cancer drug sensitivity. With its large sample size and diverse drug range, CPADS offers a range of analysis methods, such as differential gene expression, gene correlation, pathway analysis, drug analysis and gene perturbation analysis. Additionally, it provides several visualization approaches. CPADS significantly aids physicians and researchers in exploring primary and secondary drug resistance at both gene and pathway levels. The integration of drug resistance and gene perturbation data also presents novel perspectives for identifying pivotal genes influencing drug resistance. Access CPADS at https://smuonco.shinyapps.io/CPADS/ or https://robinl-lab.com/CPADS.


Assuntos
Resistencia a Medicamentos Antineoplásicos , Internet , Neoplasias , Software , Humanos , Neoplasias/tratamento farmacológico , Neoplasias/genética , Resistencia a Medicamentos Antineoplásicos/genética , Antineoplásicos/farmacologia , Antineoplásicos/uso terapêutico , Biologia Computacional/métodos , Bases de Dados Genéticas , Transcriptoma , Perfilação da Expressão Gênica/métodos
6.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38975894

RESUMO

Chimeric antigen receptor (CAR) therapy has emerged as a ground-breaking advancement in cancer treatment, harnessing the power of engineered human immune cells to target and eliminate cancer cells. The escalating interest and investment in CAR therapy in recent years emphasize its profound significance in clinical research, positioning it as a rapidly expanding frontier in the field of personalized cancer therapies. A crucial step in CAR therapy design is choosing the right target as it determines the therapy's effectiveness, safety and specificity against cancer cells, while sparing healthy tissues. Herein, we propose a suite of tools for the identification and analysis of potential CAR targets leveraging expression data from The Cancer Genome Atlas and Genotype-Tissue Expression Project, which are implemented in CARTAR website. These tools focus on pinpointing tumor-associated antigens, ensuring target selectivity and assessing specificity to avoid off-tumor toxicities and can be used to rationally designing dual CARs. In addition, candidate target expression can be explored in cancer cell lines using the expression data for the Cancer Cell Line Encyclopedia. To our best knowledge, CARTAR is the first website dedicated to the systematic search of suitable candidate targets for CAR therapy. CARTAR is publicly accessible at https://gmxenomica.github.io/CARTAR/.


Assuntos
Neoplasias , Receptores de Antígenos Quiméricos , Humanos , Receptores de Antígenos Quiméricos/genética , Receptores de Antígenos Quiméricos/metabolismo , Receptores de Antígenos Quiméricos/imunologia , Neoplasias/terapia , Neoplasias/genética , Imunoterapia Adotiva/métodos , Software , Internet , Biologia Computacional/métodos , Bases de Dados Genéticas
7.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39193916

RESUMO

Haxe is a general purpose, object-oriented programming language supporting syntactic macros. The Haxe compiler is well known for its ability to translate the source code of Haxe programs into the source code of a variety of other programming languages including Java, C++, JavaScript, and Python. Although Haxe is more and more used for a variety of purposes, including games, it has not yet attracted much attention from bioinformaticians. This is surprising, as Haxe allows generating different versions of the same program (e.g. a graphical user interface version in JavaScript running in a web browser for beginners and a command-line version in C++ or Python for increased performance) while maintaining a single code, a feature that should be of interest for many bioinformatic applications. To demonstrate the usefulness of Haxe in bioinformatics, we present here the case story of the program SeqPHASE, written originally in Perl (with a CGI version running on a server) and published in 2010. As Perl+CGI is not desirable anymore for security purposes, we decided to rewrite the SeqPHASE program in Haxe and to host it at Github Pages (https://eeg-ebe.github.io/SeqPHASE), thereby alleviating the need to configure and maintain a dedicated server. Using SeqPHASE as an example, we discuss the advantages and disadvantages of Haxe's source code conversion functionality when it comes to implementing bioinformatic software.


Assuntos
Biologia Computacional , Linguagens de Programação , Software , Biologia Computacional/métodos
8.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38487850

RESUMO

The screening of enzymes for catalyzing specific substrate-product pairs is often constrained in the realms of metabolic engineering and synthetic biology. Existing tools based on substrate and reaction similarity predominantly rely on prior knowledge, demonstrating limited extrapolative capabilities and an inability to incorporate custom candidate-enzyme libraries. Addressing these limitations, we have developed the Substrate-product Pair-based Enzyme Promiscuity Prediction (SPEPP) model. This innovative approach utilizes transfer learning and transformer architecture to predict enzyme promiscuity, thereby elucidating the intricate interplay between enzymes and substrate-product pairs. SPEPP exhibited robust predictive ability, eliminating the need for prior knowledge of reactions and allowing users to define their own candidate-enzyme libraries. It can be seamlessly integrated into various applications, including metabolic engineering, de novo pathway design, and hazardous material degradation. To better assist metabolic engineers in designing and refining biochemical pathways, particularly those without programming skills, we also designed EnzyPick, an easy-to-use web server for enzyme screening based on SPEPP. EnzyPick is accessible at http://www.biosynther.com/enzypick/.

9.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38261341

RESUMO

Ribonucleic acids (RNAs) play important roles in cellular regulation. Consequently, dysregulation of both coding and non-coding RNAs has been implicated in several disease conditions in the human body. In this regard, a growing interest has been observed to probe into the potential of RNAs to act as drug targets in disease conditions. To accelerate this search for disease-associated novel RNA targets and their small molecular inhibitors, machine learning models for binding affinity prediction were developed specific to six RNA subtypes namely, aptamers, miRNAs, repeats, ribosomal RNAs, riboswitches and viral RNAs. We found that differences in RNA sequence composition, flexibility and polar nature of RNA-binding ligands are important for predicting the binding affinity. Our method showed an average Pearson correlation (r) of 0.83 and a mean absolute error of 0.66 upon evaluation using the jack-knife test, indicating their reliability despite the low amount of data available for several RNA subtypes. Further, the models were validated with external blind test datasets, which outperform other existing quantitative structure-activity relationship (QSAR) models. We have developed a web server to host the models, RNA-Small molecule binding Affinity Predictor, which is freely available at: https://web.iitm.ac.in/bioinfo2/RSAPred/.


Assuntos
MicroRNAs , Humanos , Reprodutibilidade dos Testes , Ciclo Celular , Aprendizado de Máquina , Relação Quantitativa Estrutura-Atividade
10.
Brief Bioinform ; 25(6)2024 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-39431517

RESUMO

Protein degradation through the ubiquitin proteasome system at the spatial and temporal regulation is essential for many cellular processes. E3 ligases and degradation signals (degrons), the sequences they recognize in the target proteins, are key parts of the ubiquitin-mediated proteolysis, and their interactions determine the degradation specificity and maintain cellular homeostasis. To date, only a limited number of targeted degron instances have been identified, and their properties are not yet fully characterized. To tackle on this challenge, here we develop a novel deep-learning framework, namely MetaDegron, for predicting E3 ligase targeted degron by integrating the protein language model and comprehensive featurization strategies. Through extensive evaluations using benchmark datasets and comparison with existing method, such as Degpred, we demonstrate the superior performance of MetaDegron. Among functional features, MetaDegron allows batch prediction of targeted degrons of 21 E3 ligases, and provides functional annotations and visualization of multiple degron-related structural and physicochemical features. MetaDegron is freely available at http://modinfor.com/MetaDegron/. We anticipate that MetaDegron will serve as a useful tool for the clinical and translational community to elucidate the mechanisms of regulation of protein homeostasis, cancer research, and drug development.


Assuntos
Proteólise , Ubiquitina-Proteína Ligases , Ubiquitina-Proteína Ligases/metabolismo , Humanos , Biologia Computacional/métodos , Aprendizado Profundo , Software , Bases de Dados de Proteínas , Degrons
11.
Mol Cell Proteomics ; 23(1): 100693, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38097182

RESUMO

Large-scale omics studies have generated a wealth of mass spectrometry-based proteomics data, which provide additional insights into disease biology spanning genomic boundaries. However, there is a notable lack of web-based analysis and visualization tools that facilitate the reutilization of these data. Given this challenge, we present iProPhos, a user-friendly web server to deliver interactive and customizable functionalities. iProPhos incorporates a large number of samples, including 1444 tumor samples and 746 normal samples across 12 cancer types, sourced from the Clinical Proteomic Tumor Analysis Consortium. Additionally, users can also upload their own proteomics/phosphoproteomics data for analysis and visualization. In iProPhos, users can perform profiling plotting and differential expression, patient survival, clinical feature-related, and correlation analyses, including protein-protein, mRNA-protein, and kinase-substrate correlations. Furthermore, functional enrichment, protein-protein interaction network, and kinase-substrate enrichment analyses are accessible. iProPhos displays the analytical results in interactive figures and tables with various selectable parameters. It is freely accessible at http://longlab-zju.cn/iProPhos without login requirement. We present two case studies to demonstrate that iProPhos can identify potential drug targets and upstream kinases contributing to site-specific phosphorylation. Ultimately, iProPhos allows end-users to leverage the value of big data in cancer proteomics more effectively and accelerates the discovery of novel therapeutic targets.


Assuntos
Neoplasias , Proteoma , Humanos , Proteômica/métodos , Software , Neoplasias/genética , Internet
12.
Proc Natl Acad Sci U S A ; 120(3): e2212649120, 2023 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-36623193

RESUMO

The World Wide Web (WWW) empowers people in developing regions by eradicating illiteracy, supporting women, and generating economic opportunities. However, their reliance on limited bandwidth and low-end phones leaves them with a poorer browsing experience compared to privileged users across the digital divide. To evaluate the extent of this phenomenon, we sent participants to 56 cities to measure the cost of mobile data and the average page load time. We found the cost to be orders of magnitude greater, and the average page load time to be four times slower, in some locations compared to others. Analyzing how popular webpages have changed over the past years suggests that they are increasingly designed with high processing power in mind, effectively leaving the less fortunate users behind. Addressing this digital inequality through new infrastructure takes years to complete and billions of dollars to finance. A more practical solution is to make the webpages more accessible by reducing their size and optimizing their load time. To this end, we developed a solution called Lite-Web and evaluated it in the Gilgit-Baltistan province of Pakistan, demonstrating that it transforms the browsing experience of a Pakistani villager using a low-end phone to almost that of a Dubai resident using a flagship phone. A user study in two high schools in Pakistan confirms that the performance gains come at no expense to the pages' look and functionality. These findings suggest that deploying Lite-Web at scale would constitute a major step toward a WWW without digital inequality.


Assuntos
Emprego , Internet , Humanos , Feminino , Paquistão
13.
Proc Natl Acad Sci U S A ; 120(1): e2216701120, 2023 01 03.
Artigo em Inglês | MEDLINE | ID: mdl-36574678

RESUMO

The marine pelagic compartment spans numerous trophic levels and consists of numerous reticulate connections between species from primary producers to iconic apex predators, while the benthic compartment is perceived to be simpler in structure and comprised of only low trophic level species. Here, we challenge this paradigm by illustrating that the benthic compartment is home to a subweb of similar structure and complexity to that of the pelagic realm, including the benthic equivalent to iconic polar bears: megafaunal-predatory sea stars.


Assuntos
Ursidae , Animais , Comportamento Predatório , Cadeia Alimentar , Ecossistema
14.
J Biol Chem ; 300(5): 107279, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38588808

RESUMO

Actin bundling proteins crosslink filaments into polarized structures that shape and support membrane protrusions including filopodia, microvilli, and stereocilia. In the case of epithelial microvilli, mitotic spindle positioning protein (MISP) is an actin bundler that localizes specifically to the basal rootlets, where the pointed ends of core bundle filaments converge. Previous studies established that MISP is prevented from binding more distal segments of the core bundle by competition with other actin-binding proteins. Yet whether MISP holds a preference for binding directly to rootlet actin remains an open question. By immunostaining native intestinal tissue sections, we found that microvillar rootlets are decorated with the severing protein, cofilin, suggesting high levels of ADP-actin in these structures. Using total internal reflection fluorescence microscopy assays, we also found that purified MISP exhibits a binding preference for ADP- versus ADP-Pi-actin-containing filaments. Consistent with this, assays with actively growing actin filaments revealed that MISP binds at or near their pointed ends. Moreover, although substrate attached MISP assembles filament bundles in parallel and antiparallel configurations, in solution MISP assembles parallel bundles consisting of multiple filaments exhibiting uniform polarity. These discoveries highlight nucleotide state sensing as a mechanism for sorting actin bundlers along filaments and driving their accumulation near filament ends. Such localized binding might drive parallel bundle formation and/or locally modulate bundle mechanical properties in microvilli and related protrusions.


Assuntos
Actinas , Animais , Citoesqueleto de Actina/metabolismo , Fatores de Despolimerização de Actina/metabolismo , Actinas/metabolismo , Difosfato de Adenosina/metabolismo , Proteínas de Ciclo Celular/metabolismo , Proteínas dos Microfilamentos/metabolismo , Microvilosidades/metabolismo , Ligação Proteica
15.
RNA ; 29(5): 570-583, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36750372

RESUMO

Antisense oligomers (ASOs), such as peptide nucleic acids (PNAs), designed to inhibit the translation of essential bacterial genes, have emerged as attractive sequence- and species-specific programmable RNA antibiotics. Yet, potential drawbacks include unwanted side effects caused by their binding to transcripts other than the intended target. To facilitate the design of PNAs with minimal off-target effects, we developed MASON (make antisense oligomers now), a web server for the design of PNAs that target bacterial mRNAs. MASON generates PNA sequences complementary to the translational start site of a bacterial gene of interest and reports critical sequence attributes and potential off-target sites. We based MASON's off-target predictions on experiments in which we treated Salmonella enterica serovar Typhimurium with a series of 10-mer PNAs derived from a PNA targeting the essential gene acpP but carrying two serial mismatches. Growth inhibition and RNA-sequencing (RNA-seq) data revealed that PNAs with terminal mismatches are still able to target acpP, suggesting wider off-target effects than anticipated. Comparison of these results to an RNA-seq data set from uropathogenic Escherichia coli (UPEC) treated with eleven different PNAs confirmed that our findings are not unique to Salmonella We believe that MASON's off-target assessment will improve the design of specific PNAs and other ASOs.


Assuntos
Ácidos Nucleicos Peptídicos , RNA Mensageiro/genética , RNA Mensageiro/química , Ácidos Nucleicos Peptídicos/genética , Ácidos Nucleicos Peptídicos/farmacologia , Ácidos Nucleicos Peptídicos/química , Oligonucleotídeos Antissenso/farmacologia , Bactérias/genética , RNA , Salmonella typhimurium/genética
16.
Brief Bioinform ; 24(5)2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37738401

RESUMO

Cracking the entangling code of protein-ligand interaction (PLI) is of great importance to structure-based drug design and discovery. Different physical and biochemical representations can be used to describe PLI such as energy terms and interaction fingerprints, which can be analyzed by machine learning (ML) algorithms to create ML-based scoring functions (MLSFs). Here, we propose the ML-based PLI capturer (ML-PLIC), a web platform that automatically characterizes PLI and generates MLSFs to identify the potential binders of a specific protein target through virtual screening (VS). ML-PLIC comprises five modules, including Docking for ligand docking, Descriptors for PLI generation, Modeling for MLSF training, Screening for VS and Pipeline for the integration of the aforementioned functions. We validated the MLSFs constructed by ML-PLIC in three benchmark datasets (Directory of Useful Decoys-Enhanced, Active as Decoys and TocoDecoy), demonstrating accuracy outperforming traditional docking tools and competitive performance to the deep learning-based SF, and provided a case study of the Serine/threonine-protein kinase WEE1 in which MLSFs were developed by using the ML-based VS pipeline in ML-PLIC. Underpinning the latest version of ML-PLIC is a powerful platform that incorporates physical and biological knowledge about PLI, leveraging PLI characterization and MLSF generation into the design of structure-based VS pipeline. The ML-PLIC web platform is now freely available at http://cadd.zju.edu.cn/plic/.


Assuntos
Algoritmos , Benchmarking , Ligantes , Desenho de Fármacos , Aprendizado de Máquina
17.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37742051

RESUMO

Single-base substitution (SBS) mutational signatures have become standard practice in cancer genomics. In lieu of de novo signature extraction, reference signature assignment allows users to estimate the activities of pre-established SBS signatures within individual malignancies. Several tools have been developed for this purpose, each with differing methodologies. However, due to a lack of standardization, there may be inter-tool variability in signature assignment. We deeply characterized three assignment strategies and five SBS signature assignment tools. We observed that assignment strategy choice can significantly influence results and interpretations. Despite varying recommendations by tools, Refit performed best by reducing overfitting and maximizing reconstruction of the original mutational spectra. Even after uniform application of Refit, tools varied remarkably in signature assignments both qualitatively (Jaccard index = 0.38-0.83) and quantitatively (Kendall tau-b = 0.18-0.76). This phenomenon was exacerbated for 'flat' signatures such as the homologous recombination deficiency signature SBS3. An ensemble approach (EnsembleFit), which leverages output from all five tools, increased SBS3 assignment accuracy in BRCA1/2-deficient breast carcinomas. After generating synthetic mutational profiles for thousands of pan-cancer tumors, EnsembleFit reduced signature activity assignment error 15.9-24.7% on average using Catalogue of Somatic Mutations In Cancer and non-standard reference signature sets. We have also released the EnsembleFit web portal (https://www.ensemblefit.pittlabgenomics.com) for users to generate or download ensemble-based SBS signature assignments using any strategy and combination of tools. Overall, we show that signature assignment heterogeneity across tools and strategies is non-negligible and propose a viable, ensemble solution.


Assuntos
Proteína BRCA1 , Proteína BRCA2 , Proteína BRCA1/genética , Proteína BRCA2/genética , Mutação
18.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36549921

RESUMO

Cancer initiation and progression are likely caused by the dysregulation of biological pathways. Gene set analysis (GSA) could improve the signal-to-noise ratio and identify potential biological insights on the gene set level. However, platforms exploring cancer multi-omics data using GSA methods are lacking. In this study, we upgraded our GSCALite to GSCA (gene set cancer analysis, http://bioinfo.life.hust.edu.cn/GSCA) for cancer GSA at genomic, pharmacogenomic and immunogenomic levels. In this improved GSCA, we integrated expression, mutation, drug sensitivity and clinical data from four public data sources for 33 cancer types. We introduced useful features to GSCA, including associations between immune infiltration with gene expression and genomic variations, and associations between gene set expression/mutation and clinical outcomes. GSCA has four main functional modules for cancer GSA to explore, analyze and visualize expression, genomic variations, tumor immune infiltration, drug sensitivity and their associations with clinical outcomes. We used case studies of three gene sets: (i) seven cell cycle genes, (ii) tumor suppressor genes of PI3K pathway and (iii) oncogenes of PI3K pathway to prove the advantage of GSCA over single gene analysis. We found novel associations of gene set expression and mutation with clinical outcomes in different cancer types on gene set level, while on single gene analysis level, they are not significant associations. In conclusion, GSCA is a user-friendly web server and a useful resource for conducting hypothesis tests by using GSA methods at genomic, pharmacogenomic and immunogenomic levels.


Assuntos
Neoplasias , Farmacogenética , Humanos , Fosfatidilinositol 3-Quinases/genética , Genômica/métodos , Neoplasias/tratamento farmacológico , Neoplasias/genética , Oncogenes
19.
Brief Bioinform ; 24(2)2023 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-36869843

RESUMO

Recently, lysine lactylation (Kla), a novel post-translational modification (PTM), which can be stimulated by lactate, has been found to regulate gene expression and life activities. Therefore, it is imperative to accurately identify Kla sites. Currently, mass spectrometry is the fundamental method for identifying PTM sites. However, it is expensive and time-consuming to achieve this through experiments alone. Herein, we proposed a novel computational model, Auto-Kla, to quickly and accurately predict Kla sites in gastric cancer cells based on automated machine learning (AutoML). With stable and reliable performance, our model outperforms the recently published model in the 10-fold cross-validation. To investigate the generalizability and transferability of our approach, we evaluated the performance of our models trained on two other widely studied types of PTM, including phosphorylation sites in host cells infected with SARS-CoV-2 and lysine crotonylation sites in HeLa cells. The results show that our models achieve comparable or better performance than current outstanding models. We believe that this method will become a useful analytical tool for PTM prediction and provide a reference for the future development of related models. The web server and source code are available at http://tubic.org/Kla and https://github.com/tubic/Auto-Kla, respectively.


Assuntos
COVID-19 , Lisina , Humanos , Lisina/metabolismo , Células HeLa , SARS-CoV-2/metabolismo , Aprendizado de Máquina
20.
Brief Bioinform ; 25(1)2023 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-38113076

RESUMO

In clinical treatment, two or more drugs (i.e. drug combination) are simultaneously or successively used for therapy with the purpose of primarily enhancing the therapeutic efficacy or reducing drug side effects. However, inappropriate drug combination may not only fail to improve efficacy, but even lead to adverse reactions. Therefore, according to the basic principle of improving the efficacy and/or reducing adverse reactions, we should study drug-drug interactions (DDIs) comprehensively and thoroughly so as to reasonably use drug combination. In this review, we first introduced the basic conception and classification of DDIs. Further, some important publicly available databases and web servers about experimentally verified or predicted DDIs were briefly described. As an effective auxiliary tool, computational models for predicting DDIs can not only save the cost of biological experiments, but also provide relevant guidance for combination therapy to some extent. Therefore, we summarized three types of prediction models (including traditional machine learning-based models, deep learning-based models and score function-based models) proposed during recent years and discussed the advantages as well as limitations of them. Besides, we pointed out the problems that need to be solved in the future research of DDIs prediction and provided corresponding suggestions.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Humanos , Interações Medicamentosas , Bases de Dados Factuais , Simulação por Computador , Combinação de Medicamentos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA