Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 9.124
Filtrar
Más filtros

Publication year range
1.
Proc Natl Acad Sci U S A ; 121(28): e2403888121, 2024 Jul 09.
Artículo en Inglés | MEDLINE | ID: mdl-38968102

RESUMEN

Real-world communication frequently requires language producers to address more than one comprehender at once, yet most psycholinguistic research focuses on one-on-one communication. As the audience size grows, interlocutors face new challenges that do not arise in dyads. They must consider multiple perspectives and weigh multiple sources of feedback to build shared understanding. Here, we ask which properties of the group's interaction structure facilitate successful communication. We used a repeated reference game paradigm in which directors instructed between one and five matchers to choose specific targets out of a set of abstract figures. Across 313 games (N = 1,319 participants), we manipulated several key constraints on the group's interaction, including the amount of feedback that matchers could give to directors and the availability of peer interaction between matchers. Across groups of different sizes and interaction constraints, describers produced increasingly efficient utterances and matchers made increasingly accurate selections. Critically, however, we found that smaller groups and groups with less-constrained interaction structures ("thick channels") showed stronger convergence to group-specific conventions than large groups with constrained interaction structures ("thin channels"), which struggled with convention formation. Overall, these results shed light on the core structural factors that enable communication to thrive in larger groups.


Asunto(s)
Comunicación , Humanos , Masculino , Femenino , Adulto , Lenguaje , Procesos de Grupo , Relaciones Interpersonales , Adulto Joven , Psicolingüística
2.
Hum Mol Genet ; 33(14): 1207-1214, 2024 Jul 06.
Artículo en Inglés | MEDLINE | ID: mdl-38643062

RESUMEN

Genotype imputation is widely used in genome-wide association studies (GWAS). However, both the genotyping chips and imputation reference panels are dependent on next-generation sequencing (NGS). Due to the nature of NGS, some regions of the genome are inaccessible to sequencing. To date, there has been no complete evaluation of these regions and their impact on the identification of associations in GWAS remains unclear. In this study, we systematically assess the extent to which variants in inaccessible regions are underrepresented on genotyping chips and imputation reference panels, in GWAS results and in variant databases. We also determine the proportion of genes located in inaccessible regions and compare the results across variant masks defined by the 1000 Genomes Project and the TOPMed program. Overall, fewer variants were observed in inaccessible regions in all categories analyzed. Depending on the mask used and normalized for region size, only 4%-17% of the genotyped variants are located in inaccessible regions and 52 to 581 genes were almost completely inaccessible. From the Cooperative Health Research in South Tyrol (CHRIS) study, we present a case study of an association located in an inaccessible region that is driven by genotyped variants and cannot be reproduced by imputation in GRCh37. We conclude that genotyping, NGS, genotype imputation and downstream analyses such as GWAS and fine mapping are systematically biased in inaccessible regions, due to missed variants and spurious associations. To help researchers assess gene and variant accessibility, we provide an online application (https://gab.gm.eurac.edu).


Asunto(s)
Genoma Humano , Estudio de Asociación del Genoma Completo , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento , Polimorfismo de Nucleótido Simple , Humanos , Estudio de Asociación del Genoma Completo/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Polimorfismo de Nucleótido Simple/genética
3.
Am J Hum Genet ; 110(2): 251-272, 2023 02 02.
Artículo en Inglés | MEDLINE | ID: mdl-36669495

RESUMEN

For neurodevelopmental disorders (NDDs), a molecular diagnosis is key for management, predicting outcome, and counseling. Often, routine DNA-based tests fail to establish a genetic diagnosis in NDDs. Transcriptome analysis (RNA sequencing [RNA-seq]) promises to improve the diagnostic yield but has not been applied to NDDs in routine diagnostics. Here, we explored the diagnostic potential of RNA-seq in 96 individuals including 67 undiagnosed subjects with NDDs. We performed RNA-seq on single individuals' cultured skin fibroblasts, with and without cycloheximide treatment, and used modified OUTRIDER Z scores to detect gene expression outliers and mis-splicing by exonic and intronic outliers. Analysis was performed by a user-friendly web application, and candidate pathogenic transcriptional events were confirmed by secondary assays. We identified intragenic deletions, monoallelic expression, and pseudoexonic insertions but also synonymous and non-synonymous variants with deleterious effects on transcription, increasing the diagnostic yield for NDDs by 13%. We found that cycloheximide treatment and exonic/intronic Z score analysis increased detection and resolution of aberrant splicing. Importantly, in one individual mis-splicing was found in a candidate gene nearly matching the individual's specific phenotype. However, pathogenic splicing occurred in another neuronal-expressed gene and provided a molecular diagnosis, stressing the need to customize RNA-seq. Lastly, our web browser application allowed custom analysis settings that facilitate diagnostic application and ranked pathogenic transcripts as top candidates. Our results demonstrate that RNA-seq is a complementary method in the genomic diagnosis of NDDs and, by providing accessible analysis with improved sensitivity, our transcriptome analysis approach facilitates wider implementation of RNA-seq in routine genome diagnostics.


Asunto(s)
Perfilación de la Expresión Génica , Trastornos del Neurodesarrollo , Humanos , RNA-Seq , Cicloheximida , Análisis de Secuencia de ARN/métodos , Trastornos del Neurodesarrollo/diagnóstico , Trastornos del Neurodesarrollo/genética
4.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38975894

RESUMEN

Chimeric antigen receptor (CAR) therapy has emerged as a ground-breaking advancement in cancer treatment, harnessing the power of engineered human immune cells to target and eliminate cancer cells. The escalating interest and investment in CAR therapy in recent years emphasize its profound significance in clinical research, positioning it as a rapidly expanding frontier in the field of personalized cancer therapies. A crucial step in CAR therapy design is choosing the right target as it determines the therapy's effectiveness, safety and specificity against cancer cells, while sparing healthy tissues. Herein, we propose a suite of tools for the identification and analysis of potential CAR targets leveraging expression data from The Cancer Genome Atlas and Genotype-Tissue Expression Project, which are implemented in CARTAR website. These tools focus on pinpointing tumor-associated antigens, ensuring target selectivity and assessing specificity to avoid off-tumor toxicities and can be used to rationally designing dual CARs. In addition, candidate target expression can be explored in cancer cell lines using the expression data for the Cancer Cell Line Encyclopedia. To our best knowledge, CARTAR is the first website dedicated to the systematic search of suitable candidate targets for CAR therapy. CARTAR is publicly accessible at https://gmxenomica.github.io/CARTAR/.


Asunto(s)
Neoplasias , Receptores Quiméricos de Antígenos , Humanos , Receptores Quiméricos de Antígenos/genética , Receptores Quiméricos de Antígenos/metabolismo , Receptores Quiméricos de Antígenos/inmunología , Neoplasias/terapia , Neoplasias/genética , Inmunoterapia Adoptiva/métodos , Programas Informáticos , Internet , Biología Computacional/métodos , Bases de Datos Genéticas
5.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38770717

RESUMEN

Drug therapy is vital in cancer treatment. Accurate analysis of drug sensitivity for specific cancers can guide healthcare professionals in prescribing drugs, leading to improved patient survival and quality of life. However, there is a lack of web-based tools that offer comprehensive visualization and analysis of pancancer drug sensitivity. We gathered cancer drug sensitivity data from publicly available databases (GEO, TCGA and GDSC) and developed a web tool called Comprehensive Pancancer Analysis of Drug Sensitivity (CPADS) using Shiny. CPADS currently includes transcriptomic data from over 29 000 samples, encompassing 44 types of cancer, 288 drugs and more than 9000 gene perturbations. It allows easy execution of various analyses related to cancer drug sensitivity. With its large sample size and diverse drug range, CPADS offers a range of analysis methods, such as differential gene expression, gene correlation, pathway analysis, drug analysis and gene perturbation analysis. Additionally, it provides several visualization approaches. CPADS significantly aids physicians and researchers in exploring primary and secondary drug resistance at both gene and pathway levels. The integration of drug resistance and gene perturbation data also presents novel perspectives for identifying pivotal genes influencing drug resistance. Access CPADS at https://smuonco.shinyapps.io/CPADS/ or https://robinl-lab.com/CPADS.


Asunto(s)
Resistencia a Antineoplásicos , Internet , Neoplasias , Programas Informáticos , Humanos , Neoplasias/tratamiento farmacológico , Neoplasias/genética , Resistencia a Antineoplásicos/genética , Antineoplásicos/farmacología , Antineoplásicos/uso terapéutico , Biología Computacional/métodos , Bases de Datos Genéticas , Transcriptoma , Perfilación de la Expresión Génica/métodos
6.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38261341

RESUMEN

Ribonucleic acids (RNAs) play important roles in cellular regulation. Consequently, dysregulation of both coding and non-coding RNAs has been implicated in several disease conditions in the human body. In this regard, a growing interest has been observed to probe into the potential of RNAs to act as drug targets in disease conditions. To accelerate this search for disease-associated novel RNA targets and their small molecular inhibitors, machine learning models for binding affinity prediction were developed specific to six RNA subtypes namely, aptamers, miRNAs, repeats, ribosomal RNAs, riboswitches and viral RNAs. We found that differences in RNA sequence composition, flexibility and polar nature of RNA-binding ligands are important for predicting the binding affinity. Our method showed an average Pearson correlation (r) of 0.83 and a mean absolute error of 0.66 upon evaluation using the jack-knife test, indicating their reliability despite the low amount of data available for several RNA subtypes. Further, the models were validated with external blind test datasets, which outperform other existing quantitative structure-activity relationship (QSAR) models. We have developed a web server to host the models, RNA-Small molecule binding Affinity Predictor, which is freely available at: https://web.iitm.ac.in/bioinfo2/RSAPred/.


Asunto(s)
MicroARNs , Humanos , Reproducibilidad de los Resultados , Ciclo Celular , Aprendizaje Automático , Relación Estructura-Actividad Cuantitativa
7.
Brief Bioinform ; 25(5)2024 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-39193916

RESUMEN

Haxe is a general purpose, object-oriented programming language supporting syntactic macros. The Haxe compiler is well known for its ability to translate the source code of Haxe programs into the source code of a variety of other programming languages including Java, C++, JavaScript, and Python. Although Haxe is more and more used for a variety of purposes, including games, it has not yet attracted much attention from bioinformaticians. This is surprising, as Haxe allows generating different versions of the same program (e.g. a graphical user interface version in JavaScript running in a web browser for beginners and a command-line version in C++ or Python for increased performance) while maintaining a single code, a feature that should be of interest for many bioinformatic applications. To demonstrate the usefulness of Haxe in bioinformatics, we present here the case story of the program SeqPHASE, written originally in Perl (with a CGI version running on a server) and published in 2010. As Perl+CGI is not desirable anymore for security purposes, we decided to rewrite the SeqPHASE program in Haxe and to host it at Github Pages (https://eeg-ebe.github.io/SeqPHASE), thereby alleviating the need to configure and maintain a dedicated server. Using SeqPHASE as an example, we discuss the advantages and disadvantages of Haxe's source code conversion functionality when it comes to implementing bioinformatic software.


Asunto(s)
Biología Computacional , Lenguajes de Programación , Programas Informáticos , Biología Computacional/métodos
8.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38487850

RESUMEN

The screening of enzymes for catalyzing specific substrate-product pairs is often constrained in the realms of metabolic engineering and synthetic biology. Existing tools based on substrate and reaction similarity predominantly rely on prior knowledge, demonstrating limited extrapolative capabilities and an inability to incorporate custom candidate-enzyme libraries. Addressing these limitations, we have developed the Substrate-product Pair-based Enzyme Promiscuity Prediction (SPEPP) model. This innovative approach utilizes transfer learning and transformer architecture to predict enzyme promiscuity, thereby elucidating the intricate interplay between enzymes and substrate-product pairs. SPEPP exhibited robust predictive ability, eliminating the need for prior knowledge of reactions and allowing users to define their own candidate-enzyme libraries. It can be seamlessly integrated into various applications, including metabolic engineering, de novo pathway design, and hazardous material degradation. To better assist metabolic engineers in designing and refining biochemical pathways, particularly those without programming skills, we also designed EnzyPick, an easy-to-use web server for enzyme screening based on SPEPP. EnzyPick is accessible at http://www.biosynther.com/enzypick/.

9.
Mol Cell Proteomics ; 23(1): 100693, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38097182

RESUMEN

Large-scale omics studies have generated a wealth of mass spectrometry-based proteomics data, which provide additional insights into disease biology spanning genomic boundaries. However, there is a notable lack of web-based analysis and visualization tools that facilitate the reutilization of these data. Given this challenge, we present iProPhos, a user-friendly web server to deliver interactive and customizable functionalities. iProPhos incorporates a large number of samples, including 1444 tumor samples and 746 normal samples across 12 cancer types, sourced from the Clinical Proteomic Tumor Analysis Consortium. Additionally, users can also upload their own proteomics/phosphoproteomics data for analysis and visualization. In iProPhos, users can perform profiling plotting and differential expression, patient survival, clinical feature-related, and correlation analyses, including protein-protein, mRNA-protein, and kinase-substrate correlations. Furthermore, functional enrichment, protein-protein interaction network, and kinase-substrate enrichment analyses are accessible. iProPhos displays the analytical results in interactive figures and tables with various selectable parameters. It is freely accessible at http://longlab-zju.cn/iProPhos without login requirement. We present two case studies to demonstrate that iProPhos can identify potential drug targets and upstream kinases contributing to site-specific phosphorylation. Ultimately, iProPhos allows end-users to leverage the value of big data in cancer proteomics more effectively and accelerates the discovery of novel therapeutic targets.


Asunto(s)
Neoplasias , Proteoma , Humanos , Proteómica/métodos , Programas Informáticos , Neoplasias/genética , Internet
10.
Proc Natl Acad Sci U S A ; 120(3): e2212649120, 2023 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-36623193

RESUMEN

The World Wide Web (WWW) empowers people in developing regions by eradicating illiteracy, supporting women, and generating economic opportunities. However, their reliance on limited bandwidth and low-end phones leaves them with a poorer browsing experience compared to privileged users across the digital divide. To evaluate the extent of this phenomenon, we sent participants to 56 cities to measure the cost of mobile data and the average page load time. We found the cost to be orders of magnitude greater, and the average page load time to be four times slower, in some locations compared to others. Analyzing how popular webpages have changed over the past years suggests that they are increasingly designed with high processing power in mind, effectively leaving the less fortunate users behind. Addressing this digital inequality through new infrastructure takes years to complete and billions of dollars to finance. A more practical solution is to make the webpages more accessible by reducing their size and optimizing their load time. To this end, we developed a solution called Lite-Web and evaluated it in the Gilgit-Baltistan province of Pakistan, demonstrating that it transforms the browsing experience of a Pakistani villager using a low-end phone to almost that of a Dubai resident using a flagship phone. A user study in two high schools in Pakistan confirms that the performance gains come at no expense to the pages' look and functionality. These findings suggest that deploying Lite-Web at scale would constitute a major step toward a WWW without digital inequality.


Asunto(s)
Empleo , Internet , Humanos , Femenino , Pakistán
11.
Proc Natl Acad Sci U S A ; 120(1): e2216701120, 2023 01 03.
Artículo en Inglés | MEDLINE | ID: mdl-36574678

RESUMEN

The marine pelagic compartment spans numerous trophic levels and consists of numerous reticulate connections between species from primary producers to iconic apex predators, while the benthic compartment is perceived to be simpler in structure and comprised of only low trophic level species. Here, we challenge this paradigm by illustrating that the benthic compartment is home to a subweb of similar structure and complexity to that of the pelagic realm, including the benthic equivalent to iconic polar bears: megafaunal-predatory sea stars.


Asunto(s)
Ursidae , Animales , Conducta Predatoria , Cadena Alimentaria , Ecosistema
12.
J Biol Chem ; 300(5): 107279, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38588808

RESUMEN

Actin bundling proteins crosslink filaments into polarized structures that shape and support membrane protrusions including filopodia, microvilli, and stereocilia. In the case of epithelial microvilli, mitotic spindle positioning protein (MISP) is an actin bundler that localizes specifically to the basal rootlets, where the pointed ends of core bundle filaments converge. Previous studies established that MISP is prevented from binding more distal segments of the core bundle by competition with other actin-binding proteins. Yet whether MISP holds a preference for binding directly to rootlet actin remains an open question. By immunostaining native intestinal tissue sections, we found that microvillar rootlets are decorated with the severing protein, cofilin, suggesting high levels of ADP-actin in these structures. Using total internal reflection fluorescence microscopy assays, we also found that purified MISP exhibits a binding preference for ADP- versus ADP-Pi-actin-containing filaments. Consistent with this, assays with actively growing actin filaments revealed that MISP binds at or near their pointed ends. Moreover, although substrate attached MISP assembles filament bundles in parallel and antiparallel configurations, in solution MISP assembles parallel bundles consisting of multiple filaments exhibiting uniform polarity. These discoveries highlight nucleotide state sensing as a mechanism for sorting actin bundlers along filaments and driving their accumulation near filament ends. Such localized binding might drive parallel bundle formation and/or locally modulate bundle mechanical properties in microvilli and related protrusions.


Asunto(s)
Actinas , Animales , Citoesqueleto de Actina/metabolismo , Factores Despolimerizantes de la Actina/metabolismo , Actinas/metabolismo , Adenosina Difosfato/metabolismo , Proteínas de Ciclo Celular/metabolismo , Proteínas de Microfilamentos/metabolismo , Microvellosidades/metabolismo , Unión Proteica
13.
RNA ; 29(5): 570-583, 2023 05.
Artículo en Inglés | MEDLINE | ID: mdl-36750372

RESUMEN

Antisense oligomers (ASOs), such as peptide nucleic acids (PNAs), designed to inhibit the translation of essential bacterial genes, have emerged as attractive sequence- and species-specific programmable RNA antibiotics. Yet, potential drawbacks include unwanted side effects caused by their binding to transcripts other than the intended target. To facilitate the design of PNAs with minimal off-target effects, we developed MASON (make antisense oligomers now), a web server for the design of PNAs that target bacterial mRNAs. MASON generates PNA sequences complementary to the translational start site of a bacterial gene of interest and reports critical sequence attributes and potential off-target sites. We based MASON's off-target predictions on experiments in which we treated Salmonella enterica serovar Typhimurium with a series of 10-mer PNAs derived from a PNA targeting the essential gene acpP but carrying two serial mismatches. Growth inhibition and RNA-sequencing (RNA-seq) data revealed that PNAs with terminal mismatches are still able to target acpP, suggesting wider off-target effects than anticipated. Comparison of these results to an RNA-seq data set from uropathogenic Escherichia coli (UPEC) treated with eleven different PNAs confirmed that our findings are not unique to Salmonella We believe that MASON's off-target assessment will improve the design of specific PNAs and other ASOs.


Asunto(s)
Ácidos Nucleicos de Péptidos , ARN Mensajero/genética , ARN Mensajero/química , Ácidos Nucleicos de Péptidos/genética , Ácidos Nucleicos de Péptidos/farmacología , Ácidos Nucleicos de Péptidos/química , Oligonucleótidos Antisentido/farmacología , Bacterias/genética , ARN , Salmonella typhimurium/genética
14.
Brief Bioinform ; 24(5)2023 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-37738401

RESUMEN

Cracking the entangling code of protein-ligand interaction (PLI) is of great importance to structure-based drug design and discovery. Different physical and biochemical representations can be used to describe PLI such as energy terms and interaction fingerprints, which can be analyzed by machine learning (ML) algorithms to create ML-based scoring functions (MLSFs). Here, we propose the ML-based PLI capturer (ML-PLIC), a web platform that automatically characterizes PLI and generates MLSFs to identify the potential binders of a specific protein target through virtual screening (VS). ML-PLIC comprises five modules, including Docking for ligand docking, Descriptors for PLI generation, Modeling for MLSF training, Screening for VS and Pipeline for the integration of the aforementioned functions. We validated the MLSFs constructed by ML-PLIC in three benchmark datasets (Directory of Useful Decoys-Enhanced, Active as Decoys and TocoDecoy), demonstrating accuracy outperforming traditional docking tools and competitive performance to the deep learning-based SF, and provided a case study of the Serine/threonine-protein kinase WEE1 in which MLSFs were developed by using the ML-based VS pipeline in ML-PLIC. Underpinning the latest version of ML-PLIC is a powerful platform that incorporates physical and biological knowledge about PLI, leveraging PLI characterization and MLSF generation into the design of structure-based VS pipeline. The ML-PLIC web platform is now freely available at http://cadd.zju.edu.cn/plic/.


Asunto(s)
Algoritmos , Benchmarking , Ligandos , Diseño de Fármacos , Aprendizaje Automático
15.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36549921

RESUMEN

Cancer initiation and progression are likely caused by the dysregulation of biological pathways. Gene set analysis (GSA) could improve the signal-to-noise ratio and identify potential biological insights on the gene set level. However, platforms exploring cancer multi-omics data using GSA methods are lacking. In this study, we upgraded our GSCALite to GSCA (gene set cancer analysis, http://bioinfo.life.hust.edu.cn/GSCA) for cancer GSA at genomic, pharmacogenomic and immunogenomic levels. In this improved GSCA, we integrated expression, mutation, drug sensitivity and clinical data from four public data sources for 33 cancer types. We introduced useful features to GSCA, including associations between immune infiltration with gene expression and genomic variations, and associations between gene set expression/mutation and clinical outcomes. GSCA has four main functional modules for cancer GSA to explore, analyze and visualize expression, genomic variations, tumor immune infiltration, drug sensitivity and their associations with clinical outcomes. We used case studies of three gene sets: (i) seven cell cycle genes, (ii) tumor suppressor genes of PI3K pathway and (iii) oncogenes of PI3K pathway to prove the advantage of GSCA over single gene analysis. We found novel associations of gene set expression and mutation with clinical outcomes in different cancer types on gene set level, while on single gene analysis level, they are not significant associations. In conclusion, GSCA is a user-friendly web server and a useful resource for conducting hypothesis tests by using GSA methods at genomic, pharmacogenomic and immunogenomic levels.


Asunto(s)
Neoplasias , Farmacogenética , Humanos , Fosfatidilinositol 3-Quinasas/genética , Genómica/métodos , Neoplasias/tratamiento farmacológico , Neoplasias/genética , Oncogenes
16.
Brief Bioinform ; 24(2)2023 03 19.
Artículo en Inglés | MEDLINE | ID: mdl-36869843

RESUMEN

Recently, lysine lactylation (Kla), a novel post-translational modification (PTM), which can be stimulated by lactate, has been found to regulate gene expression and life activities. Therefore, it is imperative to accurately identify Kla sites. Currently, mass spectrometry is the fundamental method for identifying PTM sites. However, it is expensive and time-consuming to achieve this through experiments alone. Herein, we proposed a novel computational model, Auto-Kla, to quickly and accurately predict Kla sites in gastric cancer cells based on automated machine learning (AutoML). With stable and reliable performance, our model outperforms the recently published model in the 10-fold cross-validation. To investigate the generalizability and transferability of our approach, we evaluated the performance of our models trained on two other widely studied types of PTM, including phosphorylation sites in host cells infected with SARS-CoV-2 and lysine crotonylation sites in HeLa cells. The results show that our models achieve comparable or better performance than current outstanding models. We believe that this method will become a useful analytical tool for PTM prediction and provide a reference for the future development of related models. The web server and source code are available at http://tubic.org/Kla and https://github.com/tubic/Auto-Kla, respectively.


Asunto(s)
COVID-19 , Lisina , Humanos , Lisina/metabolismo , Células HeLa , SARS-CoV-2/metabolismo , Aprendizaje Automático
17.
Brief Bioinform ; 24(6)2023 09 22.
Artículo en Inglés | MEDLINE | ID: mdl-37742051

RESUMEN

Single-base substitution (SBS) mutational signatures have become standard practice in cancer genomics. In lieu of de novo signature extraction, reference signature assignment allows users to estimate the activities of pre-established SBS signatures within individual malignancies. Several tools have been developed for this purpose, each with differing methodologies. However, due to a lack of standardization, there may be inter-tool variability in signature assignment. We deeply characterized three assignment strategies and five SBS signature assignment tools. We observed that assignment strategy choice can significantly influence results and interpretations. Despite varying recommendations by tools, Refit performed best by reducing overfitting and maximizing reconstruction of the original mutational spectra. Even after uniform application of Refit, tools varied remarkably in signature assignments both qualitatively (Jaccard index = 0.38-0.83) and quantitatively (Kendall tau-b = 0.18-0.76). This phenomenon was exacerbated for 'flat' signatures such as the homologous recombination deficiency signature SBS3. An ensemble approach (EnsembleFit), which leverages output from all five tools, increased SBS3 assignment accuracy in BRCA1/2-deficient breast carcinomas. After generating synthetic mutational profiles for thousands of pan-cancer tumors, EnsembleFit reduced signature activity assignment error 15.9-24.7% on average using Catalogue of Somatic Mutations In Cancer and non-standard reference signature sets. We have also released the EnsembleFit web portal (https://www.ensemblefit.pittlabgenomics.com) for users to generate or download ensemble-based SBS signature assignments using any strategy and combination of tools. Overall, we show that signature assignment heterogeneity across tools and strategies is non-negligible and propose a viable, ensemble solution.


Asunto(s)
Proteína BRCA1 , Proteína BRCA2 , Proteína BRCA1/genética , Proteína BRCA2/genética , Mutación
18.
Brief Bioinform ; 25(1)2023 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-38113076

RESUMEN

In clinical treatment, two or more drugs (i.e. drug combination) are simultaneously or successively used for therapy with the purpose of primarily enhancing the therapeutic efficacy or reducing drug side effects. However, inappropriate drug combination may not only fail to improve efficacy, but even lead to adverse reactions. Therefore, according to the basic principle of improving the efficacy and/or reducing adverse reactions, we should study drug-drug interactions (DDIs) comprehensively and thoroughly so as to reasonably use drug combination. In this review, we first introduced the basic conception and classification of DDIs. Further, some important publicly available databases and web servers about experimentally verified or predicted DDIs were briefly described. As an effective auxiliary tool, computational models for predicting DDIs can not only save the cost of biological experiments, but also provide relevant guidance for combination therapy to some extent. Therefore, we summarized three types of prediction models (including traditional machine learning-based models, deep learning-based models and score function-based models) proposed during recent years and discussed the advantages as well as limitations of them. Besides, we pointed out the problems that need to be solved in the future research of DDIs prediction and provided corresponding suggestions.


Asunto(s)
Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Humanos , Interacciones Farmacológicas , Bases de Datos Factuales , Simulación por Computador , Combinación de Medicamentos
19.
Brief Bioinform ; 24(5)2023 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-37529934

RESUMEN

Adequate reporting is essential for evaluating the performance and clinical utility of a prognostic prediction model. Previous studies indicated a prevalence of incomplete or suboptimal reporting in translational and clinical studies involving development of multivariable prediction models for prognosis, which limited the potential applications of these models. While reporting templates introduced by the established guidelines provide an invaluable framework for reporting prognostic studies uniformly, there is a widespread lack of qualified adherence, which may be due to miscellaneous challenges in manual reporting of extensive model details, especially in the era of precision medicine. Here, we present ReProMSig (Reproducible Prognosis Molecular Signature), a web-based integrative platform providing the analysis framework for development, validation and application of a multivariable prediction model for cancer prognosis, using clinicopathological features and/or molecular profiles. ReProMSig platform supports transparent reporting by presenting both methodology details and analysis results in a strictly structured reporting file, following the guideline checklist with minimal manual input needed. The generated reporting file can be published together with a developed prediction model, to allow thorough interrogation and external validation, as well as online application for prospective cases. We demonstrated the utilities of ReProMSig by development of prognostic molecular signatures for stage II and III colorectal cancer respectively, in comparison with a published signature reproduced by ReProMSig. Together, ReProMSig provides an integrated framework for development, evaluation and application of prognostic/predictive biomarkers for cancer in a more transparent and reproducible way, which would be a useful resource for health care professionals and biomedical researchers.


Asunto(s)
Lista de Verificación , Neoplasias , Humanos , Medicina de Precisión , Neoplasias/diagnóstico , Neoplasias/genética , Neoplasias/terapia
20.
Brief Bioinform ; 24(5)2023 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-37587836

RESUMEN

Recent studies have demonstrated the significant role that circRNA plays in the progression of human diseases. Identifying circRNA-disease associations (CDA) in an efficient manner can offer crucial insights into disease diagnosis. While traditional biological experiments can be time-consuming and labor-intensive, computational methods have emerged as a viable alternative in recent years. However, these methods are often limited by data sparsity and their inability to explore high-order information. In this paper, we introduce a novel method named Knowledge Graph Encoder from Transformer for predicting CDA (KGETCDA). Specifically, KGETCDA first integrates more than 10 databases to construct a large heterogeneous non-coding RNA dataset, which contains multiple relationships between circRNA, miRNA, lncRNA and disease. Then, a biological knowledge graph is created based on this dataset and Transformer-based knowledge representation learning and attentive propagation layers are applied to obtain high-quality embeddings with accurately captured high-order interaction information. Finally, multilayer perceptron is utilized to predict the matching scores of CDA based on their embeddings. Our empirical results demonstrate that KGETCDA significantly outperforms other state-of-the-art models. To enhance user experience, we have developed an interactive web-based platform named HNRBase that allows users to visualize, download data and make predictions using KGETCDA with ease. The code and datasets are publicly available at https://github.com/jinyangwu/KGETCDA.


Asunto(s)
ARN Circular , ARN Largo no Codificante , Humanos , Reconocimiento de Normas Patrones Automatizadas , Aprendizaje , Bases de Datos Factuales , Bases del Conocimiento , Biología Computacional
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda