Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 267
Filtrar
1.
bioRxiv ; 2024 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-38746239

RESUMO

Advancements in genomic and proteomic technologies have powered the use of gene and protein networks ("interactomes") for understanding genotype-phenotype translation. However, the proliferation of interactomes complicates the selection of networks for specific applications. Here, we present a comprehensive evaluation of 46 current human interactomes, encompassing protein-protein interactions as well as gene regulatory, signaling, colocalization, and genetic interaction networks. Our analysis shows that large composite networks such as HumanNet, STRING, and FunCoup are most effective for identifying disease genes, while smaller networks such as DIP and SIGNOR demonstrate strong interaction prediction performance. These findings provide a benchmark for interactomes across diverse network biology applications and clarify factors that influence network performance. Furthermore, our evaluation pipeline paves the way for continued assessment of emerging and updated interaction networks in the future.

2.
Nat Commun ; 15(1): 3636, 2024 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-38710699

RESUMO

Polypharmacology drugs-compounds that inhibit multiple proteins-have many applications but are difficult to design. To address this challenge we have developed POLYGON, an approach to polypharmacology based on generative reinforcement learning. POLYGON embeds chemical space and iteratively samples it to generate new molecular structures; these are rewarded by the predicted ability to inhibit each of two protein targets and by drug-likeness and ease-of-synthesis. In binding data for >100,000 compounds, POLYGON correctly recognizes polypharmacology interactions with 82.5% accuracy. We subsequently generate de-novo compounds targeting ten pairs of proteins with documented co-dependency. Docking analysis indicates that top structures bind their two targets with low free energies and similar 3D orientations to canonical single-protein inhibitors. We synthesize 32 compounds targeting MEK1 and mTOR, with most yielding >50% reduction in each protein activity and in cell viability when dosed at 1-10 µM. These results support the potential of generative modeling for polypharmacology.


Assuntos
Simulação de Acoplamento Molecular , Humanos , Serina-Treonina Quinases TOR/metabolismo , Polifarmacologia , MAP Quinase Quinase 1/antagonistas & inibidores , MAP Quinase Quinase 1/metabolismo , MAP Quinase Quinase 1/química , Inibidores de Proteínas Quinases/farmacologia , Inibidores de Proteínas Quinases/química , Ligação Proteica , Descoberta de Drogas/métodos , Desenho de Fármacos , Sobrevivência Celular/efeitos dos fármacos
3.
bioRxiv ; 2024 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-38464225

RESUMO

Genome-wide association studies (GWAS) have identified hundreds of common variants associated with alcohol consumption. In contrast, rare variants have only begun to be studied for their role in alcohol consumption. No studies have examined whether common and rare variants implicate the same genes and molecular networks. To address this knowledge gap, we used publicly available alcohol consumption GWAS summary statistics (GSCAN, N=666,978) and whole exome sequencing data (Genebass, N=393,099) to identify a set of common and rare variants for alcohol consumption. Gene-based analysis of each dataset have implicated 294 (common variants) and 35 (rare variants) genes, including ethanol metabolizing genes ADH1B and ADH1C, which were identified by both analyses, and ANKRD12, GIGYF1, KIF21B, and STK31, which were identified only by rare variant analysis, but have been associated with related psychiatric traits. We then used a network colocalization procedure to propagate the common and rare gene sets onto a shared molecular network, revealing significant overlap. The shared network identified gene families that function in alcohol metabolism, including ADH, ALDH, CYP, and UGT. 74 of the genes in the network were previously implicated in comorbid psychiatric or substance use disorders, but had not previously been identified for alcohol-related behaviors, including EXOC2, EPM2A, CACNB3, and CACNG4. Differential gene expression analysis showed enrichment in the liver and several brain regions supporting the role of network genes in alcohol consumption. Thus, genes implicated by common and rare variants identify shared functions relevant to alcohol consumption, which also underlie psychiatric traits and substance use disorders that are comorbid with alcohol use.

4.
Nat Cancer ; 2024 Mar 05.
Artigo em Inglês | MEDLINE | ID: mdl-38443662

RESUMO

Cyclin-dependent kinase 4 and 6 inhibitors (CDK4/6is) have revolutionized breast cancer therapy. However, <50% of patients have an objective response, and nearly all patients develop resistance during therapy. To elucidate the underlying mechanisms, we constructed an interpretable deep learning model of the response to palbociclib, a CDK4/6i, based on a reference map of multiprotein assemblies in cancer. The model identifies eight core assemblies that integrate rare and common alterations across 90 genes to stratify palbociclib-sensitive versus palbociclib-resistant cell lines. Predictions translate to patients and patient-derived xenografts, whereas single-gene biomarkers do not. Most predictive assemblies can be shown by CRISPR-Cas9 genetic disruption to regulate the CDK4/6i response. Validated assemblies relate to cell-cycle control, growth factor signaling and a histone regulatory complex that we show promotes S-phase entry through the activation of the histone modifiers KAT6A and TBL1XR1 and the transcription factor RUNX1. This study enables an integrated assessment of how a tumor's genetic profile modulates CDK4/6i resistance.

5.
Cancer Discov ; 14(3): 508-523, 2024 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-38236062

RESUMO

Rapid proliferation is a hallmark of cancer associated with sensitivity to therapeutics that cause DNA replication stress (RS). Many tumors exhibit drug resistance, however, via molecular pathways that are incompletely understood. Here, we develop an ensemble of predictive models that elucidate how cancer mutations impact the response to common RS-inducing (RSi) agents. The models implement recent advances in deep learning to facilitate multidrug prediction and mechanistic interpretation. Initial studies in tumor cells identify 41 molecular assemblies that integrate alterations in hundreds of genes for accurate drug response prediction. These cover roles in transcription, repair, cell-cycle checkpoints, and growth signaling, of which 30 are shown by loss-of-function genetic screens to regulate drug sensitivity or replication restart. The model translates to cisplatin-treated cervical cancer patients, highlighting an RTK-JAK-STAT assembly governing resistance. This study defines a compendium of mechanisms by which mutations affect therapeutic responses, with implications for precision medicine. SIGNIFICANCE: Zhao and colleagues use recent advances in machine learning to study the effects of tumor mutations on the response to common therapeutics that cause RS. The resulting predictive models integrate numerous genetic alterations distributed across a constellation of molecular assemblies, facilitating a quantitative and interpretable assessment of drug response. This article is featured in Selected Articles from This Issue, p. 384.


Assuntos
Neoplasias do Colo do Útero , Humanos , Feminino , Mutação , Transdução de Sinais , Cisplatino/farmacologia , Cisplatino/uso terapêutico , Aprendizado de Máquina
6.
Cell Genom ; 4(1): 100466, 2024 Jan 10.
Artigo em Inglês | MEDLINE | ID: mdl-38190108

RESUMO

The data-intensive fields of genomics and machine learning (ML) are in an early stage of convergence. Genomics researchers increasingly seek to harness the power of ML methods to extract knowledge from their data; conversely, ML scientists recognize that genomics offers a wealth of large, complex, and well-annotated datasets that can be used as a substrate for developing biologically relevant algorithms and applications. The National Human Genome Research Institute (NHGRI) inquired with researchers working in these two fields to identify common challenges and receive recommendations to better support genomic research efforts using ML approaches. Those included increasing the amount and variety of training datasets by integrating genomic with multiomics, context-specific (e.g., by cell type), and social determinants of health datasets; reducing the inherent biases of training datasets; prioritizing transparency and interpretability of ML methods; and developing privacy-preserving technologies for research participants' data.


Assuntos
Bioética , Genômica , Humanos , Algoritmos , Privacidade , Aprendizado de Máquina
7.
bioRxiv ; 2024 Feb 08.
Artigo em Inglês | MEDLINE | ID: mdl-38076945

RESUMO

Translating high-confidence (hc) autism spectrum disorder (ASD) genes into viable treatment targets remains elusive. We constructed a foundational protein-protein interaction (PPI) network in HEK293T cells involving 100 hcASD risk genes, revealing over 1,800 PPIs (87% novel). Interactors, expressed in the human brain and enriched for ASD but not schizophrenia genetic risk, converged on protein complexes involved in neurogenesis, tubulin biology, transcriptional regulation, and chromatin modification. A PPI map of 54 patient-derived missense variants identified differential physical interactions, and we leveraged AlphaFold-Multimer predictions to prioritize direct PPIs and specific variants for interrogation in Xenopus tropicalis and human forebrain organoids. A mutation in the transcription factor FOXP1 led to reconfiguration of DNA binding sites and altered development of deep cortical layer neurons in forebrain organoids. This work offers new insights into molecular mechanisms underlying ASD and describes a powerful platform to develop and test therapeutic strategies for many genetically-defined conditions.

8.
ArXiv ; 2024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-37731657

RESUMO

Gene set analysis is a mainstay of functional genomics, but it relies on curated databases of gene functions that are incomplete. Here we evaluate five Large Language Models (LLMs) for their ability to discover the common biological functions represented by a gene set, substantiated by supporting rationale, citations and a confidence assessment. Benchmarking against canonical gene sets from the Gene Ontology, GPT-4 confidently recovered the curated name or a more general concept (73% of cases), while benchmarking against random gene sets correctly yielded zero confidence. Gemini-Pro and Mixtral-Instruct showed ability in naming but were falsely confident for random sets, whereas Llama2-70b had poor performance overall. In gene sets derived from 'omics data, GPT-4 identified novel functions not reported by classical functional enrichment (32% of cases), which independent review indicated were largely verifiable and not hallucinations. The ability to rapidly synthesize common gene functions positions LLMs as valuable 'omics assistants.

9.
Pac Symp Biocomput ; 29: 661-665, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38160316

RESUMO

Cells consist of large components, such as organelles, that recursively factor into smaller systems, such as condensates and protein complexes, forming a dynamic multi-scale structure of the cell. Recent technological innovations have paved the way for systematic interrogation of subcellular structures, yielding unprecedented insights into their roles and interactions. In this workshop, we discuss progress, challenges, and collaboration to marshal various computational approaches toward assembling an integrated structural map of the human cell.


Assuntos
Biologia Computacional , Organelas , Humanos , Organelas/química , Organelas/metabolismo , Organelas/ultraestrutura
10.
bioRxiv ; 2023 Dec 09.
Artigo em Inglês | MEDLINE | ID: mdl-38106096

RESUMO

DNA methylation marks have recently been used to build models known as "epigenetic clocks" which predict calendar age. As methylation of cytosine promotes C-to-T mutations, we hypothesized that the methylation changes observed with age should reflect the accrual of somatic mutations, and the two should yield analogous aging estimates. In analysis of multimodal data from 9,331 human individuals, we find that CpG mutations indeed coincide with changes in methylation, not only at the mutated site but also with pervasive remodeling of the methylome out to ±10 kilobases. This one-to-many mapping enables mutation-based predictions of age that agree with epigenetic clocks, including which individuals are aging faster or slower than expected. Moreover, genomic loci where mutations accumulate with age also tend to have methylation patterns that are especially predictive of age. These results suggest a close coupling between the accumulation of sporadic somatic mutations and the widespread changes in methylation observed over the course of life.

11.
Res Sq ; 2023 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-37790547

RESUMO

Gene set analysis is a mainstay of functional genomics, but it relies on manually curated databases of gene functions that are incomplete and unaware of biological context. Here we evaluate the ability of OpenAI's GPT-4, a Large Language Model (LLM), to develop hypotheses about common gene functions from its embedded biomedical knowledge. We created a GPT-4 pipeline to label gene sets with names that summarize their consensus functions, substantiated by analysis text and citations. Benchmarking against named gene sets in the Gene Ontology, GPT-4 generated very similar names in 50% of cases, while in most remaining cases it recovered the name of a more general concept. In gene sets discovered in 'omics data, GPT-4 names were more informative than gene set enrichment, with supporting statements and citations that largely verified in human review. The ability to rapidly synthesize common gene functions positions LLMs as valuable functional genomics assistants.

12.
bioRxiv ; 2023 Sep 22.
Artigo em Inglês | MEDLINE | ID: mdl-37786690

RESUMO

Desmosomes are transmembrane protein complexes that contribute to cell-cell adhesion in epithelia and other tissues. Here, we report the discovery of frequent genetic alterations in the desmosome in human cancers, with the strongest signal seen in cutaneous melanoma where desmosomes are mutated in over 70% of cases. In primary but not metastatic melanoma biopsies, the burden of coding mutations on desmosome genes associates with a strong reduction in desmosome gene expression. Analysis by spatial transcriptomics suggests that these expression decreases occur in keratinocytes in the microenvironment rather than in primary melanoma tumor cells. In further support of a microenvironmental origin, we find that loss-of-function knockdowns of the desmosome in keratinocytes yield markedly increased proliferation of adjacent melanocytes in keratinocyte/melanocyte co-cultures. Thus, gradual accumulation of desmosome mutations in neighboring cells may prime melanocytes for neoplastic transformation.

13.
Cell Rep ; 42(8): 112873, 2023 08 29.
Artigo em Inglês | MEDLINE | ID: mdl-37527041

RESUMO

A vexing observation in genome-wide association studies (GWASs) is that parallel analyses in different species may not identify orthologous genes. Here, we demonstrate that cross-species translation of GWASs can be greatly improved by an analysis of co-localization within molecular networks. Using body mass index (BMI) as an example, we show that the genes associated with BMI in humans lack significant agreement with those identified in rats. However, the networks interconnecting these genes show substantial overlap, highlighting common mechanisms including synaptic signaling, epigenetic modification, and hormonal regulation. Genetic perturbations within these networks cause abnormal BMI phenotypes in mice, too, supporting their broad conservation across mammals. Other mechanisms appear species specific, including carbohydrate biosynthesis (humans) and glycerolipid metabolism (rodents). Finally, network co-localization also identifies cross-species convergence for height/body length. This study advances a general paradigm for determining whether and how phenotypes measured in model species recapitulate human biology.


Assuntos
Índice de Massa Corporal , Redes Reguladoras de Genes , Estudo de Associação Genômica Ampla , Humanos , Animais , Ratos , Tamanho Corporal , Camundongos , Especificidade da Espécie
14.
bioRxiv ; 2023 Aug 04.
Artigo em Inglês | MEDLINE | ID: mdl-37577681

RESUMO

Understanding the consequences of single amino acid substitutions in cancer driver genes remains an unmet need. Perturb-seq provides a tool to investigate the effects of individual mutations on cellular programs. Here we deploy SEUSS, a Perturb-seq like approach, to generate and assay mutations at physical interfaces of the RUNX1 Runt domain. We measured the impact of 115 mutations on RNA profiles in single myelogenous leukemia cells and used the profiles to categorize mutations into three functionally distinct groups: wild-type (WT)-like, loss-of-function (LOF)-like and hypomorphic. Notably, the largest concentration of functional mutations (non-WT-like) clustered at the DNA binding site and contained many of the more frequently observed mutations in human cancers. Hypomorphic variants shared characteristics with loss of function variants but had gene expression profiles indicative of response to neural growth factor and cytokine recruitment of neutrophils. Additionally, DNA accessibility changes upon perturbations were enriched for RUNX1 binding motifs, particularly near differentially expressed genes. Overall, our work demonstrates the potential of targeting protein interaction interfaces to better define the landscape of prospective phenotypes reachable by amino acid substitutions.

15.
Cancer Discov ; 13(10): 2270-2291, 2023 Oct 05.
Artigo em Inglês | MEDLINE | ID: mdl-37553760

RESUMO

Oncogenes can initiate tumors only in certain cellular contexts, which is referred to as oncogenic competence. In melanoma, whether cells in the microenvironment can endow such competence remains unclear. Using a combination of zebrafish transgenesis coupled with human tissues, we demonstrate that GABAergic signaling between keratinocytes and melanocytes promotes melanoma initiation by BRAFV600E. GABA is synthesized in melanoma cells, which then acts on GABA-A receptors in keratinocytes. Electron microscopy demonstrates specialized cell-cell junctions between keratinocytes and melanoma cells, and multielectrode array analysis shows that GABA acts to inhibit electrical activity in melanoma/keratinocyte cocultures. Genetic and pharmacologic perturbation of GABA synthesis abrogates melanoma initiation in vivo. These data suggest that GABAergic signaling across the skin microenvironment regulates the ability of oncogenes to initiate melanoma. SIGNIFICANCE: This study shows evidence of GABA-mediated regulation of electrical activity between melanoma cells and keratinocytes, providing a new mechanism by which the microenvironment promotes tumor initiation. This provides insights into the role of the skin microenvironment in early melanomas while identifying GABA as a potential therapeutic target in melanoma. See related commentary by Ceol, p. 2128. This article is featured in Selected Articles from This Issue, p. 2109.


Assuntos
Melanoma , Animais , Humanos , Melanoma/tratamento farmacológico , Melanoma/genética , Melanoma/patologia , Peixe-Zebra , Melanócitos/patologia , Pele , Queratinócitos , Transformação Celular Neoplásica/genética , Ácido gama-Aminobutírico , Microambiente Tumoral
16.
Sci Rep ; 13(1): 7678, 2023 05 11.
Artigo em Inglês | MEDLINE | ID: mdl-37169829

RESUMO

Cell-cycle control is accomplished by cyclin-dependent kinases (CDKs), motivating extensive research into CDK targeting small-molecule drugs as cancer therapeutics. Here we use combinatorial CRISPR/Cas9 perturbations to uncover an extensive network of functional interdependencies among CDKs and related factors, identifying 43 synthetic-lethal and 12 synergistic interactions. We dissect CDK perturbations using single-cell RNAseq, for which we develop a novel computational framework to precisely quantify cell-cycle effects and diverse cell states orchestrated by specific CDKs. While pairwise disruption of CDK4/6 is synthetic-lethal, only CDK6 is required for normal cell-cycle progression and transcriptional activation. Multiple CDKs (CDK1/7/9/12) are synthetic-lethal in combination with PRMT5, independent of cell-cycle control. In-depth analysis of mRNA expression and splicing patterns provides multiple lines of evidence that the CDK-PRMT5 dependency is due to aberrant transcriptional regulation resulting in premature termination. These inter-dependencies translate to drug-drug synergies, with therapeutic implications in cancer and other diseases.


Assuntos
Neoplasias , Humanos , Pontos de Checagem do Ciclo Celular , Ciclo Celular/genética , Neoplasias/tratamento farmacológico , Proteína-Arginina N-Metiltransferases/farmacologia
17.
PLoS One ; 18(5): e0286064, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37228113

RESUMO

Many disease-causing genetic variants converge on common biological functions and pathways. Precisely how to incorporate pathway knowledge in genetic association studies is not yet clear, however. Previous approaches employ a two-step approach, in which a regular association test is first performed to identify variants associated with the disease phenotype, followed by a test for functional enrichment within the genes implicated by those variants. Here we introduce a concise one-step approach, Hierarchical Genetic Analysis (Higana), which directly computes phenotype associations against each function in the large hierarchy of biological functions documented by the Gene Ontology. Using this approach, we identify risk genes and functions for Chronic Obstructive Pulmonary Disease (COPD), highlighting microtubule transport, muscle adaptation, and nicotine receptor signaling pathways. Microtubule transport has not been previously linked to COPD, as it integrates genetic variants spread over numerous genes. All associations validate strongly in a second COPD cohort.


Assuntos
Predisposição Genética para Doença , Doença Pulmonar Obstrutiva Crônica , Humanos , Doença Pulmonar Obstrutiva Crônica/genética , Estudos de Associação Genética , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único
18.
Cell Syst ; 14(6): 447-463.e8, 2023 06 21.
Artigo em Inglês | MEDLINE | ID: mdl-37220749

RESUMO

The DNA damage response (DDR) ensures error-free DNA replication and transcription and is disrupted in numerous diseases. An ongoing challenge is to determine the proteins orchestrating DDR and their organization into complexes, including constitutive interactions and those responding to genomic insult. Here, we use multi-conditional network analysis to systematically map DDR assemblies at multiple scales. Affinity purifications of 21 DDR proteins, with/without genotoxin exposure, are combined with multi-omics data to reveal a hierarchical organization of 605 proteins into 109 assemblies. The map captures canonical repair mechanisms and proposes new DDR-associated proteins extending to stress, transport, and chromatin functions. We find that protein assemblies closely align with genetic dependencies in processing specific genotoxins and that proteins in multiple assemblies typically act in multiple genotoxin responses. Follow-up by DDR functional readouts newly implicates 12 assembly members in double-strand-break repair. The DNA damage response assemblies map is available for interactive visualization and query (ccmi.org/ddram/).


Assuntos
Cromatina , Reparo do DNA , Reparo do DNA/genética , Cromatina/genética , Dano ao DNA/genética
19.
Bioinformatics ; 39(3)2023 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-36882166

RESUMO

MOTIVATION: The investigation of sets of genes using biological pathways is a common task for researchers and is supported by a wide variety of software tools. This type of analysis generates hypotheses about the biological processes that are active or modulated in a specific experimental context. RESULTS: The Network Data Exchange Integrated Query (NDEx IQuery) is a new tool for network and pathway-based gene set interpretation that complements or extends existing resources. It combines novel sources of pathways, integration with Cytoscape, and the ability to store and share analysis results. The NDEx IQuery web application performs multiple gene set analyses based on diverse pathways and networks stored in NDEx. These include curated pathways from WikiPathways and SIGNOR, published pathway figures from the last 27 years, machine-assembled networks using the INDRA system, and the new NCI-PID v2.0, an updated version of the popular NCI Pathway Interaction Database. NDEx IQuery's integration with MSigDB and cBioPortal now provides pathway analysis in the context of these two resources. AVAILABILITY AND IMPLEMENTATION: NDEx IQuery is available at https://www.ndexbio.org/iquery and is implemented in Javascript and Java.


Assuntos
Biologia Computacional , Software , Biologia Computacional/métodos , Mapas de Interação de Proteínas , Publicações , Bases de Dados Factuais , Internet
20.
Nat Protoc ; 18(6): 1745-1759, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-36653526

RESUMO

A longstanding goal of biomedicine is to understand how alterations in molecular and cellular networks give rise to the spectrum of human diseases. For diseases with shared etiology, understanding the common causes allows for improved diagnosis of each disease, development of new therapies and more comprehensive identification of disease genes. Accordingly, this protocol describes how to evaluate the extent to which two diseases, each characterized by a set of mapped genes, are colocalized in a reference gene interaction network. This procedure uses network propagation to measure the network 'distance' between gene sets. For colocalized diseases, the network can be further analyzed to extract common gene communities at progressive granularities. In particular, we show how to: (1) obtain input gene sets and a reference gene interaction network; (2) identify common subnetworks of genes that encompass or are in close proximity to all gene sets; (3) use multiscale community detection to identify systems and pathways represented by each common subnetwork to generate a network colocalized systems map; (4) validate identified genes and systems using a mouse variant database; and (5) visualize and further investigate select genes, interactions and systems for relevance to phenotype(s) of interest. We demonstrate the utility of this approach by identifying shared biological mechanisms underlying autism and congenital heart disease. However, this protocol is general and can be applied to any gene sets attributed to diseases or other phenotypes with suspected joint association. A typical NetColoc run takes less than an hour. Software and documentation are available at https://github.com/ucsd-ccbb/NetColoc .


Assuntos
Redes Reguladoras de Genes , Software , Humanos , Bases de Dados Factuais , Biologia Computacional/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...