Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 185(15): 2617-2620, 2022 Jul 21.
Artigo em Inglês | MEDLINE | ID: mdl-35868264

RESUMO

With recent dramatic advances in various techniques used for protein structure research, we asked researchers to comment on the next exciting questions for the field and about how these techniques will advance our knowledge not only about proteins but also about human health and diseases.

2.
Nat Methods ; 2024 May 14.
Artigo em Inglês | MEDLINE | ID: mdl-38744917

RESUMO

AlphaFold2 revolutionized structural biology with the ability to predict protein structures with exceptionally high accuracy. Its implementation, however, lacks the code and data required to train new models. These are necessary to (1) tackle new tasks, like protein-ligand complex structure prediction, (2) investigate the process by which the model learns and (3) assess the model's capacity to generalize to unseen regions of fold space. Here we report OpenFold, a fast, memory efficient and trainable implementation of AlphaFold2. We train OpenFold from scratch, matching the accuracy of AlphaFold2. Having established parity, we find that OpenFold is remarkably robust at generalizing even when the size and diversity of its training set is deliberately limited, including near-complete elisions of classes of secondary structure elements. By analyzing intermediate structures produced during training, we also gain insights into the hierarchical manner in which OpenFold learns to fold. In sum, our studies demonstrate the power and utility of OpenFold, which we believe will prove to be a crucial resource for the protein modeling community.

3.
Nat Methods ; 18(10): 1169-1180, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-34608321

RESUMO

Deep learning using neural networks relies on a class of machine-learnable models constructed using 'differentiable programs'. These programs can combine mathematical equations specific to a particular domain of natural science with general-purpose, machine-learnable components trained on experimental data. Such programs are having a growing impact on molecular and cellular biology. In this Perspective, we describe an emerging 'differentiable biology' in which phenomena ranging from the small and specific (for example, one experimental assay) to the broad and complex (for example, protein folding) can be modeled effectively and efficiently, often by exploiting knowledge about basic natural phenomena to overcome the limitations of sparse, incomplete and noisy data. By distilling differentiable biology into a small set of conceptual primitives and illustrative vignettes, we show how it can help to address long-standing challenges in integrating multimodal data from diverse experiments across biological scales. This promises to benefit fields as diverse as biophysics and functional genomics.


Assuntos
Biofísica/métodos , Biologia Computacional/instrumentação , Biologia Computacional/métodos , Aprendizado Profundo , Redes Neurais de Computação , Química Computacional , Modelos Químicos , Reconhecimento Automatizado de Padrão , Conformação Proteica , Proteínas/química
4.
Nat Methods ; 17(2): 175-183, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31907444

RESUMO

In mammalian cells, much of signal transduction is mediated by weak protein-protein interactions between globular peptide-binding domains (PBDs) and unstructured peptidic motifs in partner proteins. The number and diversity of these PBDs (over 1,800 are known), their low binding affinities and the sensitivity of binding properties to minor sequence variation represent a substantial challenge to experimental and computational analysis of PBD specificity and the networks PBDs create. Here, we introduce a bespoke machine-learning approach, hierarchical statistical mechanical modeling (HSM), capable of accurately predicting the affinities of PBD-peptide interactions across multiple protein families. By synthesizing biophysical priors within a modern machine-learning framework, HSM outperforms existing computational methods and high-throughput experimental assays. HSM models are interpretable in familiar biophysical terms at three spatial scales: the energetics of protein-peptide binding, the multidentate organization of protein-protein interactions and the global architecture of signaling networks.


Assuntos
Aprendizado de Máquina , Peptídeos/metabolismo , Proteínas/metabolismo , Transdução de Sinais , Fenômenos Biofísicos , Humanos , Ligação Proteica , Reprodutibilidade dos Testes , Domínios de Homologia de src
5.
J Chem Inf Model ; 63(17): 5457-5472, 2023 09 11.
Artigo em Inglês | MEDLINE | ID: mdl-37595065

RESUMO

Kinases have been the focus of drug discovery programs for three decades leading to over 70 therapeutic kinase inhibitors and biophysical affinity measurements for over 130,000 kinase-compound pairs. Nonetheless, the precise target spectrum for many kinases remains only partly understood. In this study, we describe a computational approach to unlocking qualitative and quantitative kinome-wide binding measurements for structure-based machine learning. Our study has three components: (i) a Kinase Inhibitor Complex (KinCo) data set comprising in silico predicted kinase structures paired with experimental binding constants, (ii) a machine learning loss function that integrates qualitative and quantitative data for model training, and (iii) a structure-based machine learning model trained on KinCo. We show that our approach outperforms methods trained on crystal structures alone in predicting binary and quantitative kinase-compound interaction affinities; relative to structure-free methods, our approach also captures known kinase biochemistry and more successfully generalizes to distant kinase sequences and compound scaffolds.


Assuntos
Descoberta de Drogas , Aprendizado de Máquina , Inibidores de Proteínas Quinases/farmacologia
6.
Nat Methods ; 16(12): 1315-1322, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31636460

RESUMO

Rational protein engineering requires a holistic understanding of protein function. Here, we apply deep learning to unlabeled amino-acid sequences to distill the fundamental features of a protein into a statistical representation that is semantically rich and structurally, evolutionarily and biophysically grounded. We show that the simplest models built on top of this unified representation (UniRep) are broadly applicable and generalize to unseen regions of sequence space. Our data-driven approach predicts the stability of natural and de novo designed proteins, and the quantitative function of molecularly diverse mutants, competitively with the state-of-the-art methods. UniRep further enables two orders of magnitude efficiency improvement in a protein engineering task. UniRep is a versatile summary of fundamental protein features that can be applied across protein engineering informatics.


Assuntos
Aprendizado Profundo , Engenharia de Proteínas/métodos , Sequência de Aminoácidos , Mutação , Estabilidade Proteica
7.
Cell Commun Signal ; 20(1): 76, 2022 05 30.
Artigo em Inglês | MEDLINE | ID: mdl-35637461

RESUMO

BACKGROUND: Acute kidney injury (AKI) is associated with a severe decline in kidney function caused by abnormalities within the podocytes' glomerular matrix. Recently, AKI has been linked to alterations in glycolysis and the activity of glycolytic enzymes, including pyruvate kinase M2 (PKM2). However, the contribution of this enzyme to AKI remains largely unexplored. METHODS: Cre-loxP technology was used to examine the effects of PKM2 specific deletion in podocytes on the activation status of key signaling pathways involved in the pathophysiology of AKI by lipopolysaccharides (LPS). In addition, we used lentiviral shRNA to generate murine podocytes deficient in PKM2 and investigated the molecular mechanisms mediating PKM2 actions in vitro. RESULTS: Specific PKM2 deletion in podocytes ameliorated LPS-induced protein excretion and alleviated LPS-induced alterations in blood urea nitrogen and serum albumin levels. In addition, PKM2 deletion in podocytes alleviated LPS-induced structural and morphological alterations to the tubules and to the brush borders. At the molecular level, PKM2 deficiency in podocytes suppressed LPS-induced inflammation and apoptosis. In vitro, PKM2 knockdown in murine podocytes diminished LPS-induced apoptosis. These effects were concomitant with a reduction in LPS-induced activation of ß-catenin and the loss of Wilms' Tumor 1 (WT1) and nephrin. Notably, the overexpression of a constitutively active mutant of ß-catenin abolished the protective effect of PKM2 knockdown. Conversely, PKM2 knockdown cells reconstituted with the phosphotyrosine binding-deficient PKM2 mutant (K433E) recapitulated the effect of PKM2 depletion on LPS-induced apoptosis, ß-catenin activation, and reduction in WT1 expression. CONCLUSIONS: Taken together, our data demonstrates that PKM2 plays a key role in podocyte injury and suggests that targetting PKM2 in podocytes could serve as a promising therapeutic strategy for AKI. TRIAL REGISTRATION: Not applicable. Video abstract.


Assuntos
Injúria Renal Aguda , Leucemia Mieloide Aguda , Podócitos , Injúria Renal Aguda/metabolismo , Animais , Leucemia Mieloide Aguda/metabolismo , Lipopolissacarídeos/farmacologia , Camundongos , Piruvato Quinase/genética , Piruvato Quinase/metabolismo , Piruvato Quinase/farmacologia , beta Catenina/metabolismo
8.
Nature ; 596(7873): 487-488, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34426694
9.
Nature ; 577(7792): 627-628, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31988401
10.
Int J Mol Sci ; 22(3)2021 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-33503959

RESUMO

Pyruvate kinase is a key regulator in glycolysis through the conversion of phosphoenolpyruvate (PEP) into pyruvate. Pyruvate kinase exists in various isoforms that can exhibit diverse biological functions and outcomes. The pyruvate kinase isoenzyme type M2 (PKM2) controls cell progression and survival through the regulation of key signaling pathways. In cancer cells, the dimer form of PKM2 predominates and plays an integral role in cancer metabolism. This predominance of the inactive dimeric form promotes the accumulation of phosphometabolites, allowing cancer cells to engage in high levels of synthetic processing to enhance their proliferative capacity. PKM2 has been recognized for its role in regulating gene expression and transcription factors critical for health and disease. This role enables PKM2 to exert profound regulatory effects that promote cancer cell metabolism, proliferation, and migration. In addition to its role in cancer, PKM2 regulates aspects essential to cellular homeostasis in non-cancer tissues and, in some cases, promotes tissue-specific pathways in health and diseases. In pursuit of understanding the diverse tissue-specific roles of PKM2, investigations targeting tissues such as the kidney, liver, adipose, and pancreas have been conducted. Findings from these studies enhance our understanding of PKM2 functions in various diseases beyond cancer. Therefore, there is substantial interest in PKM2 modulation as a potential therapeutic target for the treatment of multiple conditions. Indeed, a vast plethora of research has focused on identifying therapeutic strategies for targeting PKM2. Recently, targeting PKM2 through its regulatory microRNAs, long non-coding RNAs (lncRNAs), and circular RNAs (circRNAs) has gathered increasing interest. Thus, the goal of this review is to highlight recent advancements in PKM2 research, with a focus on PKM2 regulatory microRNAs and lncRNAs and their subsequent physiological significance.


Assuntos
Proteínas de Transporte/genética , Proteínas de Transporte/metabolismo , Reprogramação Celular , Metabolismo Energético , Regulação da Expressão Gênica , Proteínas de Membrana/genética , Proteínas de Membrana/metabolismo , Hormônios Tireóideos/genética , Hormônios Tireóideos/metabolismo , Animais , Proteínas de Transporte/antagonistas & inibidores , Transformação Celular Neoplásica/genética , Transformação Celular Neoplásica/metabolismo , Reprogramação Celular/genética , Suscetibilidade a Doenças , Desenvolvimento de Medicamentos , Avaliação Pré-Clínica de Medicamentos , Metabolismo Energético/genética , Inibidores Enzimáticos/farmacologia , Inibidores Enzimáticos/uso terapêutico , Homeostase , Humanos , Proteínas de Membrana/antagonistas & inibidores , Mutação , Transporte Proteico , Piruvato Quinase/genética , Piruvato Quinase/metabolismo , Interferência de RNA , RNA Longo não Codificante/genética , Pesquisa , Proteínas de Ligação a Hormônio da Tireoide
11.
Bioinformatics ; 35(22): 4862-4865, 2019 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-31116374

RESUMO

SUMMARY: Computational prediction of protein structure from sequence is broadly viewed as a foundational problem of biochemistry and one of the most difficult challenges in bioinformatics. Once every two years the Critical Assessment of protein Structure Prediction (CASP) experiments are held to assess the state of the art in the field in a blind fashion, by presenting predictor groups with protein sequences whose structures have been solved but have not yet been made publicly available. The first CASP was organized in 1994, and the latest, CASP13, took place last December, when for the first time the industrial laboratory DeepMind entered the competition. DeepMind's entry, AlphaFold, placed first in the Free Modeling (FM) category, which assesses methods on their ability to predict novel protein folds (the Zhang group placed first in the Template-Based Modeling (TBM) category, which assess methods on predicting proteins whose folds are related to ones already in the Protein Data Bank.) DeepMind's success generated significant public interest. Their approach builds on two ideas developed in the academic community during the preceding decade: (i) the use of co-evolutionary analysis to map residue co-variation in protein sequence to physical contact in protein structure, and (ii) the application of deep neural networks to robustly identify patterns in protein sequence and co-evolutionary couplings and convert them into contact maps. In this Letter, we contextualize the significance of DeepMind's entry within the broader history of CASP, relate AlphaFold's methodological advances to prior work, and speculate on the future of this important problem.


Assuntos
Biologia Computacional , Software , Bases de Dados de Proteínas , Modelos Moleculares , Conformação Proteica , Proteínas
12.
Cell Commun Signal ; 18(1): 126, 2020 08 14.
Artigo em Inglês | MEDLINE | ID: mdl-32795297

RESUMO

BACKGROUND: Current pharmacological therapies and treatments targeting pancreatic neuroendocrine tumors (PNETs) have proven ineffective, far too often. Therefore, there is an urgent need for alternative therapeutic approaches. Zyflamend, a combination of anti-inflammatory herbal extracts, that has proven to be effective in various in vitro and in vivo cancer platforms, shows promise. However, its effects on pancreatic cancer, in particular, remain largely unexplored. METHODS: In the current study, we investigated the effects of Zyflamend on the survival of beta-TC-6 pancreatic insulinoma cells (ß-TC6) and conducted a detailed analysis of the underlying molecular mechanisms. RESULTS: Herein, we demonstrate that Zyflamend treatment decreased cell proliferation in a dose-dependent manner, concomitant with increased apoptotic cell death and cell cycle arrest at the G2/M phase. At the molecular level, treatment with Zyflamend led to the induction of ER stress, autophagy, and the activation of c-Jun N-terminal kinase (JNK) pathway. Notably, pharmacological inhibition of JNK abrogated the pro-apoptotic effects of Zyflamend. Furthermore, Zyflamend exacerbated the effects of streptozotocin and adriamycin-induced ER stress, autophagy, and apoptosis. CONCLUSION: The current study identifies Zyflamend as a potential novel adjuvant in the treatment of pancreatic cancer via modulation of the JNK pathway. Video abstract.


Assuntos
Apoptose , Sistema de Sinalização das MAP Quinases , Neoplasias Pancreáticas/enzimologia , Neoplasias Pancreáticas/patologia , Extratos Vegetais/farmacologia , Animais , Apoptose/efeitos dos fármacos , Autofagia/efeitos dos fármacos , Pontos de Checagem do Ciclo Celular/efeitos dos fármacos , Linhagem Celular Tumoral , Proliferação de Células/efeitos dos fármacos , Doxorrubicina/farmacologia , Estresse do Retículo Endoplasmático/efeitos dos fármacos , Inflamação/patologia , Sistema de Sinalização das MAP Quinases/efeitos dos fármacos , Camundongos , Modelos Biológicos , Ratos , Estreptozocina/farmacologia
13.
BMC Bioinformatics ; 20(1): 311, 2019 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-31185886

RESUMO

BACKGROUND: Rapid progress in deep learning has spurred its application to bioinformatics problems including protein structure prediction and design. In classic machine learning problems like computer vision, progress has been driven by standardized data sets that facilitate fair assessment of new methods and lower the barrier to entry for non-domain experts. While data sets of protein sequence and structure exist, they lack certain components critical for machine learning, including high-quality multiple sequence alignments and insulated training/validation splits that account for deep but only weakly detectable homology across protein space. RESULTS: We created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships. ProteinNet integrates sequence, structure, and evolutionary information in programmatically accessible file formats tailored for machine learning frameworks. Multiple sequence alignments of all structurally characterized proteins were created using substantial high-performance computing resources. Standardized data splits were also generated to emulate the difficulty of past CASP (Critical Assessment of protein Structure Prediction) experiments by resetting protein sequence and structure space to the historical states that preceded six prior CASPs. Utilizing sensitive evolution-based distance metrics to segregate distantly related proteins, we have additionally created validation sets distinct from the official CASP sets that faithfully mimic their difficulty. CONCLUSION: ProteinNet represents a comprehensive and accessible resource for training and assessing machine-learned models of protein structure.


Assuntos
Aprendizado de Máquina , Proteínas/química , Sequência de Aminoácidos , Bases de Dados de Proteínas , Padrões de Referência , Alinhamento de Sequência
14.
J Comput Chem ; 40(7): 885-892, 2019 03 15.
Artigo em Inglês | MEDLINE | ID: mdl-30614534

RESUMO

The conversion of polymer parameterization from internal coordinates (bond lengths, angles, and torsions) to Cartesian coordinates is a fundamental task in molecular modeling, often performed using the natural extension reference frame (NeRF) algorithm. NeRF can be parallelized to process multiple polymers simultaneously, but is not parallelizable along the length of a single polymer. A mathematically equivalent algorithm, pNeRF, has been derived that is parallelizable along a polymer's length. Empirical analysis demonstrates an order-of-magnitude speed up using modern GPUs and CPUs. In machine learning-based workflows, in which partial derivatives are backpropagated through NeRF equations and neural network primitives, switching to pNeRF can reduce the fractional computational cost of coordinate conversion from over two-thirds to around 10%. An optimized TensorFlow-based implementation of pNeRF is available on GitHub at https://github.com/aqlaboratory/pnerf © 2018 Wiley Periodicals, Inc.


Assuntos
Algoritmos , Polímeros/química , Aprendizado de Máquina , Modelos Moleculares
15.
BMC Bioinformatics ; 16: 390, 2015 Nov 19.
Artigo em Inglês | MEDLINE | ID: mdl-26586237

RESUMO

BACKGROUND: Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. DESCRIPTION: We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. CONCLUSIONS: This database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.


Assuntos
Proteínas de Ligação a DNA/genética , DNA/genética , Algoritmos , Proteínas de Ligação a DNA/metabolismo , Bases de Dados Factuais , Fatores de Transcrição
16.
Proc Natl Acad Sci U S A ; 108(36): 14819-24, 2011 Sep 06.
Artigo em Inglês | MEDLINE | ID: mdl-21825146

RESUMO

Compressed sensing has revolutionized signal acquisition, by enabling complex signals to be measured with remarkable fidelity using a small number of so-called incoherent sensors. We show that molecular interactions, e.g., protein-DNA interactions, can be analyzed in a directly analogous manner and with similarly remarkable results. Specifically, mesoscopic molecular interactions act as incoherent sensors that measure the energies of microscopic interactions between atoms. We combine concepts from compressed sensing and statistical mechanics to determine the interatomic interaction energies of a molecular system exclusively from experimental measurements, resulting in a "de novo" energy potential. In contrast, conventional methods for estimating energy potentials are based on theoretical models premised on a priori assumptions and extensive domain knowledge. We determine the de novo energy potential for pairwise interactions between protein and DNA atoms from (i) experimental measurements of the binding affinity of protein-DNA complexes and (ii) crystal structures of the complexes. We show that the de novo energy potential can be used to predict the binding specificity of proteins to DNA with approximately 90% accuracy, compared to approximately 60% for the best performing alternative computational methods applied to this fundamental problem. This de novo potential method is directly extendable to other biomolecule interaction domains (enzymes and signaling molecule interactions) and to other classes of molecular interactions.


Assuntos
Simulação por Computador , Proteínas de Ligação a DNA/química , DNA/química , Modelos Químicos , Cristalografia por Raios X , Estrutura Terciária de Proteína , Termodinâmica
17.
Nat Biotechnol ; 2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38783148

RESUMO

Single-nucleotide variants (SNVs) in key T cell genes can drive clinical pathologies and could be repurposed to improve cellular cancer immunotherapies. Here, we perform massively parallel base-editing screens to generate thousands of variants at gene loci annotated with known or potential clinical relevance. We discover a broad landscape of putative gain-of-function (GOF) and loss-of-function (LOF) mutations, including in PIK3CD and the gene encoding its regulatory subunit, PIK3R1, LCK, SOS1, AKT1 and RHOA. Base editing of PIK3CD and PIK3R1 variants in T cells with an engineered T cell receptor specific to a melanoma epitope or in different generations of CD19 chimeric antigen receptor (CAR) T cells demonstrates that discovered GOF variants, but not LOF or silent mutation controls, enhanced signaling, cytokine production and lysis of cognate melanoma and leukemia cell models, respectively. Additionally, we show that generations of CD19 CAR T cells engineered with PIK3CD GOF mutations demonstrate enhanced antigen-specific signaling, cytokine production and leukemia cell killing, including when benchmarked against other recent strategies.

18.
Proteins ; 81(3): 426-42, 2013 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-23042633

RESUMO

The energetics of protein-DNA interactions are often modeled using so-called statistical potentials, that is, energy models derived from the atomic structures of protein-DNA complexes. Many statistical protein-DNA potentials based on differing theoretical assumptions have been investigated, but little attention has been paid to the types of data and the parameter estimation process used in deriving the statistical potentials. We describe three enhancements to statistical potential inference that significantly improve the accuracy of predicted protein-DNA interactions: (i) incorporation of binding energy data of protein-DNA complexes, in conjunction with their X-ray crystal structures, (ii) use of spatially-aware parameter fitting, and (iii) use of ensemble-based parameter fitting. We apply these enhancements to three widely-used statistical potentials and use the resulting enhanced potentials in a structure-based prediction of the DNA binding sites of proteins. These enhancements are directly applicable to all statistical potentials used in protein-DNA modeling, and we show that they can improve the accuracy of predicted DNA binding sites by up to 21%.


Assuntos
DNA/química , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Algoritmos , Inteligência Artificial , Sequência de Bases , Sítios de Ligação , Sequência Consenso , Cristalografia por Raios X , Proteínas de Ligação a DNA/química , Interpretação Estatística de Dados , Entropia , Modelos Moleculares , Modelos Estatísticos , Ligação Proteica , Sensibilidade e Especificidade
19.
Genome Biol ; 24(1): 110, 2023 05 09.
Artigo em Inglês | MEDLINE | ID: mdl-37161576

RESUMO

Understanding coding mutations is important for many applications in biology and medicine but the vast mutation space makes comprehensive experimental characterisation impossible. Current predictors are often computationally intensive and difficult to scale, including recent deep learning models. We introduce Sequence UNET, a highly scalable deep learning architecture that classifies and predicts variant frequency from sequence alone using multi-scale representations from a fully convolutional compression/expansion architecture. It achieves comparable pathogenicity prediction to recent methods. We demonstrate scalability by analysing 8.3B variants in 904,134 proteins detected through large-scale proteomics. Sequence UNET runs on modest hardware with a simple Python package.


Assuntos
Compressão de Dados , Aprendizado Profundo , Mutação , Proteômica
20.
Protein Eng Des Sel ; 362023 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-38102755

RESUMO

Numerous cellular functions rely on protein-protein interactions. Efforts to comprehensively characterize them remain challenged however by the diversity of molecular recognition mechanisms employed within the proteome. Deep learning has emerged as a promising approach for tackling this problem by exploiting both experimental data and basic biophysical knowledge about protein interactions. Here, we review the growing ecosystem of deep learning methods for modeling protein interactions, highlighting the diversity of these biophysically informed models and their respective trade-offs. We discuss recent successes in using representation learning to capture complex features pertinent to predicting protein interactions and interaction sites, geometric deep learning to reason over protein structures and predict complex structures, and generative modeling to design de novo protein assemblies. We also outline some of the outstanding challenges and promising new directions. Opportunities abound to discover novel interactions, elucidate their physical mechanisms, and engineer binders to modulate their functions using deep learning and, ultimately, unravel how protein interactions orchestrate complex cellular behaviors.


Assuntos
Aprendizado Profundo , Mapeamento de Interação de Proteínas , Proteínas , Proteínas/química , Mapeamento de Interação de Proteínas/métodos
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa