Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Nat Chem ; 2024 May 16.
Artículo en Inglés | MEDLINE | ID: mdl-38755312

RESUMEN

Several peptide dual agonists of the human glucagon receptor (GCGR) and the glucagon-like peptide-1 receptor (GLP-1R) are in development for the treatment of type 2 diabetes, obesity and their associated complications. Candidates must have high potency at both receptors, but it is unclear whether the limited experimental data available can be used to train models that accurately predict the activity at both receptors of new peptide variants. Here we use peptide sequence data labelled with in vitro potency at human GCGR and GLP-1R to train several models, including a deep multi-task neural-network model using multiple loss optimization. Model-guided sequence optimization was used to design three groups of peptide variants, with distinct ranges of predicted dual activity. We found that three of the model-designed sequences are potent dual agonists with superior biological activity. With our designs we were able to achieve up to sevenfold potency improvement at both receptors simultaneously compared to the best dual-agonist in the training set.

3.
Nature ; 625(7996): 832-839, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37956700

RESUMEN

AlphaFold2 (ref. 1) has revolutionized structural biology by accurately predicting single structures of proteins. However, a protein's biological function often depends on multiple conformational substates2, and disease-causing point mutations often cause population changes within these substates3,4. We demonstrate that clustering a multiple-sequence alignment by sequence similarity enables AlphaFold2 to sample alternative states of known metamorphic proteins with high confidence. Using this method, named AF-Cluster, we investigated the evolutionary distribution of predicted structures for the metamorphic protein KaiB5 and found that predictions of both conformations were distributed in clusters across the KaiB family. We used nuclear magnetic resonance spectroscopy to confirm an AF-Cluster prediction: a cyanobacteria KaiB variant is stabilized in the opposite state compared with the more widely studied variant. To test AF-Cluster's sensitivity to point mutations, we designed and experimentally verified a set of three mutations predicted to flip KaiB from Rhodobacter sphaeroides from the ground to the fold-switched state. Finally, screening for alternative states in protein families without known fold switching identified a putative alternative state for the oxidoreductase Mpt53 in Mycobacterium tuberculosis. Further development of such bioinformatic methods in tandem with experiments will probably have a considerable impact on predicting protein energy landscapes, essential for illuminating biological function.


Asunto(s)
Análisis por Conglomerados , Aprendizaje Automático , Conformación Proteica , Pliegue de Proteína , Proteínas , Alineación de Secuencia , Mutación , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Rhodobacter sphaeroides , Proteínas Bacterianas/química , Proteínas Bacterianas/metabolismo
4.
Int J Ment Health Nurs ; 33(1): 202-212, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-37788130

RESUMEN

This article aims to draw attention to increasing genericism in nurse education in the United Kingdom, which sees less specialist mental health education for mental health nursing students and offers opposition to such direction. In 2018, the Nursing and Midwifery Council produced the 'Future Nurse' standards which directed changes to pre-registration nurse education. This led to dissatisfaction from many mental health nurses, specifically regarding reduced mental health content for students studying mental health nursing. Concerns have been raised through public forum and evolved into a grassroots national movement 'Mental Health Deserves Better' (#MHDeservesBetter). This is a position paper which presents the perspective of many mental health nurse academics working at universities within the United Kingdom. Mental health nurse academics collaborated to develop ideas and articulate arguments and perspectives which present a strong position on the requirement for specialist pre-registration mental health nurse education. The key themes explored are; a conflict of ideologies in nursing, no parity of esteem, physical health care needs to be contextualized, the unique nature of mental health nursing, ethical tensions and values conflict, implications for practice, necessary improvements overlooked and the dangers of honesty and academic 'freedom'. The paper concludes by asserting a strong position on the need for a change of direction away from genericism and calls on mental health nurses to rise from the ashes to advocate for a quality education necessary to ensure quality care delivery. The quality of mental health care provided by mental health nurses has many influences, yet the foundation offered through pre-registration education is one of the most valuable. If the education of mental health nurses does not attend to the distinct and unique role of the mental health nurse, standards of mental health care may diminish without assertive action from mental health nurses and allies.


Asunto(s)
Bachillerato en Enfermería , Enfermería Psiquiátrica , Humanos , Salud Mental , Reino Unido , Educación en Salud
5.
Elife ; 122023 02 27.
Artículo en Inglés | MEDLINE | ID: mdl-36847334

RESUMEN

Predicting the function of a protein from its amino acid sequence is a long-standing challenge in bioinformatics. Traditional approaches use sequence alignment to compare a query sequence either to thousands of models of protein families or to large databases of individual protein sequences. Here we introduce ProteInfer, which instead employs deep convolutional neural networks to directly predict a variety of protein functions - Enzyme Commission (EC) numbers and Gene Ontology (GO) terms - directly from an unaligned amino acid sequence. This approach provides precise predictions which complement alignment-based methods, and the computational efficiency of a single neural network permits novel and lightweight software interfaces, which we demonstrate with an in-browser graphical interface for protein function prediction in which all computation is performed on the user's personal computer with no data uploaded to remote servers. Moreover, these models place full-length amino acid sequences into a generalised functional space, facilitating downstream analysis and interpretation. To read the interactive version of this paper, please visit https://google-research.github.io/proteinfer/.


Asunto(s)
Algoritmos , Redes Neurales de la Computación , Proteínas/genética , Proteínas/química , Secuencia de Aminoácidos , Programas Informáticos , Biología Computacional/métodos
6.
Nat Biotechnol ; 41(8): 1073-1074, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-36702894
7.
Methods Mol Biol ; 2586: 49-77, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36705898

RESUMEN

Here we detail the LandscapeFold secondary structure prediction algorithm and how it is used. The algorithm was previously described and tested in (Kimchi O et al., Biophys J 117(3):520-532, 2019), though it was not named there. The algorithm directly enumerates all possible secondary structures into which up to two RNA or single-stranded DNA sequences can fold. It uses a polymer physics model to estimate the configurational entropy of structures including complex pseudoknots. We detail each of these steps and ways in which the user can adjust the algorithm as desired. The code is available on the GitHub repository https://github.com/ofer-kimchi/LandscapeFold .


Asunto(s)
Algoritmos , ARN , Conformación de Ácido Nucleico , ARN/genética , Entropía , ADN de Cadena Simple
8.
Nucleic Acids Res ; 51(D1): D753-D759, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36477304

RESUMEN

The MGnify platform (https://www.ebi.ac.uk/metagenomics) facilitates the assembly, analysis and archiving of microbiome-derived nucleic acid sequences. The platform provides access to taxonomic assignments and functional annotations for nearly half a million analyses covering metabarcoding, metatranscriptomic, and metagenomic datasets, which are derived from a wide range of different environments. Over the past 3 years, MGnify has not only grown in terms of the number of datasets contained but also increased the breadth of analyses provided, such as the analysis of long-read sequences. The MGnify protein database now exceeds 2.4 billion non-redundant sequences predicted from metagenomic assemblies. This collection is now organised into a relational database making it possible to understand the genomic context of the protein through navigation back to the source assembly and sample metadata, marking a major improvement. To extend beyond the functional annotations already provided in MGnify, we have applied deep learning-based annotation methods. The technology underlying MGnify's Application Programming Interface (API) and website has been upgraded, and we have enabled the ability to perform downstream analysis of the MGnify data through the introduction of a coupled Jupyter Lab environment.


Asunto(s)
Microbiota , Análisis de Secuencia , Genómica/métodos , Metagenoma , Metagenómica/métodos , Microbiota/genética , Programas Informáticos , Análisis de Secuencia/métodos
9.
Nucleic Acids Res ; 51(D1): D418-D427, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36350672

RESUMEN

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. Here, we report recent developments with InterPro (version 90.0) and its associated software, including updates to data content and to the website. These developments extend and enrich the information provided by InterPro, and provide a more user friendly access to the data. Additionally, we have worked on adding Pfam website features to the InterPro website, as the Pfam website will be retired in late 2022. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB. Moreover, we report the development of a card game as a method of engaging the non-scientific community. Finally, we discuss the benefits and challenges brought by the use of artificial intelligence for protein structure prediction.


Asunto(s)
Bases de Datos de Proteínas , Humanos , Secuencia de Aminoácidos , Inteligencia Artificial , Internet , Proteínas/química , Programas Informáticos
10.
Database (Oxford) ; 20222022 08 12.
Artículo en Inglés | MEDLINE | ID: mdl-35961013

RESUMEN

Over the last 25 years, biology has entered the genomic era and is becoming a science of 'big data'. Most interpretations of genomic analyses rely on accurate functional annotations of the proteins encoded by more than 500 000 genomes sequenced to date. By different estimates, only half the predicted sequenced proteins carry an accurate functional annotation, and this percentage varies drastically between different organismal lineages. Such a large gap in knowledge hampers all aspects of biological enterprise and, thereby, is standing in the way of genomic biology reaching its full potential. A brainstorming meeting to address this issue funded by the National Science Foundation was held during 3-4 February 2022. Bringing together data scientists, biocurators, computational biologists and experimentalists within the same venue allowed for a comprehensive assessment of the current state of functional annotations of protein families. Further, major issues that were obstructing the field were identified and discussed, which ultimately allowed for the proposal of solutions on how to move forward.


Asunto(s)
Genómica , Proteínas , Secuencia de Bases , Biología Computacional , Genoma , Anotación de Secuencia Molecular
11.
Biophys J ; 121(16): 3023-3033, 2022 08 16.
Artículo en Inglés | MEDLINE | ID: mdl-35859421

RESUMEN

Collagen fibrils are the major constituents of the extracellular matrix, which provides structural support to vertebrate connective tissues. It is widely assumed that the superstructure of collagen fibrils is encoded in the primary sequences of the molecular building blocks. However, the interplay between large-scale architecture and small-scale molecular interactions makes the ab initio prediction of collagen structure challenging. Here, we propose a model that allows us to predict the periodic structure of collagen fibers and the axial offset between the molecules, purely on the basis of simple predictive rules for the interaction between amino acid residues. With our model, we identify the sequence-dependent collagen fiber geometries with the lowest free energy and validate the predicted geometries against the available experimental data. We propose a procedure for searching for optimal staggering distances. Finally, we build a classification algorithm and use it to scan 11 data sets of vertebrate fibrillar collagens, and predict the periodicity of the resulting assemblies. We analyzed the experimentally observed variance of the optimal stagger distances across species, and find that these distances, and the resulting fibrillar phenotypes, are evolutionary well preserved. Moreover, we observed that the energy minimum at the optimal stagger distance is broad in all cases, suggesting a further evolutionary adaptation designed to improve the assembly kinetics. Our periodicity predictions are not only in good agreement with the experimental data on collagen molecular staggering for all collagen types analyzed, but also for synthetic peptides. We argue that, with our model, it becomes possible to design tailor-made, periodic collagen structures, thereby enabling the design of novel biomimetic materials based on collagen-mimetic trimers.


Asunto(s)
Materiales Biomiméticos , Colágeno , Materiales Biomiméticos/química , Colágeno/metabolismo , Matriz Extracelular/metabolismo , Colágenos Fibrilares , Péptidos/química
12.
Nat Biotechnol ; 40(6): 932-937, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35190689

RESUMEN

Understanding the relationship between amino acid sequence and protein function is a long-standing challenge with far-reaching scientific and translational implications. State-of-the-art alignment-based techniques cannot predict function for one-third of microbial protein sequences, hampering our ability to exploit data from diverse organisms. Here, we train deep learning models to accurately predict functional annotations for unaligned amino acid sequences across rigorous benchmark assessments built from the 17,929 families of the protein families database Pfam. The models infer known patterns of evolutionary substitutions and learn representations that accurately cluster sequences from unseen families. Combining deep models with existing methods significantly improves remote homology detection, suggesting that the deep models learn complementary information. This approach extends the coverage of Pfam by >9.5%, exceeding additions made over the last decade, and predicts function for 360 human reference proteome proteins with no previous Pfam annotation. These results suggest that deep learning models will be a core component of future protein annotation tools.


Asunto(s)
Aprendizaje Profundo , Secuencia de Aminoácidos , Bases de Datos de Proteínas , Humanos , Anotación de Secuencia Molecular , Proteoma/metabolismo , Proteómica
13.
Cell Syst ; 12(11): 1019-1020, 2021 11 17.
Artículo en Inglés | MEDLINE | ID: mdl-34793698

RESUMEN

Machine-learning-guided protein design is rapidly emerging as a strategy to find high-fitness multi-mutant variants. In this issue of Cell Systems, Wittman et al. analyze the impact of design decisions for machine-learning-assisted directed evolution (MLDE) on its ability to navigate a fitness landscape and reliably find global optima.


Asunto(s)
Aprendizaje Automático , Proteínas
14.
Nat Biotechnol ; 39(6): 691-696, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-33574611

RESUMEN

Modern experimental technologies can assay large numbers of biological sequences, but engineered protein libraries rarely exceed the sequence diversity of natural protein families. Machine learning (ML) models trained directly on experimental data without biophysical modeling provide one route to accessing the full potential diversity of engineered proteins. Here we apply deep learning to design highly diverse adeno-associated virus 2 (AAV2) capsid protein variants that remain viable for packaging of a DNA payload. Focusing on a 28-amino acid segment, we generated 201,426 variants of the AAV2 wild-type (WT) sequence yielding 110,689 viable engineered capsids, 57,348 of which surpass the average diversity of natural AAV serotype sequences, with 12-29 mutations across this region. Even when trained on limited data, deep neural network models accurately predict capsid viability across diverse variants. This approach unlocks vast areas of functional but previously unreachable sequence space, with many potential applications for the generation of improved viral vectors and protein therapeutics.


Asunto(s)
Proteínas de la Cápside/genética , Dependovirus/genética , Aprendizaje Automático , Vectores Genéticos , Células HeLa , Humanos
15.
Nat Biotechnol ; 38(8): 989-999, 2020 08.
Artículo en Inglés | MEDLINE | ID: mdl-32284585

RESUMEN

A central challenge in expanding the genetic code of cells to incorporate noncanonical amino acids into proteins is the scalable discovery of aminoacyl-tRNA synthetase (aaRS)-tRNA pairs that are orthogonal in their aminoacylation specificity. Here we computationally identify candidate orthogonal tRNAs from millions of sequences and develop a rapid, scalable approach-named tRNA Extension (tREX)-to determine the in vivo aminoacylation status of tRNAs. Using tREX, we test 243 candidate tRNAs in Escherichia coli and identify 71 orthogonal tRNAs, covering 16 isoacceptor classes, and 23 functional orthogonal tRNA-cognate aaRS pairs. We discover five orthogonal pairs, including three highly active amber suppressors, and evolve new amino acid substrate specificities for two pairs. Finally, we use tREX to characterize a matrix of 64 orthogonal synthetase-orthogonal tRNA specificities. This work expands the number of orthogonal pairs available for genetic code expansion and provides a pipeline for the discovery of additional orthogonal pairs and a foundation for encoding the cellular synthesis of noncanonical biopolymers.


Asunto(s)
Aminoacil-ARNt Sintetasas/metabolismo , ARN de Transferencia/metabolismo , Secuencia de Aminoácidos , Aminoacil-ARNt Sintetasas/genética , Simulación por Computador , Escherichia coli , Regulación Bacteriana de la Expresión Génica , Proteínas Fluorescentes Verdes , Unión Proteica , Especificidad por Sustrato
16.
Sci Rep ; 10(1): 3397, 2020 02 25.
Artículo en Inglés | MEDLINE | ID: mdl-32099005

RESUMEN

Collagen fibrils are central to the molecular organization of the extracellular matrix (ECM) and to defining the cellular microenvironment. Glycation of collagen fibrils is known to impact on cell adhesion and migration in the context of cancer and in model studies, glycation of collagen molecules has been shown to affect the binding of other ECM components to collagen. Here we use TEM to show that ribose-5-phosphate (R5P) glycation of collagen fibrils - potentially important in the microenvironment of actively dividing cells, such as cancer cells - disrupts the longitudinal ordering of the molecules in collagen fibrils and, using KFM and FLiM, that R5P-glycated collagen fibrils have a more negative surface charge than unglycated fibrils. Altered molecular arrangement can be expected to impact on the accessibility of cell adhesion sites and altered fibril surface charge on the integrity of the extracellular matrix structure surrounding glycated collagen fibrils. Both effects are highly relevant for cell adhesion and migration within the tumour microenvironment.


Asunto(s)
Colágeno Tipo I/química , Matriz Extracelular/química , Ribosamonofosfatos/química , Animales , Colágeno Tipo I/metabolismo , Matriz Extracelular/metabolismo , Glicosilación , Humanos , Ribosamonofosfatos/metabolismo
17.
J Chem Inf Model ; 60(1): 56-62, 2020 01 27.
Artículo en Inglés | MEDLINE | ID: mdl-31825609

RESUMEN

The structured nature of chemical data means machine-learning models trained to predict protein-ligand binding risk overfitting the data, impairing their ability to generalize and make accurate predictions for novel candidate ligands. Data debiasing algorithms, which systematically partition the data to reduce bias and provide a more accurate metric of model performance, have the potential to address this issue. When models are trained using debiased data splits, the reward for simply memorizing the training data is reduced, suggesting that the ability of the model to make accurate predictions for novel candidate ligands will improve. To test this hypothesis, we use distance-based data splits to measure how well a model can generalize. We first confirm that models perform better for randomly split held-out sets than for distant held-out sets. We then debias the data and find, surprisingly, that debiasing typically reduces the ability of models to make accurate predictions for distant held-out test sets and that model performance measured after debiasing is not representative of the ability of a model to generalize. These results suggest that debiasing reduces the information available to a model, impairing its ability to generalize.


Asunto(s)
Proteínas/química , Algoritmos , Ligandos , Modelos Químicos , Unión Proteica
18.
Brief Bioinform ; 21(5): 1549-1567, 2020 09 25.
Artículo en Inglés | MEDLINE | ID: mdl-31626279

RESUMEN

Antibodies are proteins that recognize the molecular surfaces of potentially noxious molecules to mount an adaptive immune response or, in the case of autoimmune diseases, molecules that are part of healthy cells and tissues. Due to their binding versatility, antibodies are currently the largest class of biotherapeutics, with five monoclonal antibodies ranked in the top 10 blockbuster drugs. Computational advances in protein modelling and design can have a tangible impact on antibody-based therapeutic development. Antibody-specific computational protocols currently benefit from an increasing volume of data provided by next generation sequencing and application to related drug modalities based on traditional antibodies, such as nanobodies. Here we present a structured overview of available databases, methods and emerging trends in computational antibody analysis and contextualize them towards the engineering of candidate antibody therapeutics.


Asunto(s)
Anticuerpos Monoclonales/química , Anticuerpos Monoclonales/inmunología , Anticuerpos Monoclonales/uso terapéutico , Biología Computacional/métodos , Bases de Datos de Proteínas , Simulación del Acoplamiento Molecular , Conformación Proteica
19.
J Comput Biol ; 27(8): 1219-1231, 2020 08.
Artículo en Inglés | MEDLINE | ID: mdl-31874057

RESUMEN

In many application domains, neural networks are highly accurate and have been deployed at large scale. However, users often do not have good tools for understanding how these models arrive at their predictions. This has hindered adoption in fields such as the life and medical sciences, where researchers require that models base their decisions on underlying biological phenomena rather than peculiarities of the dataset. We propose a set of methods for critiquing deep learning models and demonstrate their application for protein family classification, a task for which high-accuracy models have considerable potential impact. Our methods extend the Sufficient Input Subsets (SIS) technique, which we use to identify subsets of features in each protein sequence that are alone sufficient for classification. Our suite of tools analyzes these subsets to shed light on the decision-making criteria employed by models trained on this task. These tools show that while deep models may perform classification for biologically relevant reasons, their behavior varies considerably across the choice of network architecture and parameter initialization. While the techniques that we develop are specific to the protein sequence classification task, the approach taken generalizes to a broad set of scientific contexts in which model interpretability is essential.


Asunto(s)
Biología Computacional , Modelos Biológicos , Familia de Multigenes/genética , Proteínas/clasificación , Aprendizaje Profundo , Humanos , Aprendizaje Automático , Redes Neurales de la Computación , Proteínas/genética
20.
Phys Rev Lett ; 123(23): 238102, 2019 Dec 06.
Artículo en Inglés | MEDLINE | ID: mdl-31868483

RESUMEN

Collagen consists of three peptides twisted together through a periodic array of hydrogen bonds. Here we use this as inspiration to find design rules for programmed specific interactions for self-assembling synthetic collagenlike triple helices, starting from disordered configurations. The assembly generically nucleates defects in the triple helix, the characteristics of which can be manipulated by spatially varying the enthalpy of helix formation. Defect formation slows assembly, evoking kinetic pathologies that have been observed to mutations in the primary collagen amino acid sequence. The controlled formation and interaction between defects gives a route for hierarchical self-assembly of bundles of twisted filaments.


Asunto(s)
Colágeno/química , Modelos Químicos , Secuencia de Aminoácidos , Modelos Moleculares , Nanoestructuras/química , Péptidos/química , Conformación Proteica en Hélice alfa
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...