Pesquisa | BVS Aleitamento Materno

1.

A highly conserved program of neuronal microexons is misregulated in autistic brains.

Irimia, Manuel; Weatheritt, Robert J; Ellis, Jonathan D; Parikshak, Neelroop N; Gonatopoulos-Pournatzis, Thomas; Babor, Mariana; Quesnel-Vallières, Mathieu; Tapial, Javier; Raj, Bushra; O'Hanlon, Dave; Barrios-Rodiles, Miriam; Sternberg, Michael J E; Cordes, Sabine P; Roth, Frederick P; Wrana, Jeffrey L; Geschwind, Daniel H; Blencowe, Benjamin J.

Cell ; 159(7): 1511-23, 2014 Dec 18.

Artigo em Inglês | MEDLINE | ID: mdl-25525873

RESUMO

Alternative splicing (AS) generates vast transcriptomic and proteomic complexity. However, which of the myriad of detected AS events provide important biological functions is not well understood. Here, we define the largest program of functionally coordinated, neural-regulated AS described to date in mammals. Relative to all other types of AS within this program, 3-15 nucleotide "microexons" display the most striking evolutionary conservation and switch-like regulation. These microexons modulate the function of interaction domains of proteins involved in neurogenesis. Most neural microexons are regulated by the neuronal-specific splicing factor nSR100/SRRM4, through its binding to adjacent intronic enhancer motifs. Neural microexons are frequently misregulated in the brains of individuals with autism spectrum disorder, and this misregulation is associated with reduced levels of nSR100. The results thus reveal a highly conserved program of dynamic microexon regulation associated with the remodeling of protein-interaction networks during neurogenesis, the misregulation of which is linked to autism.

Assuntos

Processamento Alternativo , Transtornos Globais do Desenvolvimento Infantil/patologia , Proteínas do Tecido Nervoso/metabolismo , Neurônios/metabolismo , Animais , Transtornos Globais do Desenvolvimento Infantil/metabolismo , Humanos , Camundongos , Modelos Moleculares , Proteínas do Tecido Nervoso/química , Proteínas do Tecido Nervoso/genética , Neurogênese , Domínios e Motivos de Interação entre Proteínas , Análise de Sequência de RNA , Lobo Temporal/patologia

2.

3DLigandSite: structure-based prediction of protein-ligand binding sites.

McGreig, Jake E; Uri, Hannah; Antczak, Magdalena; Sternberg, Michael J E; Michaelis, Martin; Wass, Mark N.

Nucleic Acids Res ; 50(W1): W13-W20, 2022 07 05.

Artigo em Inglês | MEDLINE | ID: mdl-35412635

RESUMO

3DLigandSite is a web tool for the prediction of ligand-binding sites in proteins. Here, we report a significant update since the first release of 3DLigandSite in 2010. The overall methodology remains the same, with candidate binding sites in proteins inferred using known binding sites in related protein structures as templates. However, the initial structural modelling step now uses the newly available structures from the AlphaFold database or alternatively Phyre2 when AlphaFold structures are not available. Further, a sequence-based search using HHSearch has been introduced to identify template structures with bound ligands that are used to infer the ligand-binding residues in the query protein. Finally, we introduced a machine learning element as the final prediction step, which improves the accuracy of predictions and provides a confidence score for each residue predicted to be part of a binding site. Validation of 3DLigandSite on a set of 6416 binding sites obtained 92% recall at 75% precision for non-metal binding sites and 52% recall at 75% precision for metal binding sites. 3DLigandSite is available at https://www.wass-michaelislab.org/3dligandsite. Users submit either a protein sequence or structure. Results are displayed in multiple formats including an interactive Mol* molecular visualization of the protein and the predicted binding sites.

Assuntos

Bases de Dados de Proteínas , Proteínas , Sítios de Ligação , Ligantes , Aprendizado de Máquina , Ligação Proteica , Proteínas/química

3.

Genome3D: integrating a collaborative data pipeline to expand the depth and breadth of consensus protein structure annotation.

Sillitoe, Ian; Andreeva, Antonina; Blundell, Tom L; Buchan, Daniel W A; Finn, Robert D; Gough, Julian; Jones, David; Kelley, Lawrence A; Paysan-Lafosse, Typhaine; Lam, Su Datt; Murzin, Alexey G; Pandurangan, Arun Prasad; Salazar, Gustavo A; Skwark, Marcin J; Sternberg, Michael J E; Velankar, Sameer; Orengo, Christine.

Nucleic Acids Res ; 48(D1): D314-D319, 2020 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-31733063

RESUMO

Genome3D (https://www.genome3d.eu) is a freely available resource that provides consensus structural annotations for representative protein sequences taken from a selection of model organisms. Since the last NAR update in 2015, the method of data submission has been overhauled, with annotations now being 'pushed' to the database via an API. As a result, contributing groups are now able to manage their own structural annotations, making the resource more flexible and maintainable. The new submission protocol brings a number of additional benefits including: providing instant validation of data and avoiding the requirement to synchronise releases between resources. It also makes it possible to implement the submission of these structural annotations as an automated part of existing internal workflows. In turn, these improvements facilitate Genome3D being opened up to new prediction algorithms and groups. For the latest release of Genome3D (v2.1), the underlying dataset of sequences used as prediction targets has been updated using the latest reference proteomes available in UniProtKB. A number of new reference proteomes have also been added of particular interest to the wider scientific community: cow, pig, wheat and mycobacterium tuberculosis. These additions, along with improvements to the underlying predictions from contributing resources, has ensured that the number of annotations in Genome3D has nearly doubled since the last NAR update article. The new API has also been used to facilitate the dissemination of Genome3D data into InterPro, thereby widening the visibility of both the annotation data and annotation algorithms.

Assuntos

Proteínas/química , Bases de Dados de Proteínas , Proteínas/classificação , Proteínas/genética , Interface Usuário-Computador

4.

Missense3D-DB web catalogue: an atom-based analysis and repository of 4M human protein-coding genetic variants.

Khanna, Tarun; Hanna, Gordon; Sternberg, Michael J E; David, Alessia.

Hum Genet ; 140(5): 805-812, 2021 May.

Artigo em Inglês | MEDLINE | ID: mdl-33502607

RESUMO

The interpretation of human genetic variation is one of the greatest challenges of modern genetics. New approaches are urgently needed to prioritize variants, especially those that are rare or lack a definitive clinical interpretation. We examined 10,136,597 human missense genetic variants from GnomAD, ClinVar and UniProt. We were able to perform large-scale atom-based mapping and phenotype interpretation of 3,960,015 of these variants onto 18,874 experimental and 84,818 in house predicted three-dimensional coordinates of the human proteome. We demonstrate that 14% of amino acid substitutions from the GnomAD database that could be structurally analysed are predicted to affect protein structure (n = 568,548, of which 566,439 rare or extremely rare) and may, therefore, have a yet unknown disease-causing effect. The same is true for 19.0% (n = 6266) of variants of unknown clinical significance or conflicting interpretation reported in the ClinVar database. The results of the structural analysis are available in the dedicated web catalogue Missense3D-DB ( http://missense3d.bc.ic.ac.uk/ ). For each of the 4 M variants, the results of the structural analysis are presented in a friendly concise format that can be included in clinical genetic reports. A detailed report of the structural analysis is also available for the non-experts in structural biology. Population frequency and predictions from SIFT and PolyPhen are included for a more comprehensive variant interpretation. This is the first large-scale atom-based structural interpretation of human genetic variation and offers geneticists and the biomedical community a new approach to genetic variant interpretation.

Assuntos

Mapeamento Cromossômico/métodos , Biologia Computacional/métodos , Bases de Dados Genéticas , Mutação de Sentido Incorreto/genética , Substituição de Aminoácidos/genética , Frequência do Gene/genética , Humanos , Conformação Proteica , Proteoma/genética

5.

Application of docking methodologies to modeled proteins.

Singh, Amar; Dauzhenka, Taras; Kundrotas, Petras J; Sternberg, Michael J E; Vakser, Ilya A.

Proteins ; 88(9): 1180-1188, 2020 09.

Artigo em Inglês | MEDLINE | ID: mdl-32170770

RESUMO

Protein docking is essential for structural characterization of protein interactions. Besides providing the structure of protein complexes, modeling of proteins and their complexes is important for understanding the fundamental principles and specific aspects of protein interactions. The accuracy of protein modeling, in general, is still less than that of the experimental approaches. Thus, it is important to investigate the applicability of docking techniques to modeled proteins. We present new comprehensive benchmark sets of protein models for the development and validation of protein docking, as well as a systematic assessment of free and template-based docking techniques on these sets. As opposed to previous studies, the benchmark sets reflect the real case modeling/docking scenario where the accuracy of the models is assessed by the modeling procedure, without reference to the native structure (which would be unknown in practical applications). We also expanded the analysis to include docking of protein pairs where proteins have different structural accuracy. The results show that, in general, the template-based docking is less sensitive to the structural inaccuracies of the models than the free docking. The near-native docking poses generated by the template-based approach, typically, also have higher ranks than those produces by the free docking (although the free docking is indispensable in modeling the multiplicity of protein interactions in a crowded cellular environment). The results show that docking techniques are applicable to protein models in a broad range of modeling accuracy. The study provides clear guidelines for practical applications of docking to protein models.

Assuntos

Benchmarking/estatística & dados numéricos , Simulação de Acoplamento Molecular , Proteínas/química , Software , Sequência de Aminoácidos , Sítios de Ligação , Bases de Dados de Proteínas , Ligação Proteica , Estrutura Secundária de Proteína

6.

Phylotranscriptomic Insights into the Diversification of Endothermic Thunnus Tunas.

Ciezarek, Adam G; Osborne, Owen G; Shipley, Oliver N; Brooks, Edward J; Tracey, Sean R; McAllister, Jaime D; Gardner, Luke D; Sternberg, Michael J E; Block, Barbara; Savolainen, Vincent.

Mol Biol Evol ; 36(1): 84-96, 2019 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-30364966

RESUMO

Birds, mammals, and certain fishes, including tunas, opahs and lamnid sharks, are endothermic, conserving internally generated, metabolic heat to maintain body or tissue temperatures above that of the environment. Bluefin tunas are commercially important fishes worldwide, and some populations are threatened. They are renowned for their endothermy, maintaining elevated temperatures of the oxidative locomotor muscle, viscera, brain and eyes, and occupying cold, productive high-latitude waters. Less cold-tolerant tunas, such as yellowfin tuna, by contrast, remain in warm-temperate to tropical waters year-round, reproducing more rapidly than most temperate bluefin tuna populations, providing resiliency in the face of large-scale industrial fisheries. Despite the importance of these traits to not only fisheries but also habitat utilization and responses to climate change, little is known of the genetic processes underlying the diversification of tunas. In collecting and analyzing sequence data across 29,556 genes, we found that parallel selection on standing genetic variation is associated with the evolution of endothermy in bluefin tunas. This includes two shared substitutions in genes encoding glycerol-3 phosphate dehydrogenase, an enzyme that contributes to thermogenesis in bumblebees and mammals, as well as four genes involved in the Krebs cycle, oxidative phosphorylation, ß-oxidation, and superoxide removal. Using phylogenetic techniques, we further illustrate that the eight Thunnus species are genetically distinct, but found evidence of mitochondrial genome introgression across two species. Phylogeny-based metrics highlight conservation needs for some of these species.

Assuntos

Evolução Biológica , Termogênese/genética , Atum/genética , Animais , Espécies em Perigo de Extinção , Genoma Mitocondrial , Hibridização Genética , Mutação , Seleção Genética , Atum/metabolismo

7.

Identification of disease-associated loci using machine learning for genotype and network data integration.

Leal, Luis G; David, Alessia; Jarvelin, Marjo-Riita; Sebert, Sylvain; Männikkö, Minna; Karhunen, Ville; Seaby, Eleanor; Hoggart, Clive; Sternberg, Michael J E.

Bioinformatics ; 35(24): 5182-5190, 2019 12 15.

Artigo em Inglês | MEDLINE | ID: mdl-31070705

RESUMO

MOTIVATION: Integration of different omics data could markedly help to identify biological signatures, understand the missing heritability of complex diseases and ultimately achieve personalized medicine. Standard regression models used in Genome-Wide Association Studies (GWAS) identify loci with a strong effect size, whereas GWAS meta-analyses are often needed to capture weak loci contributing to the missing heritability. Development of novel machine learning algorithms for merging genotype data with other omics data is highly needed as it could enhance the prioritization of weak loci. RESULTS: We developed cNMTF (corrected non-negative matrix tri-factorization), an integrative algorithm based on clustering techniques of biological data. This method assesses the inter-relatedness between genotypes, phenotypes, the damaging effect of the variants and gene networks in order to identify loci-trait associations. cNMTF was used to prioritize genes associated with lipid traits in two population cohorts. We replicated 129 genes reported in GWAS world-wide and provided evidence that supports 85% of our findings (226 out of 265 genes), including recent associations in literature (NLGN1), regulators of lipid metabolism (DAB1) and pleiotropic genes for lipid traits (CARM1). Moreover, cNMTF performed efficiently against strong population structures by accounting for the individuals' ancestry. As the method is flexible in the incorporation of diverse omics data sources, it can be easily adapted to the user's research needs. AVAILABILITY AND IMPLEMENTATION: An R package (cnmtf) is available at https://lgl15.github.io/cnmtf_web/index.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Estudo de Associação Genômica Ampla , Aprendizado de Máquina , Redes Reguladoras de Genes , Genótipo , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único

8.

PhenoRank: reducing study bias in gene prioritization through simulation.

Cornish, Alex J; David, Alessia; Sternberg, Michael J E.

Bioinformatics ; 34(12): 2087-2095, 2018 06 15.

Artigo em Inglês | MEDLINE | ID: mdl-29360927

RESUMO

Motivation: Genome-wide association studies have identified thousands of loci associated with human disease, but identifying the causal genes at these loci is often difficult. Several methods prioritize genes most likely to be disease causing through the integration of biological data, including protein-protein interaction and phenotypic data. Data availability is not the same for all genes however, potentially influencing the performance of these methods. Results: We demonstrate that whilst disease genes tend to be associated with greater numbers of data, this may be at least partially a result of them being better studied. With this observation we develop PhenoRank, which prioritizes disease genes whilst avoiding being biased towards genes with more available data. Bias is avoided by comparing gene scores generated for the query disease against gene scores generated using simulated sets of phenotype terms, which ensures that differences in data availability do not affect the ranking of genes. We demonstrate that whilst existing prioritization methods are biased by data availability, PhenoRank is not similarly biased. Avoiding this bias allows PhenoRank to effectively prioritize genes with fewer available data and improves its overall performance. PhenoRank outperforms three available prioritization methods in cross-validation (PhenoRank area under receiver operating characteristic curve [AUC]=0.89, DADA AUC = 0.87, EXOMISER AUC = 0.71, PRINCE AUC = 0.83, P < 2.2 × 10-16). Availability and implementation: PhenoRank is freely available for download at https://github.com/alexjcornish/PhenoRank. Supplementary information: Supplementary data are available at Bioinformatics online.

Assuntos

Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Polimorfismo Genético , Software , Animais , Viés , Humanos , Camundongos , Fenótipo , Mapas de Interação de Proteínas , Curva ROC

9.

k-SLAM: accurate and ultra-fast taxonomic classification and gene identification for large metagenomic data sets.

Ainsworth, David; Sternberg, Michael J E; Raczy, Come; Butcher, Sarah A.

Nucleic Acids Res ; 45(4): 1649-1656, 2017 02 28.

Artigo em Inglês | MEDLINE | ID: mdl-27965413

RESUMO

k-SLAM is a highly efficient algorithm for the characterization of metagenomic data. Unlike other ultra-fast metagenomic classifiers, full sequence alignment is performed allowing for gene identification and variant calling in addition to accurate taxonomic classification. A k-mer based method provides greater taxonomic accuracy than other classifiers and a three orders of magnitude speed increase over alignment based approaches. The use of alignments to find variants and genes along with their taxonomic origins enables novel strains to be characterized. k-SLAM's speed allows a full taxonomic classification and gene identification to be tractable on modern large data sets. A pseudo-assembly method is used to increase classification accuracy by up to 40% for species which have high sequence homology within their genus.

Assuntos

Biologia Computacional/métodos , Código de Barras de DNA Taxonômico/métodos , Metagenoma , Metagenômica/métodos , Algoritmos , Estudos de Casos e Controles , Biologia Computacional/normas , Código de Barras de DNA Taxonômico/normas , Microbioma Gastrointestinal , Genoma Bacteriano , Humanos , Cirrose Hepática/microbiologia , Metagenômica/normas , Reprodutibilidade dos Testes , Escherichia coli Shiga Toxigênica/classificação , Escherichia coli Shiga Toxigênica/genética

10.

Properties of human genes guided by their enrichment in rare and common variants.

Alhuzimi, Eman; Leal, Luis G; Sternberg, Michael J E; David, Alessia.

Hum Mutat ; 39(3): 365-370, 2018 03.

Artigo em Inglês | MEDLINE | ID: mdl-29197136

RESUMO

We analyzed 563,099 common (minor allele frequency, MAF≥0.01) and rare (MAF < 0.01) genetic variants annotated in ExAC and UniProt and 26,884 disease-causing variants from ClinVar and UniProt occurring in the coding region of 17,975 human protein-coding genes. Three novel sets of genes were identified: those enriched in rare variants (n = 32 genes), in common variants (n = 282 genes), and in disease-causing variants (n = 800 genes). Genes enriched in rare variants have far greater similarities in terms of biological and network properties to genes enriched in disease-causing variants, than to genes enriched in common variants. However, in half of the genes enriched in rare variants (AOC2, MAMDC4, ANKHD1, CDC42BPB, SPAG5, TRRAP, TANC2, IQCH, USP54, SRRM2, DOPEY2, and PITPNM1), no disease-causing variants have been identified in major, publicly available databases. Thus, genetic variants in these genes are strong candidates for disease and their identification, as part of sequencing studies, should prompt further in vitro analyses.

Assuntos

Genes , Variação Genética , Doença/genética , Genes Essenciais , Humanos , Mutação/genética

11.

Landscape of Pleiotropic Proteins Causing Human Disease: Structural and System Biology Insights.

Ittisoponpisan, Sirawit; Alhuzimi, Eman; Sternberg, Michael J E; David, Alessia.

Hum Mutat ; 38(3): 289-296, 2017 03.

Artigo em Inglês | MEDLINE | ID: mdl-27957775

RESUMO

Pleiotropy is the phenomenon by which the same gene can result in multiple phenotypes. Pleiotropic proteins are emerging as important contributors to rare and common disorders. Nevertheless, little is known on the mechanisms underlying pleiotropy and the characteristic of pleiotropic proteins. We analyzed disease-causing proteins reported in UniProt and observed that 12% are pleiotropic (variants in the same protein cause more than one disease). Pleiotropic proteins were enriched in deleterious and rare variants, but not in common variants. Pleiotropic proteins were more likely to be involved in the pathogenesis of neoplasms, neurological, and circulatory diseases and congenital malformations, whereas non-pleiotropic proteins in endocrine and metabolic disorders. Pleiotropic proteins were more essential and had a higher number of interacting partners compared with non-pleiotropic proteins. Significantly more pleiotropic than non-pleiotropic proteins contained at least one intrinsically long disordered region (P < 0.001). Deleterious variants occurring in structurally disordered regions were more commonly found in pleiotropic, rather than non-pleiotropic proteins. In conclusion, pleiotropic proteins are an important contributor to human disease. They represent a biologically different class of proteins compared with non-pleiotropic proteins and a better understanding of their characteristics and genetic variants can greatly aid in the interpretation of genetic studies and drug design.

Assuntos

Estudos de Associação Genética , Pleiotropia Genética , Predisposição Genética para Doença , Biologia Computacional , Bases de Dados Genéticas , Proteínas de Homeodomínio/química , Proteínas de Homeodomínio/genética , Proteínas de Homeodomínio/metabolismo , Humanos , Proteínas Intrinsicamente Desordenadas/química , Proteínas Intrinsicamente Desordenadas/genética , Proteínas Intrinsicamente Desordenadas/metabolismo , Modelos Moleculares , Razão de Chances , Ligação Proteica , Conformação Proteica , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Transdução de Sinais , Relação Estrutura-Atividade , Biologia de Sistemas/métodos , Vinculina/química , Vinculina/genética , Vinculina/metabolismo

12.

Genome3D: exploiting structure to help users understand their sequences.

Lewis, Tony E; Sillitoe, Ian; Andreeva, Antonina; Blundell, Tom L; Buchan, Daniel W A; Chothia, Cyrus; Cozzetto, Domenico; Dana, José M; Filippis, Ioannis; Gough, Julian; Jones, David T; Kelley, Lawrence A; Kleywegt, Gerard J; Minneci, Federico; Mistry, Jaina; Murzin, Alexey G; Ochoa-Montaño, Bernardo; Oates, Matt E; Punta, Marco; Rackham, Owen J L; Stahlhacke, Jonathan; Sternberg, Michael J E; Velankar, Sameer; Orengo, Christine.

Nucleic Acids Res ; 43(Database issue): D382-6, 2015 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-25348407

RESUMO

Genome3D (http://www.genome3d.eu) is a collaborative resource that provides predicted domain annotations and structural models for key sequences. Since introducing Genome3D in a previous NAR paper, we have substantially extended and improved the resource. We have annotated representatives from Pfam families to improve coverage of diverse sequences and added a fast sequence search to the website to allow users to find Genome3D-annotated sequences similar to their own. We have improved and extended the Genome3D data, enlarging the source data set from three model organisms to 10, and adding VIVACE, a resource new to Genome3D. We have analysed and updated Genome3D's SCOP/CATH mapping. Finally, we have improved the superposition tools, which now give users a more powerful interface for investigating similarities and differences between structural models.

Assuntos

Bases de Dados de Proteínas , Anotação de Sequência Molecular , Estrutura Terciária de Proteína , Algoritmos , Genômica , Internet , Modelos Moleculares , Estrutura Terciária de Proteína/genética , Análise de Sequência de Proteína

13.

AlloPred: prediction of allosteric pockets on proteins using normal mode perturbation analysis.

Greener, Joe G; Sternberg, Michael J E.

BMC Bioinformatics ; 16: 335, 2015 Oct 23.

Artigo em Inglês | MEDLINE | ID: mdl-26493317

RESUMO

BACKGROUND: Despite being hugely important in biological processes, allostery is poorly understood and no universal mechanism has been discovered. Allosteric drugs are a largely unexplored prospect with many potential advantages over orthosteric drugs. Computational methods to predict allosteric sites on proteins are needed to aid the discovery of allosteric drugs, as well as to advance our fundamental understanding of allostery. RESULTS: AlloPred, a novel method to predict allosteric pockets on proteins, was developed. AlloPred uses perturbation of normal modes alongside pocket descriptors in a machine learning approach that ranks the pockets on a protein. AlloPred ranked an allosteric pocket top for 23 out of 40 known allosteric proteins, showing comparable and complementary performance to two existing methods. In 28 of 40 cases an allosteric pocket was ranked first or second. The AlloPred web server, freely available at http://www.sbg.bio.ic.ac.uk/allopred/home, allows visualisation and analysis of predictions. The source code and dataset information are also available from this site. CONCLUSIONS: Perturbation of normal modes can enhance our ability to predict allosteric sites on proteins. Computational methods such as AlloPred assist drug discovery efforts by suggesting sites on proteins for further experimental study.

Assuntos

Proteínas/metabolismo , Algoritmos , Sítio Alostérico , Ligantes , Conformação Proteica

14.

Modeling protein interactions and complexes in CAPRI: Seventh CAPRI evaluation meeting, April 3-5 EMBL-EBI, Hinxton, UK.

Wodak, Shoshana J; Velankar, Sameer; Sternberg, Michael J E.

Proteins ; 88(8): 913-915, 2020 08.

Artigo em Inglês | MEDLINE | ID: mdl-32077154

Assuntos

Simulação de Acoplamento Molecular , Oligossacarídeos/química , Peptídeos/química , Proteínas/química , Software , Humanos , Oligossacarídeos/metabolismo , Peptídeos/metabolismo , Domínios e Motivos de Interação entre Proteínas , Mapeamento de Interação de Proteínas , Proteínas/metabolismo

15.

Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains.

Lewis, Tony E; Sillitoe, Ian; Andreeva, Antonina; Blundell, Tom L; Buchan, Daniel W A; Chothia, Cyrus; Cuff, Alison; Dana, Jose M; Filippis, Ioannis; Gough, Julian; Hunter, Sarah; Jones, David T; Kelley, Lawrence A; Kleywegt, Gerard J; Minneci, Federico; Mitchell, Alex; Murzin, Alexey G; Ochoa-Montaño, Bernardo; Rackham, Owen J L; Smith, James; Sternberg, Michael J E; Velankar, Sameer; Yeats, Corin; Orengo, Christine.

Nucleic Acids Res ; 41(Database issue): D499-507, 2013 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-23203986

RESUMO

Genome3D, available at http://www.genome3d.eu, is a new collaborative project that integrates UK-based structural resources to provide a unique perspective on sequence-structure-function relationships. Leading structure prediction resources (DomSerf, FUGUE, Gene3D, pDomTHREADER, Phyre and SUPERFAMILY) provide annotations for UniProt sequences to indicate the locations of structural domains (structural annotations) and their 3D structures (structural models). Structural annotations and 3D model predictions are currently available for three model genomes (Homo sapiens, E. coli and baker's yeast), and the project will extend to other genomes in the near future. As these resources exploit different strategies for predicting structures, the main aim of Genome3D is to enable comparisons between all the resources so that biologists can see where predictions agree and are therefore more trusted. Furthermore, as these methods differ in whether they build their predictions using CATH or SCOP, Genome3D also contains the first official mapping between these two databases. This has identified pairs of similar superfamilies from the two resources at various degrees of consensus (532 bronze pairs, 527 silver pairs and 370 gold pairs).

Assuntos

Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Genômica , Humanos , Internet , Anotação de Sequência Molecular , Proteínas/química , Proteínas/classificação , Proteínas/genética , Software

16.

Proteomic analysis of the Plasmodium male gamete reveals the key role for glycolysis in flagellar motility.

Talman, Arthur M; Prieto, Judith H; Marques, Sara; Ubaida-Mohien, Ceereena; Lawniczak, Mara; Wass, Mark N; Xu, Tao; Frank, Roland; Ecker, Andrea; Stanway, Rebecca S; Krishna, Sanjeev; Sternberg, Michael J E; Christophides, Georges K; Graham, David R; Dinglasan, Rhoel R; Yates, John R; Sinden, Robert E.

Malar J ; 13: 315, 2014 Aug 13.

Artigo em Inglês | MEDLINE | ID: mdl-25124718

RESUMO

BACKGROUND: Gametogenesis and fertilization play crucial roles in malaria transmission. While male gametes are thought to be amongst the simplest eukaryotic cells and are proven targets of transmission blocking immunity, little is known about their molecular organization. For example, the pathway of energy metabolism that power motility, a feature that facilitates gamete encounter and fertilization, is unknown. METHODS: Plasmodium berghei microgametes were purified and analysed by whole-cell proteomic analysis for the first time. Data are available via ProteomeXchange with identifier PXD001163. RESULTS: 615 proteins were recovered, they included all male gamete proteins described thus far. Amongst them were the 11 enzymes of the glycolytic pathway. The hexose transporter was localized to the gamete plasma membrane and it was shown that microgamete motility can be suppressed effectively by inhibitors of this transporter and of the glycolytic pathway. CONCLUSIONS: This study describes the first whole-cell proteomic analysis of the malaria male gamete. It identifies glycolysis as the likely exclusive source of energy for flagellar beat, and provides new insights in original features of Plasmodium flagellar organization.

Assuntos

Metabolismo Energético , Flagelos/fisiologia , Células Germinativas/química , Glicólise , Plasmodium berghei/química , Plasmodium berghei/fisiologia , Proteoma/análise , Animais , Feminino , Locomoção , Masculino , Camundongos

17.

CombFunc: predicting protein function using heterogeneous data sources.

Wass, Mark N; Barton, Geraint; Sternberg, Michael J E.

Nucleic Acids Res ; 40(Web Server issue): W466-70, 2012 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-22641853

RESUMO

Only a small fraction of known proteins have been functionally characterized, making protein function prediction essential to propose annotations for uncharacterized proteins. In recent years many function prediction methods have been developed using various sources of biological data from protein sequence and structure to gene expression data. Here we present the CombFunc web server, which makes Gene Ontology (GO)-based protein function predictions. CombFunc incorporates ConFunc, our existing function prediction method, with other approaches for function prediction that use protein sequence, gene expression and protein-protein interaction data. In benchmarking on a set of 1686 proteins CombFunc obtains precision and recall of 0.71 and 0.64 respectively for gene ontology molecular function terms. For biological process GO terms precision of 0.74 and recall of 0.41 is obtained. CombFunc is available at http://www.sbg.bio.ic.ac.uk/combfunc.

Assuntos

Proteínas/fisiologia , Software , Algoritmos , Expressão Gênica , Internet , Mapas de Interação de Proteínas , Proteínas/química , Proteínas/genética , Análise de Sequência de Proteína

18.

Missense3D-TM: Predicting the Effect of Missense Variants in Helical Transmembrane Protein Regions Using 3D Protein Structures.

Hanna, Gordon; Khanna, Tarun; Islam, Suhail A; David, Alessia; Sternberg, Michael J E.

J Mol Biol ; 436(2): 168374, 2024 01 15.

Artigo em Inglês | MEDLINE | ID: mdl-38182301

RESUMO

Variant effect predictors assess if a substitution is pathogenic or benign. Most predictors, including those that are structure-based, are designed for globular proteins in aqueous environments and do not consider that the variant residue is located within the membrane. We report Missense3D-TM that provides a structure-based assessment of the impact of a missense variant located within a membrane. On a dataset of 2,078 pathogenic and 1,060 benign variants, spanning 711 proteins from 706 structures, Missense3D-TM achieved an accuracy of 66%, Mathews correlation coefficient of 0.37, sensitivity of 58% and specificity of 81%. Missense3D-TM performed similarly to mCSM-membrane: accuracy 66% vs 61% (p = 0.02) on an unbalanced test set and 70% vs 67% (p = 0.20) on a balanced test set. The Missense3D-TM website provides an analysis of the structural effects of the variant along with its predicted position within the membrane. The web server is available at http://missense3d.bc.ic.ac.uk/.

Assuntos

Proteínas de Membrana , Mutação de Sentido Incorreto , Domínios Proteicos , Imageamento Tridimensional , Conjuntos de Dados como Assunto , Proteínas de Membrana/química , Proteínas de Membrana/genética

19.

Functional significance of mutations in the Snf2 domain of ATRX.

Mitson, Matthew; Kelley, Lawrence A; Sternberg, Michael J E; Higgs, Douglas R; Gibbons, Richard J.

Hum Mol Genet ; 20(13): 2603-10, 2011 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-21505078

RESUMO

ATRX is a member of the Snf2 family of chromatin-remodelling proteins and is mutated in an X-linked mental retardation syndrome associated with alpha-thalassaemia (ATR-X syndrome). We have carried out an analysis of 21 disease-causing mutations within the Snf2 domain of ATRX by quantifying the expression of the ATRX protein and placing all missense mutations in their structural context by homology modelling. While demonstrating the importance of protein dosage to the development of ATR-X syndrome, we also identified three mutations which primarily affect function rather than protein structure. We show that all three of these mutant proteins are defective in translocating along DNA while one mutant, uniquely for a human disease-causing mutation, partially uncouples adenosine triphosphate (ATP) hydrolysis from DNA binding. Our results highlight important mechanistic aspects in the development of ATR-X syndrome and identify crucial functional residues within the Snf2 domain of ATRX. These findings are important for furthering our understanding of how ATP hydrolysis is harnessed as useful work in chromatin remodelling proteins and the wider family of nucleic acid translocating motors.

Assuntos

DNA Helicases/genética , DNA Helicases/metabolismo , Mutação/genética , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Ubiquitina-Proteína Ligases/genética , Sequência de Aminoácidos , Animais , Linhagem Celular , DNA Helicases/química , Ativação Enzimática/fisiologia , Humanos , Insetos , Deficiência Intelectual Ligada ao Cromossomo X/enzimologia , Deficiência Intelectual Ligada ao Cromossomo X/genética , Modelos Moleculares , Dados de Sequência Molecular , Proteínas Nucleares/química , Conformação Proteica , Estabilidade Proteica , Alinhamento de Sequência , Translocação Genética/genética , Ubiquitina-Proteína Ligases/química , Proteína Nuclear Ligada ao X , Talassemia alfa/enzimologia , Talassemia alfa/genética

20.

PINALOG: a novel approach to align protein interaction networks--implications for complex detection and function prediction.

Phan, Hang T T; Sternberg, Michael J E.

Bioinformatics ; 28(9): 1239-45, 2012 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-22419782

RESUMO

MOTIVATION: Analysis of protein-protein interaction networks (PPINs) at the system level has become increasingly important in understanding biological processes. Comparison of the interactomes of different species not only provides a better understanding of species evolution but also helps with detecting conserved functional components and in function prediction. Method and RESULTS: Here we report a PPIN alignment method, called PINALOG, which combines information from protein sequence, function and network topology. Alignment of human and yeast PPINs reveals several conserved subnetworks between them that participate in similar biological processes, notably the proteasome and transcription related processes. PINALOG has been tested for its power in protein complex prediction as well as function prediction. Comparison with PSI-BLAST in predicting protein function in the twilight zone also shows that PINALOG is valuable in predicting protein function. AVAILABILITY AND IMPLEMENTATION: The PINALOG web-server is freely available from http://www.sbg.bio.ic.ac.uk/~pinalog. The PINALOG program and associated data are available from the Download section of the web-server. CONTACT: m.sternberg@imperial.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Algoritmos , Mapas de Interação de Proteínas , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Animais , Humanos , Proteínas/química , Proteínas/metabolismo , Leveduras/genética

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA