Pesquisa | BVS Doenças Infecciosas e Parasitárias

1.

From structure to systems: high-resolution, quantitative genetic analysis of RNA polymerase II.

Braberg, Hannes; Jin, Huiyan; Moehle, Erica A; Chan, Yujia A; Wang, Shuyi; Shales, Michael; Benschop, Joris J; Morris, John H; Qiu, Chenxi; Hu, Fuqu; Tang, Leung K; Fraser, James S; Holstege, Frank C P; Hieter, Philip; Guthrie, Christine; Kaplan, Craig D; Krogan, Nevan J.

Cell ; 154(4): 775-88, 2013 Aug 15.

Artigo em Inglês | MEDLINE | ID: mdl-23932120

RESUMO

RNA polymerase II (RNAPII) lies at the core of dynamic control of gene expression. Using 53 RNAPII point mutants, we generated a point mutant epistatic miniarray profile (pE-MAP) comprising â¼60,000 quantitative genetic interactions in Saccharomyces cerevisiae. This analysis enabled functional assignment of RNAPII subdomains and uncovered connections between individual regions and other protein complexes. Using splicing microarrays and mutants that alter elongation rates in vitro, we found an inverse relationship between RNAPII speed and in vivo splicing efficiency. Furthermore, the pE-MAP classified fast and slow mutants that favor upstream and downstream start site selection, respectively. The striking coordination of polymerization rate with transcription initiation and splicing suggests that transcription rate is tuned to regulate multiple gene expression steps. The pE-MAP approach provides a powerful strategy to understand other multifunctional machines at amino acid resolution.

Assuntos

Epistasia Genética , RNA Polimerase II/genética , RNA Polimerase II/metabolismo , Saccharomyces cerevisiae/enzimologia , Saccharomyces cerevisiae/genética , Alelos , Estudo de Associação Genômica Ampla , Mutação Puntual , RNA Polimerase II/química , Splicing de RNA , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Fatores de Transcrição/metabolismo , Sítio de Iniciação de Transcrição , Transcrição Gênica , Transcriptoma

2.

The scalable precision medicine open knowledge engine (SPOKE): a massive knowledge graph of biomedical information.

Morris, John H; Soman, Karthik; Akbas, Rabia E; Zhou, Xiaoyuan; Smith, Brett; Meng, Elaine C; Huang, Conrad C; Cerono, Gabriel; Schenk, Gundolf; Rizk-Jackson, Angela; Harroud, Adil; Sanders, Lauren; Costes, Sylvain V; Bharat, Krish; Chakraborty, Arjun; Pico, Alexander R; Mardirossian, Taline; Keiser, Michael; Tang, Alice; Hardi, Josef; Shi, Yongmei; Musen, Mark; Israni, Sharat; Huang, Sui; Rose, Peter W; Nelson, Charlotte A; Baranzini, Sergio E.

Bioinformatics ; 39(2)2023 02 03.

Artigo em Inglês | MEDLINE | ID: mdl-36759942

RESUMO

MOTIVATION: Knowledge graphs (KGs) are being adopted in industry, commerce and academia. Biomedical KG presents a challenge due to the complexity, size and heterogeneity of the underlying information. RESULTS: In this work, we present the Scalable Precision Medicine Open Knowledge Engine (SPOKE), a biomedical KG connecting millions of concepts via semantically meaningful relationships. SPOKE contains 27 million nodes of 21 different types and 53 million edges of 55 types downloaded from 41 databases. The graph is built on the framework of 11 ontologies that maintain its structure, enable mappings and facilitate navigation. SPOKE is built weekly by python scripts which download each resource, check for integrity and completeness, and then create a 'parent table' of nodes and edges. Graph queries are translated by a REST API and users can submit searches directly via an API or a graphical user interface. Conclusions/Significance: SPOKE enables the integration of seemingly disparate information to support precision medicine efforts. AVAILABILITY AND IMPLEMENTATION: The SPOKE neighborhood explorer is available at https://spoke.rbvi.ucsf.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Reconhecimento Automatizado de Padrão , Medicina de Precisão , Bases de Dados Factuais

3.

Hemin-Induced Death Models Hemorrhagic Stroke and Is a Variant of Classical Neuronal Ferroptosis.

Zille, Marietta; Oses-Prieto, Juan A; Savage, Sara R; Karuppagounder, Saravanan S; Chen, Yingxin; Kumar, Amit; Morris, John H; Scheidt, Karl A; Burlingame, Alma L; Ratan, Rajiv R.

J Neurosci ; 42(10): 2065-2079, 2022 03 09.

Artigo em Inglês | MEDLINE | ID: mdl-34987108

RESUMO

Ferroptosis is a caspase-independent, iron-dependent form of regulated necrosis extant in traumatic brain injury, Huntington disease, and hemorrhagic stroke. It can be activated by cystine deprivation leading to glutathione depletion, the insufficiency of the antioxidant glutathione peroxidase-4, and the hemolysis products hemoglobin and hemin. A cardinal feature of ferroptosis is extracellular signal-regulated kinase (ERK)1/2 activation culminating in its translocation to the nucleus. We have previously confirmed that the mitogen-activated protein (MAP) kinase kinase (MEK) inhibitor U0126 inhibits persistent ERK1/2 phosphorylation and ferroptosis. Here, we show that hemin exposure, a model of secondary injury in brain hemorrhage and ferroptosis, activated ERK1/2 in mouse neurons. Accordingly, MEK inhibitor U0126 protected against hemin-induced ferroptosis. Unexpectedly, U0126 prevented hemin-induced ferroptosis independent of its ability to inhibit ERK1/2 signaling. In contrast to classical ferroptosis in neurons or cancer cells, chemically diverse inhibitors of MEK did not block hemin-induced ferroptosis, nor did the forced expression of the ERK-selective MAP kinase phosphatase (MKP)3. We conclude that hemin or hemoglobin-induced ferroptosis, unlike glutathione depletion, is ERK1/2-independent. Together with recent studies, our findings suggest the existence of a novel subtype of neuronal ferroptosis relevant to bleeding in the brain that is 5-lipoxygenase-dependent, ERK-independent, and transcription-independent. Remarkably, our unbiased phosphoproteome analysis revealed dramatic differences in phosphorylation induced by two ferroptosis subtypes. As U0126 also reduced cell death and improved functional recovery after hemorrhagic stroke in male mice, our analysis also provides a template on which to build a search for U0126's effects in a variant of neuronal ferroptosis.SIGNIFICANCE STATEMENT Ferroptosis is an iron-dependent mechanism of regulated necrosis that has been linked to hemorrhagic stroke. Common features of ferroptotic death induced by diverse stimuli are the depletion of the antioxidant glutathione, production of lipoxygenase-dependent reactive lipids, sensitivity to iron chelation, and persistent activation of extracellular signal-regulated kinase (ERK) signaling. Unlike classical ferroptosis induced in neurons or cancer cells, here we show that ferroptosis induced by hemin is ERK-independent. Paradoxically, the canonical MAP kinase kinase (MEK) inhibitor U0126 blocks brain hemorrhage-induced death. Altogether, these data suggest that a variant of ferroptosis is unleashed in hemorrhagic stroke. We present the first, unbiased phosphoproteomic analysis of ferroptosis as a template on which to understand distinct paths to cell death that meet the definition of ferroptosis.

Assuntos

Ferroptose , Acidente Vascular Cerebral Hemorrágico , Animais , Antioxidantes/metabolismo , MAP Quinases Reguladas por Sinal Extracelular/metabolismo , Glutationa/metabolismo , Hemina/metabolismo , Hemina/farmacologia , Hemoglobinas/metabolismo , Hemorragias Intracranianas/metabolismo , Ferro/metabolismo , Masculino , Camundongos , Quinases de Proteína Quinase Ativadas por Mitógeno/metabolismo , Necrose/metabolismo , Neurônios/metabolismo , Fosforilação

4.

clusterMaker2: a major update to clusterMaker, a multi-algorithm clustering app for Cytoscape.

Utriainen, Maija; Morris, John H.

BMC Bioinformatics ; 24(1): 134, 2023 Apr 05.

Artigo em Inglês | MEDLINE | ID: mdl-37020209

RESUMO

BACKGROUND: Since the initial publication of clusterMaker, the need for tools to analyze large biological datasets has only increased. New datasets are significantly larger than a decade ago, and new experimental techniques such as single-cell transcriptomics continue to drive the need for clustering or classification techniques to focus on portions of datasets of interest. While many libraries and packages exist that implement various algorithms, there remains the need for clustering packages that are easy to use, integrated with visualization of the results, and integrated with other commonly used tools for biological data analysis. clusterMaker2 has added several new algorithms, including two entirely new categories of analyses: node ranking and dimensionality reduction. Furthermore, many of the new algorithms have been implemented using the Cytoscape jobs API, which provides a mechanism for executing remote jobs from within Cytoscape. Together, these advances facilitate meaningful analyses of modern biological datasets despite their ever-increasing size and complexity. RESULTS: The use of clusterMaker2 is exemplified by reanalyzing the yeast heat shock expression experiment that was included in our original paper; however, here we explored this dataset in significantly more detail. Combining this dataset with the yeast protein-protein interaction network from STRING, we were able to perform a variety of analyses and visualizations from within clusterMaker2, including Leiden clustering to break the entire network into smaller clusters, hierarchical clustering to look at the overall expression dataset, dimensionality reduction using UMAP to find correlations between our hierarchical visualization and the UMAP plot, fuzzy clustering, and cluster ranking. Using these techniques, we were able to explore the highest-ranking cluster and determine that it represents a strong contender for proteins working together in response to heat shock. We found a series of clusters that, when re-explored as fuzzy clusters, provide a better presentation of mitochondrial processes. CONCLUSIONS: clusterMaker2 represents a significant advance over the previously published version, and most importantly, provides an easy-to-use tool to perform clustering and to visualize clusters within the Cytoscape network context. The new algorithms should be welcome to the large population of Cytoscape users, particularly the new dimensionality reduction and fuzzy clustering techniques.

Assuntos

Aplicativos Móveis , Saccharomyces cerevisiae , Algoritmos , Mapas de Interação de Proteínas , Análise por Conglomerados

5.

Cytoscape stringApp 2.0: Analysis and Visualization of Heterogeneous Biological Networks.

Doncheva, Nadezhda T; Morris, John H; Holze, Henrietta; Kirsch, Rebecca; Nastou, Katerina C; Cuesta-Astroz, Yesid; Rattei, Thomas; Szklarczyk, Damian; von Mering, Christian; Jensen, Lars J.

J Proteome Res ; 22(2): 637-646, 2023 02 03.

Artigo em Inglês | MEDLINE | ID: mdl-36512705

RESUMO

Biological networks are often used to represent complex biological systems, which can contain several types of entities. Analysis and visualization of such networks is supported by the Cytoscape software tool and its many apps. While earlier versions of stringApp focused on providing intraspecies protein-protein interactions from the STRING database, the new stringApp 2.0 greatly improves the support for heterogeneous networks. Here, we highlight new functionality that makes it possible to create networks that contain proteins and interactions from STRING as well as other biological entities and associations from other sources. We exemplify this by complementing a published SARS-CoV-2 interactome with interactions from STRING. We have also extended stringApp with new data and query functionality for protein-protein interactions between eukaryotic parasites and their hosts. We show how this can be used to retrieve and visualize a cross-species network for a malaria parasite, its host, and its vector. Finally, the latest stringApp version has an improved user interface, allows retrieval of both functional associations and physical interactions, and supports group-wise enrichment analysis of different parts of a network to aid biological interpretation. stringApp is freely available at https://apps.cytoscape.org/apps/stringapp.

Assuntos

COVID-19 , Humanos , SARS-CoV-2 , Software , Proteínas , Eucariotos

6.

IntAct App: a Cytoscape application for molecular interaction network visualization and analysis.

Ragueneau, Eliot; Shrivastava, Anjali; Morris, John H; Del-Toro, Noemi; Hermjakob, Henning; Porras, Pablo.

Bioinformatics ; 37(20): 3684-3685, 2021 Oct 25.

Artigo em Inglês | MEDLINE | ID: mdl-33961020

RESUMO

SUMMARY: IntAct App is a Cytoscape 3 application that grants in-depth access to IntAct's molecular interaction data. It build networks where nodes are interacting molecules (mainly proteins, but also genes, RNA, chemicals) and edges represent evidence of interaction. Users can query a network by providing its molecules, identified by different fields and optionally include all their interacting partners in the resulting network. The app offers three visualizations: one only displaying interactions, another representing every evidence and the last one emphasizing evidence where mutated versions of proteins were used. Users can also filter networks and click on nodes and edges to access all their related details. Finally, the application supports automation of its main features via Cytoscape commands. AVAILABILITY AND IMPLEMENTATION: Implementation available at https://apps.cytoscape.org/apps/intactapp, while the source code is available at https://github.com/EBI-IntAct/IntactApp.

7.

STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets.

Szklarczyk, Damian; Gable, Annika L; Lyon, David; Junge, Alexander; Wyder, Stefan; Huerta-Cepas, Jaime; Simonovic, Milan; Doncheva, Nadezhda T; Morris, John H; Bork, Peer; Jensen, Lars J; Mering, Christian von.

Nucleic Acids Res ; 47(D1): D607-D613, 2019 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-30476243

RESUMO

Proteins and their functional interactions form the backbone of the cellular machinery. Their connectivity network needs to be considered for the full understanding of biological phenomena, but the available information on protein-protein associations is incomplete and exhibits varying levels of annotation granularity and reliability. The STRING database aims to collect, score and integrate all publicly available sources of protein-protein interaction information, and to complement these with computational predictions. Its goal is to achieve a comprehensive and objective global network, including direct (physical) as well as indirect (functional) interactions. The latest version of STRING (11.0) more than doubles the number of organisms it covers, to 5090. The most important new feature is an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input. For the enrichment analysis, STRING implements well-known classification systems such as Gene Ontology and KEGG, but also offers additional, new classification systems based on high-throughput text-mining as well as on a hierarchical clustering of the association network itself. The STRING resource is available online at https://string-db.org/.

Assuntos

Genômica/métodos , Mapeamento de Interação de Proteínas/métodos , Software , Animais , Bases de Dados Genéticas , Ontologia Genética , Humanos

8.

Ten simple rules to create biological network figures for communication.

Marai, G Elisabeta; Pinaud, Bruno; Bühler, Katja; Lex, Alexander; Morris, John H.

PLoS Comput Biol ; 15(9): e1007244, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-31557157

RESUMO

Biological network figures are ubiquitous in the biology and medical literature. On the one hand, a good network figure can quickly provide information about the nature and degree of interactions between items and enable inferences about the reason for those interactions. On the other hand, good network figures are difficult to create. In this paper, we outline 10 simple rules for creating biological network figures for communication, from choosing layouts, to applying color or other channels to show attributes, to the use of layering and separation. These rules are accompanied by illustrative examples. We also provide a concise set of references and additional resources for each rule.

Assuntos

Biologia Computacional/métodos , Gráficos por Computador , Atenção , Cor , Humanos , Mapas de Interação de Proteínas/fisiologia , Transdução de Sinais/fisiologia , Percepção Visual

9.

Cytoscape StringApp: Network Analysis and Visualization of Proteomics Data.

Doncheva, Nadezhda T; Morris, John H; Gorodkin, Jan; Jensen, Lars J.

J Proteome Res ; 18(2): 623-632, 2019 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-30450911

RESUMO

Protein networks have become a popular tool for analyzing and visualizing the often long lists of proteins or genes obtained from proteomics and other high-throughput technologies. One of the most popular sources of such networks is the STRING database, which provides protein networks for more than 2000 organisms, including both physical interactions from experimental data and functional associations from curated pathways, automatic text mining, and prediction methods. However, its web interface is mainly intended for inspection of small networks and their underlying evidence. The Cytoscape software, on the other hand, is much better suited for working with large networks and offers greater flexibility in terms of network analysis, import, and visualization of additional data. To include both resources in the same workflow, we created stringApp, a Cytoscape app that makes it easy to import STRING networks into Cytoscape, retains the appearance and many of the features of STRING, and integrates data from associated databases. Here, we introduce many of the stringApp features and show how they can be used to carry out complex network analysis and visualization tasks on a typical proteomics data set, all through the Cytoscape user interface. stringApp is freely available from the Cytoscape app store: http://apps.cytoscape.org/apps/stringapp .

Assuntos

Análise de Dados , Proteômica/métodos , Software , Biologia Computacional/métodos , Internet , Mapas de Interação de Proteínas , Interface Usuário-Computador

10.

The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible.

Szklarczyk, Damian; Morris, John H; Cook, Helen; Kuhn, Michael; Wyder, Stefan; Simonovic, Milan; Santos, Alberto; Doncheva, Nadezhda T; Roth, Alexander; Bork, Peer; Jensen, Lars J; von Mering, Christian.

Nucleic Acids Res ; 45(D1): D362-D368, 2017 01 04.

Artigo em Inglês | MEDLINE | ID: mdl-27924014

RESUMO

A system-wide understanding of cellular function requires knowledge of all functional interactions between the expressed proteins. The STRING database aims to collect and integrate this information, by consolidating known and predicted protein-protein association data for a large number of organisms. The associations in STRING include direct (physical) interactions, as well as indirect (functional) interactions, as long as both are specific and biologically meaningful. Apart from collecting and reassessing available experimental data on protein-protein interactions, and importing known pathways and protein complexes from curated databases, interaction predictions are derived from the following sources: (i) systematic co-expression analysis, (ii) detection of shared selective signals across genomes, (iii) automated text-mining of the scientific literature and (iv) computational transfer of interaction knowledge between organisms based on gene orthology. In the latest version 10.5 of STRING, the biggest changes are concerned with data dissemination: the web frontend has been completely redesigned to reduce dependency on outdated browser technologies, and the database can now also be queried from inside the popular Cytoscape software framework. Further improvements include automated background analysis of user inputs for functional enrichments, and streamlined download options. The STRING resource is available online, at http://string-db.org/.

Assuntos

Biologia Computacional/métodos , Bases de Dados de Proteínas , Software , Modelos Moleculares , Ligação Proteica , Conformação Proteica , Mapeamento de Interação de Proteínas , Mapas de Interação de Proteínas , Proteínas/química , Proteínas/metabolismo , Relação Estrutura-Atividade , Interface Usuário-Computador , Navegador

11.

An Atlas of Peroxiredoxins Created Using an Active Site Profile-Based Approach to Functionally Relevant Clustering of Proteins.

Harper, Angela F; Leuthaeuser, Janelle B; Babbitt, Patricia C; Morris, John H; Ferrin, Thomas E; Poole, Leslie B; Fetrow, Jacquelyn S.

PLoS Comput Biol ; 13(2): e1005284, 2017 02.

Artigo em Inglês | MEDLINE | ID: mdl-28187133

RESUMO

Peroxiredoxins (Prxs or Prdxs) are a large protein superfamily of antioxidant enzymes that rapidly detoxify damaging peroxides and/or affect signal transduction and, thus, have roles in proliferation, differentiation, and apoptosis. Prx superfamily members are widespread across phylogeny and multiple methods have been developed to classify them. Here we present an updated atlas of the Prx superfamily identified using a novel method called MISST (Multi-level Iterative Sequence Searching Technique). MISST is an iterative search process developed to be both agglomerative, to add sequences containing similar functional site features, and divisive, to split groups when functional site features suggest distinct functionally-relevant clusters. Superfamily members need not be identified initially-MISST begins with a minimal representative set of known structures and searches GenBank iteratively. Further, the method's novelty lies in the manner in which isofunctional groups are selected; rather than use a single or shifting threshold to identify clusters, the groups are deemed isofunctional when they pass a self-identification criterion, such that the group identifies itself and nothing else in a search of GenBank. The method was preliminarily validated on the Prxs, as the Prxs presented challenges of both agglomeration and division. For example, previous sequence analysis clustered the Prx functional families Prx1 and Prx6 into one group. Subsequent expert analysis clearly identified Prx6 as a distinct functionally relevant group. The MISST process distinguishes these two closely related, though functionally distinct, families. Through MISST search iterations, over 38,000 Prx sequences were identified, which the method divided into six isofunctional clusters, consistent with previous expert analysis. The results represent the most complete computational functional analysis of proteins comprising the Prx superfamily. The feasibility of this novel method is demonstrated by the Prx superfamily results, laying the foundation for potential functionally relevant clustering of the universe of protein sequences.

Assuntos

Bases de Dados de Proteínas , Peroxirredoxinas/química , Peroxirredoxinas/classificação , Mapeamento de Interação de Proteínas/métodos , Análise de Sequência de Proteína/métodos , Homologia de Sequência de Aminoácidos , Sequência de Aminoácidos , Sítios de Ligação , Sistemas de Gerenciamento de Base de Dados , Ativação Enzimática , Ensaios de Triagem em Larga Escala/métodos , Dados de Sequência Molecular , Família Multigênica , Peroxirredoxinas/ultraestrutura , Ligação Proteica

12.

Global landscape of HIV-human protein complexes.

Jäger, Stefanie; Cimermancic, Peter; Gulbahce, Natali; Johnson, Jeffrey R; McGovern, Kathryn E; Clarke, Starlynn C; Shales, Michael; Mercenne, Gaelle; Pache, Lars; Li, Kathy; Hernandez, Hilda; Jang, Gwendolyn M; Roth, Shoshannah L; Akiva, Eyal; Marlett, John; Stephens, Melanie; D'Orso, Iván; Fernandes, Jason; Fahey, Marie; Mahon, Cathal; O'Donoghue, Anthony J; Todorovic, Aleksandar; Morris, John H; Maltby, David A; Alber, Tom; Cagney, Gerard; Bushman, Frederic D; Young, John A; Chanda, Sumit K; Sundquist, Wesley I; Kortemme, Tanja; Hernandez, Ryan D; Craik, Charles S; Burlingame, Alma; Sali, Andrej; Frankel, Alan D; Krogan, Nevan J.

Nature ; 481(7381): 365-70, 2011 Dec 21.

Artigo em Inglês | MEDLINE | ID: mdl-22190034

RESUMO

Human immunodeficiency virus (HIV) has a small genome and therefore relies heavily on the host cellular machinery to replicate. Identifying which host proteins and complexes come into physical contact with the viral proteins is crucial for a comprehensive understanding of how HIV rewires the host's cellular machinery during the course of infection. Here we report the use of affinity tagging and purification mass spectrometry to determine systematically the physical interactions of all 18 HIV-1 proteins and polyproteins with host proteins in two different human cell lines (HEK293 and Jurkat). Using a quantitative scoring system that we call MiST, we identified with high confidence 497 HIV-human protein-protein interactions involving 435 individual human proteins, with â¼40% of the interactions being identified in both cell types. We found that the host proteins hijacked by HIV, especially those found interacting in both cell types, are highly conserved across primates. We uncovered a number of host complexes targeted by viral proteins, including the finding that HIV protease cleaves eIF3d, a subunit of eukaryotic translation initiation factor 3. This host protein is one of eleven identified in this analysis that act to inhibit HIV replication. This data set facilitates a more comprehensive and detailed understanding of how the host machinery is manipulated during the course of HIV infection.

Assuntos

HIV-1/química , HIV-1/metabolismo , Interações Hospedeiro-Patógeno , Proteínas do Vírus da Imunodeficiência Humana/metabolismo , Mapeamento de Interação de Proteínas/métodos , Mapas de Interação de Proteínas/fisiologia , Marcadores de Afinidade , Sequência de Aminoácidos , Sequência Conservada , Fator de Iniciação 3 em Eucariotos/química , Fator de Iniciação 3 em Eucariotos/metabolismo , Células HEK293 , Infecções por HIV/metabolismo , Infecções por HIV/virologia , Protease de HIV/metabolismo , HIV-1/fisiologia , Proteínas do Vírus da Imunodeficiência Humana/análise , Proteínas do Vírus da Imunodeficiência Humana/química , Proteínas do Vírus da Imunodeficiência Humana/isolamento & purificação , Humanos , Imunoprecipitação , Células Jurkat , Espectrometria de Massas , Ligação Proteica , Reprodutibilidade dos Testes , Replicação Viral

13.

DASP3: identification of protein sequences belonging to functionally relevant groups.

Leuthaeuser, Janelle B; Morris, John H; Harper, Angela F; Ferrin, Thomas E; Babbitt, Patricia C; Fetrow, Jacquelyn S.

BMC Bioinformatics ; 17(1): 458, 2016 Nov 11.

Artigo em Inglês | MEDLINE | ID: mdl-27835946

RESUMO

BACKGROUND: Development of automatable processes for clustering proteins into functionally relevant groups is a critical hurdle as an increasing number of sequences are deposited into databases. Experimental function determination is exceptionally time-consuming and can't keep pace with the identification of protein sequences. A tool, DASP (Deacon Active Site Profiler), was previously developed to identify protein sequences with active site similarity to a query set. Development of two iterative, automatable methods for clustering proteins into functionally relevant groups exposed algorithmic limitations to DASP. RESULTS: The accuracy and efficiency of DASP was significantly improved through six algorithmic enhancements implemented in two stages: DASP2 and DASP3. Validation demonstrated DASP3 provides greater score separation between true positives and false positives than earlier versions. In addition, DASP3 shows similar performance to previous versions in clustering protein structures into isofunctional groups (validated against manual curation), but DASP3 gathers and clusters protein sequences into isofunctional groups more efficiently than DASP and DASP2. CONCLUSIONS: DASP algorithmic enhancements resulted in improved efficiency and accuracy of identifying proteins that contain active site features similar to those of the query set. These enhancements provide incremental improvement in structure database searches and initial sequence database searches; however, the enhancements show significant improvement in iterative sequence searches, suggesting DASP3 is an appropriate tool for the iterative processes required for clustering proteins into isofunctional groups.

Assuntos

Algoritmos , Análise de Sequência de Proteína/métodos , Motivos de Aminoácidos , Sequência de Aminoácidos , Domínio Catalítico , Análise por Conglomerados , Bases de Dados de Proteínas , Proteínas/química

14.

cddApp: a Cytoscape app for accessing the NCBI conserved domain database.

Morris, John H; Wu, Allan; Yamashita, Roxanne A; Marchler-Bauer, Aron; Ferrin, Thomas E.

Bioinformatics ; 31(1): 134-6, 2015 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-25212755

RESUMO

MOTIVATION: cddApp is a Cytoscape extension that supports the annotation of protein networks with information about domains and specific functional sites from the National Center for Biotechnology Information's conserved domain database (CDD). CDD information is loaded for nodes annotated with NCBI numbers or UniProt identifiers and (optionally) Protein Data Bank structures. cddApp integrates with the Cytoscape apps structureViz2 and enhancedGraphics. Together, these three apps provide powerful tools to annotate nodes with CDD domain and site information and visualize that information in both network and structural contexts. AVAILABILITY AND IMPLEMENTATION: cddApp is written in Java and freely available for download from the Cytoscape app store (http://apps.cytoscape.org). Documentation is provided at http://www.rbvi.ucsf.edu/cytoscape, and the source is publically available from GitHub http://github.com/RBVI/cddApp.

Assuntos

Proteínas de Bactérias/metabolismo , Biologia Computacional/instrumentação , Redes e Vias Metabólicas , Anotação de Sequência Molecular/métodos , Análise de Sequência de Proteína/métodos , Software , Algoritmos , Bacillus , Proteínas de Bactérias/química , Sequência Conservada , Bases de Dados de Proteínas , Humanos , Conformação Proteica , Mapeamento de Interação de Proteínas

15.

Enhancing UCSF Chimera through web services.

Huang, Conrad C; Meng, Elaine C; Morris, John H; Pettersen, Eric F; Ferrin, Thomas E.

Nucleic Acids Res ; 42(Web Server issue): W478-84, 2014 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-24861624

RESUMO

Integrating access to web services with desktop applications allows for an expanded set of application features, including performing computationally intensive tasks and convenient searches of databases. We describe how we have enhanced UCSF Chimera (http://www.rbvi.ucsf.edu/chimera/), a program for the interactive visualization and analysis of molecular structures and related data, through the addition of several web services (http://www.rbvi.ucsf.edu/chimera/docs/webservices.html). By streamlining access to web services, including the entire job submission, monitoring and retrieval process, Chimera makes it simpler for users to focus on their science projects rather than data manipulation. Chimera uses Opal, a toolkit for wrapping scientific applications as web services, to provide scalable and transparent access to several popular software packages. We illustrate Chimera's use of web services with an example workflow that interleaves use of these services with interactive manipulation of molecular sequences and structures, and we provide an example Python program to demonstrate how easily Opal-based web services can be accessed from within an application. Web server availability: http://webservices.rbvi.ucsf.edu/opal2/dashboard?command=serviceList.

Assuntos

Estrutura Molecular , Software , Internet , Modelos Moleculares

16.

The Structure-Function Linkage Database.

Akiva, Eyal; Brown, Shoshana; Almonacid, Daniel E; Barber, Alan E; Custer, Ashley F; Hicks, Michael A; Huang, Conrad C; Lauck, Florian; Mashiyama, Susan T; Meng, Elaine C; Mischel, David; Morris, John H; Ojha, Sunil; Schnoes, Alexandra M; Stryke, Doug; Yunes, Jeffrey M; Ferrin, Thomas E; Holliday, Gemma L; Babbitt, Patricia C.

Nucleic Acids Res ; 42(Database issue): D521-30, 2014 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-24271399

RESUMO

The Structure-Function Linkage Database (SFLD, http://sfld.rbvi.ucsf.edu/) is a manually curated classification resource describing structure-function relationships for functionally diverse enzyme superfamilies. Members of such superfamilies are diverse in their overall reactions yet share a common ancestor and some conserved active site features associated with conserved functional attributes such as a partial reaction. Thus, despite their different functions, members of these superfamilies 'look alike', making them easy to misannotate. To address this complexity and enable rational transfer of functional features to unknowns only for those members for which we have sufficient functional information, we subdivide superfamily members into subgroups using sequence information, and lastly into families, sets of enzymes known to catalyze the same reaction using the same mechanistic strategy. Browsing and searching options in the SFLD provide access to all of these levels. The SFLD offers manually curated as well as automatically classified superfamily sets, both accompanied by search and download options for all hierarchical levels. Additional information includes multiple sequence alignments, tab-separated files of functional and other attributes, and sequence similarity networks. The latter provide a new and intuitively powerful way to visualize functional trends mapped to the context of sequence similarity.

Assuntos

Bases de Dados de Proteínas , Enzimas/química , Enzimas/classificação , Enzimas/metabolismo , Internet , Anotação de Sequência Molecular , Alinhamento de Sequência , Relação Estrutura-Atividade

17.

Translating desktop success to the web in the cytoscape project.

Pratt, Dexter; Pillich, Rudolf T; Morris, John H.

Front Bioinform ; 3: 1125949, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37035036

RESUMO

Cytoscape is an open-source bioinformatics environment for the analysis, integration, visualization, and query of biological networks. In this perspective piece, we describe our project to bring the Cytoscape desktop application to the web while explaining our strategy in ways relevant to others in the bioinformatics community. We examine opportunities and challenges in developing bioinformatics software that spans both the desktop and web, and we describe our ongoing efforts to build a Cytoscape web application, highlighting the principles that guide our development.

18.

UCSF ChimeraX: Tools for structure building and analysis.

Meng, Elaine C; Goddard, Thomas D; Pettersen, Eric F; Couch, Greg S; Pearson, Zach J; Morris, John H; Ferrin, Thomas E.

Protein Sci ; 32(11): e4792, 2023 11.

Artigo em Inglês | MEDLINE | ID: mdl-37774136

RESUMO

Advances in computational tools for atomic model building are leading to accurate models of large molecular assemblies seen in electron microscopy, often at challenging resolutions of 3-4 Å. We describe new methods in the UCSF ChimeraX molecular modeling package that take advantage of machine-learning structure predictions, provide likelihood-based fitting in maps, and compute per-residue scores to identify modeling errors. Additional model-building tools assist analysis of mutations, post-translational modifications, and interactions with ligands. We present the latest ChimeraX model-building capabilities, including several community-developed extensions. ChimeraX is available free of charge for noncommercial use at https://www.rbvi.ucsf.edu/chimerax.

Assuntos

Software , Microscopia Crioeletrônica/métodos , Funções Verossimilhança , Modelos Moleculares , Microscopia Eletrônica , Conformação Proteica

19.

Improving the quality of protein similarity network clustering algorithms using the network edge weight distribution.

Apeltsin, Leonard; Morris, John H; Babbitt, Patricia C; Ferrin, Thomas E.

Bioinformatics ; 27(3): 326-33, 2011 Feb 01.

Artigo em Inglês | MEDLINE | ID: mdl-21118823

RESUMO

MOTIVATION: Clustering protein sequence data into functionally specific families is a difficult but important problem in biological research. One useful approach for tackling this problem involves representing the sequence dataset as a protein similarity network, and afterwards clustering the network using advanced graph analysis techniques. Although a multitude of such network clustering algorithms have been developed over the past few years, comparing algorithms is often difficult because performance is affected by the specifics of network construction. We investigate an important aspect of network construction used in analyzing protein superfamilies and present a heuristic approach for improving the performance of several algorithms. RESULTS: We analyzed how the performance of network clustering algorithms relates to thresholding the network prior to clustering. Our results, over four different datasets, show how for each input dataset there exists an optimal threshold range over which an algorithm generates its most accurate clustering output. Our results further show how the optimal threshold range correlates with the shape of the edge weight distribution for the input similarity network. We used this correlation to develop an automated threshold selection heuristic in order to most optimally filter a similarity network prior to clustering. This heuristic allows researchers to process their protein datasets with runtime efficient network clustering algorithms without sacrificing the clustering accuracy of the final results. AVAILABILITY: Python code for implementing the automated threshold selection heuristic, together with the datasets used in our analysis, are available at http://www.rbvi.ucsf.edu/Research/cytoscape/threshold_scripts.zip.

Assuntos

Algoritmos , Reconhecimento Automatizado de Padrão/métodos , Proteínas/química , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Inteligência Artificial , Análise por Conglomerados , Proteínas/metabolismo , Software

20.

Computational tools for the interactive exploration of proteomic and structural data.

Morris, John H; Meng, Elaine C; Ferrin, Thomas E.

Mol Cell Proteomics ; 9(8): 1703-15, 2010 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-20525940

RESUMO

Linking proteomics and structural data is critical to our understanding of cellular processes, and interactive exploration of these complementary data sets can be extremely valuable for developing or confirming hypotheses in silico. However, few computational tools facilitate linking these types of data interactively. In addition, the tools that do exist are neither well understood nor widely used by the proteomics or structural biology communities. We briefly describe several relevant tools, and then, using three scenarios, we present in depth two tools for the integrated exploration of proteomics and structural data.

Assuntos

Bases de Dados de Proteínas , Proteínas/química , Proteômica/métodos , Animais , Humanos , Modelos Moleculares , Proteínas Mutantes/química , Ligação Proteica , Saccharomyces cerevisiae/enzimologia , Software

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA