Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
J Struct Biol ; 215(3): 108011, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37562586

RESUMEN

Leucine Rich Repeat (LRR) domains, are present in hundreds of thousands of proteins across all kingdoms of life and are typically involved in protein-protein interactions and ligand recognition. LRR domains are classified into eight classes and when examined in three dimensions seven, of them form curved solenoid-like super-helices, also described as toruses, with a beta sheet on the concave (inside) and stacked alpha-helices on the convex (outside) of the torus. Here we present an overview of the least characterized 8th class of LRR proteins, the TpLRR-like LRRs, named after the Treponema pallidum protein Tp0225. Proteins from the TpLRR class differ from the proteins in all other known LRR classes by having a flipped curvature, with the beta sheet on the convex side of the torus and irregular secondary structure instead of helices on the opposite, now concave site. TpLRR proteins also present highly divergent sequence pattern of individual repeats and can associate with specific types of additional domains. Several of the characterized proteins from this class, specifically the BspA-like proteins, were found in human bacterial and protozoan pathogens, playing an important role in the interactions between the pathogens and the host immune system. In this paper we surveyed all existing experimental structures and selected AlphaFold models of the best-known proteins containing this class of LRR repeats, analyzing the relation between the pattern of conserved residues, specific structural features and functions of these proteins.


Asunto(s)
Proteínas Repetidas Ricas en Leucina , Proteínas , Humanos , Proteínas/química , Dominios Proteicos , Estructura Secundaria de Proteína , Bacterias/química
2.
Emerg Infect Dis ; 29(5)2023 05.
Artículo en Inglés | MEDLINE | ID: mdl-37054986

RESUMEN

Since late 2020, SARS-CoV-2 variants have regularly emerged with competitive and phenotypic differences from previously circulating strains, sometimes with the potential to escape from immunity produced by prior exposure and infection. The Early Detection group is one of the constituent groups of the US National Institutes of Health National Institute of Allergy and Infectious Diseases SARS-CoV-2 Assessment of Viral Evolution program. The group uses bioinformatic methods to monitor the emergence, spread, and potential phenotypic properties of emerging and circulating strains to identify the most relevant variants for experimental groups within the program to phenotypically characterize. Since April 2021, the group has prioritized variants monthly. Prioritization successes include rapidly identifying most major variants of SARS-CoV-2 and providing experimental groups within the National Institutes of Health program easy access to regularly updated information on the recent evolution and epidemiology of SARS-CoV-2 that can be used to guide phenotypic investigations.


Asunto(s)
COVID-19 , SARS-CoV-2 , Estados Unidos/epidemiología , Humanos , SARS-CoV-2/genética , COVID-19/epidemiología , National Institutes of Health (U.S.)
3.
Arch Biochem Biophys ; 739: 109579, 2023 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-36933758

RESUMEN

Both gender and smoking are correlated with prevalence and outcomes in many types of cancers. Tobacco smoke is a known carcinogen through its genotoxicity but can also affect cancer progression through its effect on the immune system. In this study, we aim to evaluate the hypothesis that the effects of smoking on the tumor immune microenvironment will be influenced differently by gender using large-scale analysis of publicly available cancer datasets. We used The Cancer Genomic Atlas (TCGA) datasets (n = 2724) to analyze effects of smoking on different cancer immune subtypes and the relative abundance of immune cell types between male and female cancer patients. We further validated our results by analyzing additional datasets, including Expression Project for Oncology (expO) bulk RNA-seq dataset (n = 1118) and single-cell RNA-seq dataset (n = 14). Results of our study indicate that in female patients, two immune subtypes, C1 and C2, are respectively over and under abundant in smokers vs. never smokers. In males, the only significant difference is underabundance of the C6 subtype in smokers. We identified gender-specific differences in the population of immune cell types between smokers and never smokers in all TCGA and expO cancer types. Increased plasma cell population was identified as the most consistent feature distinguishing smokers and never smokers, especially in current female smokers based on both TCGA and expO data. Our analysis of existing single-cell RNA-seq data further revealed that smoking differentially affects the gene expression profile of cancer patients based on the immune cell type and gender. In our analysis, female and male smokers show different smoking-induced patterns of immune cells in tumor microenvironment. Besides, our results suggest cancer tissues directly exposed to tobacco smoke undergo the most significant changes, but all other tissue types are affected as well. Findings of current study also indicate that changes in the populations of plasma cells and their correlations to survival outcomes are stronger in female current smokers, with implications for cancer immunotherapy of women smokers. In conclusion, results of this study can be used to develop personalized treatment plans for cancer patients who smoke, particularly women smokers, taking into account the unique immune cell profile of their tumors.


Asunto(s)
Neoplasias Pulmonares , Contaminación por Humo de Tabaco , Humanos , Masculino , Femenino , Microambiente Tumoral , Factores Sexuales , Fumar/efectos adversos , Neoplasias Pulmonares/patología
4.
Protein Sci ; 31(7): e4325, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35762711

RESUMEN

Proteins sample a multitude of different conformations by undergoing small- and large-scale conformational changes that are often intrinsic to their functions. Information about these changes is often captured in the Protein Data Bank by the apparently redundant deposition of independent structural solutions of identical proteins. Here, we mine these data to examine the conservation of large-scale conformational changes between homologous proteins. This is important for both practical reasons, such as predicting alternative conformations of a protein by comparative modeling, and conceptual reasons, such as understanding the extent of conservation of different features in evolution. To study this question, we introduce a novel approach to compare conformational changes between proteins by the comparison of their difference distance maps (DDMs). We found that proteins undergoing similar conformational changes have similar DDMs and that this similarity could be quantified by the correlation between the DDMs. By comparing the DDMs of homologous protein pairs, we found that large-scale conformational changes show a high level of conservation across a broad range of sequence identities. This shows that conformational space is usually conserved between homologs, even relatively distant ones.


Asunto(s)
Proteínas , Bases de Datos de Proteínas , Conformación Proteica , Proteínas/química , Proteínas/genética
5.
PLoS Comput Biol ; 17(7): e1009147, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-34237054

RESUMEN

The unprecedented pace of the sequencing of the SARS-CoV-2 virus genomes provides us with unique information about the genetic changes in a single pathogen during ongoing pandemic. By the analysis of close to 200,000 genomes we show that the patterns of the SARS-CoV-2 virus mutations along its genome are closely correlated with the structural and functional features of the encoded proteins. Requirements of foldability of proteins' 3D structures and the conservation of their key functional regions, such as protein-protein interaction interfaces, are the dominant factors driving evolutionary selection in protein-coding genes. At the same time, avoidance of the host immunity leads to the abundance of mutations in other regions, resulting in high variability of the missense mutation rate along the genome. "Unexplained" peaks and valleys in the mutation rate provide hints on function for yet uncharacterized genomic regions and specific protein structural and functional features they code for. Some of these observations have immediate practical implications for the selection of target regions for PCR-based COVID-19 tests and for evaluating the risk of mutations in epitopes targeted by specific antibodies and vaccine design strategies.


Asunto(s)
Evolución Biológica , SARS-CoV-2/fisiología , Genes Virales , Mutación , SARS-CoV-2/genética , Proteínas Virales/fisiología
6.
J Mol Biol ; 433(11): 166828, 2021 05 28.
Artículo en Inglés | MEDLINE | ID: mdl-33972023

RESUMEN

There is a wide, and continuously widening, gap between the number of proteins known only by their amino acid sequence versus those structurally characterized by direct experiment. To close this gap, we mostly rely on homology-based inference and modeling to reason about the structures of the uncharacterized proteins by using structures of homologous proteins as templates. With the rapidly growing size of the Protein Data Bank, there are often multiple choices of templates, including multiple sets of coordinates from the same protein. The substantial conformational differences observed between different experimental structures of the same protein often reflect function related structural flexibility. Thus, depending on the questions being asked, using distant homologs, or coordinate sets with lower resolution but solved in the appropriate functional form, as templates may be more informative. The ModFlex server (https://modflex.org/) addresses this seldom mentioned gap in the standard homology modeling approach by providing the user with an interface with multiple options and tools to select the most relevant template and explore the range of structural diversity in the available templates. ModFlex is closely integrated with a range of other programs and servers developed in our group for the analysis and visualization of protein structural flexibility and divergence.


Asunto(s)
Modelos Moleculares , Proteínas/metabolismo , Programas Informáticos , Humanos , Lactoferrina/química , Conformación Proteica , Proteínas/química , Homología Estructural de Proteína , Interfaz Usuario-Computador
7.
bioRxiv ; 2020 Aug 10.
Artículo en Inglés | MEDLINE | ID: mdl-32817947

RESUMEN

Fast evolution of the SARS-CoV-2 virus provides us with unique information about the patterns of genetic changes in a single pathogen in the timescale of months. This data is used extensively to track the phylodynamic of the pandemic's spread and its split into distinct clades. Here we show that the patterns of SARS-CoV-2 virus mutations along its genome are closely correlated with the structural features of the coded proteins. We show that the foldability of proteins' 3D structures and conservation of their functions are the universal factors driving evolutionary selection in protein-coding genes. Insights from the analysis of mutation distribution in the context of the SARS-CoV-2 proteins' structures and functions have practical implications including evaluating potential antigen epitopes or selection of primers for PCR-based COVID-19 tests.

8.
Nucleic Acids Res ; 48(W1): W60-W64, 2020 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-32469061

RESUMEN

FATCAT 2.0 server (http://fatcat.godziklab.org/), provides access to a flexible protein structure alignment algorithm developed in our group. In such an alignment, rotations and translations between elements in the structure are allowed to minimize the overall root mean square deviation (RMSD) between the compared structures. This allows to effectively compare protein structures even if they underwent structural rearrangements in different functional forms, different crystallization conditions or as a result of mutations. The major update for the server introduces a new graphical interface, much faster database searches and several new options for visualization of the structural differences between proteins.


Asunto(s)
Programas Informáticos , Homología Estructural de Proteína , Algoritmos , Bases de Datos de Proteínas , Modelos Moleculares , Proteínas/química
9.
Bioinformatics ; 36(15): 4360-4362, 2020 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-32470119

RESUMEN

MOTIVATION: As the COVID-19 pandemic is spreading around the world, the SARS-CoV-2 virus is evolving with mutations that potentially change and fine-tune functions of the proteins coded in its genome. RESULTS: Coronavirus3D website integrates data on the SARS-CoV-2 virus mutations with information about 3D structures of its proteins, allowing users to visually analyze the mutations in their 3D context. AVAILABILITY AND IMPLEMENTATION: Coronavirus3D server is freely available at https://coronavirus3d.org.


Asunto(s)
Infecciones por Coronavirus , Genoma Viral , Pandemias , Neumonía Viral , Betacoronavirus , COVID-19 , Genómica , Humanos , SARS-CoV-2
10.
PLoS One ; 15(3): e0226702, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32163442

RESUMEN

Protein structures, usually visualized in various highly idealized forms focusing on the three-dimensional arrangements of secondary structure elements, can also be described as lists of interacting residues or atoms and visualized as two-dimensional distance or contact maps. We show that contact maps provide an ideal tool to describe and analyze differences between structures of proteins in different conformations. Expanding functionality of the PDBFlex server and database developed previously in our group, we describe how analysis of difference contact maps (DCMs) can be used to identify critical interactions stabilizing alternative protein conformations, recognize residues and positions controlling protein functions and build hypotheses as to molecular mechanisms of disease mutations.


Asunto(s)
Biología Computacional/métodos , Modelos Moleculares , Estructura Secundaria de Proteína , Proteínas/química , Algoritmos , Sitios de Unión , Ligandos , Unión Proteica , Proteínas/metabolismo
11.
Nucleic Acids Res ; 47(D1): D895-D899, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30407596

RESUMEN

Our knowledge of cancer genomics exploded in last several years, providing us with detailed knowledge of genetic alterations in almost all cancer types. Analysis of this data gave us new insights into molecular aspects of cancer, most important being the amazing diversity of molecular abnormalities in individual cancers. The most important question in cancer research today is how to classify this diversity to identify subtypes that are most relevant for treatment and outcome prediction for individual patients. The Cancer3D database at http://www.cancer3d.org gives an open and user-friendly way to analyze cancer missense mutations in the context of structures of proteins they are found in and in relation to patients' clinical data. This approach allows users to find novel candidate driver regions for specific subgroups, that often cannot be found when similar analyses are done on the whole gene level and for large, diverse cohorts. Interactive interface allows user to visualize the distribution of mutations in subgroups defined by cancer type and stage, gender and age brackets, patient's ethnicity or vice versa find dominant cancer type, gender or age groups for specific three-dimensional mutation patterns.


Asunto(s)
Bases de Datos de Proteínas , Mutación Missense , Neoplasias/genética , Conformación Proteica , Proteínas/genética , Humanos , Dominios Proteicos
12.
Bioinformatics ; 32(4): 602-4, 2016 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-26515826

RESUMEN

UNLABELLED: Protael is a JavaScript library for creating interactive visualizations of biological sequences and various associated data. It allows users to generate high-quality vector graphics (SVG) and integrate it into web pages. AVAILABILITY AND IMPLEMENTATION: Protael distribution, documentation and examples are freely available at http://protael.org; source code is hosted at https://github.com/sanshu/protaeljs.


Asunto(s)
Gráficos por Computador , Internet , Proteínas/química , Programas Informáticos , Humanos , Lenguajes de Programación
13.
Nucleic Acids Res ; 44(D1): D423-8, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26615193

RESUMEN

The PDBFlex database, available freely and with no login requirements at http://pdbflex.org, provides information on flexibility of protein structures as revealed by the analysis of variations between depositions of different structural models of the same protein in the Protein Data Bank (PDB). PDBFlex collects information on all instances of such depositions, identifying them by a 95% sequence identity threshold, performs analysis of their structural differences and clusters them according to their structural similarities for easy analysis. The PDBFlex contains tools and viewers enabling in-depth examination of structural variability including: 2D-scaling visualization of RMSD distances between structures of the same protein, graphs of average local RMSD in the aligned structures of protein chains, graphical presentation of differences in secondary structure and observed structural disorder (unresolved residues), difference distance maps between all sets of coordinates and 3D views of individual structures and simulated transitions between different conformations, the latter displayed using JSMol visualization software.


Asunto(s)
Bases de Datos de Proteínas , Conformación Proteica , Ligandos , Modelos Moleculares
14.
Nucleic Acids Res ; 42(Web Server issue): W430-5, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-24957597

RESUMEN

PubServer, available at http://pubserver.burnham.org/, is a tool to automatically collect, filter and analyze publications associated with groups of homologous proteins. Protein entries in databases such as Entrez Protein database at NCBI contain information about publications associated with a given protein. The scope of these publications varies a lot: they include studies focused on biochemical functions of individual proteins, but also reports from genome sequencing projects that introduce tens of thousands of proteins. Collecting and analyzing publications related to sets of homologous proteins help in functional annotation of novel protein families and in improving annotations of well-studied protein families or individual genes. However, performing such collection and analysis manually is a tedious and time-consuming process. PubServer automatically collects identifiers of homologous proteins using PSI-Blast, retrieves literature references from corresponding database entries and filters out publications unlikely to contain useful information about individual proteins. It also prepares simple vocabulary statistics from titles, abstracts and MeSH terms to identify the most frequently occurring keywords, which may help to quickly identify common themes in these publications. The filtering criteria applied to collected publications are user-adjustable. The results of the server are presented as an interactive page that allows re-filtering and different presentations of the output.


Asunto(s)
Minería de Datos/métodos , Homología de Secuencia de Aminoácido , Programas Informáticos , Internet , Anotación de Secuencia Molecular , Estructura Terciaria de Proteína , Proteínas/clasificación , Proteínas/genética , PubMed , Análisis de Secuencia de Proteína
15.
BMC Bioinformatics ; 14: 341, 2013 Nov 26.
Artículo en Inglés | MEDLINE | ID: mdl-24274019

RESUMEN

BACKGROUND: A novel highly conserved protein domain, DUF162 [Pfam: PF02589], can be mapped to two proteins: LutB and LutC. Both proteins are encoded by a highly conserved LutABC operon, which has been implicated in lactate utilization in bacteria. Based on our analysis of its sequence, structure, and recent experimental evidence reported by other groups, we hereby redefine DUF162 as the LUD domain family. RESULTS: JCSG solved the first crystal structure [PDB:2G40] from the LUD domain family: LutC protein, encoded by ORF DR_1909, of Deinococcus radiodurans. LutC shares features with domains in the functionally diverse ISOCOT superfamily. We have observed that the LUD domain has an increased abundance in the human gut microbiome. CONCLUSIONS: We propose a model for the substrate and cofactor binding and regulation in LUD domain. The significance of LUD-containing proteins in the human gut microbiome, and the implication of lactate metabolism in the radiation-resistance of Deinococcus radiodurans are discussed.


Asunto(s)
Proteínas Bacterianas/metabolismo , Deinococcus/química , Deinococcus/metabolismo , Ácido Láctico/metabolismo , Secuencia de Aminoácidos , Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Cristalografía por Rayos X , Deinococcus/genética , Humanos , Microbiota/efectos de la radiación , Datos de Secuencia Molecular , Estructura Terciaria de Proteína
16.
BMC Syst Biol ; 5: 7, 2011 Jan 14.
Artículo en Inglés | MEDLINE | ID: mdl-21235794

RESUMEN

BACKGROUND: Understanding of immune response mechanisms of pathogen-infected host requires multi-scale analysis of genome-wide data. Data integration methods have proved useful to the study of biological processes in model organisms, but their systematic application to the study of host immune system response to a pathogen and human disease is still in the initial stage. RESULTS: To study host-pathogen interaction on the systems biology level, an extension to the previously described BiologicalNetworks system is proposed. The developed methods and data integration and querying tools allow simplifying and streamlining the process of integration of diverse experimental data types, including molecular interactions and phylogenetic classifications, genomic sequences and protein structure information, gene expression and virulence data for pathogen-related studies. The data can be integrated from the databases and user's files for both public and private use. CONCLUSIONS: The developed system can be used for the systems-level analysis of host-pathogen interactions, including host molecular pathways that are induced/repressed during the infections, co-expressed genes, and conserved transcription factor binding sites. Previously unknown to be associated with the influenza infection genes were identified and suggested for further investigation as potential drug targets. Developed methods and data are available through the Java application (from BiologicalNetworks program at http://www.biologicalnetworks.org) and web interface (at http://flu.sdsc.edu).


Asunto(s)
Interacciones Huésped-Patógeno , Biología de Sistemas/métodos , Animales , Antivirales/metabolismo , Antivirales/farmacología , Antivirales/uso terapéutico , Minería de Datos , Bases de Datos Factuales , Humanos , Ratones , Orthomyxoviridae/efectos de los fármacos , Orthomyxoviridae/fisiología , Infecciones por Orthomyxoviridae/tratamiento farmacológico , Infecciones por Orthomyxoviridae/metabolismo , Filogeografía , Ratas , Programas Informáticos , Interfaz Usuario-Computador
17.
BMC Bioinformatics ; 11: 610, 2010 Dec 29.
Artículo en Inglés | MEDLINE | ID: mdl-21190573

RESUMEN

BACKGROUND: A significant problem in the study of mechanisms of an organism's development is the elucidation of interrelated factors which are making an impact on the different levels of the organism, such as genes, biological molecules, cells, and cell systems. Numerous sources of heterogeneous data which exist for these subsystems are still not integrated sufficiently enough to give researchers a straightforward opportunity to analyze them together in the same frame of study. Systematic application of data integration methods is also hampered by a multitude of such factors as the orthogonal nature of the integrated data and naming problems. RESULTS: Here we report on a new version of BiologicalNetworks, a research environment for the integral visualization and analysis of heterogeneous biological data. BiologicalNetworks can be queried for properties of thousands of different types of biological entities (genes/proteins, promoters, COGs, pathways, binding sites, and other) and their relations (interactions, co-expression, co-citations, and other). The system includes the build-pathways infrastructure for molecular interactions/relations and module discovery in high-throughput experiments. Also implemented in BiologicalNetworks are the Integrated Genome Viewer and Comparative Genomics Browser applications, which allow for the search and analysis of gene regulatory regions and their conservation in multiple species in conjunction with molecular pathways/networks, experimental data and functional annotations. CONCLUSIONS: The new release of BiologicalNetworks together with its back-end database introduces extensive functionality for a more efficient integrated multi-level analysis of microarray, sequence, regulatory, and other data. BiologicalNetworks is freely available at http://www.biologicalnetworks.org.


Asunto(s)
Sistemas de Administración de Bases de Datos , Bases de Datos Genéticas , Genómica/métodos , Biología Computacional/métodos , Genoma , Internet , Anotación de Secuencia Molecular , Análisis de Secuencia por Matrices de Oligonucleótidos , Análisis de Secuencia de ADN
18.
Nucleic Acids Res ; 34(Web Server issue): W466-71, 2006 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-16845051

RESUMEN

Systems level investigation of genomic scale information requires the development of truly integrated databases dealing with heterogeneous data, which can be queried for simple properties of genes or other database objects as well as for complex network level properties, for the analysis and modelling of complex biological processes. Towards that goal, we recently constructed PathSys, a data integration platform for systems biology, which provides dynamic integration over a diverse set of databases [Baitaluk et al. (2006) BMC Bioinformatics 7, 55]. Here we describe a server, BiologicalNetworks, which provides visualization, analysis services and an information management framework over PathSys. The server allows easy retrieval, construction and visualization of complex biological networks, including genome-scale integrated networks of protein-protein, protein-DNA and genetic interactions. Most importantly, BiologicalNetworks addresses the need for systematic presentation and analysis of high-throughput expression data by mapping and analysis of expression profiles of genes or proteins simultaneously on to regulatory, metabolic and cellular networks. BiologicalNetworks Server is available at http://brak.sdsc.edu/pub/BiologicalNetworks.


Asunto(s)
Genómica/métodos , Programas Informáticos , Biología de Sistemas/métodos , Gráficos por Computador , Bases de Datos Genéticas , Perfilación de la Expresión Génica , Internet , Análisis de Secuencia por Matrices de Oligonucleótidos , Integración de Sistemas , Interfaz Usuario-Computador
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...