Búsqueda | Portal de Búsqueda de la BVS España

PANTHER version 16: a revised family classification, tree-based classification tool, enhancer regions and extensive API.

Mi, Huaiyu; Ebert, Dustin; Muruganujan, Anushya; Mills, Caitlin; Albou, Laurent-Philippe; Mushayamaha, Tremayne; Thomas, Paul D.

Nucleic Acids Res ; 49(D1): D394-D403, 2021 01 08.

Artículo en Inglés | MEDLINE | ID: mdl-33290554

RESUMEN

PANTHER (Protein Analysis Through Evolutionary Relationships, http://www.pantherdb.org) is a resource for the evolutionary and functional classification of protein-coding genes from all domains of life. The evolutionary classification is based on a library of over 15,000 phylogenetic trees, and the functional classifications include Gene Ontology terms and pathways. Here, we analyze the current coverage of genes from genomes in different taxonomic groups, so that users can better understand what to expect when analyzing a gene list using PANTHER tools. We also describe extensive improvements to PANTHER made in the past two years. The PANTHER Protein Class ontology has been completely refactored, and 6101 PANTHER families have been manually assigned to a Protein Class, providing a high level classification of protein families and their genes. Users can access the TreeGrafter tool to add their own protein sequences to the reference phylogenetic trees in PANTHER, to infer evolutionary context as well as fine-grained annotations. We have added human enhancer-gene links that associate non-coding regions with the annotated human genes in PANTHER. We have also expanded the available services for programmatic access to PANTHER tools and data via application programming interfaces (APIs). Other improvements include additional plant genomes and an updated PANTHER GO-slim.

Asunto(s)

Biología Computacional/métodos , Elementos de Facilitación Genéticos/genética , Filogenia , Programas Informáticos , Interfaz Usuario-Computador , Evolución Molecular , Ontología de Genes , Genoma , Anotación de Secuencia Molecular , Sistemas de Lectura Abierta/genética

Reactome and the Gene Ontology: digital convergence of data resources.

Good, Benjamin M; Van Auken, Kimberly; Hill, David P; Mi, Huaiyu; Carbon, Seth; Balhoff, James P; Albou, Laurent-Philippe; Thomas, Paul D; Mungall, Christopher J; Blake, Judith A; D'Eustachio, Peter.

Bioinformatics ; 37(19): 3343-3348, 2021 Oct 11.

Artículo en Inglés | MEDLINE | ID: mdl-33964129

RESUMEN

MOTIVATION: Gene Ontology Causal Activity Models (GO-CAMs) assemble individual associations of gene products with cellular components, molecular functions and biological processes into causally linked activity flow models. Pathway databases such as the Reactome Knowledgebase create detailed molecular process descriptions of reactions and assemble them, based on sharing of entities between individual reactions into pathway descriptions. RESULTS: To convert the rich content of Reactome into GO-CAMs, we have developed a software tool, Pathways2GO, to convert the entire set of normal human Reactome pathways into GO-CAMs. This conversion yields standard GO annotations from Reactome content and supports enhanced quality control for both Reactome and GO, yielding a nearly seamless conversion between these two resources for the bioinformatics community. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Ancestral Genomes: a resource for reconstructed ancestral genes and genomes across the tree of life.

Huang, Xiaosong; Albou, Laurent-Philippe; Mushayahama, Tremayne; Muruganujan, Anushya; Tang, Haiming; Thomas, Paul D.

Nucleic Acids Res ; 47(D1): D271-D279, 2019 01 08.

Artículo en Inglés | MEDLINE | ID: mdl-30371900

RESUMEN

A growing number of whole genome sequencing projects, in combination with development of phylogenetic methods for reconstructing gene evolution, have provided us with a window into genomes that existed millions, and even billions, of years ago. Ancestral Genomes (http://ancestralgenomes.org) is a resource for comprehensive reconstructions of these 'fossil genomes'. Comprehensive sets of protein-coding genes have been reconstructed for 78 genomes of now-extinct species that were the common ancestors of extant species from across the tree of life. The reconstructed genes are based on the extensive library of over 15 000 gene family trees from the PANTHER database, and are updated on a yearly basis. For each ancestral gene, we assign a stable identifier, and provide additional information designed to facilitate analysis: an inferred name, a reconstructed protein sequence, a set of inferred Gene Ontology (GO) annotations, and a 'proxy gene' for each ancestral gene, defined as the least-diverged descendant of the ancestral gene in a given extant genome. On the Ancestral Genomes website, users can browse the Ancestral Genomes by selecting nodes in a species tree, and can compare an extant genome with any of its reconstructed ancestors to understand how the genome evolved.

Asunto(s)

Bases de Datos Genéticas , Evolución Molecular , Genes , Genoma , Filogenia , Animales , Eucariontes/genética , Extinción Biológica , Genes Arqueales , Genes Bacterianos , Genes Protozoarios , Anotación de Secuencia Molecular , Programas Informáticos

M-ORBIS: mapping of molecular binding sites and surfaces.

Albou, Laurent-Philippe; Poch, Olivier; Moras, Dino.

Nucleic Acids Res ; 39(1): 30-43, 2011 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-20813758

RESUMEN

M-ORBIS is a Molecular Cartography approach that performs integrative high-throughput analysis of structural data to localize all types of binding sites and associated partners by homology and to characterize their properties and behaviors in a systemic way. The robustness of our binding site inferences was compared to four curated datasets corresponding to protein heterodimers and homodimers and protein-DNA/RNA assemblies. The Molecular Cartographies of structurally well-detailed proteins shows that 44% of their surfaces interact with non-solvent partners. Residue contact frequencies with water suggest that â¼86% of their surfaces are transiently solvated, whereas only 15% are specifically solvated. Our analysis also reveals the existence of two major binding site families: specific binding sites which can only bind one type of molecule (protein, DNA, RNA, etc.) and polyvalent binding sites that can bind several distinct types of molecule. Specific homodimer binding sites are for instance nearly twice as hydrophobic than previously described and more closely resemble the protein core, while polyvalent binding sites able to form homo and heterodimers more closely resemble the surfaces involved in crystal packing. Similarly, the regions able to bind DNA and to alternatively form homodimers, are more hydrophobic and less polar than previously described DNA binding sites.

Asunto(s)

Conformación Proteica , Sitios de Unión , Biología Computacional , Dimerización , Modelos Moleculares , Unión Proteica , Proteínas/química , Agua/química

PANTHER: Making genome-scale phylogenetics accessible to all.

Thomas, Paul D; Ebert, Dustin; Muruganujan, Anushya; Mushayahama, Tremayne; Albou, Laurent-Philippe; Mi, Huaiyu.

Protein Sci ; 31(1): 8-22, 2022 01.

Artículo en Inglés | MEDLINE | ID: mdl-34717010

RESUMEN

Phylogenetics is a powerful tool for analyzing protein sequences, by inferring their evolutionary relationships to other proteins. However, phylogenetics analyses can be challenging: they are computationally expensive and must be performed carefully in order to avoid systematic errors and artifacts. Protein Analysis THrough Evolutionary Relationships (PANTHER; http://pantherdb.org) is a publicly available, user-focused knowledgebase that stores the results of an extensive phylogenetic reconstruction pipeline that includes computational and manual processes and quality control steps. First, fully reconciled phylogenetic trees (including ancestral protein sequences) are reconstructed for a set of "reference" protein sequences obtained from fully sequenced genomes of organisms across the tree of life. Second, the resulting phylogenetic trees are manually reviewed and annotated with function evolution events: inferred gains and losses of protein function along branches of the phylogenetic tree. Here, we describe in detail the current contents of PANTHER, how those contents are generated, and how they can be used in a variety of applications. The PANTHER knowledgebase can be downloaded or accessed via an extensive API. In addition, PANTHER provides software tools to facilitate the application of the knowledgebase to common protein sequence analysis tasks: exploring an annotated genome by gene function; performing "enrichment analysis" of lists of genes; annotating a single sequence or large batch of sequences by homology; and assessing the likelihood that a genetic variant at a particular site in a protein will have deleterious effects.

Asunto(s)

Bases de Datos de Proteínas , Evolución Molecular , Filogenia , Proteínas , Análisis de Secuencia de Proteína , Programas Informáticos , Anotación de Secuencia Molecular , Proteínas/química , Proteínas/genética

SM2PH-db: an interactive system for the integrated analysis of phenotypic consequences of missense mutations in proteins involved in human genetic diseases.

Friedrich, Anne; Garnier, Nicolas; Gagnière, Nicolas; Nguyen, Hoan; Albou, Laurent-Philippe; Biancalana, Valérie; Bettler, Emmanuel; Deléage, Gilbert; Lecompte, Odile; Muller, Jean; Moras, Dino; Mandel, Jean-Louis; Toursel, Thierry; Moulinier, Luc; Poch, Olivier.

Hum Mutat ; 31(2): 127-35, 2010 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-19921752

RESUMEN

Understanding how genetic alterations affect gene products at the molecular level represents a first step in the elucidation of the complex relationships between genotypic and phenotypic variations, and is thus a major challenge in the postgenomic era. Here, we present SM2PH-db (http://decrypthon.igbmc.fr/sm2ph), a new database designed to investigate structural and functional impacts of missense mutations and their phenotypic effects in the context of human genetic diseases. A wealth of up-to-date interconnected information is provided for each of the 2,249 disease-related entry proteins (August 2009), including data retrieved from biological databases and data generated from a Sequence-Structure-Evolution Inference in Systems-based approach, such as multiple alignments, three-dimensional structural models, and multidimensional (physicochemical, functional, structural, and evolutionary) characterizations of mutations. SM2PH-db provides a robust infrastructure associated with interactive analysis tools supporting in-depth study and interpretation of the molecular consequences of mutations, with the more long-term goal of elucidating the chain of events leading from a molecular defect to its pathology. The entire content of SM2PH-db is regularly and automatically updated thanks to a computational grid data federation facilities provided in the context of the Decrypthon program.

Asunto(s)

Bases de Datos de Proteínas , Enfermedades Genéticas Congénitas/genética , Mutación Missense/genética , Programas Informáticos , Humanos , Internet , Fenotipo , Proteínas , Interfaz Usuario-Computador

Defining and characterizing protein surface using alpha shapes.

Albou, Laurent-Philippe; Schwarz, Benjamin; Poch, Olivier; Wurtz, Jean Marie; Moras, Dino.

Proteins ; 76(1): 1-12, 2009 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-19089982

RESUMEN

The alpha shape of a molecule is a geometrical representation that provides a unique surface decomposition and a means to filter atomic contacts. We used it to revisit and unify the definition and computation of surface residues, contiguous patches, and curvature. These descriptors are evaluated and compared with former approaches on 85 proteins for which both bound and unbound forms are available. Based on the local density of interactions, the detection of surface residues shows a sensibility of 98%, whereas preserving a well-formed protein core. A novel conception of surface patch is defined by traveling along the surface from a central residue or atom. By construction, all surface patches are contiguous and, therefore, allows to cope with common problems of wrong and nonselection of neighbors. In the case of protein-binding site prediction, this new definition has improved the signal-to-noise ratio by 2.6 times compared with a widely used approach. With most common approaches, the computation of surface curvature can be locally biased by the presence of subsurface cavities and local variations of atomic densities. A novel notion of surface curvature is specifically developed to avoid such bias and is parametrizable to emphasize either local or global features. It defines a molecular landscape composed on average of 38% knobs and 62% clefts where interacting residues (IR) are 30% more frequent in knobs. A statistical analysis shows that residues in knobs are more charged, less hydrophobic and less aromatic than residues in clefts. IR in knobs are, however, much more hydrophobic and aromatic and less charged than noninteracting residues (non-IR) in knobs. Furthermore, IR are shown to be more accessible than non-IR both in clefts and knobs. The use of the alpha shape as a unifying framework allows for formal definitions, and fast and robust computations desirable in large-scale projects. This swiftness is not achieved to the detriment of quality, as proven by valid improvements compared with former approaches. In addition, our approach is general enough to be applied on nucleic acids and any other biomolecules.

Asunto(s)

Biología Computacional/métodos , Proteínas/química , Aminoácidos/química , Sitios de Unión , Simulación por Computador , Modelos Moleculares , Ácidos Nucleicos/química , Unión Proteica , Conformación Proteica

Gene Ontology Causal Activity Modeling (GO-CAM) moves beyond GO annotations to structured descriptions of biological functions and systems.

Thomas, Paul D; Hill, David P; Mi, Huaiyu; Osumi-Sutherland, David; Van Auken, Kimberly; Carbon, Seth; Balhoff, James P; Albou, Laurent-Philippe; Good, Benjamin; Gaudet, Pascale; Lewis, Suzanna E; Mungall, Christopher J.

Nat Genet ; 51(10): 1429-1433, 2019 10.

Artículo en Inglés | MEDLINE | ID: mdl-31548717

Asunto(s)

Biología Computacional/métodos , Ontología de Genes , Modelos Biológicos , Anotación de Secuencia Molecular , Transducción de Señal , Bases de Datos Genéticas , Humanos , Fenotipo

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA