Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
1.
Nucleic Acids Res ; 50(W1): W57-W65, 2022 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-35640593

RESUMO

The Annotation Query (AnnoQ) (http://annoq.org/) is designed to provide comprehensive and up-to-date functional annotations for human genetic variants. The system is supported by an annotation database with ∼39 million human variants from the Haplotype Reference Consortium (HRC) pre-annotated with sequence feature annotations by WGSA and functional annotations to Gene Ontology (GO) and pathways in PANTHER. The database operates on an optimized Elasticsearch framework to support real-time complex searches. This implementation enables users to annotate data with the most up-to-date functional annotations via simple queries instead of setting up individual tools. A web interface allows users to interactively browse the annotations, annotate variants and search variant data. Its easy-to-use interface and search capabilities are well-suited for scientists with fewer bioinformatics skills such as bench scientists and statisticians. AnnoQ also has an API for users to access and annotate the data programmatically. Packages for programming languages, such as the R package, are available for users to embed the annotation queries in their scripts. AnnoQ serves researchers with a wide range of backgrounds and research interests as an integrated annotation platform.


Assuntos
Variação Genética , Anotação de Sequência Molecular , Software , Humanos , Bases de Dados Genéticas , Internet , Anotação de Sequência Molecular/métodos , Interface Usuário-Computador , Variação Genética/genética , Haplótipos/genética , Linguagens de Programação
2.
Nucleic Acids Res ; 49(D1): D394-D403, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33290554

RESUMO

PANTHER (Protein Analysis Through Evolutionary Relationships, http://www.pantherdb.org) is a resource for the evolutionary and functional classification of protein-coding genes from all domains of life. The evolutionary classification is based on a library of over 15,000 phylogenetic trees, and the functional classifications include Gene Ontology terms and pathways. Here, we analyze the current coverage of genes from genomes in different taxonomic groups, so that users can better understand what to expect when analyzing a gene list using PANTHER tools. We also describe extensive improvements to PANTHER made in the past two years. The PANTHER Protein Class ontology has been completely refactored, and 6101 PANTHER families have been manually assigned to a Protein Class, providing a high level classification of protein families and their genes. Users can access the TreeGrafter tool to add their own protein sequences to the reference phylogenetic trees in PANTHER, to infer evolutionary context as well as fine-grained annotations. We have added human enhancer-gene links that associate non-coding regions with the annotated human genes in PANTHER. We have also expanded the available services for programmatic access to PANTHER tools and data via application programming interfaces (APIs). Other improvements include additional plant genomes and an updated PANTHER GO-slim.


Assuntos
Biologia Computacional/métodos , Elementos Facilitadores Genéticos/genética , Filogenia , Software , Interface Usuário-Computador , Evolução Molecular , Ontologia Genética , Genoma , Anotação de Sequência Molecular , Fases de Leitura Aberta/genética
3.
Nucleic Acids Res ; 47(D1): D419-D426, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30407594

RESUMO

PANTHER (Protein Analysis Through Evolutionary Relationships, http://pantherdb.org) is a resource for the evolutionary and functional classification of genes from organisms across the tree of life. We report the improvements we have made to the resource during the past two years. For evolutionary classifications, we have added more prokaryotic and plant genomes to the phylogenetic gene trees, expanding the representation of gene evolution in these lineages. We have refined many protein family boundaries, and have aligned PANTHER with the MEROPS resource for protease and protease inhibitor families. For functional classifications, we have developed an entirely new PANTHER GO-slim, containing over four times as many Gene Ontology terms as our previous GO-slim, as well as curated associations of genes to these terms. Lastly, we have made substantial improvements to the enrichment analysis tools available on the PANTHER website: users can now analyze over 900 different genomes, using updated statistical tests with false discovery rate corrections for multiple testing. The overrepresentation test is also available as a web service, for easy addition to third-party sites.


Assuntos
Bases de Dados Genéticas , Genoma , Proteínas/classificação , Animais , Evolução Molecular , Ontologia Genética , Genes , Genoma Microbiano , Genoma de Planta , Peptídeo Hidrolases/classificação , Filogenia , Proteínas/genética , Software
4.
Nucleic Acids Res ; 47(D1): D271-D279, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30371900

RESUMO

A growing number of whole genome sequencing projects, in combination with development of phylogenetic methods for reconstructing gene evolution, have provided us with a window into genomes that existed millions, and even billions, of years ago. Ancestral Genomes (http://ancestralgenomes.org) is a resource for comprehensive reconstructions of these 'fossil genomes'. Comprehensive sets of protein-coding genes have been reconstructed for 78 genomes of now-extinct species that were the common ancestors of extant species from across the tree of life. The reconstructed genes are based on the extensive library of over 15 000 gene family trees from the PANTHER database, and are updated on a yearly basis. For each ancestral gene, we assign a stable identifier, and provide additional information designed to facilitate analysis: an inferred name, a reconstructed protein sequence, a set of inferred Gene Ontology (GO) annotations, and a 'proxy gene' for each ancestral gene, defined as the least-diverged descendant of the ancestral gene in a given extant genome. On the Ancestral Genomes website, users can browse the Ancestral Genomes by selecting nodes in a species tree, and can compare an extant genome with any of its reconstructed ancestors to understand how the genome evolved.


Assuntos
Bases de Dados Genéticas , Evolução Molecular , Genes , Genoma , Filogenia , Animais , Eucariotos/genética , Extinção Biológica , Genes Arqueais , Genes Bacterianos , Genes de Protozoários , Anotação de Sequência Molecular , Software
5.
Nucleic Acids Res ; 45(D1): D183-D189, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899595

RESUMO

The PANTHER database (Protein ANalysis THrough Evolutionary Relationships, http://pantherdb.org) contains comprehensive information on the evolution and function of protein-coding genes from 104 completely sequenced genomes. PANTHER software tools allow users to classify new protein sequences, and to analyze gene lists obtained from large-scale genomics experiments. In the past year, major improvements include a large expansion of classification information available in PANTHER, as well as significant enhancements to the analysis tools. Protein subfamily functional classifications have more than doubled due to progress of the Gene Ontology Phylogenetic Annotation Project. For human genes (as well as a few other organisms), PANTHER now also supports enrichment analysis using pathway classifications from the Reactome resource. The gene list enrichment tools include a new 'hierarchical view' of results, enabling users to leverage the structure of the classifications/ontologies; the tools also allow users to upload genetic variant data directly, rather than requiring prior conversion to a gene list. The updated coding single-nucleotide polymorphisms (SNP) scoring tool uses an improved algorithm. The hidden Markov model (HMM) search tools now use HMMER3, dramatically reducing search times and improving accuracy of E-value statistics. Finally, the PANTHER Tree-Attribute Viewer has been implemented in JavaScript, with new views for exploring protein sequence evolution.


Assuntos
Biologia Computacional/métodos , Genômica/métodos , Software , Curadoria de Dados , Bases de Dados Genéticas , Ontologia Genética , Filogenia , Polimorfismo de Nucleotídeo Único , Navegador
6.
Nucleic Acids Res ; 44(D1): D336-42, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26578592

RESUMO

PANTHER (Protein Analysis THrough Evolutionary Relationships, http://pantherdb.org) is a widely used online resource for comprehensive protein evolutionary and functional classification, and includes tools for large-scale biological data analysis. Recent development has been focused in three main areas: genome coverage, functional information ('annotation') coverage and accuracy, and improved genomic data analysis tools. The latest version of PANTHER, 10.0, includes almost 5000 new protein families (for a total of over 12 000 families), each with a reference phylogenetic tree including protein-coding genes from 104 fully sequenced genomes spanning all kingdoms of life. Phylogenetic trees now include inference of horizontal transfer events in addition to speciation and gene duplication events. Functional annotations are regularly updated using the models generated by the Gene Ontology Phylogenetic Annotation Project. For the data analysis tools, PANTHER has expanded the number of different 'functional annotation sets' available for functional enrichment testing, allowing analyses to access all Gene Ontology annotations--updated monthly from the Gene Ontology database--in addition to the annotations that have been inferred through evolutionary relationships. The Prowler (data browser) has been updated to enable users to more efficiently browse the entire database, and to create custom gene lists using the multiple axes of classification in PANTHER.


Assuntos
Bases de Dados de Proteínas , Proteínas/classificação , Genômica , Humanos , Anotação de Sequência Molecular , Filogenia , Proteínas/genética , Proteínas/fisiologia , Software
7.
Nucleic Acids Res ; 42(Database issue): D677-84, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24285306

RESUMO

PortEco (http://porteco.org) aims to collect, curate and provide data and analysis tools to support basic biological research in Escherichia coli (and eventually other bacterial systems). PortEco is implemented as a 'virtual' model organism database that provides a single unified interface to the user, while integrating information from a variety of sources. The main focus of PortEco is to enable broad use of the growing number of high-throughput experiments available for E. coli, and to leverage community annotation through the EcoliWiki and GONUTS systems. Currently, PortEco includes curated data from hundreds of genome-wide RNA expression studies, from high-throughput phenotyping of single-gene knockouts under hundreds of annotated conditions, from chromatin immunoprecipitation experiments for tens of different DNA-binding factors and from ribosome profiling experiments that yield insights into protein expression. Conditions have been annotated with a consistent vocabulary, and data have been consistently normalized to enable users to find, compare and interpret relevant experiments. PortEco includes tools for data analysis, including clustering, enrichment analysis and exploration via genome browsers. PortEco search and data analysis tools are extensively linked to the curated gene, metabolic pathway and regulation content at its sister site, EcoCyc.


Assuntos
Bases de Dados Genéticas , Escherichia coli/genética , Alelos , Proteínas de Ligação a DNA/metabolismo , Escherichia coli/metabolismo , Proteínas de Escherichia coli/metabolismo , Genes Bacterianos , Genoma Bacteriano , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Fenótipo , RNA Mensageiro/metabolismo , Ribossomos/metabolismo , Software
8.
Nucleic Acids Res ; 41(Database issue): D377-86, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23193289

RESUMO

The data and tools in PANTHER-a comprehensive, curated database of protein families, trees, subfamilies and functions available at http://pantherdb.org-have undergone continual, extensive improvement for over a decade. Here, we describe the current PANTHER process as a whole, as well as the website tools for analysis of user-uploaded data. The main goals of PANTHER remain essentially unchanged: the accurate inference (and practical application) of gene and protein function over large sequence databases, using phylogenetic trees to extrapolate from the relatively sparse experimental information from a few model organisms. Yet the focus of PANTHER has continually shifted toward more accurate and detailed representations of evolutionary events in gene family histories. The trees are now designed to represent gene family evolution, including inference of evolutionary events, such as speciation and gene duplication. Subfamilies are still curated and used to define HMMs, but gene ontology functional annotations can now be made at any node in the tree, and are designed to represent gain and loss of function by ancestral genes during evolution. Finally, PANTHER now includes stable database identifiers for inferred ancestral genes, which are used to associate inferred gene attributes with particular genes in the common ancestral genomes of extant species.


Assuntos
Bases de Dados Genéticas , Família Multigênica , Filogenia , Proteínas/classificação , Análise por Conglomerados , Evolução Molecular , Genes , Internet , Modelos Genéticos , Anotação de Sequência Molecular , Proteínas/genética , Software
9.
Bioinformatics ; 27(24): 3437-8, 2011 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-22021903

RESUMO

MOTIVATION: BioPAX is a standard language for representing and exchanging models of biological processes at the molecular and cellular levels. It is widely used by different pathway databases and genomics data analysis software. Currently, the primary source of BioPAX data is direct exports from the curated pathway databases. It is still uncommon for wet-lab biologists to share and exchange pathway knowledge using BioPAX. Instead, pathways are usually represented as informal diagrams in the literature. In order to encourage formal representation of pathways, we describe a software package that allows users to create pathway diagrams using CellDesigner, a user-friendly graphical pathway-editing tool and save the pathway data in BioPAX Level 3 format. AVAILABILITY: The plug-in is freely available and can be downloaded at ftp://ftp.pantherdb.org/CellDesigner/plugins/BioPAX/ CONTACT: huaiyumi@usc.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Modelos Biológicos , Software , Animais , Biologia Computacional , Bases de Dados Factuais , Humanos , Redes e Vias Metabólicas , Transdução de Sinais
10.
Nucleic Acids Res ; 38(Database issue): D204-10, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-20015972

RESUMO

Protein Analysis THrough Evolutionary Relationships (PANTHER) is a comprehensive software system for inferring the functions of genes based on their evolutionary relationships. Phylogenetic trees of gene families form the basis for PANTHER and these trees are annotated with ontology terms describing the evolution of gene function from ancestral to modern day genes. One of the main applications of PANTHER is in accurate prediction of the functions of uncharacterized genes, based on their evolutionary relationships to genes with functions known from experiment. The PANTHER website, freely available at http://www.pantherdb.org, also includes software tools for analyzing genomic data relative to known and inferred gene functions. Since 2007, there have been several new developments to PANTHER: (i) improved phylogenetic trees, explicitly representing speciation and gene duplication events, (ii) identification of gene orthologs, including least diverged orthologs (best one-to-one pairs), (iii) coverage of more genomes (48 genomes, up to 87% of genes in each genome; see http://www.pantherdb.org/panther/summaryStats.jsp), (iv) improved support for alternative database identifiers for genes, proteins and microarray probes and (v) adoption of the SBGN standard for display of biological pathways. In addition, PANTHER trees are being annotated with gene function as part of the Gene Ontology Reference Genome project, resulting in an increasing number of curated functional annotations.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Algoritmos , Animais , Biologia Computacional/tendências , Bases de Dados de Proteínas , Evolução Molecular , Humanos , Armazenamento e Recuperação da Informação/métodos , Internet , Filogenia , Estrutura Terciária de Proteína , Análise de Sequência de Proteína , Homologia de Sequência de Aminoácidos , Software
11.
Protein Sci ; 31(1): 8-22, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34717010

RESUMO

Phylogenetics is a powerful tool for analyzing protein sequences, by inferring their evolutionary relationships to other proteins. However, phylogenetics analyses can be challenging: they are computationally expensive and must be performed carefully in order to avoid systematic errors and artifacts. Protein Analysis THrough Evolutionary Relationships (PANTHER; http://pantherdb.org) is a publicly available, user-focused knowledgebase that stores the results of an extensive phylogenetic reconstruction pipeline that includes computational and manual processes and quality control steps. First, fully reconciled phylogenetic trees (including ancestral protein sequences) are reconstructed for a set of "reference" protein sequences obtained from fully sequenced genomes of organisms across the tree of life. Second, the resulting phylogenetic trees are manually reviewed and annotated with function evolution events: inferred gains and losses of protein function along branches of the phylogenetic tree. Here, we describe in detail the current contents of PANTHER, how those contents are generated, and how they can be used in a variety of applications. The PANTHER knowledgebase can be downloaded or accessed via an extensive API. In addition, PANTHER provides software tools to facilitate the application of the knowledgebase to common protein sequence analysis tasks: exploring an annotated genome by gene function; performing "enrichment analysis" of lists of genes; annotating a single sequence or large batch of sequences by homology; and assessing the likelihood that a genetic variant at a particular site in a protein will have deleterious effects.


Assuntos
Bases de Dados de Proteínas , Evolução Molecular , Filogenia , Proteínas , Análise de Sequência de Proteína , Software , Anotação de Sequência Molecular , Proteínas/química , Proteínas/genética
12.
PLoS One ; 15(12): e0243791, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33320871

RESUMO

Enhancers are powerful and versatile agents of cell-type specific gene regulation, which are thought to play key roles in human disease. Enhancers are short DNA elements that function primarily as clusters of transcription factor binding sites that are spatially coordinated to regulate expression of one or more specific target genes. These regulatory connections between enhancers and target genes can therefore be characterized as enhancer-gene links that can affect development, disease, and homeostatic cellular processes. Despite their implication in disease and the establishment of cell identity during development, most enhancer-gene links remain unknown. Here we introduce a new, publicly accessible database of predicted enhancer-gene links, PEREGRINE. The PEREGRINE human enhancer-gene links interactive web interface incorporates publicly available experimental data from ChIA-PET, eQTL, and Hi-C assays across 78 cell and tissue types to link 449,627 enhancers to 17,643 protein-coding genes. These enhancer-gene links are made available through the new Enhancer module of the PANTHER database and website where the user may easily access the evidence for each enhancer-gene link, as well as query by target gene and enhancer location.


Assuntos
Elementos Facilitadores Genéticos/genética , Genômica/métodos , Linhagem Celular , Bases de Dados Genéticas , Locos de Características Quantitativas/genética
13.
Plant Direct ; 4(12): e00293, 2020 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-33392435

RESUMO

We aim to enable the accurate and efficient transfer of knowledge about gene function gained from Arabidopsis thaliana and other model organisms to other plant species. This knowledge transfer is frequently challenging in plants due to duplications of individual genes and whole genomes in plant lineages. Such duplications result in complex evolutionary relationships between related genes, which may have similar sequences but highly divergent functions. In such cases, functional inference requires more than a simple sequence similarity calculation. We have developed an online resource, PhyloGenes (phylogenes.org), that displays precomputed phylogenetic trees for plant gene families along with experimentally validated function information for individual genes within the families. A total of 40 plant genomes and 10 non-plant model organisms are represented in over 8,000 gene families. Evolutionary events such as speciation and duplication are clearly labeled on gene trees to distinguish orthologs from paralogs. Nearly 6,000 families have at least one member with an experimentally supported annotation to a Gene Ontology (GO) molecular function or biological process term. By displaying experimentally validated gene functions associated to individual genes within a tree, PhyloGenes enables functional inference for genes of uncharacterized function, based on their evolutionary relationships to experimentally studied genes, in a visually traceable manner. For the many families containing genes that have evolved to perform different functions, PhyloGenes facilitates the use of evolutionary history to determine the most likely function of genes that have not been experimentally characterized. Future work will enrich the resource by incorporating additional gene function datasets such as plant gene expression atlas data.

14.
Nat Protoc ; 14(3): 703-721, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30804569

RESUMO

The PANTHER classification system ( http://www.pantherdb.org ) is a comprehensive system that combines genomes, gene function classifications, pathways and statistical analysis tools to enable biologists to analyze large-scale genome-wide experimental data. The current system (PANTHER v.14.0) covers 131 complete genomes organized into gene families and subfamilies; evolutionary relationships between genes are represented in phylogenetic trees, multiple sequence alignments and statistical models (hidden Markov models (HMMs)). The families and subfamilies are annotated with Gene Ontology (GO) terms, and sequences are assigned to PANTHER pathways. A suite of tools has been built to allow users to browse and query gene functions and analyze large-scale experimental data with a number of statistical tests. PANTHER is widely used by bench scientists, bioinformaticians, computer scientists and systems biologists. Since the protocol for using this tool (v.8.0) was originally published in 2013, there have been substantial improvements and updates in the areas of data quality, data coverage, statistical algorithms and user experience. This Protocol Update provides detailed instructions on how to analyze genome-wide experimental data in the PANTHER classification system.


Assuntos
Genes , Genômica/métodos , Software , Animais , Bases de Dados Genéticas , Ontologia Genética , Genótipo , Humanos , Anotação de Sequência Molecular , Filogenia , Estatística como Assunto
15.
Nucleic Acids Res ; 34(Web Server issue): W645-50, 2006 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-16912992

RESUMO

The vast amount of protein sequence data now available, together with accumulating experimental knowledge of protein function, enables modeling of protein sequence and function evolution. The PANTHER database was designed to model evolutionary sequence-function relationships on a large scale. There are a number of applications for these data, and we have implemented web services that address three of them. The first is a protein classification service. Proteins can be classified, using only their amino acid sequences, to evolutionary groups at both the family and subfamily levels. Specific subfamilies, and often families, are further classified when possible according to their functions, including molecular function and the biological processes and pathways they participate in. The second application, then, is an expression data analysis service, where functional classification information can help find biological patterns in the data obtained from genome-wide experiments. The third application is a coding single-nucleotide polymorphism scoring service. In this case, information about evolutionarily related proteins is used to assess the likelihood of a deleterious effect on protein function arising from a single substitution at a specific amino acid position in the protein. All three web services are available at http://www.pantherdb.org/tools.


Assuntos
Bases de Dados de Proteínas , Evolução Molecular , Polimorfismo de Nucleotídeo Único , Proteínas/genética , Proteínas/fisiologia , Análise de Sequência de Proteína , Software , Substituição de Aminoácidos , Animais , Gráficos por Computador , Interpretação Estatística de Dados , Drosophila melanogaster/genética , Genômica , Humanos , Internet , Cadeias de Markov , Camundongos , Proteínas/classificação , RNA Mensageiro/metabolismo , Ratos , Interface Usuário-Computador
16.
Nucleic Acids Res ; 33(Database issue): D284-8, 2005 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-15608197

RESUMO

PANTHER is a large collection of protein families that have been subdivided into functionally related subfamilies, using human expertise. These subfamilies model the divergence of specific functions within protein families, allowing more accurate association with function (ontology terms and pathways), as well as inference of amino acids important for functional specificity. Hidden Markov models (HMMs) are built for each family and subfamily for classifying additional protein sequences. The latest version, 5.0, contains 6683 protein families, divided into 31,705 subfamilies, covering approximately 90% of mammalian protein-coding genes. PANTHER 5.0 includes a number of significant improvements over previous versions, most notably (i) representation of pathways (primarily signaling pathways) and association with subfamilies and individual protein sequences; (ii) an improved methodology for defining the PANTHER families and subfamilies, and for building the HMMs; (iii) resources for scoring sequences against PANTHER HMMs both over the web and locally; and (iv) a number of new web resources to facilitate analysis of large gene lists, including data generated from high-throughput expression experiments. Efforts are underway to add PANTHER to the InterPro suite of databases, and to make PANTHER consistent with the PIRSF database. PANTHER is now publicly available without restriction at http://panther.appliedbiosystems.com.


Assuntos
Bases de Dados de Proteínas , Proteínas/classificação , Análise de Sequência de Proteína , Animais , Bases de Dados de Proteínas/estatística & dados numéricos , Perfilação da Expressão Gênica , Humanos , Internet , Cadeias de Markov , Camundongos , Proteínas/química , Proteínas/fisiologia , Ratos , Transdução de Sinais , Integração de Sistemas , Interface Usuário-Computador
17.
Nucleic Acids Res ; 31(1): 334-41, 2003 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-12520017

RESUMO

The PANTHER database was designed for high-throughput analysis of protein sequences. One of the key features is a simplified ontology of protein function, which allows browsing of the database by biological functions. Biologist curators have associated the ontology terms with groups of protein sequences rather than individual sequences. Statistical models (Hidden Markov Models, or HMMs) are built from each of these groups. The advantage of this approach is that new sequences can be automatically classified as they become available. To ensure accurate functional classification, HMMs are constructed not only for families, but also for functionally distinct subfamilies. Multiple sequence alignments and phylogenetic trees, including curator-assigned information, are available for each family. The current version of the PANTHER database includes training sequences from all organisms in the GenBank non-redundant protein database, and the HMMs have been used to classify gene products across the entire genomes of human, and Drosophila melanogaster. The ontology terms and protein families and subfamilies, as well as Drosophila gene c;assifications, can be browsed and searched for free. Due to outstanding contractual obligations, access to human gene classifications and to protein family trees and multiple sequence alignments will temporarily require a nominal registration fee. PANTHER is publicly available on the web at http://panther.celera.com.


Assuntos
Bases de Dados de Proteínas , Proteínas/classificação , Proteínas/fisiologia , Animais , Humanos , Filogenia , Estrutura Terciária de Proteína , Proteínas/química , Proteínas/genética , Alinhamento de Sequência
18.
Nat Protoc ; 8(8): 1551-66, 2013 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-23868073

RESUMO

The PANTHER (protein annotation through evolutionary relationship) classification system (http://www.pantherdb.org/) is a comprehensive system that combines gene function, ontology, pathways and statistical analysis tools that enable biologists to analyze large-scale, genome-wide data from sequencing, proteomics or gene expression experiments. The system is built with 82 complete genomes organized into gene families and subfamilies, and their evolutionary relationships are captured in phylogenetic trees, multiple sequence alignments and statistical models (hidden Markov models or HMMs). Genes are classified according to their function in several different ways: families and subfamilies are annotated with ontology terms (Gene Ontology (GO) and PANTHER protein class), and sequences are assigned to PANTHER pathways. The PANTHER website includes a suite of tools that enable users to browse and query gene functions, and to analyze large-scale experimental data with a number of statistical tests. It is widely used by bench scientists, bioinformaticians, computer scientists and systems biologists. In the 2013 release of PANTHER (v.8.0), in addition to an update of the data content, we redesigned the website interface to improve both user experience and the system's analytical capability. This protocol provides a detailed description of how to analyze genome-wide experimental data with the PANTHER classification system.


Assuntos
Anotação de Sequência Molecular/métodos , Filogenia , Proteínas/classificação , Software , Bases de Dados de Proteínas , Internet , Proteínas/química , Estatísticas não Paramétricas
19.
Genome Res ; 13(9): 2129-41, 2003 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-12952881

RESUMO

In the genomic era, one of the fundamental goals is to characterize the function of proteins on a large scale. We describe a method, PANTHER, for relating protein sequence relationships to function relationships in a robust and accurate way. PANTHER is composed of two main components: the PANTHER library (PANTHER/LIB) and the PANTHER index (PANTHER/X). PANTHER/LIB is a collection of "books," each representing a protein family as a multiple sequence alignment, a Hidden Markov Model (HMM), and a family tree. Functional divergence within the family is represented by dividing the tree into subtrees based on shared function, and by subtree HMMs. PANTHER/X is an abbreviated ontology for summarizing and navigating molecular functions and biological processes associated with the families and subfamilies. We apply PANTHER to three areas of active research. First, we report the size and sequence diversity of the families and subfamilies, characterizing the relationship between sequence divergence and functional divergence across a wide range of protein families. Second, we use the PANTHER/X ontology to give a high-level representation of gene function across the human and mouse genomes. Third, we use the family HMMs to rank missense single nucleotide polymorphisms (SNPs), on a database-wide scale, according to their likelihood of affecting protein function.


Assuntos
Algoritmos , Bases de Dados de Proteínas , Proteínas/classificação , Proteínas/fisiologia , Sequência de Aminoácidos , Animais , Biologia Computacional/métodos , Biologia Computacional/estatística & dados numéricos , Sistemas de Gerenciamento de Base de Dados , Variação Genética , Humanos , Cadeias de Markov , Camundongos , Mutação de Sentido Incorreto/genética , Biblioteca de Peptídeos , Polimorfismo de Nucleotídeo Único/genética , Valor Preditivo dos Testes , Proteínas/genética , Alinhamento de Sequência , Software , Terminologia como Assunto
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA