Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 140
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38436561

RESUMO

Enrichment analysis (EA) is a common approach to gain functional insights from genome-scale experiments. As a consequence, a large number of EA methods have been developed, yet it is unclear from previous studies which method is the best for a given dataset. The main issues with previous benchmarks include the complexity of correctly assigning true pathways to a test dataset, and lack of generality of the evaluation metrics, for which the rank of a single target pathway is commonly used. We here provide a generalized EA benchmark and apply it to the most widely used EA methods, representing all four categories of current approaches. The benchmark employs a new set of 82 curated gene expression datasets from DNA microarray and RNA-Seq experiments for 26 diseases, of which only 13 are cancers. In order to address the shortcomings of the single target pathway approach and to enhance the sensitivity evaluation, we present the Disease Pathway Network, in which related Kyoto Encyclopedia of Genes and Genomes pathways are linked. We introduce a novel approach to evaluate pathway EA by combining sensitivity and specificity to provide a balanced evaluation of EA methods. This approach identifies Network Enrichment Analysis methods as the overall top performers compared with overlap-based methods. By using randomized gene expression datasets, we explore the null hypothesis bias of each method, revealing that most of them produce skewed P-values.


Assuntos
Benchmarking , RNA-Seq
2.
Nucleic Acids Res ; 50(W1): W398-W404, 2022 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-35609981

RESUMO

Accurate inference of gene regulatory networks (GRN) is an essential component of systems biology, and there is a constant development of new inference methods. The most common approach to assess accuracy for publications is to benchmark the new method against a selection of existing algorithms. This often leads to a very limited comparison, potentially biasing the results, which may stem from tuning the benchmark's properties or incorrect application of other methods. These issues can be avoided by a web server with a broad range of data properties and inference algorithms, that makes it easy to perform comprehensive benchmarking of new methods, and provides a more objective assessment. Here we present https://GRNbenchmark.org/ - a new web server for benchmarking GRN inference methods, which provides the user with a set of benchmarks with several datasets, each spanning a range of properties including multiple noise levels. As soon as the web server has performed the benchmarking, the accuracy results are made privately available to the user via interactive summary plots and underlying curves. The user can then download these results for any purpose, and decide whether or not to make them public to share with the community.


Assuntos
Benchmarking , Redes Reguladoras de Genes , Algoritmos , Computadores , Biologia de Sistemas/métodos
3.
Nucleic Acids Res ; 50(W1): W623-W632, 2022 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-35552456

RESUMO

The Orthology Benchmark Service (https://orthology.benchmarkservice.org) is the gold standard for orthology inference evaluation, supported and maintained by the Quest for Orthologs consortium. It is an essential resource to compare existing and new methods of orthology inference (the bedrock for many comparative genomics and phylogenetic analysis) over a standard dataset and through common procedures. The Quest for Orthologs Consortium is dedicated to maintaining the resource up to date, through regular updates of the Reference Proteomes and increasingly accessible data through the OpenEBench platform. For this update, we have added a new benchmark based on curated orthology assertion from the Vertebrate Gene Nomenclature Committee, and provided an example meta-analysis of the public predictions present on the platform.


Assuntos
Benchmarking , Genômica , Filogenia , Genômica/métodos , Proteoma
4.
Bioinformatics ; 38(10): 2918-2919, 2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35561192

RESUMO

SUMMARY: Predicting orthologs, genes in different species having shared ancestry, is an important task in bioinformatics. Orthology prediction tools are required to make accurate and fast predictions, in order to analyze large amounts of data within a feasible time frame. InParanoid is a well-known algorithm for orthology analysis, shown to perform well in benchmarks, but having the major limitation of long runtimes on large datasets. Here, we present an update to the InParanoid algorithm that can use the faster tool DIAMOND instead of BLAST for the homolog search step. We show that it reduces the runtime by 94%, while still obtaining similar performance in the Quest for Orthologs benchmark. AVAILABILITY AND IMPLEMENTATION: The source code is available at (https://bitbucket.org/sonnhammergroup/inparanoid). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Software
5.
Bioinformatics ; 38(8): 2263-2268, 2022 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-35176145

RESUMO

MOTIVATION: Inferring an accurate gene regulatory network (GRN) has long been a key goal in the field of systems biology. To do this, it is important to find a suitable balance between the maximum number of true positive and the minimum number of false-positive interactions. Another key feature is that the inference method can handle the large size of modern experimental data, meaning the method needs to be both fast and accurate. The Least Squares Cut-Off (LSCO) method can fulfill both these criteria, however as it is based on least squares it is vulnerable to known issues of amplifying extreme values, small or large. In GRN this manifests itself with genes that are erroneously hyper-connected to a large fraction of all genes due to extremely low value fold changes. RESULTS: We developed a GRN inference method called Least Squares Cut-Off with Normalization (LSCON) that tackles this problem. LSCON extends the LSCO algorithm by regularization to avoid hyper-connected genes and thereby reduce false positives. The regularization used is based on normalization, which removes effects of extreme values on the fit. We benchmarked LSCON and compared it to Genie3, LASSO, LSCO and Ridge regression, in terms of accuracy, speed and tendency to predict hyper-connected genes. The results show that LSCON achieves better or equal accuracy compared to LASSO, the best existing method, especially for data with extreme values. Thanks to the speed of least squares regression, LSCON does this an order of magnitude faster than LASSO. AVAILABILITY AND IMPLEMENTATION: Data: https://bitbucket.org/sonnhammergrni/lscon; Code: https://bitbucket.org/sonnhammergrni/genespider. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Redes Reguladoras de Genes , Análise dos Mínimos Quadrados , Biologia de Sistemas , Benchmarking
6.
Bioinformatics ; 38(9): 2659-2660, 2022 04 28.
Artigo em Inglês | MEDLINE | ID: mdl-35266519

RESUMO

MOTIVATION: Pathway annotation tools are indispensable for the interpretation of a wide range of experiments in life sciences. Network-based algorithms have recently been developed which are more sensitive than traditional overlap-based algorithms, but there is still a lack of good online tools for network-based pathway analysis. RESULTS: We present PathwAX II-a pathway analysis web tool based on network crosstalk analysis using the BinoX algorithm. It offers several new features compared with the first version, including interactive graphical network visualization of the crosstalk between a query gene set and an enriched pathway, and the addition of Reactome pathways. AVAILABILITY AND IMPLEMENTATION: PathwAX II is available at http://pathwax.sbc.su.se. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Software , Fenômenos Fisiológicos Celulares
7.
Nucleic Acids Res ; 49(D1): D412-D419, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33125078

RESUMO

The Pfam database is a widely used resource for classifying protein sequences into families and domains. Since Pfam was last described in this journal, over 350 new families have been added in Pfam 33.1 and numerous improvements have been made to existing entries. To facilitate research on COVID-19, we have revised the Pfam entries that cover the SARS-CoV-2 proteome, and built new entries for regions that were not covered by Pfam. We have reintroduced Pfam-B which provides an automatically generated supplement to Pfam and contains 136 730 novel clusters of sequences that are not yet matched by a Pfam family. The new Pfam-B is based on a clustering by the MMseqs2 software. We have compared all of the regions in the RepeatsDB to those in Pfam and have started to use the results to build and refine Pfam repeat families. Pfam is freely available for browsing and download at http://pfam.xfam.org/.


Assuntos
Biologia Computacional/estatística & dados numéricos , Bases de Dados de Proteínas , Proteínas/metabolismo , Proteoma/metabolismo , Animais , COVID-19/epidemiologia , COVID-19/prevenção & controle , COVID-19/virologia , Biologia Computacional/métodos , Epidemias , Humanos , Internet , Modelos Moleculares , Estrutura Terciária de Proteína , Proteínas/química , Proteínas/genética , Proteoma/classificação , Proteoma/genética , Sequências Repetitivas de Aminoácidos/genética , SARS-CoV-2/genética , SARS-CoV-2/fisiologia , Análise de Sequência de Proteína/métodos
8.
Mol Biol Evol ; 38(8): 3033-3045, 2021 07 29.
Artigo em Inglês | MEDLINE | ID: mdl-33822172

RESUMO

Accurate determination of the evolutionary relationships between genes is a foundational challenge in biology. Homology-evolutionary relatedness-is in many cases readily determined based on sequence similarity analysis. By contrast, whether or not two genes directly descended from a common ancestor by a speciation event (orthologs) or duplication event (paralogs) is more challenging, yet provides critical information on the history of a gene. Since 2009, this task has been the focus of the Quest for Orthologs (QFO) Consortium. The sixth QFO meeting took place in Okazaki, Japan in conjunction with the 67th National Institute for Basic Biology conference. Here, we report recent advances, applications, and oncoming challenges that were discussed during the conference. Steady progress has been made toward standardization and scalability of new and existing tools. A feature of the conference was the presentation of a panel of accessible tools for phylogenetic profiling and several developments to bring orthology beyond the gene unit-from domains to networks. This meeting brought into light several challenges to come: leveraging orthology computations to get the most of the incoming avalanche of genomic data, integrating orthology from domain to biological network levels, building better gene models, and adapting orthology approaches to the broad evolutionary and genomic diversity recognized in different forms of life and viruses.


Assuntos
Especiação Genética , Genômica/tendências , Filogenia , Genoma Viral , Genômica/métodos
9.
Brief Bioinform ; 21(4): 1224-1237, 2020 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-31281921

RESUMO

The vast amount of experimental data from recent advances in the field of high-throughput biology begs for integration into more complex data structures such as genome-wide functional association networks. Such networks have been used for elucidation of the interplay of intra-cellular molecules to make advances ranging from the basic science understanding of evolutionary processes to the more translational field of precision medicine. The allure of the field has resulted in rapid growth of the number of available network resources, each with unique attributes exploitable to answer different biological questions. Unfortunately, the high volume of network resources makes it impossible for the intended user to select an appropriate tool for their particular research question. The aim of this paper is to provide an overview of the underlying data and representative network resources as well as to mention methods of integration, allowing a customized approach to resource selection. Additionally, this report will provide a primer for researchers venturing into the field of network integration.


Assuntos
Biologia Computacional/métodos , Genoma , Bases de Dados Genéticas
10.
Bioinformatics ; 37(20): 3553-3559, 2021 Oct 25.
Artigo em Inglês | MEDLINE | ID: mdl-33978748

RESUMO

MOTIVATION: Accurate inference of gene regulatory interactions is of importance for understanding the mechanisms of underlying biological processes. For gene expression data gathered from targeted perturbations, gene regulatory network (GRN) inference methods that use the perturbation design are the top performing methods. However, the connection between the perturbation design and gene expression can be obfuscated due to problems, such as experimental noise or off-target effects, limiting the methods' ability to reconstruct the true GRN. RESULTS: In this study, we propose an algorithm, IDEMAX, to infer the effective perturbation design from gene expression data in order to eliminate the potential risk of fitting a disconnected perturbation design to gene expression. We applied IDEMAX to synthetic data from two different data generation tools, GeneNetWeaver and GeneSPIDER, and assessed its effect on the experiment design matrix as well as the accuracy of the GRN inference, followed by application to a real dataset. The results show that our approach consistently improves the accuracy of GRN inference compared to using the intended perturbation design when much of the signal is hidden by noise, which is often the case for real data. AVAILABILITY AND IMPLEMENTATION: https://bitbucket.org/sonnhammergrni/idemax. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

11.
Nucleic Acids Res ; 48(W1): W538-W545, 2020 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-32374845

RESUMO

The identification of orthologs-genes in different species which descended from the same gene in their last common ancestor-is a prerequisite for many analyses in comparative genomics and molecular evolution. Numerous algorithms and resources have been conceived to address this problem, but benchmarking and interpreting them is fraught with difficulties (need to compare them on a common input dataset, absence of ground truth, computational cost of calling orthologs). To address this, the Quest for Orthologs consortium maintains a reference set of proteomes and provides a web server for continuous orthology benchmarking (http://orthology.benchmarkservice.org). Furthermore, consensus ortholog calls derived from public benchmark submissions are provided on the Alliance of Genome Resources website, the joint portal of NIH-funded model organism databases.


Assuntos
Família Multigênica , Proteoma , Software , Animais , Benchmarking , Consenso , Genômica , Humanos , Camundongos , Filogenia , Ratos
12.
Nucleic Acids Res ; 47(D1): D427-D432, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30357350

RESUMO

The last few years have witnessed significant changes in Pfam (https://pfam.xfam.org). The number of families has grown substantially to a total of 17,929 in release 32.0. New additions have been coupled with efforts to improve existing families, including refinement of domain boundaries, their classification into Pfam clans, as well as their functional annotation. We recently began to collaborate with the RepeatsDB resource to improve the definition of tandem repeat families within Pfam. We carried out a significant comparison to the structural classification database, namely the Evolutionary Classification of Protein Domains (ECOD) that led to the creation of 825 new families based on their set of uncharacterized families (EUFs). Furthermore, we also connected Pfam entries to the Sequence Ontology (SO) through mapping of the Pfam type definitions to SO terms. Since Pfam has many community contributors, we recently enabled the linking between authorship of all Pfam entries with the corresponding authors' ORCID identifiers. This effectively permits authors to claim credit for their Pfam curation and link them to their ORCID record.


Assuntos
Bases de Dados de Proteínas , Proteínas/classificação , Anotação de Sequência Molecular , Domínios Proteicos , Proteínas/química , Sequências Repetitivas de Aminoácidos
13.
Int J Mol Sci ; 22(2)2021 Jan 14.
Artigo em Inglês | MEDLINE | ID: mdl-33466918

RESUMO

DNA methylation changes may predispose becoming IgE-sensitized to allergens. We analyzed whether DNA methylation in peripheral blood mononuclear cells (PBMC) is associated with IgE sensitization at 5 years of age (5Y). DNA methylation was measured in 288 PBMC samples from 74 mother/child pairs from the birth cohort ALADDIN (Assessment of Lifestyle and Allergic Disease During INfancy) using the HumanMethylation450BeadChip (Illumina). PBMCs were obtained from the mothers during pregnancy and from their children in cord blood, at 2 years and 5Y. DNA methylation levels at each time point were compared between children with and without IgE sensitization to allergens at 5Y. For replication, CpG sites associated with IgE sensitization in ALADDIN were evaluated in whole blood DNA of 256 children, 4 years old, from the BAMSE (Swedish abbreviation for Children, Allergy, Milieu, Stockholm, Epidemiology) cohort. We found 34 differentially methylated regions (DMRs) associated with IgE sensitization to airborne allergens and 38 DMRs associated with sensitization to food allergens in children at 5Y (Sidak p ≤ 0.05). Genes associated with airborne sensitization were enriched in the pathway of endocytosis, while genes associated with food sensitization were enriched in focal adhesion, the bacterial invasion of epithelial cells, and leukocyte migration. Furthermore, 25 DMRs in maternal PBMCs were associated with IgE sensitization to airborne allergens in their children at 5Y, which were functionally annotated to the mTOR (mammalian Target of Rapamycin) signaling pathway. This study supports that DNA methylation is associated with IgE sensitization early in life and revealed new candidate genes for atopy. Moreover, our study provides evidence that maternal DNA methylation levels are associated with IgE sensitization in the child supporting early in utero effects on atopy predisposition.


Assuntos
Ilhas de CpG/genética , Metilação de DNA , Imunoglobulina E/sangue , Leucócitos Mononucleares/metabolismo , Mães/estatística & dados numéricos , Adulto , Alérgenos/imunologia , Células Cultivadas , Pré-Escolar , Estudos de Coortes , Feminino , Sangue Fetal/imunologia , Predisposição Genética para Doença/genética , Humanos , Hipersensibilidade/genética , Hipersensibilidade/imunologia , Imunoglobulina E/imunologia , Leucócitos Mononucleares/citologia , Leucócitos Mononucleares/imunologia , Masculino , Gravidez
14.
Mol Biol Evol ; 36(10): 2157-2164, 2019 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-31241141

RESUMO

Gene families evolve by the processes of speciation (creating orthologs), gene duplication (paralogs), and horizontal gene transfer (xenologs), in addition to sequence divergence and gene loss. Orthologs in particular play an essential role in comparative genomics and phylogenomic analyses. With the continued sequencing of organisms across the tree of life, the data are available to reconstruct the unique evolutionary histories of tens of thousands of gene families. Accurate reconstruction of these histories, however, is a challenging computational problem, and the focus of the Quest for Orthologs Consortium. We review the recent advances and outstanding challenges in this field, as revealed at a symposium and meeting held at the University of Southern California in 2017. Key advances have been made both at the level of orthology algorithm development and with respect to coordination across the community of algorithm developers and orthology end-users. Applications spanned a broad range, including gene function prediction, phylostratigraphy, genome evolution, and phylogenomics. The meetings highlighted the increasing use of meta-analyses integrating results from multiple different algorithms, and discussed ongoing challenges in orthology inference as well as the next steps toward improvement and integration of orthology resources.


Assuntos
Evolução Molecular , Genômica/tendências , Família Multigênica , Algoritmos , Animais , Genômica/métodos , Humanos
15.
Bioinformatics ; 35(6): 1026-1032, 2019 03 15.
Artigo em Inglês | MEDLINE | ID: mdl-30169550

RESUMO

MOTIVATION: Inference of gene regulatory networks (GRNs) from perturbation data can give detailed mechanistic insights of a biological system. Many inference methods exist, but the resulting GRN is generally sensitive to the choice of method-specific parameters. Even though the inferred GRN is optimal given the parameters, many links may be wrong or missing if the data is not informative. To make GRN inference reliable, a method is needed to estimate the support of each predicted link as the method parameters are varied. RESULTS: To achieve this we have developed a method called nested bootstrapping, which applies a bootstrapping protocol to GRN inference, and by repeated bootstrap runs assesses the stability of the estimated support values. To translate bootstrap support values to false discovery rates we run the same pipeline with shuffled data as input. This provides a general method to control the false discovery rate of GRN inference that can be applied to any setting of inference parameters, noise level, or data properties. We evaluated nested bootstrapping on a simulated dataset spanning a range of such properties, using the LASSO, Least Squares, RNI, GENIE3 and CLR inference methods. An improved inference accuracy was observed in almost all situations. Nested bootstrapping was incorporated into the GeneSPIDER package, which was also used for generating the simulated networks and data, as well as running and analyzing the inferences. AVAILABILITY AND IMPLEMENTATION: https://bitbucket.org/sonnhammergrni/genespider/src/NB/%2B Methods/NestBoot.m.


Assuntos
Algoritmos , Redes Reguladoras de Genes
16.
Nucleic Acids Res ; 46(D1): D601-D607, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29165593

RESUMO

This release of the FunCoup database (http://funcoup.sbc.su.se) is the fourth generation of one of the most comprehensive databases for genome-wide functional association networks. These functional associations are inferred via integrating various data types using a naive Bayesian algorithm and orthology based information transfer across different species. This approach provides high coverage of the included genomes as well as high quality of inferred interactions. In this update of FunCoup we introduce four new eukaryotic species: Schizosaccharomyces pombe, Plasmodium falciparum, Bos taurus, Oryza sativa and open the database to the prokaryotic domain by including networks for Escherichia coli and Bacillus subtilis. The latter allows us to also introduce a new class of functional association between genes - co-occurrence in the same operon. We also supplemented the existing classes of functional association: metabolic, signaling, complex and physical protein interaction with up-to-date information. In this release we switched to InParanoid v8 as the source of orthology and base for calculation of phylogenetic profiles. While populating all other evidence types with new data we introduce a new evidence type based on quantitative mass spectrometry data. Finally, the new JavaScript based network viewer provides the user an intuitive and responsive platform to further evaluate the results.


Assuntos
Bases de Dados Genéticas , Animais , Bovinos , Redes Reguladoras de Genes , Estudo de Associação Genômica Ampla , Genômica , Humanos , Óperon , Oryza/genética , Filogenia , Plasmodium falciparum/genética , Mapas de Interação de Proteínas , Proteômica , Schizosaccharomyces/genética , Interface Usuário-Computador
17.
BMC Bioinformatics ; 20(1): 523, 2019 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-31660857

RESUMO

BACKGROUND: Orthology inference is normally based on full-length protein sequences. However, most proteins contain independently folding and recurring regions, domains. The domain architecture of a protein is vital for its function, and recombination events mean individual domains can have different evolutionary histories. It has previously been shown that orthologous proteins may differ in domain architecture, creating challenges for orthology inference methods operating on full-length sequences. We have developed Domainoid, a new tool aiming to overcome these challenges faced by full-length orthology methods by inferring orthology on the domain level. It employs the InParanoid algorithm on single domains separately, to infer groups of orthologous domains. RESULTS: This domain-oriented approach allows detection of discordant domain orthologs, cases where different domains on the same protein have different evolutionary histories. In addition to domain level analysis, protein level orthology based on the fraction of domains that are orthologous can be inferred. Domainoid orthology assignments were compared to those yielded by the conventional full-length approach InParanoid, and were validated in a standard benchmark. CONCLUSIONS: Our results show that domain-based orthology inference can reveal many orthologous relationships that are not found by full-length sequence approaches. AVAILABILITY: https://bitbucket.org/sonnhammergroup/domainoid/.


Assuntos
Proteínas/análise , Algoritmos , Evolução Biológica , Proteínas/genética , Software
18.
Nat Methods ; 13(5): 425-30, 2016 05.
Artigo em Inglês | MEDLINE | ID: mdl-27043882

RESUMO

Achieving high accuracy in orthology inference is essential for many comparative, evolutionary and functional genomic analyses, yet the true evolutionary history of genes is generally unknown and orthologs are used for very different applications across phyla, requiring different precision-recall trade-offs. As a result, it is difficult to assess the performance of orthology inference methods. Here, we present a community effort to establish standards and an automated web-based service to facilitate orthology benchmarking. Using this service, we characterize 15 well-established inference methods and resources on a battery of 20 different benchmarks. Standardized benchmarking provides a way for users to identify the most effective methods for the problem at hand, sets a minimum requirement for new tools and resources, and guides the development of more accurate orthology inference methods.


Assuntos
Biologia Computacional/normas , Genômica/normas , Filogenia , Proteômica/normas , Archaea/classificação , Archaea/genética , Bactérias/classificação , Bactérias/genética , Biologia Computacional/métodos , Bases de Dados Genéticas , Eucariotos/classificação , Eucariotos/genética , Ontologia Genética , Genômica/métodos , Modelos Genéticos , Proteômica/métodos , Análise de Sequência de Proteína , Homologia de Sequência , Especificidade da Espécie
19.
Bioinformatics ; 34(2): 323-329, 2018 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-28968857

RESUMO

The Quest for Orthologs (QfO) is an open collaboration framework for experts in comparative phylogenomics and related research areas who have an interest in highly accurate orthology predictions and their applications. We here report highlights and discussion points from the QfO meeting 2015 held in Barcelona. Achievements in recent years have established a basis to support developments for improved orthology prediction and to explore new approaches. Central to the QfO effort is proper benchmarking of methods and services, as well as design of standardized datasets and standardized formats to allow sharing and comparison of results. Simultaneously, analysis pipelines have been improved, evaluated and adapted to handle large datasets. All this would not have occurred without the long-term collaboration of Consortium members. Meeting regularly to review and coordinate complementary activities from a broad spectrum of innovative researchers clearly benefits the community. Highlights of the meeting include addressing sources of and legitimacy of disagreements between orthology calls, the context dependency of orthology definitions, special challenges encountered when analyzing very anciently rooted orthologies, orthology in the light of whole-genome duplications, and the concept of orthologous versus paralogous relationships at different levels, including domain-level orthology. Furthermore, particular needs for different applications (e.g. plant genomics, ancient gene families and others) and the infrastructure for making orthology inferences available (e.g. interfaces with model organism databases) were discussed, with several ongoing efforts that are expected to be reported on during the upcoming 2017 QfO meeting.

20.
Nucleic Acids Res ; 45(2): e8, 2017 01 25.
Artigo em Inglês | MEDLINE | ID: mdl-27664219

RESUMO

Analyzing gene expression patterns is a mainstay to gain functional insights of biological systems. A plethora of tools exist to identify significant enrichment of pathways for a set of differentially expressed genes. Most tools analyze gene overlap between gene sets and are therefore severely hampered by the current state of pathway annotation, yet at the same time they run a high risk of false assignments. A way to improve both true positive and false positive rates (FPRs) is to use a functional association network and instead look for enrichment of network connections between gene sets. We present a new network crosstalk analysis method BinoX that determines the statistical significance of network link enrichment or depletion between gene sets, using the binomial distribution. This is a much more appropriate statistical model than previous methods have employed, and as a result BinoX yields substantially better true positive and FPRs than was possible before. A number of benchmarks were performed to assess the accuracy of BinoX and competing methods. We demonstrate examples of how BinoX finds many biologically meaningful pathway annotations for gene sets from cancer and other diseases, which are not found by other methods. BinoX is available at http://sonnhammer.org/BinoX.


Assuntos
Biologia Computacional/métodos , Redes Reguladoras de Genes , Redes e Vias Metabólicas , Transdução de Sinais , Software , Algoritmos , Estudo de Associação Genômica Ampla , Genômica/métodos , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA