Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
1.
Brief Bioinform ; 12(1): 22-32, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21278374

RESUMO

Finding the most promising genes among large lists of candidate genes has been defined as the gene prioritization problem. It is a recurrent problem in genetics in which genetic conditions are reported to be associated with chromosomal regions. In the last decade, several different computational approaches have been developed to tackle this challenging task. In this study, we review 19 computational solutions for human gene prioritization that are freely accessible as web tools and illustrate their differences. We summarize the various biological problems to which they have been successfully applied. Ultimately, we describe several research directions that could increase the quality and applicability of the tools. In addition we developed a website (http://www.esat.kuleuven.be/gpp) containing detailed information about these and other tools, which is regularly updated. This review and the associated website constitute together a guide to help users select a gene prioritization strategy that suits best their needs.


Assuntos
Biologia Computacional/métodos , Genes , Software , Humanos , Internet
2.
Nucleic Acids Res ; 39(Web Server issue): W334-8, 2011 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-21602267

RESUMO

PINTA (available at http://www.esat.kuleuven.be/pinta/; this web site is free and open to all users and there is no login requirement) is a web resource for the prioritization of candidate genes based on the differential expression of their neighborhood in a genome-wide protein-protein interaction network. Our strategy is meant for biological and medical researchers aiming at identifying novel disease genes using disease specific expression data. PINTA supports both candidate gene prioritization (starting from a user defined set of candidate genes) as well as genome-wide gene prioritization and is available for five species (human, mouse, rat, worm and yeast). As input data, PINTA only requires disease specific expression data, whereas various platforms (e.g. Affymetrix) are supported. As a result, PINTA computes a gene ranking and presents the results as a table that can easily be browsed and downloaded by the user.


Assuntos
Doença/genética , Perfilação da Expressão Gênica , Mapeamento de Interação de Proteínas , Software , Animais , Genes , Humanos , Internet , Camundongos , Ratos
3.
BMC Bioinformatics ; 11: 460, 2010 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-20840752

RESUMO

BACKGROUND: Discovering novel disease genes is still challenging for diseases for which no prior knowledge--such as known disease genes or disease-related pathways--is available. Performing genetic studies frequently results in large lists of candidate genes of which only few can be followed up for further investigation. We have recently developed a computational method for constitutional genetic disorders that identifies the most promising candidate genes by replacing prior knowledge by experimental data of differential gene expression between affected and healthy individuals.To improve the performance of our prioritization strategy, we have extended our previous work by applying different machine learning approaches that identify promising candidate genes by determining whether a gene is surrounded by highly differentially expressed genes in a functional association or protein-protein interaction network. RESULTS: We have proposed three strategies scoring disease candidate genes relying on network-based machine learning approaches, such as kernel ridge regression, heat kernel, and Arnoldi kernel approximation. For comparison purposes, a local measure based on the expression of the direct neighbors is also computed. We have benchmarked these strategies on 40 publicly available knockout experiments in mice, and performance was assessed against results obtained using a standard procedure in genetics that ranks candidate genes based solely on their differential expression levels (Simple Expression Ranking). Our results showed that our four strategies could outperform this standard procedure and that the best results were obtained using the Heat Kernel Diffusion Ranking leading to an average ranking position of 8 out of 100 genes, an AUC value of 92.3% and an error reduction of 52.8% relative to the standard procedure approach which ranked the knockout gene on average at position 17 with an AUC value of 83.7%. CONCLUSION: In this study we could identify promising candidate genes using network based machine learning approaches even if no knowledge is available about the disease or phenotype.


Assuntos
Inteligência Artificial , Perfilação da Expressão Gênica/métodos , Animais , Bases de Dados Genéticas , Camundongos , Fenótipo
4.
PLoS One ; 4(5): e5526, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19436755

RESUMO

Genetic studies (in particular linkage and association studies) identify chromosomal regions involved in a disease or phenotype of interest, but those regions often contain many candidate genes, only a few of which can be followed-up for biological validation. Recently, computational methods to identify (prioritize) the most promising candidates within a region have been proposed, but they are usually not applicable to cases where little is known about the phenotype (no or few confirmed disease genes, fragmentary understanding of the biological cascades involved). We seek to overcome this limitation by replacing knowledge about the biological process by experimental data on differential gene expression between affected and healthy individuals. Considering the problem from the perspective of a gene/protein network, we assess a candidate gene by considering the level of differential expression in its neighborhood under the assumption that strong candidates will tend to be surrounded by differentially expressed neighbors. We define a notion of soft neighborhood where each gene is given a contributing weight, which decreases with the distance from the candidate gene on the protein network. To account for multiple paths between genes, we define the distance using the Laplacian exponential diffusion kernel. We score candidates by aggregating the differential expression of neighbors weighted as a function of distance. Through a randomization procedure, we rank candidates by p-values. We illustrate our approach on four monogenic diseases and successfully prioritize the known disease causing genes.


Assuntos
Biologia Computacional/métodos , Expressão Gênica , Redes Reguladoras de Genes/genética , Predisposição Genética para Doença/genética , Bases de Dados Genéticas , Feminino , Síndrome do Cromossomo X Frágil/genética , Perfilação da Expressão Gênica/métodos , Genoma Humano , Humanos , Síndrome do Ovário Policístico/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA