Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
PeerJ Comput Sci ; 9: e1710, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38077536

RESUMO

Topic-based search systems retrieve items by contextualizing the information seeking process on a topic of interest to the user. A key issue in topic-based search of text resources is how to automatically generate multiple queries that reflect the topic of interest in such a way that precision, recall, and diversity are achieved. The problem of generating topic-based queries can be effectively addressed by Multi-Objective Evolutionary Algorithms, which have shown promising results. However, two common problems with such an approach are loss of diversity and low global recall when combining results from multiple queries. This work proposes a family of Multi-Objective Genetic Programming strategies based on objective functions that attempt to maximize precision and recall while minimizing the similarity among the retrieved results. To this end, we define three novel objective functions based on result set similarity and on the information theoretic notion of entropy. Extensive experiments allow us to conclude that while the proposed strategies significantly improve precision after a few generations, only some of them are able to maintain or improve global recall. A comparative analysis against previous strategies based on Multi-Objective Evolutionary Algorithms, indicates that the proposed approach is superior in terms of precision and global recall. Furthermore, when compared to query-term-selection methods based on existing state-of-the-art term-weighting schemes, the presented Multi-Objective Genetic Programming strategies demonstrate significantly higher levels of precision, recall, and F1-score, while maintaining competitive global recall. Finally, we identify the strengths and limitations of the strategies and conclude that the choice of objectives to be maximized or minimized should be guided by the application at hand.

2.
Genes (Basel) ; 14(6)2023 06 11.
Artigo em Inglês | MEDLINE | ID: mdl-37372430

RESUMO

The likelihood of being diagnosed with thyroid cancer has increased in recent years; it is the fastest-expanding cancer in the United States and it has tripled in the last three decades. In particular, Papillary Thyroid Carcinoma (PTC) is the most common type of cancer affecting the thyroid. It is a slow-growing cancer and, thus, it can usually be cured. However, given the worrying increase in the diagnosis of this type of cancer, the discovery of new genetic markers for accurate treatment and prognostic is crucial. In the present study, the aim is to identify putative genes that may be specifically relevant in PTC through bioinformatic analysis of several gene expression public datasets and clinical information. Two datasets from Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) dataset were studied. Statistics and machine learning methods were sequentially employed to retrieve a final small cluster of genes of interest: PTGFR, ZMAT3, GABRB2, and DPP6. Kaplan-Meier plots were employed to assess the expression levels regarding overall survival and relapse-free survival. Furthermore, a manual bibliographic search for each gene was carried out, and a Protein-Protein Interaction (PPI) network was built to verify existing associations among them, followed by a new enrichment analysis. The results revealed that all the genes are highly relevant in the context of thyroid cancer and, more particularly interesting, PTGFR and DPP6 have not yet been associated with the disease up to date, thus making them worthy of further investigation as to their relationship to PTC.


Assuntos
Regulação Neoplásica da Expressão Gênica , Neoplasias da Glândula Tireoide , Humanos , Câncer Papilífero da Tireoide/metabolismo , Recidiva Local de Neoplasia/genética , Neoplasias da Glândula Tireoide/patologia , Biologia Computacional , Expressão Gênica
3.
Biosystems ; 150: 1-12, 2016 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-27521767

RESUMO

Detection of crosstalks among pathways is a challenging task, which requires the identification of different types of interactions associated with cellular processes. A common strategy used in bioinformatics consists in extrapolating pathway associations from the pairwise analysis of some genes related to them, using gene expression data and topological information. PET, the method proposed in this paper, goes a step further by incorporating a strategy for the detection of correlation across conditions between differentially expressed genes based on biclustering analysis. In order to evaluate the performance of this new approach, a comparison with two recently published algorithms was carried out. The methods were contrasted in the inference of pathway associations from Alzheimer disease datasets, where the new proposal presents a higher crosstalk discoveries' rate. Finally, the analysis of the biological relevance of the pathway associations inferred by PET has shown the soundness of the extracted knowledge.


Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica , Algoritmos , Doença de Alzheimer/diagnóstico , Doença de Alzheimer/genética , Análise por Conglomerados , Humanos
4.
Brief Bioinform ; 17(5): 758-70, 2016 09.
Artigo em Inglês | MEDLINE | ID: mdl-26438418

RESUMO

Gene expression measurements represent the most important source of biological data used to unveil the interaction and functionality of genes. In this regard, several data mining and machine learning algorithms have been proposed that require, in a number of cases, some kind of data discretization to perform the inference. Selection of an appropriate discretization process has a major impact on the design and outcome of the inference algorithms, as there are a number of relevant issues that need to be considered. This study presents a revision of the current state-of-the-art discretization techniques, together with the key subjects that need to be considered when designing or selecting a discretization approach for gene expression data.


Assuntos
Expressão Gênica , Algoritmos , Mineração de Dados , Perfilação da Expressão Gênica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA