Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 15 Suppl 12: S1, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25474588

RESUMO

BACKGROUND: Currently, most people use NCBI's PubMed to search the MEDLINE database, an important bibliographical information source for life science and biomedical information. However, PubMed has some drawbacks that make it difficult to find relevant publications pertaining to users' individual intentions, especially for non-expert users. To ameliorate the disadvantages of PubMed, we developed G-Bean, a graph based biomedical search engine, to search biomedical articles in MEDLINE database more efficiently. METHODS: G-Bean addresses PubMed's limitations with three innovations: (1) Parallel document index creation: a multithreaded index creation strategy is employed to generate the document index for G-Bean in parallel; (2) Ontology-graph based query expansion: an ontology graph is constructed by merging four major UMLS (Version 2013AA) vocabularies, MeSH, SNOMEDCT, CSP and AOD, to cover all concepts in National Library of Medicine (NLM) database; a Personalized PageRank algorithm is used to compute concept relevance in this ontology graph and the Term Frequency - Inverse Document Frequency (TF-IDF) weighting scheme is used to re-rank the concepts. The top 500 ranked concepts are selected for expanding the initial query to retrieve more accurate and relevant information; (3) Retrieval and re-ranking of documents based on user's search intention: after the user selects any article from the existing search results, G-Bean analyzes user's selections to determine his/her true search intention and then uses more relevant and more specific terms to retrieve additional related articles. The new articles are presented to the user in the order of their relevance to the already selected articles. RESULTS: Performance evaluation with 106 OHSUMED benchmark queries shows that G-Bean returns more relevant results than PubMed does when using these queries to search the MEDLINE database. PubMed could not even return any search result for some OHSUMED queries because it failed to form the appropriate Boolean query statement automatically from the natural language query strings. G-Bean is available at http://bioinformatics.clemson.edu/G-Bean/index.php. CONCLUSIONS: G-Bean addresses PubMed's limitations with ontology-graph based query expansion, automatic document indexing, and user search intention discovery. It shows significant advantages in finding relevant articles from the MEDLINE database to meet the information need of the user.


Assuntos
Ontologias Biológicas , Armazenamento e Recuperação da Informação/métodos , PubMed , Software , Algoritmos , Internet , MEDLINE
2.
Proteome Sci ; 10 Suppl 1: S4, 2012 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-22759581

RESUMO

BACKGROUND: Protein synthetic lethal genetic interactions are useful to define functional relationships between proteins and pathways. However, the molecular mechanism of synthetic lethal genetic interactions remains unclear. RESULTS: In this study we used the clusters of short polypeptide sequences, which are typically shorter than the classically defined protein domains, to characterize the functionalities of proteins. We developed a framework to identify significant short polypeptide clusters from yeast protein sequences, and then used these short polypeptide clusters as features to predict yeast synthetic lethal genetic interactions. The short polypeptide clusters based approach provides much higher coverage for predicting yeast synthetic lethal genetic interactions. Evaluation using experimental data sets showed that the short polypeptide clusters based approach is superior to the previous protein domain based one. CONCLUSION: We were able to achieve higher performance in yeast synthetic lethal genetic interactions prediction using short polypeptide clusters as features. Our study suggests that the short polypeptide cluster may help better understand the functionalities of proteins.

3.
PLoS One ; 10(3): e0121893, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25811466

RESUMO

In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR) domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC) domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention.


Assuntos
Citrus/genética , Genes de Plantas , Estudo de Associação Genômica Ampla , Hibridização Genética , Motivos de Aminoácidos , Sequência de Aminoácidos , Citrus sinensis/genética , Sequência Conservada/genética , Elementos de DNA Transponíveis/genética , Conversão Gênica , Duplicação Gênica , Funções Verossimilhança , Dados de Sequência Molecular , Família Multigênica , Mutação/genética , Filogenia , Proteínas de Plantas/química , Proteínas de Plantas/genética , Estrutura Terciária de Proteína , Reprodutibilidade dos Testes , Análise de Sequência de DNA , Homologia de Sequência de Aminoácidos
4.
Artigo em Inglês | MEDLINE | ID: mdl-26356015

RESUMO

The rapid development of gene ontology (GO) and huge amount of biomedical data annotated by GO terms necessitate computation of semantic similarity of GO terms and, in turn, measurement of functional similarity of genes based on their annotations. In this paper we propose a novel and efficient method to measure the semantic similarity of GO terms. The proposed method addresses the limitations in existing GO term similarity measurement techniques; it computes the semantic content of a GO term by considering the information content of all of its ancestor terms in the graph. The aggregate information content (AIC) of all ancestor terms of a GO term implicitly reflects the GO term's location in the GO graph and also represents how human beings use this GO term and all its ancestor terms to annotate genes. We show that semantic similarity of GO terms obtained by our method closely matches the human perception. Extensive experimental studies show that this novel method also outperforms all existing methods in terms of the correlation with gene expression data. We have developed web services for measuring semantic similarity of GO terms and functional similarity of genes using the proposed AIC method and other popular methods. These web services are available at http://bioinformatics.clemson.edu/G-SESAME.


Assuntos
Biologia Computacional/métodos , Ontologia Genética , Semântica , Perfilação da Expressão Gênica , Humanos , Anotação de Sequência Molecular
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA