Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Database (Oxford) ; 2013: bat080, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24288140

RESUMO

Improving the prediction of chemical toxicity is a goal common to both environmental health research and pharmaceutical drug development. To improve safety detection assays, it is critical to have a reference set of molecules with well-defined toxicity annotations for training and validation purposes. Here, we describe a collaboration between safety researchers at Pfizer and the research team at the Comparative Toxicogenomics Database (CTD) to text mine and manually review a collection of 88,629 articles relating over 1,200 pharmaceutical drugs to their potential involvement in cardiovascular, neurological, renal and hepatic toxicity. In 1 year, CTD biocurators curated 254,173 toxicogenomic interactions (152,173 chemical-disease, 58,572 chemical-gene, 5,345 gene-disease and 38,083 phenotype interactions). All chemical-gene-disease interactions are fully integrated with public CTD, and phenotype interactions can be downloaded. We describe Pfizer's text-mining process to collate the articles, and CTD's curation strategy, performance metrics, enhanced data content and new module to curate phenotype information. As well, we show how data integration can connect phenotypes to diseases. This curation can be leveraged for information about toxic endpoints important to drug safety and help develop testable hypotheses for drug-disease events. The availability of these detailed, contextualized, high-quality annotations curated from seven decades' worth of the scientific literature should help facilitate new mechanistic screening assays for pharmaceutical compound survival. This unique partnership demonstrates the importance of resource sharing and collaboration between public and private entities and underscores the complementary needs of the environmental health science and pharmaceutical communities. Database URL: http://ctdbase.org/


Assuntos
Comportamento Cooperativo , Mineração de Dados , Bases de Dados Factuais , Indústria Farmacêutica , Preparações Farmacêuticas/metabolismo , Publicações , Toxicogenética , Doença , Humanos , Fenótipo
2.
Database (Oxford) ; 2013: bat033, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23707966

RESUMO

The vast collection of biomedical literature and its continued expansion has presented a number of challenges to researchers who require structured findings to stay abreast of and analyze molecular mechanisms relevant to their domain of interest. By structuring literature content into topic-specific machine-readable databases, the aggregate data from multiple articles can be used to infer trends that can be compared and contrasted with similar findings from topic-independent resources. Our study presents a generalized procedure for semi-automatically creating a custom topic-specific molecular interaction database through the use of text mining to assist manual curation. We apply the procedure to capture molecular events that underlie 'pain', a complex phenomenon with a large societal burden and unmet medical need. We describe how existing text mining solutions are used to build a pain-specific corpus, extract molecular events from it, add context to the extracted events and assess their relevance. The pain-specific corpus contains 765 692 documents from Medline and PubMed Central, from which we extracted 356 499 unique normalized molecular events, with 261 438 single protein events and 93 271 molecular interactions supplied by BioContext. Event chains are annotated with negation, speculation, anatomy, Gene Ontology terms, mutations, pain and disease relevance, which collectively provide detailed insight into how that event chain is associated with pain. The extracted relations are visualized in a wiki platform (wiki-pain.org) that enables efficient manual curation and exploration of the molecular mechanisms that underlie pain. Curation of 1500 grouped event chains ranked by pain relevance revealed 613 accurately extracted unique molecular interactions that in the future can be used to study the underlying mechanisms involved in pain. Our approach demonstrates that combining existing text mining tools with domain-specific terms and wiki-based visualization can facilitate rapid curation of molecular interactions to create a custom database. Database URL: •••


Assuntos
Catálogos como Assunto , Mineração de Dados/métodos , Dor/genética , Transdução de Sinais , Animais , Automação , Dicionários como Assunto , Humanos , Armazenamento e Recuperação da Informação , Camundongos , Ratos , Transdução de Sinais/genética , Software
3.
BMC Bioinformatics ; 12 Suppl 8: S4, 2011 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-22151968

RESUMO

BACKGROUND: The BioCreative challenge evaluation is a community-wide effort for evaluating text mining and information extraction systems applied to the biological domain. The biocurator community, as an active user of biomedical literature, provides a diverse and engaged end user group for text mining tools. Earlier BioCreative challenges involved many text mining teams in developing basic capabilities relevant to biological curation, but they did not address the issues of system usage, insertion into the workflow and adoption by curators. Thus in BioCreative III (BC-III), the InterActive Task (IAT) was introduced to address the utility and usability of text mining tools for real-life biocuration tasks. To support the aims of the IAT in BC-III, involvement of both developers and end users was solicited, and the development of a user interface to address the tasks interactively was requested. RESULTS: A User Advisory Group (UAG) actively participated in the IAT design and assessment. The task focused on gene normalization (identifying gene mentions in the article and linking these genes to standard database identifiers), gene ranking based on the overall importance of each gene mentioned in the article, and gene-oriented document retrieval (identifying full text papers relevant to a selected gene). Six systems participated and all processed and displayed the same set of articles. The articles were selected based on content known to be problematic for curation, such as ambiguity of gene names, coverage of multiple genes and species, or introduction of a new gene name. Members of the UAG curated three articles for training and assessment purposes, and each member was assigned a system to review. A questionnaire related to the interface usability and task performance (as measured by precision and recall) was answered after systems were used to curate articles. Although the limited number of articles analyzed and users involved in the IAT experiment precluded rigorous quantitative analysis of the results, a qualitative analysis provided valuable insight into some of the problems encountered by users when using the systems. The overall assessment indicates that the system usability features appealed to most users, but the system performance was suboptimal (mainly due to low accuracy in gene normalization). Some of the issues included failure of species identification and gene name ambiguity in the gene normalization task leading to an extensive list of gene identifiers to review, which, in some cases, did not contain the relevant genes. The document retrieval suffered from the same shortfalls. The UAG favored achieving high performance (measured by precision and recall), but strongly recommended the addition of features that facilitate the identification of correct gene and its identifier, such as contextual information to assist in disambiguation. DISCUSSION: The IAT was an informative exercise that advanced the dialog between curators and developers and increased the appreciation of challenges faced by each group. A major conclusion was that the intended users should be actively involved in every phase of software development, and this will be strongly encouraged in future tasks. The IAT Task provides the first steps toward the definition of metrics and functional requirements that are necessary for designing a formal evaluation of interactive curation systems in the BioCreative IV challenge.


Assuntos
Mineração de Dados/métodos , Genes , Animais , Biologia Computacional/métodos , Publicações Periódicas como Assunto , Plantas/genética , Plantas/metabolismo
4.
Nucleic Acids Res ; 37(3): 771-7, 2009 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-19074486

RESUMO

Molecular perturbations provide a powerful toolset for biomedical researchers to scrutinize the contributions of individual molecules in biological systems. Perturbations qualify the context of experimental results and, despite their diversity, share properties in different dimensions in ways that can be formalized. We propose a formal framework to describe and classify perturbations that allows accumulation of knowledge in order to inform the process of biomedical scientific experimentation and target analysis. We apply this framework to develop a novel algorithm for automatic detection and characterization of perturbations in text and show its relevance in the study of gene-phenotype associations and protein-protein interactions in diabetes and cancer. Analyzing perturbations introduces a novel view of the multivariate landscape of biological systems.


Assuntos
Algoritmos , Doença/genética , Classificação/métodos , Diabetes Mellitus/genética , Diabetes Mellitus/metabolismo , Redes Reguladoras de Genes , Humanos , MEDLINE , Neoplasias/genética , Neoplasias/metabolismo , Fenótipo , Mapeamento de Interação de Proteínas
5.
Pac Symp Biocomput ; : 592-603, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-18229718

RESUMO

Drug development generates information needs from groups throughout a company. Knowing where to look for high-quality information is essential for minimizing costs and remaining competitive. Using 1131 research requests that came to our library between 2001 and 2007, we show that drugs, diseases, and genes/proteins are the most frequently searched subjects, and journal articles, patents, and competitive intelligence literature are the most frequently consulted textual resources.


Assuntos
Biologia Computacional , Desenho de Fármacos , Armazenamento e Recuperação da Informação , Biotecnologia , Bases de Dados Factuais , Indústria Farmacêutica , Bibliotecas Médicas
6.
Brief Bioinform ; 7(4): 399-406, 2006 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-17032698

RESUMO

Currently, literature is integrated in systems biology studies in three ways. Hand-curated pathways have been sufficient for assembling models in numerous studies. Second, literature is frequently accessed in a derived form, such as the concepts represented by the Medical Subject Headings (MeSH) and Gene Ontologies (GO), or functional relationships captured in protein-protein interaction (PPI) databases; both of these are convenient, consistent reductions of more complex concepts expressed as free text in the literature. Moreover, their contents are easily integrated into computational processes required for dealing with large data sets. Last, mining text directly for specific types of information is on the rise as text analytics methods become more accurate and accessible. These uses of literature, specifically manual curation, derived concepts captured in ontologies and databases, and indirect and direct application of text mining, will be discussed as they pertain to systems biology.


Assuntos
Bases de Dados como Assunto , Armazenamento e Recuperação da Informação , Biologia de Sistemas , Animais , Bases de Dados Genéticas , Humanos , Medical Subject Headings , Publicações Periódicas como Assunto , Vocabulário Controlado
7.
Curr Opin Drug Discov Devel ; 8(3): 323-8, 2005 May.
Artigo em Inglês | MEDLINE | ID: mdl-15892247

RESUMO

The automated extraction of biological and chemical information has improved over the past year, with advances in access to content, entity extraction of genes, chemicals, kinetic data and relationships, and algorithms for generating and testing hypotheses. As the systems for reading and understanding scientific literature grow more powerful, so must the infrastructure in which to assemble information. Advances in infrastructure systems are discussed in this review. Research efforts have flourished as a result of text analytics competitions that attract participants from various disciplines, from computer science to bioinformatics.


Assuntos
Biologia Computacional , Desenho de Fármacos , Armazenamento e Recuperação da Informação , Animais , Humanos , Processamento de Linguagem Natural , Preparações Farmacêuticas , Relação Quantitativa Estrutura-Atividade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...