Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Drug Discov Today Technol ; 14: 11-6, 2015 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-26194582

RESUMEN

Protein databases are a gold mine of potential new drug targets. The ready access to a complete overview of all aspects of protein biology provides the most benefit at the outset of drug discovery pipelines. Ideally, curation strategies used to move from the raw data to the validated knowledge should contain the checks and balances necessary for accuracy. The neXtProt human protein knowledgebase is used here as an example to give insight into these methods.


Asunto(s)
Bases de Datos de Proteínas , Descubrimiento de Drogas , Minería de Datos , Humanos , Conformación Proteica
2.
Drug Discov Today ; 20(4): 399-405, 2015 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-25463038

RESUMEN

Modern data-driven drug discovery requires integrated resources to support decision-making and enable new discoveries. The Open PHACTS Discovery Platform (http://dev.openphacts.org) was built to address this requirement by focusing on drug discovery questions that are of high priority to the pharmaceutical industry. Although complex, most of these frequently asked questions (FAQs) revolve around the combination of data concerning compounds, targets, pathways and diseases. Computational drug discovery using workflow tools and the integrated resources of Open PHACTS can deliver answers to most of these questions. Here, we report on a selection of workflows used for solving these use cases and discuss some of the research challenges. The workflows are accessible online from myExperiment (http://www.myexperiment.org) and are available for reuse by the scientific community.


Asunto(s)
Biología Computacional , Bases de Datos de Compuestos Químicos , Bases de Datos Farmacéuticas , Técnicas de Apoyo para la Decisión , Descubrimiento de Drogas/métodos , Preparaciones Farmacéuticas/química , Flujo de Trabajo , Acceso a la Información , Minería de Datos , Humanos , Estructura Molecular , Transducción de Señal/efectos de los fármacos , Relación Estructura-Actividad , Integración de Sistemas
3.
PLoS One ; 9(12): e115460, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25522365

RESUMEN

Integration of open access, curated, high-quality information from multiple disciplines in the Life and Biomedical Sciences provides a holistic understanding of the domain. Additionally, the effective linking of diverse data sources can unearth hidden relationships and guide potential research strategies. However, given the lack of consistency between descriptors and identifiers used in different resources and the absence of a simple mechanism to link them, gathering and combining relevant, comprehensive information from diverse databases remains a challenge. The Open Pharmacological Concepts Triple Store (Open PHACTS) is an Innovative Medicines Initiative project that uses semantic web technology approaches to enable scientists to easily access and process data from multiple sources to solve real-world drug discovery problems. The project draws together sources of publicly-available pharmacological, physicochemical and biomolecular data, represents it in a stable infrastructure and provides well-defined information exploration and retrieval methods. Here, we highlight the utility of this platform in conjunction with workflow tools to solve pharmacological research questions that require interoperability between target, compound, and pathway data. Use cases presented herein cover 1) the comprehensive identification of chemical matter for a dopamine receptor drug discovery program 2) the identification of compounds active against all targets in the Epidermal growth factor receptor (ErbB) signaling pathway that have a relevance to disease and 3) the evaluation of established targets in the Vitamin D metabolism pathway to aid novel Vitamin D analogue design. The example workflows presented illustrate how the Open PHACTS Discovery Platform can be used to exploit existing knowledge and generate new hypotheses in the process of drug discovery.


Asunto(s)
Bases de Datos como Asunto , Descubrimiento de Drogas/organización & administración , Programas Informáticos , Descubrimiento de Drogas/métodos , Descubrimiento de Drogas/estadística & datos numéricos
4.
Drug Discov Today Technol ; 12: e47-54, 2014 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-25027375

RESUMEN

Transport proteins represent an eminent class of drug targets and ADMET (absorption, distribution, metabolism, excretion, toxicity) associated genes. There exists a large number of distinct activity assays for transport proteins, depending on not only the measurement needed (e.g. transport activity, strength of ligand­protein interaction), but also due to heterogeneous assay setups used by different research groups. Efforts to systematically organize this (divergent) bioassay data have large potential impact in Public-Private partnership and conventional commercial drug discovery. In this short review, we highlight some of the frequently used high-throughput assays for transport proteins, and we discuss emerging assay ontologies and their application to this field. Focusing on human P-glycoprotein (Multidrug resistance protein 1; gene name: ABCB1, MDR1), we exemplify how annotation of bioassay data per target class could improve and add to existing ontologies, and we propose to include an additional layer of metadata supporting data fusion across different bioassays.


Asunto(s)
Ontologías Biológicas , Descubrimiento de Drogas/métodos , Ensayos Analíticos de Alto Rendimiento , Proteínas de Transporte de Membrana , Proteínas de Transporte de Membrana/química , Proteínas de Transporte de Membrana/clasificación , Proteínas de Transporte de Membrana/metabolismo , Preparaciones Farmacéuticas/química , Preparaciones Farmacéuticas/metabolismo
5.
Drug Discov Today ; 18(17-18): 843-52, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-23702085

RESUMEN

Molecular information systems play an important part in modern data-driven drug discovery. They do not only support decision making but also enable new discoveries via association and inference. In this review, we outline the scientific requirements identified by the Innovative Medicines Initiative (IMI) Open PHACTS consortium for the design of an open pharmacological space (OPS) information system. The focus of this work is the integration of compound-target-pathway-disease/phenotype data for public and industrial drug discovery research. Typical scientific competency questions provided by the consortium members will be analyzed based on the underlying data concepts and associations needed to answer the questions. Publicly available data sources used to target these questions as well as the need for and potential of semantic web-based technology will be presented.


Asunto(s)
Bases de Datos de Compuestos Químicos , Bases de Datos Farmacéuticas , Descubrimiento de Drogas/métodos , Sistemas de Información , Semántica , Integración de Sistemas , Minería de Datos , Bases de Datos de Compuestos Químicos/normas , Bases de Datos Farmacéuticas/normas , Descubrimiento de Drogas/normas , Guías como Asunto , Sistemas de Información/normas , Bases del Conocimiento , Estructura Molecular , Relación Estructura-Actividad
6.
Drug Discov Today ; 17(21-22): 1188-98, 2012 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-22683805

RESUMEN

Open PHACTS is a public-private partnership between academia, publishers, small and medium sized enterprises and pharmaceutical companies. The goal of the project is to deliver and sustain an 'open pharmacological space' using and enhancing state-of-the-art semantic web standards and technologies. It is focused on practical and robust applications to solve specific questions in drug discovery research. OPS is intended to facilitate improvements in drug discovery in academia and industry and to support open innovation and in-house non-public drug discovery research. This paper lays out the challenges and how the Open PHACTS project is hoping to address these challenges technically and socially.


Asunto(s)
Descubrimiento de Drogas/organización & administración , Industria Farmacéutica/organización & administración , Asociación entre el Sector Público-Privado/organización & administración , Diseño de Fármacos , Humanos , Almacenamiento y Recuperación de la Información/métodos , Internet , Innovación Organizacional , Investigación/organización & administración , Semántica
7.
Mol Inform ; 31(8): 599-609, 2012 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-23293680

RESUMEN

Huge amounts of small compound bioactivity data have been entering the public domain as a consequence of open innovation initiatives. It is now the time to carefully analyse existing bioassay data and give it a systematic structure. Our study aims to annotate prominent in vitro assays used for the determination of bioactivities of human P-glycoprotein inhibitors and substrates as they are represented in the ChEMBL and TP-search open source databases. Furthermore, the ability of data, determined in different assays, to be combined with each other is explored. As a result of this study, it is suggested that for inhibitors of human P-glycoprotein it is possible to combine data coming from the same assay type, if the cell lines used are also identical and the fluorescent or radiolabeled substrate have overlapping binding sites. In addition, it demonstrates that there is a need for larger chemical diverse datasets that have been measured in a panel of different assays. This would certainly alleviate the search for other inter-correlations between bioactivity data yielded by different assay setups.

8.
Nat Genet ; 43(4): 281-3, 2011 Mar 29.
Artículo en Inglés | MEDLINE | ID: mdl-21445068

RESUMEN

Data citation and the derivation of semantic constructs directly from datasets have now both found their place in scientific communication. The social challenge facing us is to maintain the value of traditional narrative publications and their relationship to the datasets they report upon while at the same time developing appropriate metrics for citation of data and data constructs.


Asunto(s)
Bases de Datos Genéticas , Comunicación , Variación Genética , Humanos , Bases del Conocimiento , Edición
9.
PLoS One ; 4(11): e7894, 2009 Nov 18.
Artículo en Inglés | MEDLINE | ID: mdl-19924298

RESUMEN

We have developed a method that predicts Protein-Protein Interactions (PPIs) based on the similarity of the context in which proteins appear in literature. This method outperforms previously developed PPI prediction algorithms that rely on the conjunction of two protein names in MEDLINE abstracts. We show significant increases in coverage (76% versus 32%) and sensitivity (66% versus 41% at a specificity of 95%) for the prediction of PPIs currently archived in 6 PPI databases. A retrospective analysis shows that PPIs can efficiently be predicted before they enter PPI databases and before their interaction is explicitly described in the literature. The practical value of the method for discovery of novel PPIs is illustrated by the experimental confirmation of the inferred physical interaction between CAPN3 and PARVB, which was based on frequent co-occurrence of both proteins with concepts like Z-disc, dysferlin, and alpha-actinin. The relationships between proteins predicted by our method are broader than PPIs, and include proteins in the same complex or pathway. Dependent on the type of relationships deemed useful, the precision of our method can be as high as 90%. The full set of predicted interactions is available in a downloadable matrix and through the webtool Nermal, which lists the most likely interaction partners for a given protein. Our framework can be used for prioritizing potential interaction partners, hitherto undiscovered, for follow-up studies and to aid the generation of accurate protein interaction maps.


Asunto(s)
Biología Computacional/métodos , Mapeo de Interacción de Proteínas/métodos , Proteínas/química , Algoritmos , Animales , Calpaína/metabolismo , Clonación Molecular , Humanos , MEDLINE , Ratones , Modelos Estadísticos , Proteínas Musculares/metabolismo , Unión Proteica , Proteínas/metabolismo , Curva ROC , Proteínas Recombinantes/química , Reproducibilidad de los Resultados , Estados Unidos
10.
Genome Biol ; 9(5): R89, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-18507872

RESUMEN

WikiProteins enables community annotation in a Wiki-based system. Extracts of major data sources have been fused into an editable environment that links out to the original sources. Data from community edits create automatic copies of the original data. Semantic technology captures concepts co-occurring in one sentence and thus potential factual statements. In addition, indirect associations between concepts have been calculated. We call on a 'million minds' to annotate a 'million concepts' and to collect facts from the literature with the reward of collaborative knowledge discovery. The system is available for beta testing at http://www.wikiprofessional.org.


Asunto(s)
Bases de Datos de Proteínas , Proteínas/genética , Programas Informáticos , Almacenamiento y Recuperación de la Información , Internet
11.
Proteomics ; 7(6): 921-31, 2007 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-17370270

RESUMEN

Attribution of the most probable functions to proteins identified by proteomics is a significant challenge that requires extensive literature analysis. We have developed a system for automated prediction of implicit and explicit biologically meaningful functions for a proteomics study of the nucleolus. This approach uses a set of vocabulary terms to map and integrate the information from the entire MEDLINE database. Based on a combination of cross-species sequence homology searches and the corresponding literature, our approach facilitated the direct association between sequence data and information from biological texts describing function. Comparison of our automated functional assignment to manual annotation demonstrated our method to be highly effective. To establish the sensitivity, we defined the functional subtleties within a family containing a highly conserved sequence. Clustering of the DEAD-box protein family of RNA helicases confirmed that these proteins shared similar morphology although functional subfamilies were accurately identified by our approach. We visualized the nucleolar proteome in terms of protein functions using multi-dimensional scaling, showing functional associations between nucleolar proteins that were not previously realized. Finally, by clustering the functional properties of the established nucleolar proteins, we predicted novel nucleolar proteins. Subsequently, nonproteomics studies confirmed the predictions of previously unidentified nucleolar proteins.


Asunto(s)
MEDLINE , Proteínas Nucleares , Secuencia de Aminoácidos , Animales , ARN Helicasas DEAD-box/química , ARN Helicasas DEAD-box/genética , ARN Helicasas DEAD-box/metabolismo , Bases de Datos de Proteínas , Humanos , Datos de Secuencia Molecular , Proteínas Nucleares/química , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Proteoma
12.
Int J Med Inform ; 76(2-3): 195-200, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-16815739

RESUMEN

PROBLEM: key word assignment has been largely used in MEDLINE to provide an indicative "gist" of the content of articles and to help retrieving biomedical articles. Abstracts are also used for this purpose. However with usually more than 300 words, MEDLINE abstracts can still be regarded as long documents; therefore we design a system to select a unique key sentence. This key sentence must be indicative of the article's content and we assume that abstract's conclusions are good candidates. We design and assess the performance of an automatic key sentence selector, which classifies sentences into four argumentative moves: PURPOSE, METHODS, RESULTS and METHODS: we rely on Bayesian classifiers trained on automatically acquired data. Features representation, selection and weighting are reported and classification effectiveness is evaluated on the four classes using confusion matrices. We also explore the use of simple heuristics to take the position of sentences into account. Recall, precision and F-scores are computed for the CONCLUSION class. For the CONCLUSION class, the F-score reaches 84%. Automatic argumentative classification using Bayesian learners is feasible on MEDLINE abstracts and should help user navigation in such repositories.


Asunto(s)
Indización y Redacción de Resúmenes , Almacenamiento y Recuperación de la Información/métodos , Bibliotecas Digitales , MEDLINE , Procesamiento de Lenguaje Natural , Inteligencia Artificial , Teorema de Bayes , Bibliometría , Publicaciones Periódicas como Asunto , Terminología como Asunto , Vocabulario Controlado
13.
Int J Med Inform ; 75(6): 488-95, 2006 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-16165395

RESUMEN

The aim of this study is to investigate the relationships between citations and the scientific argumentation found abstracts. We design a related article search task and observe how the argumentation can affect the search results. We extracted citation lists from a set of 3200 full-text papers originating from a narrow domain. In parallel, we recovered the corresponding MEDLINE records for analysis of the argumentative moves. Our argumentative model is founded on four classes: PURPOSE, METHODS, RESULTS and CONCLUSION. A Bayesian classifier trained on explicitly structured MEDLINE abstracts generates these argumentative categories. The categories are used to generate four different argumentative indexes. A fifth index contains the complete abstract, together with the title and the list of Medical Subject Headings (MeSH) terms. To appraise the relationship of the moves to the citations, the citation lists were used as the criteria for determining relatedness of articles, establishing a benchmark; it means that two articles are considered as "related" if they share a significant set of co-citations. Our results show that the average precision of queries with the PURPOSE and CONCLUSION features is the highest, while the precision of the RESULTS and METHODS features was relatively low. A linear weighting combination of the moves is proposed, which significantly improves retrieval of related articles.


Asunto(s)
Indización y Redacción de Resúmenes/métodos , Almacenamiento y Recuperación de la Información/métodos , Bibliotecas Digitales , MEDLINE , Procesamiento de Lenguaje Natural , Terminología como Asunto , Vocabulario Controlado , Algoritmos , Inteligencia Artificial , Benchmarking , Bibliometría , Sistemas de Administración de Bases de Datos , Almacenamiento y Recuperación de la Información/normas , Publicaciones Periódicas como Asunto
14.
Mass Spectrom Rev ; 25(2): 215-34, 2006.
Artículo en Inglés | MEDLINE | ID: mdl-16211575

RESUMEN

Nucleoli are plurifunctional nuclear domains involved in the regulation of several major cellular processes such as ribosome biogenesis, the biogenesis of non-ribosomal ribonucleoprotein complexes, cell cycle, and cellular aging. Until recently, the protein content of nucleoli was poorly described. Several proteomic analyses have been undertaken to discover the molecular bases of the biological roles fulfilled by nucleoli. These studies have led to the identification of more than 700 proteins. Extensive bibliographic and bioinformatic analyses allowed the classification of the identified proteins into functional groups and suggested potential functions of 150 human proteins previously uncharacterized. The combination of improvements in mass spectrometry technologies, the characterization of protein complexes, and data mining will assist in furthering our understanding of the role of nucleoli in different physiological and pathological cell states.


Asunto(s)
Nucléolo Celular/metabolismo , Proteínas Nucleares/análisis , Proteoma , Proteómica/métodos , Humanos
15.
Stud Health Technol Inform ; 116: 835-40, 2005.
Artículo en Inglés | MEDLINE | ID: mdl-16160362

RESUMEN

PROBLEM: Key word assignment has been largely used in MEDLINE to provide an indicative "gist" of the content of articles. Abstracts are also used for this purpose. However with usually more than 300 words, abstracts can still be regarded as long documents; therefore we design a system to select a unique key sentence. This key sentence must be indicative of the article's content and we assume that abstract's conclusions are good candidates. We design and assess the performance of an automatic key sentence selector, which classifies sentences into 4 argumentative moves: PURPOSE, METHODS, RESULTS and CONCLUSION. METHODS: We rely on Bayesian classifiers trained on automatically acquired data. Features representation, selection and weighting are reported and classification effectiveness is evaluated on the four classes using confusion matrices. We also explore the use of simple heuristics to take the position of sentences into account. Recall, precision and F-scores are computed for the CONCLUSION class. For the CONCLUSION class, the F-score reaches 84%. Automatic argumentative classification is feasible on MEDLINE abstracts and should help user navigation in such repositories.


Asunto(s)
Teorema de Bayes , MEDLINE , Humanos , Procesamiento de Lenguaje Natural
16.
Comput Biol Chem ; 27(1): 29-35, 2003 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-12798037

RESUMEN

Proteomics enforces the reverse chronological order on the gene to protein dogma and imposes amino acid sequences as a starting point of an investigation relative to function. By this approach, proteomics data can confirm the presence of multiple forms of a protein. Notwithstanding variations attributed specific individual features of organisms and tissues, from two to over ten protein forms can be identified in a given sample. The present work describes some guidelines for tracking the origin of alternative protein forms and attempts to tag the details of sequence data in the literature. Working via these guidelines we have uncovered a third alternative form of the Pim subfamily of oncogenes. The term form is here combined with the qualification alternative to describe any product of a given gene including closely related paralogs. This paper also emphasizes the need for consistency checks in annotation processes, such as gene clustering, to avoid losing important details describing protein alternative forms. By identifying alternative protein forms, we illustrate the fact that rationalizing of protein function via the identification of protein-protein interactions should in reality be that of identifying (alternative) form-form interactions.


Asunto(s)
Proteómica/normas , Proteínas Proto-Oncogénicas/genética , Secuencia de Aminoácidos/genética , Animales , Biología Computacional/métodos , Biología Computacional/normas , ADN Complementario/clasificación , ADN Complementario/genética , Bases de Datos de Proteínas/estadística & datos numéricos , Etiquetas de Secuencia Expresada , Variación Genética , Humanos , Datos de Secuencia Molecular , Familia de Multigenes/genética , Proteínas Serina-Treonina Quinasas/química , Proteínas Serina-Treonina Quinasas/clasificación , Proteínas Serina-Treonina Quinasas/genética , Proteómica/métodos , Proteínas Proto-Oncogénicas/química , Proteínas Proto-Oncogénicas/clasificación , Proteínas Proto-Oncogénicas c-pim-1 , Control de Calidad , Alineación de Secuencia , Homología de Secuencia de Aminoácido , Porcinos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...