AutoBind: automatic extraction of protein-ligand-binding affinity data from biological literature.
Bioinformatics
; 28(16): 2162-8, 2012 Aug 15.
Article
en En
| MEDLINE
| ID: mdl-22753780
ABSTRACT
MOTIVATION Determination of the binding affinity of a protein-ligand complex is important to quantitatively specify whether a particular small molecule will bind to the target protein. Besides, collection of comprehensive datasets for protein-ligand complexes and their corresponding binding affinities is crucial in developing accurate scoring functions for the prediction of the binding affinities of previously unknown protein-ligand complexes. In the past decades, several databases of protein-ligand-binding affinities have been created via visual extraction from literature. However, such approaches are time-consuming and most of these databases are updated only a few times per year. Hence, there is an immediate demand for an automatic extraction method with high precision for binding affinity collection. RESULT:
We have created a new database of protein-ligand-binding affinity data, AutoBind, based on automatic information retrieval. We first compiled a collection of 1586 articles where the binding affinities have been marked manually. Based on this annotated collection, we designed four sentence patterns that are used to scan full-text articles as well as a scoring function to rank the sentences that match our patterns. The proposed sentence patterns can effectively identify the binding affinities in full-text articles. Our assessment shows that AutoBind achieved 84.22% precision and 79.07% recall on the testing corpus. Currently, 13 616 protein-ligand complexes and the corresponding binding affinities have been deposited in AutoBind from 17 221 articles.AVAILABILITY:
AutoBind is automatically updated on a monthly basis, and it is freely available at http//autobind.csie.ncku.edu.tw/ and http//autobind.mc.ntu.edu.tw/. All of the deposited binding affinities have been refined and approved manually before being released.
Texto completo:
1
Colección:
01-internacional
Banco de datos:
MEDLINE
Asunto principal:
Unión Proteica
/
Programas Informáticos
/
Bases de Datos Factuales
/
Almacenamiento y Recuperación de la Información
/
Ligandos
Tipo de estudio:
Prognostic_studies
Idioma:
En
Revista:
Bioinformatics
Asunto de la revista:
INFORMATICA MEDICA
Año:
2012
Tipo del documento:
Article
País de afiliación:
Taiwán