Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Más filtros




Base de datos
Intervalo de año de publicación
1.
J Cheminform ; 4(1): 15, 2012 Aug 07.
Artículo en Inglés | MEDLINE | ID: mdl-22870956

RESUMEN

: This paper introduces a subdomain chemistry format for storing computational chemistry data called CompChem. It has been developed based on the design, concepts and methodologies of Chemical Markup Language (CML) by adding computational chemistry semantics on top of the CML Schema. The format allows a wide range of ab initio quantum chemistry calculations of individual molecules to be stored. These calculations include, for example, single point energy calculation, molecular geometry optimization, and vibrational frequency analysis. The paper also describes the supporting infrastructure, such as processing software, dictionaries, validation tools and database repositories. In addition, some of the challenges and difficulties in developing common computational chemistry dictionaries are discussed. The uses of CompChem are illustrated by two practical applications.

2.
J Chem Inf Model ; 52(10): 2494-500, 2012 Oct 22.
Artículo en Inglés | MEDLINE | ID: mdl-22900941

RESUMEN

A plethora of articles on naive Bayes classifiers, where the chemical compounds to be classified are represented by binary-valued (absent or present type) descriptors, have appeared in the cheminformatics literature over the past decade. The principal goal of this paper is to describe how a naive Bayes classifier based on binary descriptors (NBCBBD) can be employed as a feature selector in an efficient manner suitable for cheminformatics. In the process, we point out a fact well documented in other disciplines that NBCBBD is a linear classifier and is therefore intrinsically suboptimal for classifying compounds that are nonlinearly separable in their binary descriptor space. We investigate the performance of the proposed algorithm on classifying a subset of the MDDR data set, a standard molecular benchmark data set, into active and inactive compounds.


Asunto(s)
Algoritmos , Productos Biológicos/química , Inhibidores Enzimáticos/química , Teorema de Bayes , Productos Biológicos/farmacología , Inhibidores Enzimáticos/farmacología , Humanos , Informática , Modelos Moleculares , Relación Estructura-Actividad
3.
J Cheminform ; 3(1): 39, 2011 Oct 14.
Artículo en Inglés | MEDLINE | ID: mdl-21999395

RESUMEN

CMLLite is a collection of definitions and processes which provide strong and flexible validation for a document in Chemical Markup Language (CML). It consists of an updated CML schema (schema3), conventions specifying rules in both human and machine-understandable forms and a validator available both online and offline to check conformance. This article explores the rationale behind the changes which have been made to the schema, explains how conventions interact and how they are designed, formulated, implemented and tested, and gives an overview of the validation service.

4.
J Cheminform ; 3(1): 42, 2011 Oct 14.
Artículo en Inglés | MEDLINE | ID: mdl-21999475

RESUMEN

The World-Wide Molecular Matrix (WWMM) is a ten year project to create a peer-to-peer (P2P) system for the publication and collection of chemical objects, including over 250, 000 molecules. It has now been instantiated in a number of repositories which include data encoded in Chemical Markup Language (CML) and linked by URIs and RDF. The technical specification and implementation is now complete. We discuss the types of architecture required to implement nodes in the WWMM and consider the social issues involved in adoption.

5.
J Cheminform ; 3: 43, 2011 Oct 14.
Artículo en Inglés | MEDLINE | ID: mdl-21999509

RESUMEN

The semantic architecture of CML consists of conventions, dictionaries and units. The conventions conform to a top-level specification and each convention can constrain compliant documents through machine-processing (validation). Dictionaries conform to a dictionary specification which also imposes machine validation on the dictionaries. Each dictionary can also be used to validate data in a CML document, and provide human-readable descriptions. An additional set of conventions and dictionaries are used to support scientific units. All conventions, dictionaries and dictionary elements are identifiable and addressable through unique URIs.

6.
J Cheminform ; 3: 45, 2011 Oct 14.
Artículo en Inglés | MEDLINE | ID: mdl-21999587

RESUMEN

The Ami project was a six month Rapid Innovation project sponsored by JISC to explore the Virtual Research Environment space. The project brainstormed with chemists and decided to investigate ways to facilitate monitoring and collection of experimental data.A frequently encountered use-case was identified of how the chemist reaches the end of an experiment, but finds an unexpected result. The ability to replay events can significantly help make sense of how things progressed. The project therefore concentrated on collecting a variety of dimensions of ancillary data - data that would not normally be collected due to practicality constraints. There were three main areas of investigation: 1) Development of a monitoring tool using infrared and ultrasonic sensors; 2) Time-lapse motion video capture (for example, videoing 5 seconds in every 60); and 3) Activity-driven video monitoring of the fume cupboard environs.The Ami client application was developed to control these separate logging functions. The application builds up a timeline of the events in the experiment and around the fume cupboard. The videos and data logs can then be reviewed after the experiment in order to help the chemist determine the exact timings and conditions used.The project experimented with ways in which a Microsoft Kinect could be used in a laboratory setting. Investigations suggest that it would not be an ideal device for controlling a mouse, but it shows promise for usages such as manipulating virtual molecules.

7.
J Chem Inf Model ; 50(2): 251-61, 2010 Feb 22.
Artículo en Inglés | MEDLINE | ID: mdl-20088574

RESUMEN

The SPECTRa-T project has developed text-mining tools to extract named chemical entities (NCEs), such as chemical names and terms, and chemical objects (COs), e.g., experimental spectral assignments and physical chemistry properties, from electronic theses (e-theses). Although NCEs were readily identified within the two major document formats studied, only the use of structured documents enabled identification of chemical objects and their association with the relevant chemical entity (e.g., systematic chemical name). A corpus of theses was analyzed and it is shown that a high degree of semantic information can be extracted from structured documents. This integrated information has been deposited in a persistent Resource Description Framework (RDF) triple-store that allows users to conduct semantic searches. The strength and weaknesses of several document formats are reviewed.


Asunto(s)
Tesis Académicas como Asunto , Química/educación , Minería de Datos/métodos , Programas Informáticos , Bases de Datos Factuales , Procesamiento Automatizado de Datos , Reacciones Falso Positivas
8.
Org Biomol Chem ; 2(22): 3294-300, 2004 Nov 21.
Artículo en Inglés | MEDLINE | ID: mdl-15534707

RESUMEN

Automatically extracting chemical information from documents is a challenging task, but an essential one for dealing with the vast quantity of data that is available. The task is least difficult for structured documents, such as chemistry department web pages or the output of computational chemistry programs, but requires increasingly sophisticated approaches for less structured documents, such as chemical papers. The identification of key units of information, such as chemical names, makes the extraction of useful information from unstructured documents possible.


Asunto(s)
Química/métodos , Procesamiento Automatizado de Datos/métodos , Programas Informáticos , Internet , Terminología como Asunto
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA