RESUMEN
Knowledge about the 3-dimensional structure, orientation and interaction of chemical compounds is important in many areas of science and technology. X-ray crystallography is one of the experimental techniques capable of providing a large amount of structural information for a given compound, and it is widely used for characterisation of organic and metal-organic molecules. The method provides precise 3D coordinates of atoms inside crystals, however, it does not directly deliver information about certain chemical characteristics such as bond orders, delocalization, charges, lone electron pairs or lone electrons. These aspects of a molecular model have to be derived from crystallographic data using refined information about interatomic distances and atom types as well as employing general chemical knowledge. This publication describes a curated automatic pipeline for the derivation of chemical attributes of molecules from crystallographic models. The method is applied to build a catalogue of chemical entities in an open-access crystallographic database, the Crystallography Open Database (COD). The catalogue of such chemical entities is provided openly as a derived database. The content of this catalogue and the problems arising in the fully automated pipeline are discussed, along with the possibilities to introduce manual data curation into the process.
RESUMEN
We present an efficient algorithm for substructure search in combinatorial libraries defined by synthons, i.e., substructures with connection points. Our method improves on existing approaches by introducing powerful heuristics and fast fingerprint screening to quickly eliminate branches of nonmatching combinations of synthons. With this, we achieve typical response times of a few seconds on a standard desktop computer for searches in large combinatorial libraries like the Enamine REAL Space. We published the Java source as part of the OpenChemLib under the BSD license, and we implemented tools to enable substructure search in custom combinatorial libraries.
Asunto(s)
Algoritmos , Técnicas Químicas Combinatorias , Biblioteca de GenesRESUMEN
In drug discovery, molecules are optimized towards desired properties. In this context, machine learning is used for extrapolation in drug discovery projects. The limits of extrapolation for regression models are known. However, a systematic analysis of the effectiveness of extrapolation in drug discovery has not yet been performed. In response, this study examined the capabilities of six machine learning algorithms to extrapolate from 243 datasets. The response values calculated from the molecules in the datasets were molecular weight, cLogP, and the number of sp3-atoms. Three experimental set ups were chosen for response values. Shuffled data were used for interpolation, whereas data for extrapolation were sorted from high to low values, and the reverse. Extrapolation with sorted data resulted in much larger prediction errors than extrapolation with shuffled data. Additionally, this study demonstrated that linear machine learning methods are preferable for extrapolation.
RESUMEN
Synthetically accessible chemical spaces provide a valuable source to search for small-molecule analogues or new starting points in drug discovery projects. Having a toolbox at hand that can automatically create searchable representations of such spaces using reaction definitions and building blocks as inputs is a prerequisite to put this approach into practice. Herein, we present a tool kit to create such virtual chemical spaces. It is part of the OpenChemLib, an open-source Cheminformatics tool kit. Furthermore, we demonstrate the creation of a several billion molecules large chemical space from commercial building blocks and a list of common organic chemistry reactions.
Asunto(s)
Descubrimiento de DrogasRESUMEN
BACKGROUND: Analyses of few gene-sets in epilepsy showed a potential to unravel key disease associations. We set out to investigate the burden of ultra-rare variants (URVs) in a comprehensive range of biologically informed gene-sets presumed to be implicated in epileptogenesis. METHODS: The burden of 12 URV types in 92 gene-sets was compared between cases and controls using whole exome sequencing data from individuals of European descent with developmental and epileptic encephalopathies (DEE, n = 1,003), genetic generalized epilepsy (GGE, n = 3,064), or non-acquired focal epilepsy (NAFE, n = 3,522), collected by the Epi25 Collaborative, compared to 3,962 ancestry-matched controls. FINDINGS: Missense URVs in highly constrained regions were enriched in neuron-specific and developmental genes, whereas genes not expressed in brain were not affected. GGE featured a higher burden in gene-sets derived from inhibitory vs. excitatory neurons or associated receptors, whereas the opposite was found for NAFE, and DEE featured a burden in both. Top-ranked susceptibility genes from recent genome-wide association studies (GWAS) and gene-sets derived from generalized vs. focal epilepsies revealed specific enrichment patterns of URVs in GGE vs. NAFE. INTERPRETATION: Missense URVs affecting highly constrained sites differentially impact genes expressed in inhibitory vs. excitatory pathways in generalized vs. focal epilepsies. The excess of URVs in top-ranked GWAS risk-genes suggests a convergence of rare deleterious and common risk-variants in the pathogenesis of generalized and focal epilepsies. FUNDING: DFG Research Unit FOR-2715 (Germany), FNR (Luxembourg), NHGRI (US), NHLBI (US), DAAD (Germany).
Asunto(s)
Epilepsias Parciales/genética , Epilepsia Generalizada/genética , Predisposición Genética a la Enfermedad/genética , Variación Genética/genética , Estudios de Casos y Controles , Exoma/genética , Femenino , Estudio de Asociación del Genoma Completo/métodos , Humanos , Masculino , Secuenciación del Exoma/métodosRESUMEN
Increasing evidence indicates the pathogenetic relevance of regulatory genomic motifs for variability in the manifestation of brain disorders. In this context, cis-regulatory effects of single nucleotide polymorphisms (SNPs) on gene expression can contribute to changing transcript levels of excitability-relevant molecules and episodic seizure manifestation in epilepsy. Biopsy specimens of patients undergoing epilepsy surgery for seizure relief provide unique insights into the impact of promoter SNPs on corresponding mRNA expression. Here, we have scrutinized whether two linked regulatory SNPs (rs2744575; 4779C > G and rs4646830; 4854C > G) located in the aldehyde dehydrogenase 5a1 (succinic semialdehyde dehydrogenase; ALDH5A1) gene promoter are associated with expression of corresponding mRNAs in epileptic hippocampi (n = 43). The minor ALDH5A1-GG haplotype associates with significantly lower ALDH5A1 transcript abundance. Complementary in vitro analyses in neural cell cultures confirm this difference and further reveal a significantly constricted range for the minor ALDH5A1 haplotype of promoter activity regulation through the key epileptogenesis transcription factor Egr1 (early growth response 1). The present data suggest systematic analyses in human hippocampal tissue as a useful approach to unravel the impact of epilepsy candidate SNPs on associated gene expression. Aberrant ALDH5A1 promoter regulation in functional terms can contribute to impaired γ-aminobutyric acid homeostasis and thereby network excitability and seizure propensity.
Asunto(s)
Epilepsia del Lóbulo Temporal/genética , Hipocampo/metabolismo , Neuronas/metabolismo , ARN Mensajero/metabolismo , Succionato-Semialdehído Deshidrogenasa/genética , Animales , Línea Celular , Proteína 1 de la Respuesta de Crecimiento Precoz/metabolismo , Epilepsia del Lóbulo Temporal/patología , Epilepsia del Lóbulo Temporal/cirugía , Perfilación de la Expresión Génica , Haplotipos , Hipocampo/patología , Humanos , Técnicas In Vitro , Ratones , Polimorfismo de Nucleótido Simple , Regiones Promotoras Genéticas/genética , Ratas , EsclerosisRESUMEN
Aim: Pharmacoresistance is a major burden in epilepsy treatment. We aimed to identify genetic biomarkers in response to specific antiepileptic drugs (AEDs) in genetic generalized epilepsies (GGE). Materials & methods: We conducted a genome-wide association study (GWAS) of 3.3 million autosomal SNPs in 893 European subjects with GGE - responsive or nonresponsive to lamotrigine, levetiracetam and valproic acid. Results: Our GWAS of AED response revealed suggestive evidence for association at 29 genomic loci (p <10-5) but no significant association reflecting its limited power. The suggestive associations highlight candidate genes that are implicated in epileptogenesis and neurodevelopment. Conclusion: This first GWAS of AED response in GGE provides a comprehensive reference of SNP associations for hypothesis-driven candidate gene analyses in upcoming pharmacogenetic studies.
Asunto(s)
Anticonvulsivantes/uso terapéutico , Epilepsia Generalizada/tratamiento farmacológico , Epilepsia Generalizada/genética , Estudio de Asociación del Genoma Completo/métodos , Adolescente , Estudios de Casos y Controles , Niño , Estudios de Cohortes , Epilepsia Generalizada/epidemiología , Europa (Continente)/epidemiología , Femenino , Humanos , Lamotrigina/uso terapéutico , Levetiracetam/uso terapéutico , Masculino , Estudios Retrospectivos , Resultado del Tratamiento , Ácido Valproico/uso terapéuticoRESUMEN
There has been decades of research on determining and predicting acid dissociation constants (pKa) and the tautomer ratios both experimentally and theoretically. However, the lack of an extensive publicly available database of measured tautomeric ratios in water and nonaqueous solvents poses a challenge for the researchers interested in theoretical studies related to tautomers. Hereby, we present Tautobase, to date and to the best of our knowledge, the first extensive open-source tautomer database of measured and estimated tautomer ratios mainly in water, containing 1680 unique tautomer pairs.
Asunto(s)
Modelos Teóricos , Agua , Ácidos , Isomerismo , SolventesRESUMEN
PURPOSE: To evaluate whether PET/CT could be used to assess the extent of colorectal peritoneal metastases. METHODS: All patients who underwent a PET/CT scan before a CRS-HIPEC procedure between January 1, 2010 and December 31, 2013 were retrospectively included (n = 35). Two nuclear medicine physicians (observer 1 and observer 2) separately reviewed the scans on intraperitoneal abnormalities. A simplified PCI was used to compare the extent of rPCI versus sPCI. RESULTS: Included patients had a median age of 60.6 years. Histology of primary tumors were 51.5% adenocarcinomas, 37.1% mucinous adenocarcinoma, and 11.4% SRCC. Median sPCI was 9.5 (5.0-11.8) and median rPCI was 5.0 (3.0-7.0) for observer 1 and 4.0 (3.0-6.0) for observer 2 (p = 0.02 and p = 0.01, respectively). When compared to the surgical data, PET/CT showed a poor correlation for assessing the extent of PC for both adenocarcinoma (observer 1 rho - 0.17, p = 0.51 and observer 2 rho 0.13, p = 0.61) as well as mucinous carcinoma or SRCC (observer 1 rho 0.44, p = 0.08 and observer 2 rho 0.38, p = 0.14). CONCLUSION: PET/CT underestimates the extent of PC during surgery in both mucinous and non-mucinous CRC and is not recommended for intraperitoneal tumor scoring.
Asunto(s)
Neoplasias Colorrectales/patología , Neoplasias Peritoneales/diagnóstico por imagen , Neoplasias Peritoneales/secundario , Tomografía Computarizada por Tomografía de Emisión de Positrones/métodos , Femenino , Fluorodesoxiglucosa F18 , Humanos , Masculino , Persona de Mediana Edad , Periodo Preoperatorio , Dosis de Radiación , Radiofármacos , Estudios RetrospectivosRESUMEN
The Platinum dataset of protein-bound ligand conformations was used to benchmark the ability of the MMFF94s force field to generate bioactive conformations by minimization of randomly generated conformers. Torsion angle parameters that generally caused wrong geometries were reparameterized by conducting dihedral scans using ab initio calculations at the MP2 level. This reparameterization resulted in a systematic improvement of generated conformations.
RESUMEN
Mutations in several genes encoding ion channels can cause the long-QT (LQT) syndrome with cardiac arrhythmias, syncope and sudden death. Recently, mutations in some of these genes were also identified to cause epileptic seizures in these patients, and the sudden unexplained death in epilepsy (SUDEP) was considered to be the pathologic overlap between the two clinical conditions. For LQT-associated KCNQ1 mutations, only few investigations reported the coincidence of cardiac dysfunction and epileptic seizures. Clinical, electrophysiological and genetic characterization of a large pedigree (n = 241 family members) with LQT syndrome caused by a 12-base-pair duplication in exon 8 of the KCNQ1 gene duplicating four amino acids in the carboxyterminal KCNQ1 domain (KCNQ1dup12; p.R360_Q361dupQKQR, NM_000218.2, hg19). Electrophysiological recordings revealed no substantial KCNQ1-like currents. The mutation did not exhibit a dominant negative effect on wild-type KCNQ1 channel function. Most likely, the mutant protein was not functionally expressed and thus not incorporated into a heteromeric channel tetramer. Many LQT family members suffered from syncopes or developed sudden death, often after physical activity. Of 26 family members with LQT, seizures were present in 14 (LQTplus seizure trait). Molecular genetic analyses confirmed a causative role of the novel KCNQ1dup12 mutation for the LQT trait and revealed a strong link also with the LQTplus seizure trait. Genome-wide parametric multipoint linkage analyses identified a second strong genetic modifier locus for the LQTplus seizure trait in the chromosomal region 10p14. The linkage results suggest a two-locus inheritance model for the LQTplus seizure trait in which both the KCNQ1dup12 mutation and the 10p14 risk haplotype are necessary for the occurrence of LQT-associated seizures. The data strongly support emerging concepts that KCNQ1 mutations may increase the risk of epilepsy, but additional genetic modifiers are necessary for the clinical manifestation of epileptic seizures.
RESUMEN
Molecular complexity is an important characteristic of organic molecules for drug discovery. How to calculate molecular complexity has been discussed in the scientific literature for decades. It was known from early on that the numbers of substructures that can be cut out of a molecular graph are of importance for this task. However, it was never realized that the cut-out substructures show self-similarity to the parent structures. A successive removal of one bond and one atom returns a series of fragments with decreasing size. Such a series shows self-similarity similar to fractal objects. Here we used the number of distinct fragments to calculate the fractal dimension of the molecule. The fractal dimension of a molecule is a new matter constant that incorporates all features that are currently known to be important for describing molecular complexity. Furthermore, this is the first work that reveals the fractal nature of organic molecules.
RESUMEN
Juvenile myoclonic epilepsy (JME) is a common syndrome of genetic generalized epilepsies (GGEs). Linkage and association studies suggest that the gene encoding the bromodomain-containing protein 2 (BRD2) may increase risk of JME. The present methylation and association study followed up a recent report highlighting that the BRD2 promoter CpG island (CpG76) is differentially hypermethylated in lymphoblastoid cells from Caucasian patients with JME compared to patients with other GGE subtypes and unaffected relatives. In contrast, we found a uniform low average percentage of methylation (<4.5%) for 13 CpG76-CpGs in whole blood cells from 782 unrelated European Caucasians, including 116 JME patients, 196 patients with genetic absence epilepsies, and 470 control subjects. We also failed to confirm an allelic association of the BRD2 promoter single nucleotide polymorphism (SNP) rs3918149 with JME (Armitage trend test, P = 0.98), and we did not detect a substantial impact of SNP rs3918149 on CpG76 methylation in either 116 JME patients (methylation quantitative trait loci [meQTL], P = 0.29) or 470 German control subjects (meQTL, P = 0.55). Our results do not support the previous observation that a high DNA methylation level of the BRD2 promoter CpG76 island is a prevalent epigenetic motif associated with JME in Caucasians.
Asunto(s)
Islas de CpG/genética , Metilación de ADN , Epilepsia Mioclónica Juvenil/genética , Regiones Promotoras Genéticas/genética , Factores de Transcripción/genética , Epilepsia Tipo Ausencia/epidemiología , Epilepsia Tipo Ausencia/genética , Europa (Continente) , Femenino , Humanos , Leucocitos/química , Masculino , Epilepsia Mioclónica Juvenil/sangre , Epilepsia Mioclónica Juvenil/epidemiología , Polimorfismo de Nucleótido SimpleRESUMEN
Genetic Generalized Epilepsy (GGE) and benign epilepsy with centro-temporal spikes or Rolandic Epilepsy (RE) are common forms of genetic epilepsies. Rare copy number variants have been recognized as important risk factors in brain disorders. We performed a systematic survey of rare deletions affecting protein-coding genes derived from exome data of patients with common forms of genetic epilepsies. We analysed exomes from 390 European patients (196 GGE and 194 RE) and 572 population controls to identify low-frequency genic deletions. We found that 75 (32 GGE and 43 RE) patients out of 390, i.e. ~19%, carried rare genic deletions. In particular, large deletions (>400 kb) represent a higher burden in both GGE and RE syndromes as compared to controls. The detected low-frequency deletions (1) share genes with brain-expressed exons that are under negative selection, (2) overlap with known autism and epilepsy-associated candidate genes, (3) are enriched for CNV intolerant genes recorded by the Exome Aggregation Consortium (ExAC) and (4) coincide with likely disruptive de novo mutations from the NPdenovo database. Employing several knowledge databases, we discuss the most prominent epilepsy candidate genes and their protein-protein networks for GGE and RE.
Asunto(s)
Epilepsia Rolándica/genética , Eliminación de Gen , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad , Trastorno Autístico/genética , Trastorno Autístico/metabolismo , Deleción Cromosómica , Hibridación Genómica Comparativa , Variaciones en el Número de Copia de ADN , Epilepsia Generalizada/genética , Epilepsia Rolándica/metabolismo , Exoma , Estudios de Asociación Genética/métodos , Humanos , Mutación , Mapeo de Interacción de Proteínas , Mapas de Interacción de Proteínas , Reproducibilidad de los Resultados , Flujo de TrabajoRESUMEN
Rolandic epilepsy (RE) is the most common focal epilepsy in childhood. To date no hypothesis-free exome-wide mutational screen has been conducted for RE and atypical RE (ARE). Here we report on whole-exome sequencing of 194 unrelated patients with RE/ARE and 567 ethnically matched population controls. We identified an exome-wide significantly enriched burden for deleterious and loss-of-function variants only for the established RE/ARE gene GRIN2A. The statistical significance of the enrichment disappeared after removing ARE patients. For several disease-related gene-sets, an odds ratio >1 was detected for loss-of-function variants.
Asunto(s)
Epilepsia Rolándica/genética , Mutación con Pérdida de Función , Receptores de N-Metil-D-Aspartato/genética , Adolescente , Niño , Epilepsia Rolándica/patología , Exoma , Femenino , Humanos , MasculinoRESUMEN
Emerging evidence emphasizes the strong impact of regulatory genomic elements in neurodevelopmental processes and the complex pathways of brain disorders. The present genome-wide quantitative trait loci analyses explore the cis-regulatory effects of single-nucleotide polymorphisms (SNPs) on DNA methylation (meQTL) and gene expression (eQTL) in 110 human hippocampal biopsies. We identify cis-meQTLs at 14,118 CpG methylation sites and cis-eQTLs for 302 3'-mRNA transcripts of 288 genes. Hippocampal cis-meQTL-CpGs are enriched in flanking regions of active promoters, CpG island shores, binding sites of the transcription factor CTCF and brain eQTLs. Cis-acting SNPs of hippocampal meQTLs and eQTLs significantly overlap schizophrenia-associated SNPs. Correlations of CpG methylation and RNA expression are found for 34 genes. Our comprehensive maps of cis-acting hippocampal meQTLs and eQTLs provide a link between disease-associated SNPs and the regulatory genome that will improve the functional interpretation of non-coding genetic variants in the molecular genetic dissection of brain disorders.
Asunto(s)
Metilación de ADN , Epilepsia del Lóbulo Temporal/genética , Perfilación de la Expresión Génica , Estudio de Asociación del Genoma Completo/métodos , Hipocampo/metabolismo , Adolescente , Adulto , Anciano , Niño , Preescolar , Femenino , Humanos , Lactante , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo/genética , Adulto JovenRESUMEN
In this case study on an essential instrument of modern drug discovery, we summarize our successful efforts in the last four years toward enhancing the Actelion screening compound collection. A key organizational step was the establishment of the Compound Library Committee (CLC) in September 2013. This cross-functional team consisting of computational scientists, medicinal chemists and a biologist was endowed with a significant annual budget for regular new compound purchases. Based on an initial library analysis performed in 2013, the CLC developed a New Library Strategy. The established continuous library turn-over mode, and the screening library size of 300'000 compounds were maintained, while the structural library quality was increased. This was achieved by shifting the selection criteria from 'druglike' to 'leadlike' structures, enriching for non-flat structures, aiming for compound novelty, and increasing the ratio of higher cost 'Premium Compounds'. Novel chemical space was gained by adding natural compounds, macrocycles, designed and focused libraries to the collection, and through mutual exchanges of proprietary compounds with agrochemical companies. A comparative analysis in 2016 provided evidence for the positive impact of these measures. Screening the improved library has provided several highly promising hits, including a macrocyclic compound, that are currently followed up in different Hit-to-Lead and Lead Optimization programs. It is important to state that the goal of the CLC was not to achieve higher HTS hit rates, but to increase the chances of identified hits to serve as the basis of successful early drug discovery programs. The experience gathered so far legitimates the New Library Strategy.
Asunto(s)
Descubrimiento de Drogas , Evaluación Preclínica de Medicamentos , Algoritmos , Ensayos Analíticos de Alto Rendimiento , Bibliotecas de Moléculas PequeñasRESUMEN
Benign Familial Infantile Epilepsy (BFIE) is clinically characterized by clusters of brief partial seizures progressing to secondarily generalized seizures with onset at the age of 3-7 months and with favorable outcome. PRRT2 mutations are the most common cause of BFIE, and found in about 80% of BFIE families. In this study, we analyzed a large multiplex BFIE family by linkage and whole exome sequencing (WES) analyses. Genome-wide linkage analysis revealed significant evidence for linkage in the chromosomal region 19p12-q13 (LOD score 3.48). Mutation screening of positional candidate genes identified a synonymous SCN1B variant (c.492T>C, p.Tyr164Tyr) affecting splicing by the removal of a splicing silencer sequence, shown by in silico analysis, as the most likely causative mutation. In addition, the PRRT2 frameshift mutation (c.649dupC/p.Arg217Profs*8) was observed, showing incomplete, but high segregation with the phenotype. In vitro splicing assay of SCN1B expression confirmed the in silico findings showing a splicing imbalance between wild type and mutant exons. Herein, the involvement of the SCN1B gene in the etiology of BFIE, contributing to the disease phenotype as a modifier or part of an oligogenic predisposition, is shown for the first time.