Búsqueda | Portal Regional de la BVS

The ToxCast pipeline: updates to curve-fitting approaches and database structure.

Feshuk, M; Kolaczkowski, L; Dunham, K; Davidson-Fritz, S E; Carstens, K E; Brown, J; Judson, R S; Paul Friedman, K.

Front Toxicol ; 5: 1275980, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-37808181

RESUMEN

Introduction: The US Environmental Protection Agency Toxicity Forecaster (ToxCast) program makes in vitro medium- and high-throughput screening assay data publicly available for prioritization and hazard characterization of thousands of chemicals. The assays employ a variety of technologies to evaluate the effects of chemical exposure on diverse biological targets, from distinct proteins to more complex cellular processes like mitochondrial toxicity, nuclear receptor signaling, immune responses, and developmental toxicity. The ToxCast data pipeline (tcpl) is an open-source R package that stores, manages, curve-fits, and visualizes ToxCast data and populates the linked MySQL Database, invitrodb. Methods: Herein we describe major updates to tcpl and invitrodb to accommodate a new curve-fitting approach. The original tcpl curve-fitting models (constant, Hill, and gain-loss models) have been expanded to include Polynomial 1 (Linear), Polynomial 2 (Quadratic), Power, Exponential 2, Exponential 3, Exponential 4, and Exponential 5 based on BMDExpress and encoded by the R package dependency, tcplfit2. Inclusion of these models impacted invitrodb (beta version v4.0) and tcpl v3 in several ways: (1) long-format storage of generic modeling parameters to permit additional curve-fitting models; (2) updated logic for winning model selection; (3) continuous hit calling logic; and (4) removal of redundant endpoints as a result of bidirectional fitting. Results and discussion: Overall, the hit call and potency estimates were largely consistent between invitrodb v3.5 and 4.0. Tcpl and invitrodb provide a standard for consistent and reproducible curve-fitting and data management for diverse, targeted in vitro assay data with readily available documentation, thus enabling sharing and use of these data in myriad toxicology applications. The software and database updates described herein promote comparability across multiple tiers of data within the US Environmental Protection Agency CompTox Blueprint.

Evaluating structure-based activity in a high-throughput assay for steroid biosynthesis.

Foster, M J; Patlewicz, G; Shah, I; Haggard, D E; Judson, R S; Paul Friedman, K.

Comput Toxicol ; 24: 1-23, 2022 Nov 01.

Artículo en Inglés | MEDLINE | ID: mdl-37841081

RESUMEN

Data from a high-throughput human adrenocortical carcinoma assay (HT-H295R) for steroid hormone biosynthesis are available for >2000 chemicals in single concentration and 654 chemicals in multi-concentration (mc). Previously, a metric describing the effect size of a chemical on the biosynthesis of 11 hormones was derived using mc data referred to as the maximum mean Mahalanobis distance (maxmMd). However, mc HT-H295R assay data remain unavailable for many chemicals. This work leverages existing HT-H295R assay data by constructing structure-activity relationships to make predictions for data-poor chemicals, including: (1) identification of individual structural descriptors, known as ToxPrint chemotypes, associated with increased odds of affecting estrogen or androgen synthesis; (2) a random forest (RF) classifier using physicochemical property descriptors to predict HT-H295R maxmMd binary (positive or negative) outcomes; and, (3) a local approach to predict maxmMd binary outcomes using nearest neighbors (NNs) based on two types of chemical fingerprints (chemotype or Morgan). Individual chemotypes demonstrated high specificity (85-98%) for modulators of estrogen and androgen synthesis but with low sensitivity. The best RF model for maxmMd classification included 13 predicted physicochemical descriptors, yielding a balanced accuracy (BA) of 71% with only modest improvement when hundreds of structural features were added. The best two NN models for binary maxmMd prediction demonstrated BAs of 85 and 81% using chemotype and Morgan fingerprints, respectively. Using an external test set of 6302 chemicals (lacking HT-H295R data), 1241 were identified as putative estrogen and androgen modulators. Combined results across the three classification models (global RF model and two local NN models) predict that 1033 of the 6302 chemicals would be more likely to affect HT-H295R bioactivity. Together, these in silico approaches can efficiently prioritize thousands of untested chemicals for screening to further evaluate their effects on steroid biosynthesis.

An automated curation procedure for addressing chemical errors and inconsistencies in public datasets used in QSAR modelling.

Mansouri, K; Grulke, C M; Richard, A M; Judson, R S; Williams, A J.

SAR QSAR Environ Res ; 27(11): 939-965, 2016 Nov.

Artículo en Inglés | MEDLINE | ID: mdl-27885862

RESUMEN

The increasing availability of large collections of chemical structures and associated experimental data provides an opportunity to build robust QSAR models for applications in different fields. One common concern is the quality of both the chemical structure information and associated experimental data. Here we describe the development of an automated KNIME workflow to curate and correct errors in the structure and identity of chemicals using the publicly available PHYSPROP physicochemical properties and environmental fate datasets. The workflow first assembles structure-identity pairs using up to four provided chemical identifiers, including chemical name, CASRNs, SMILES, and MolBlock. Problems detected included errors and mismatches in chemical structure formats, identifiers and various structure validation issues, including hypervalency and stereochemistry descriptions. Subsequently, a machine learning procedure was applied to evaluate the impact of this curation process. The performance of QSAR models built on only the highest-quality subset of the original dataset was compared with the larger curated and corrected dataset. The latter showed statistically improved predictive performance. The final workflow was used to curate the full list of PHYSPROP datasets, and is being made publicly available for further usage and integration by the scientific community.

Asunto(s)

Curaduría de Datos/métodos , Bases de Datos de Compuestos Químicos/normas , Conjuntos de Datos como Asunto/normas , Relación Estructura-Actividad Cuantitativa , Aprendizaje Automático , Estructura Molecular

Development of a consumer product ingredient database for chemical exposure screening and prioritization.

Goldsmith, M-R; Grulke, C M; Brooks, R D; Transue, T R; Tan, Y M; Frame, A; Egeghy, P P; Edwards, R; Chang, D T; Tornero-Velez, R; Isaacs, K; Wang, A; Johnson, J; Holm, K; Reich, M; Mitchell, J; Vallero, D A; Phillips, L; Phillips, M; Wambaugh, J F; Judson, R S; Buckley, T J; Dary, C C.

Food Chem Toxicol ; 65: 269-79, 2014 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-24374094

RESUMEN

Consumer products are a primary source of chemical exposures, yet little structured information is available on the chemical ingredients of these products and the concentrations at which ingredients are present. To address this data gap, we created a database of chemicals in consumer products using product Material Safety Data Sheets (MSDSs) publicly provided by a large retailer. The resulting database represents 1797 unique chemicals mapped to 8921 consumer products and a hierarchy of 353 consumer product "use categories" within a total of 15 top-level categories. We examine the utility of this database and discuss ways in which it will support (i) exposure screening and prioritization, (ii) generic or framework formulations for several indoor/consumer product exposure modeling initiatives, (iii) candidate chemical selection for monitoring near field exposure from proximal sources, and (iv) as activity tracers or ubiquitous exposure sources using "chemical space" map analyses. Chemicals present at high concentrations and across multiple consumer products and use categories that hold high exposure potential are identified. Our database is publicly available to serve regulators, retailers, manufacturers, and the public for predictive screening of chemicals in new and existing consumer products on the basis of exposure and risk.

Asunto(s)

Seguridad de Productos para el Consumidor , Sistemas de Administración de Bases de Datos , Exposición a Riesgos Ambientales

Haplotypes of the cholesteryl ester transfer protein gene predict lipid-modifying response to statin therapy.

Winkelmann, B R; Hoffmann, M M; Nauck, M; Kumar, A M; Nandabalan, K; Judson, R S; Boehm, B O; Tall, A R; Ruaño, G; März, W.

Pharmacogenomics J ; 3(5): 284-96, 2003.

Artículo en Inglés | MEDLINE | ID: mdl-14583798

RESUMEN

Cholesteryl ester transfer protein (CETP) plays a central role in high-density lipoprotein (HDL) metabolism. Single nucleotide polymorphisms (SNPs) and haplotypes in the CETP gene were determined in 98 patients with untreated dyslipidemias and analyzed for associations with plasma CETP and plasma lipids before and during statin treatment. Individual CETP SNPs and haplotypes were both significantly associated with CETP enzyme mass and activity. However, only certain CETP haplotypes, but not individual SNPs, significantly predicted the magnitude of change in HDL cholesterol (HDL-C) and triglycerides. After adjusting for covariates and multiple testing, the TTCAAA haplotype showed a gene-dose effect in predicting the HDL-C increase (P=0.03), while the TTCAAAGGG and AAAGGG haplotypes predicted a decrease in triglycerides (P=0.04 both). This is the first study to demonstrate that SNP haplotypes derived from allelic SNP combinations in the CETP gene were more informative than single SNPs in predicting the response to lipid-modifying therapy with statins.

Asunto(s)

Proteínas Portadoras/genética , Glicoproteínas , Haplotipos/genética , Inhibidores de Hidroximetilglutaril-CoA Reductasas/uso terapéutico , Lípidos/genética , Anciano , Enfermedades Cardiovasculares/sangre , Enfermedades Cardiovasculares/tratamiento farmacológico , Enfermedades Cardiovasculares/genética , Proteínas de Transferencia de Ésteres de Colesterol , Estudios de Cohortes , Femenino , Variación Genética/genética , Humanos , Lípidos/sangre , Masculino , Persona de Mediana Edad , Valor Predictivo de las Pruebas

Haplotype variation and linkage disequilibrium in 313 human genes.

Stephens, J C; Schneider, J A; Tanguay, D A; Choi, J; Acharya, T; Stanley, S E; Jiang, R; Messer, C J; Chew, A; Han, J H; Duan, J; Carr, J L; Lee, M S; Koshy, B; Kumar, A M; Zhang, G; Newell, W R; Windemuth, A; Xu, C; Kalbfleisch, T S; Shaner, S L; Arnold, K; Schulz, V; Drysdale, C M; Nandabalan, K; Judson, R S; Ruano, G; Vovis, G F.

Science ; 293(5529): 489-93, 2001 Jul 20.

Artículo en Inglés | MEDLINE | ID: mdl-11452081

RESUMEN

Variation within genes has important implications for all biological traits. We identified 3899 single nucleotide polymorphisms (SNPs) that were present within 313 genes from 82 unrelated individuals of diverse ancestry, and we organized the SNPs into 4304 different haplotypes. Each gene had several variable SNPs and haplotypes that were present in all populations, as well as a number that were population-specific. Pairs of SNPs exhibited variability in the degree of linkage disequilibrium that was a function of their location within a gene, distance from each other, population distribution, and population frequency. Haplotypes generally had more information content (heterozygosity) than did individual SNPs. Our analysis of the pattern of variation strongly supports the recent expansion of the human population.

Asunto(s)

Variación Genética , Haplotipos , Desequilibrio de Ligamiento , Polimorfismo de Nucleótido Simple , Alelos , Animales , Pueblo Asiatico/genética , Población Negra/genética , Fosfatos de Dinucleósidos/genética , Evolución Molecular , Femenino , Heterocigoto , Hispánicos o Latinos/genética , Humanos , Masculino , Mutación , Pan troglodytes/genética , Población Blanca/genética , Cromosoma X/genética

Complex promoter and coding region beta 2-adrenergic receptor haplotypes alter receptor expression and predict in vivo responsiveness.

Drysdale, C M; McGraw, D W; Stack, C B; Stephens, J C; Judson, R S; Nandabalan, K; Arnold, K; Ruano, G; Liggett, S B.

Proc Natl Acad Sci U S A ; 97(19): 10483-8, 2000 Sep 12.

Artículo en Inglés | MEDLINE | ID: mdl-10984540

RESUMEN

The human beta(2)-adrenergic receptor gene has multiple single-nucleotide polymorphisms (SNPs), but the relevance of chromosomally phased SNPs (haplotypes) is not known. The phylogeny and the in vitro and in vivo consequences of variations in the 5' upstream and ORF were delineated in a multiethnic reference population and an asthmatic cohort. Thirteen SNPs were found organized into 12 haplotypes out of the theoretically possible 8,192 combinations. Deep divergence in the distribution of some haplotypes was noted in Caucasian, African-American, Asian, and Hispanic-Latino ethnic groups with >20-fold differences among the frequencies of the four major haplotypes. The relevance of the five most common beta(2)-adrenergic receptor haplotype pairs was determined in vivo by assessing the bronchodilator response to beta agonist in asthmatics. Mean responses by haplotype pair varied by >2-fold, and response was significantly related to the haplotype pair (P = 0.007) but not to individual SNPs. Expression vectors representing two of the haplotypes differing at eight of the SNP loci and associated with divergent in vivo responsiveness to agonist were used to transfect HEK293 cells. beta(2)-adrenergic receptor mRNA levels and receptor density in cells transfected with the haplotype associated with the greater physiologic response were approximately 50% greater than those transfected with the lower response haplotype. The results indicate that the unique interactions of multiple SNPs within a haplotype ultimately can affect biologic and therapeutic phenotype and that individual SNPs may have poor predictive power as pharmacogenetic loci.

Asunto(s)

Haplotipos , Regiones Promotoras Genéticas , Receptores Adrenérgicos beta 2/genética , Secuencia de Bases , Línea Celular Transformada , ADN/genética , Genotipo , Humanos , Filogenia , Polimorfismo de Nucleótido Simple

A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae.

Uetz, P; Giot, L; Cagney, G; Mansfield, T A; Judson, R S; Knight, J R; Lockshon, D; Narayan, V; Srinivasan, M; Pochart, P; Qureshi-Emili, A; Li, Y; Godwin, B; Conover, D; Kalbfleisch, T; Vijayadamodar, G; Yang, M; Johnston, M; Fields, S; Rothberg, J M.

Nature ; 403(6770): 623-7, 2000 Feb 10.

Artículo en Inglés | MEDLINE | ID: mdl-10688190

RESUMEN

Two large-scale yeast two-hybrid screens were undertaken to identify protein-protein interactions between full-length open reading frames predicted from the Saccharomyces cerevisiae genome sequence. In one approach, we constructed a protein array of about 6,000 yeast transformants, with each transformant expressing one of the open reading frames as a fusion to an activation domain. This array was screened by a simple and automated procedure for 192 yeast proteins, with positive responses identified by their positions in the array. In a second approach, we pooled cells expressing one of about 6,000 activation domain fusions to generate a library. We used a high-throughput screening procedure to screen nearly all of the 6,000 predicted yeast proteins, expressed as Gal4 DNA-binding domain fusion proteins, against the library, and characterized positives by sequence analysis. These approaches resulted in the detection of 957 putative interactions involving 1,004 S. cerevisiae proteins. These data reveal interactions that place functionally unclassified proteins in a biological context, interactions between proteins involved in the same biological function, and interactions that link biological functions together into larger cellular processes. The results of these screens are shown here.

Asunto(s)

Proteínas Fúngicas/metabolismo , Saccharomyces cerevisiae/metabolismo , Proteínas Fúngicas/genética , Sistemas de Lectura Abierta , Biblioteca de Péptidos , Unión Proteica , Estructura Terciaria de Proteína , Técnicas del Sistema de Dos Híbridos

State-to-State Rates for the D + H2(v = 1, j = 1) rarr HD(v', j') + H Reaction: Predictions and Measurements.

Neuhauser, D; Judson, R S; Kouri, D J; Adelman, D E; Shafer, N E; Kliner, D A; Zare, R N.

Science ; 257(5069): 519-22, 1992 Jul 24.

Artículo en Inglés | MEDLINE | ID: mdl-17778685

RESUMEN

A fully quantal wavepacket approach to reactive scattering in which the best available H(3) potential energy surface was used enabled a comparison with experimentally determined rates for the D + H(2)(v = 1, j = 1) --> HD(v' = 0, 1, 2; j') + H reaction at significantly higher total energies (1.4 to 2.25 electron volts) than previously possible. The theoretical results are obtained over a sufficient range of conditions that a detailed simulation of the experiment was possible, thus making this a definitive comparison of experiment and theory. Good to excellent agreement is found for the vibrational branching ratios and for the rotational distributions within each product vibrational level. However, the calculated rotational distributions are slightly hotter than the experimentally measured ones. This small discrepancy is more marked for products for which a larger fraction of the total energy appears in translation. The most likely explanation for this behavior is that refinements are needed in the potential energy surface.

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA