Búsqueda | Portal de Búsqueda de la BVS España

1.

Integrating Concentration-Dependent Toxicity Data and Toxicokinetics To Inform Hepatotoxicity Response Pathways.

Russo, Daniel P; Aleksunes, Lauren M; Goyak, Katy; Qian, Hua; Zhu, Hao.

Environ Sci Technol ; 57(33): 12291-12301, 2023 08 22.

Artículo en Inglés | MEDLINE | ID: mdl-37566783

RESUMEN

Failure of animal models to predict hepatotoxicity in humans has created a push to develop biological pathway-based alternatives, such as those that use in vitro assays. Public screening programs (e.g., ToxCast/Tox21 programs) have tested thousands of chemicals using in vitro high-throughput screening (HTS) assays. Developing pathway-based models for simple biological pathways, such as endocrine disruption, has proven successful, but development remains a challenge for complex toxicities like hepatotoxicity, due to the many biological events involved. To this goal, we aimed to develop a computational strategy for developing pathway-based models for complex toxicities. Using a database of 2171 chemicals with human hepatotoxicity classifications, we identified 157 out of 1600+ ToxCast/Tox21 HTS assays to be associated with human hepatotoxicity. Then, a computational framework was used to group these assays by biological target or mechanisms into 52 key event (KE) models of hepatotoxicity. KE model output is a KE score summarizing chemical potency against a hepatotoxicity-relevant biological target or mechanism. Grouping hepatotoxic chemicals based on the chemical structure revealed chemical classes with high KE scores plausibly informing their hepatotoxicity mechanisms. Using KE scores and supervised learning to predict in vivo hepatotoxicity, including toxicokinetic information, improved the predictive performance. This new approach can be a universal computational toxicology strategy for various chemical toxicity evaluations.

Asunto(s)

Enfermedad Hepática Inducida por Sustancias y Drogas , Ensayos Analíticos de Alto Rendimiento , Animales , Humanos , Toxicocinética , Bases de Datos Factuales , Bioensayo

2.

Data-Driven Quantitative Structure-Activity Relationship Modeling for Human Carcinogenicity by Chronic Oral Exposure.

Chung, Elena; Russo, Daniel P; Ciallella, Heather L; Wang, Yu-Tang; Wu, Min; Aleksunes, Lauren M; Zhu, Hao.

Environ Sci Technol ; 57(16): 6573-6588, 2023 04 25.

Artículo en Inglés | MEDLINE | ID: mdl-37040559

RESUMEN

Traditional methodologies for assessing chemical toxicity are expensive and time-consuming. Computational modeling approaches have emerged as low-cost alternatives, especially those used to develop quantitative structure-activity relationship (QSAR) models. However, conventional QSAR models have limited training data, leading to low predictivity for new compounds. We developed a data-driven modeling approach for constructing carcinogenicity-related models and used these models to identify potential new human carcinogens. To this goal, we used a probe carcinogen dataset from the US Environmental Protection Agency's Integrated Risk Information System (IRIS) to identify relevant PubChem bioassays. Responses of 25 PubChem assays were significantly relevant to carcinogenicity. Eight assays inferred carcinogenicity predictivity and were selected for QSAR model training. Using 5 machine learning algorithms and 3 types of chemical fingerprints, 15 QSAR models were developed for each PubChem assay dataset. These models showed acceptable predictivity during 5-fold cross-validation (average CCR = 0.71). Using our QSAR models, we can correctly predict and rank 342 IRIS compounds' carcinogenic potentials (PPV = 0.72). The models predicted potential new carcinogens, which were validated by a literature search. This study portends an automated technique that can be applied to prioritize potential toxicants using validated QSAR models based on extensive training sets from public data resources.

Asunto(s)

Algoritmos , Relación Estructura-Actividad Cuantitativa , Humanos , Simulación por Computador , Carcinógenos/toxicidad , Bioensayo

3.

Integrating structure annotation and machine learning approaches to develop graphene toxicity models.

Wang, Tong; Russo, Daniel P; Bitounis, Dimitrios; Demokritou, Philip; Jia, Xuelian; Huang, Heng; Zhu, Hao.

Carbon N Y ; 204: 484-494, 2023 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-36845527

RESUMEN

Modern nanotechnology provides efficient and cost-effective nanomaterials (NMs). The increasing usage of NMs arises great concerns regarding nanotoxicity in humans. Traditional animal testing of nanotoxicity is expensive and time-consuming. Modeling studies using machine learning (ML) approaches are promising alternatives to direct evaluation of nanotoxicity based on nanostructure features. However, NMs, including two-dimensional nanomaterials (2DNMs) such as graphenes, have complex structures making them difficult to annotate and quantify the nanostructures for modeling purposes. To address this issue, we constructed a virtual graphenes library using nanostructure annotation techniques. The irregular graphene structures were generated by modifying virtual nanosheets. The nanostructures were digitalized from the annotated graphenes. Based on the annotated nanostructures, geometrical nanodescriptors were computed using Delaunay tessellation approach for ML modeling. The partial least square regression (PLSR) models for the graphenes were built and validated using a leave-one-out cross-validation (LOOCV) procedure. The resulted models showed good predictivity in four toxicity-related endpoints with the coefficient of determination (R2) ranging from 0.558 to 0.822. This study provides a novel nanostructure annotation strategy that can be applied to generate high-quality nanodescriptors for ML model developments, which can be widely applied to nanoinformatics studies of graphenes and other NMs.

4.

Predicting Prenatal Developmental Toxicity Based On the Combination of Chemical Structures and Biological Data.

Ciallella, Heather L; Russo, Daniel P; Sharma, Swati; Li, Yafan; Sloter, Eddie; Sweet, Len; Huang, Heng; Zhu, Hao.

Environ Sci Technol ; 56(9): 5984-5998, 2022 05 03.

Artículo en Inglés | MEDLINE | ID: mdl-35451820

RESUMEN

For hazard identification, classification, and labeling purposes, animal testing guidelines are required by law to evaluate the developmental toxicity potential of new and existing chemical products. However, guideline developmental toxicity studies are costly, time-consuming, and require many laboratory animals. Computational modeling has emerged as a promising, animal-sparing, and cost-effective method for evaluating the developmental toxicity potential of chemicals, such as endocrine disruptors, without the use of animals. We aimed to develop a predictive and explainable computational model for developmental toxicants. To this end, a comprehensive dataset of 1244 chemicals with developmental toxicity classifications was curated from public repositories and literature sources. Data from 2140 toxicological high-throughput screening assays were extracted from PubChem and the ToxCast program for this dataset and combined with information about 834 chemical fragments to group assays based on their chemical-mechanistic relationships. This effort revealed two assay clusters containing 83 and 76 assays, respectively, with high positive predictive rates for developmental toxicants identified with animal testing guidelines (PPV = 72.4 and 77.3% during cross-validation). These two assay clusters can be used as developmental toxicity models and were applied to predict new chemicals for external validation. This study provides a new strategy for constructing alternative chemical developmental toxicity evaluations that can be replicated for other toxicity modeling studies.

Asunto(s)

Ensayos Analíticos de Alto Rendimiento , Pruebas de Toxicidad , Animales , Bioensayo , Femenino , Sustancias Peligrosas , Ensayos Analíticos de Alto Rendimiento/métodos , Embarazo , Medición de Riesgo , Pruebas de Toxicidad/métodos

5.

Predictive modeling of estrogen receptor agonism, antagonism, and binding activities using machine- and deep-learning approaches.

Ciallella, Heather L; Russo, Daniel P; Aleksunes, Lauren M; Grimm, Fabian A; Zhu, Hao.

Lab Invest ; 101(4): 490-502, 2021 04.

Artículo en Inglés | MEDLINE | ID: mdl-32778734

RESUMEN

As defined by the World Health Organization, an endocrine disruptor is an exogenous substance or mixture that alters function(s) of the endocrine system and consequently causes adverse health effects in an intact organism, its progeny, or (sub)populations. Traditional experimental testing regimens to identify toxicants that induce endocrine disruption can be expensive and time-consuming. Computational modeling has emerged as a promising and cost-effective alternative method for screening and prioritizing potentially endocrine-active compounds. The efficient identification of suitable chemical descriptors and machine-learning algorithms, including deep learning, is a considerable challenge for computational toxicology studies. Here, we sought to apply classic machine-learning algorithms and deep-learning approaches to a panel of over 7500 compounds tested against 18 Toxicity Forecaster assays related to nuclear estrogen receptor (ERα and ERß) activity. Three binary fingerprints (Extended Connectivity FingerPrints, Functional Connectivity FingerPrints, and Molecular ACCess System) were used as chemical descriptors in this study. Each descriptor was combined with four machine-learning and two deep- learning (normal and multitask neural networks) approaches to construct models for all 18 ER assays. The resulting model performance was evaluated using the area under the receiver- operating curve (AUC) values obtained from a fivefold cross-validation procedure. The results showed that individual models have AUC values that range from 0.56 to 0.86. External validation was conducted using two additional sets of compounds (n = 592 and n = 966) with established interactions with nuclear ER demonstrated through experimentation. An agonist, antagonist, or binding score was determined for each compound by averaging its predicted probabilities in relevant assay models as an external validation, yielding AUC values ranging from 0.63 to 0.91. The results suggest that multitask neural networks offer advantages when modeling mechanistically related endpoints. Consensus predictions based on the average values of individual models remain the best modeling strategy for computational toxicity evaluations.

Asunto(s)

Aprendizaje Automático , Modelos Estadísticos , Receptores de Estrógenos , Algoritmos , Animales , Biología Computacional , Bases de Datos de Compuestos Químicos , Aprendizaje Profundo , Disruptores Endocrinos/metabolismo , Disruptores Endocrinos/toxicidad , Humanos , Ratones , Unión Proteica , Receptores de Estrógenos/antagonistas & inhibidores , Receptores de Estrógenos/efectos de los fármacos , Receptores de Estrógenos/metabolismo

6.

Revealing Adverse Outcome Pathways from Public High-Throughput Screening Data to Evaluate New Toxicants by a Knowledge-Based Deep Neural Network Approach.

Ciallella, Heather L; Russo, Daniel P; Aleksunes, Lauren M; Grimm, Fabian A; Zhu, Hao.

Environ Sci Technol ; 55(15): 10875-10887, 2021 08 03.

Artículo en Inglés | MEDLINE | ID: mdl-34304572

RESUMEN

Traditional experimental testing to identify endocrine disruptors that enhance estrogenic signaling relies on expensive and labor-intensive experiments. We sought to design a knowledge-based deep neural network (k-DNN) approach to reveal and organize public high-throughput screening data for compounds with nuclear estrogen receptor α and ß (ERα and ERß) binding potentials. The target activity was rodent uterotrophic bioactivity driven by ERα/ERß activations. After training, the resultant network successfully inferred critical relationships among ERα/ERß target bioassays, shown as weights of 6521 edges between 1071 neurons. The resultant network uses an adverse outcome pathway (AOP) framework to mimic the signaling pathway initiated by ERα and identify compounds that mimic endogenous estrogens (i.e., estrogen mimetics). The k-DNN can predict estrogen mimetics by activating neurons representing several events in the ERα/ERß signaling pathway. Therefore, this virtual pathway model, starting from a compound's chemistry initiating ERα activation and ending with rodent uterotrophic bioactivity, can efficiently and accurately prioritize new estrogen mimetics (AUC = 0.864-0.927). This k-DNN method is a potential universal computational toxicology strategy to utilize public high-throughput screening data to characterize hazards and prioritize potentially toxic compounds.

Asunto(s)

Rutas de Resultados Adversos , Receptor beta de Estrógeno , Receptor alfa de Estrógeno , Estrógenos , Ensayos Analíticos de Alto Rendimiento , Redes Neurales de la Computación

7.

Viral markers in nasopharyngeal carcinoma: A systematic review and meta-analysis on the detection of p16^INK4a, human papillomavirus (HPV), and Ebstein-Barr virus (EBV).

Tham, Tristan; Machado, Rosalie; Russo, Daniel P; Herman, Saori Wendy; Teegala, Sushma; Costantino, Peter.

Am J Otolaryngol ; 42(1): 102762, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-33202328

RESUMEN

PURPOSE: This study aimed to conduct a meta-analysis to investigate the distribution of EBV and HPV stratified according to histological NPC type. MATERIALS & METHODS: We performed a meta-analysis to produce pooled prevalence estimates in a random-effects model. We also performed calculations for attributable fractions of viral combinations in NPC, stratified according to histological type. RESULTS: There was a higher prevalence of HPV DNA in WHO Type I (34.4%) versus WHO Type II/III (18.4%). The attributable fractions of WHO Type I NPC was predominantly double negative EBV(-) HPV(-) NPC (56.4%), and EBV(-) HPV(+) NPC (21.5%), in contrast to the predominant infection in WHO Type II/III which was EBV(+) HPV(-) NPC (87.5%). Co-infection of both EBV and HPV was uncommon, and double-negative infection was more common in WHO Type I NPC. CONCLUSION: A significant proportion of WHO Type I NPC was either double-negative EBV(-)HPV(-) or EBV(-)HPV(+).

Asunto(s)

Alphapapillomavirus/aislamiento & purificación , Inhibidor p16 de la Quinasa Dependiente de Ciclina/aislamiento & purificación , Infecciones por Virus de Epstein-Barr/diagnóstico , Herpesvirus Humano 4/aislamiento & purificación , Carcinoma Nasofaríngeo/virología , Neoplasias Nasofaríngeas/virología , Infecciones por Papillomavirus/diagnóstico , Biomarcadores , Infecciones por Virus de Epstein-Barr/virología , Humanos , Carcinoma Nasofaríngeo/patología , Neoplasias Nasofaríngeas/patología , Infecciones por Papillomavirus/virología , Pronóstico

8.

Virtual Molecular Projections and Convolutional Neural Networks for the End-to-End Modeling of Nanoparticle Activities and Properties.

Russo, Daniel P; Yan, Xiliang; Shende, Sunil; Huang, Heng; Yan, Bing; Zhu, Hao.

Anal Chem ; 92(20): 13971-13979, 2020 10 20.

Artículo en Inglés | MEDLINE | ID: mdl-32970421

RESUMEN

Digitalizing complex nanostructures into data structures suitable for machine learning modeling without losing nanostructure information has been a major challenge. Deep learning frameworks, particularly convolutional neural networks (CNNs), are especially adept at handling multidimensional and complex inputs. In this study, CNNs were applied for the modeling of nanoparticle activities exclusively from nanostructures. The nanostructures were represented by virtual molecular projections, a multidimensional digitalization of nanostructures, and used as input data to train CNNs. To this end, 77 nanoparticles with various activities and/or physicochemical property results were used for modeling. The resulting CNN model predictions show high correlations with the experimental results. An analysis of a trained CNN quantitatively showed that neurons were able to recognize distinct nanostructure features critical to activities and physicochemical properties. This "end-to-end" deep learning approach is well suited to digitalize complex nanostructures for data-driven machine learning modeling and can be broadly applied to rationally design nanoparticles with desired activities.

9.

Exploiting machine learning for end-to-end drug discovery and development.

Ekins, Sean; Puhl, Ana C; Zorn, Kimberley M; Lane, Thomas R; Russo, Daniel P; Klein, Jennifer J; Hickey, Anthony J; Clark, Alex M.

Nat Mater ; 18(5): 435-441, 2019 05.

Artículo en Inglés | MEDLINE | ID: mdl-31000803

RESUMEN

A variety of machine learning methods such as naive Bayesian, support vector machines and more recently deep neural networks are demonstrating their utility for drug discovery and development. These leverage the generally bigger datasets created from high-throughput screening data and allow prediction of bioactivities for targets and molecular properties with increased levels of accuracy. We have only just begun to exploit the potential of these techniques but they may already be fundamentally changing the research process for identifying new molecules and/or repurposing old drugs. The integrated application of such machine learning models for end-to-end (E2E) application is broadly relevant and has considerable implications for developing future therapies and their targeting.

Asunto(s)

Biología Computacional/métodos , Aprendizaje Automático , Algoritmos , Teorema de Bayes , Simulación por Computador , Diseño de Fármacos , Desarrollo de Medicamentos , Descubrimiento de Drogas , Reposicionamiento de Medicamentos , Humanos , Nanomedicina , Redes Neurales de la Computación , Máquina de Vectores de Soporte , Tecnología Farmacéutica/tendencias

10.

Machine Learning Models for Estrogen Receptor Bioactivity and Endocrine Disruption Prediction.

Zorn, Kimberley M; Foil, Daniel H; Lane, Thomas R; Russo, Daniel P; Hillwalker, Wendy; Feifarek, David J; Jones, Frank; Klaren, William D; Brinkman, Ashley M; Ekins, Sean.

Environ Sci Technol ; 54(19): 12202-12213, 2020 10 06.

Artículo en Inglés | MEDLINE | ID: mdl-32857505

RESUMEN

The U.S. Environmental Protection Agency (EPA) periodically releases in vitro data across a variety of targets, including the estrogen receptor (ER). In 2015, the EPA used these data to construct mathematical models of ER agonist and antagonist pathways to prioritize chemicals for endocrine disruption testing. However, mathematical models require in vitro data prior to predicting estrogenic activity, but machine learning methods are capable of prospective prediction from the molecular structure alone. The current study describes the generation and evaluation of Bayesian machine learning models grouped by the EPA's ER agonist pathway model using multiple data types with proprietary software, Assay Central. External predictions with three test sets of in vitro and in vivo reference chemicals with agonist activity classifications were compared to previous mathematical model publications. Training data sets were subjected to additional machine learning algorithms and compared with rank normalized scores of internal five-fold cross-validation statistics. External predictions were found to be comparable or superior to previous studies published by the EPA. When assessing six additional algorithms for the training data sets, Assay Central performed similarly at a reduced computational cost. This study demonstrates that machine learning can prioritize chemicals for future in vitro and in vivo testing of ER agonism.

Asunto(s)

Disruptores Endocrinos , Receptores de Estrógenos , Teorema de Bayes , Disruptores Endocrinos/toxicidad , Aprendizaje Automático , Estudios Prospectivos

11.

The effect of race in head and neck cancer: A meta-analysis controlling for socioeconomic status.

Russo, Daniel P; Tham, Tristan; Bardash, Yonatan; Kraus, Dennis.

Am J Otolaryngol ; 41(6): 102624, 2020.

Artículo en Inglés | MEDLINE | ID: mdl-32663732

RESUMEN

PURPOSE: To investigate the association between race and ethnicity and prognosis in head and neck cancers (HNC), while controlling for socioeconomic status (SES). MATERIALS AND METHODS: Medline, Scopus, EMBASE, and the Cochrane Library were used to identify studies for inclusion, from database inception till March 5th 2019. Studies that analyzed the role of race and ethnicity in overall survival (OS) for malignancies of the head and neck were included in this study. For inclusion, the study needed to report a multivariate analysis controlling for some proxy of SES (for example household income or employment status). Pooled estimates were generated using a random effects model. Subgroup analysis by tumor sub-site, meta-regression, and sensitivity analyses were also performed. RevMan 5.3, Meta Essentials, and OpenMeta[Analyst] were used for statistical analysis. RESULTS: Ten studies from 2004 to 2019 with a total of 108,990 patients were included for analysis in this study. After controlling for SES, tumor stage, and treatment variables, blacks were found to have a poorer survival compared to whites (HR = 1.27, 95%CI: 1.18-1.36, p < 0.00001). Subgroup analysis by sub-site and sensitivity analysis agreed with the primary result. No differences in survival across sub-sites were observed. Meta-regression did not identify any factors associated with the pooled estimate. CONCLUSIONS: In HNC, blacks have poorer OS compared to whites even after controlling for socioeconomic factors.

Asunto(s)

Neoplasias de Cabeza y Cuello/etnología , Neoplasias de Cabeza y Cuello/mortalidad , Grupos Raciales , Clase Social , Humanos , Pronóstico , Tasa de Supervivencia

12.

Multiple Machine Learning Comparisons of HIV Cell-based and Reverse Transcriptase Data Sets.

Zorn, Kimberley M; Lane, Thomas R; Russo, Daniel P; Clark, Alex M; Makarov, Vadim; Ekins, Sean.

Mol Pharm ; 16(4): 1620-1632, 2019 04 01.

Artículo en Inglés | MEDLINE | ID: mdl-30779585

RESUMEN

The human immunodeficiency virus (HIV) causes over a million deaths every year and has a huge economic impact in many countries. The first class of drugs approved were nucleoside reverse transcriptase inhibitors. A newer generation of reverse transcriptase inhibitors have become susceptible to drug resistant strains of HIV, and hence, alternatives are urgently needed. We have recently pioneered the use of Bayesian machine learning to generate models with public data to identify new compounds for testing against different disease targets. The current study has used the NIAID ChemDB HIV, Opportunistic Infection and Tuberculosis Therapeutics Database for machine learning studies. We curated and cleaned data from HIV-1 wild-type cell-based and reverse transcriptase (RT) DNA polymerase inhibition assays. Compounds from this database with ≤1 µM HIV-1 RT DNA polymerase activity inhibition and cell-based HIV-1 inhibition are correlated (Pearson r = 0.44, n = 1137, p < 0.0001). Models were trained using multiple machine learning approaches (Bernoulli Naive Bayes, AdaBoost Decision Tree, Random Forest, support vector classification, k-Nearest Neighbors, and deep neural networks as well as consensus approaches) and then their predictive abilities were compared. Our comparison of different machine learning methods demonstrated that support vector classification, deep learning, and a consensus were generally comparable and not significantly different from each other using 5-fold cross validation and using 24 training and test set combinations. This study demonstrates findings in line with our previous studies for various targets that training and testing with multiple data sets does not demonstrate a significant difference between support vector machine and deep neural networks.

Asunto(s)

Fármacos Anti-VIH/farmacología , Infecciones por VIH/tratamiento farmacológico , Transcriptasa Inversa del VIH/antagonistas & inhibidores , VIH/efectos de los fármacos , Aprendizaje Automático , Inhibidores de la Transcriptasa Inversa/farmacología , Teorema de Bayes , Bases de Datos Factuales , Árboles de Decisión , Descubrimiento de Drogas , Infecciones por VIH/virología , Humanos , Redes Neurales de la Computación , Máquina de Vectores de Soporte

13.

CIIPro: a new read-across portal to fill data gaps using public large-scale chemical and biological data.

Russo, Daniel P; Kim, Marlene T; Wang, Wenyi; Pinolini, Daniel; Shende, Sunil; Strickland, Judy; Hartung, Thomas; Zhu, Hao.

Bioinformatics ; 33(3): 464-466, 2017 02 01.

Artículo en Inglés | MEDLINE | ID: mdl-28172359

RESUMEN

Summary: We have developed a public Chemical In vitroIn vivo Profiling (CIIPro) portal, which can automatically extract in vitro biological data from public resources (i.e. PubChem) for user-supplied compounds. For compounds with in vivo target activity data (e.g. animal toxicity testing results), the integrated cheminformatics algorithm will optimize the extracted biological data using in vitroin vivo correlations. The resulting in vitro biological data for target compounds can be used for read-across risk assessment of target compounds. Additionally, the CIIPro portal can identify the most similar compounds based on their optimized bioprofiles. The CIIPro portal provides new powerful assessment capabilities to the scientific community and can be easily integrated with other cheminformatics tools. Availability and Implementation: ciipro.rutgers.edu. Contact: danrusso@scarletmail.rutgers.edu or hao.zhu99@rutgers.edu

Asunto(s)

Biología Computacional/métodos , Programas Informáticos , Toxicología/métodos , Animales , Biosimilares Farmacéuticos , Medición de Riesgo/métodos

14.

Comparing Multiple Machine Learning Algorithms and Metrics for Estrogen Receptor Binding Prediction.

Russo, Daniel P; Zorn, Kimberley M; Clark, Alex M; Zhu, Hao; Ekins, Sean.

Mol Pharm ; 15(10): 4361-4370, 2018 10 01.

Artículo en Inglés | MEDLINE | ID: mdl-30114914

RESUMEN

Many chemicals that disrupt endocrine function have been linked to a variety of adverse biological outcomes. However, screening for endocrine disruption using in vitro or in vivo approaches is costly and time-consuming. Computational methods, e.g., quantitative structure-activity relationship models, have become more reliable due to bigger training sets, increased computing power, and advanced machine learning algorithms, such as multilayered artificial neural networks. Machine learning models can be used to predict compounds for endocrine disrupting capabilities, such as binding to the estrogen receptor (ER), and allow for prioritization and further testing. In this work, an exhaustive comparison of multiple machine learning algorithms, chemical spaces, and evaluation metrics for ER binding was performed on public data sets curated using in-house cheminformatics software (Assay Central). Chemical features utilized in modeling consisted of binary fingerprints (ECFP6, FCFP6, ToxPrint, or MACCS keys) and continuous molecular descriptors from RDKit. Each feature set was subjected to classic machine learning algorithms (Bernoulli Naive Bayes, AdaBoost Decision Tree, Random Forest, Support Vector Machine) and Deep Neural Networks (DNN). Models were evaluated using a variety of metrics: recall, precision, F1-score, accuracy, area under the receiver operating characteristic curve, Cohen's Kappa, and Matthews correlation coefficient. For predicting compounds within the training set, DNN has an accuracy higher than that of other methods; however, in 5-fold cross validation and external test set predictions, DNN and most classic machine learning models perform similarly regardless of the data set or molecular descriptors used. We have also used the rank normalized scores as a performance-criteria for each machine learning method, and Random Forest performed best on the validation set when ranked by metric or by data sets. These results suggest classic machine learning algorithms may be sufficient to develop high quality predictive models of ER activity.

Asunto(s)

Aprendizaje Automático , Receptores de Estrógenos/metabolismo , Algoritmos , Animales , Teorema de Bayes , Humanos , Unión Proteica , Programas Informáticos , Máquina de Vectores de Soporte

15.

Comparing and Validating Machine Learning Models for Mycobacterium tuberculosis Drug Discovery.

Lane, Thomas; Russo, Daniel P; Zorn, Kimberley M; Clark, Alex M; Korotcov, Alexandru; Tkachenko, Valery; Reynolds, Robert C; Perryman, Alexander L; Freundlich, Joel S; Ekins, Sean.

Mol Pharm ; 15(10): 4346-4360, 2018 10 01.

Artículo en Inglés | MEDLINE | ID: mdl-29672063

RESUMEN

Tuberculosis is a global health dilemma. In 2016, the WHO reported 10.4 million incidences and 1.7 million deaths. The need to develop new treatments for those infected with Mycobacterium tuberculosis ( Mtb) has led to many large-scale phenotypic screens and many thousands of new active compounds identified in vitro. However, with limited funding, efforts to discover new active molecules against Mtb needs to be more efficient. Several computational machine learning approaches have been shown to have good enrichment and hit rates. We have curated small molecule Mtb data and developed new models with a total of 18,886 molecules with activity cutoffs of 10 µM, 1 µM, and 100 nM. These data sets were used to evaluate different machine learning methods (including deep learning) and metrics and to generate predictions for additional molecules published in 2017. One Mtb model, a combined in vitro and in vivo data Bayesian model at a 100 nM activity yielded the following metrics for 5-fold cross validation: accuracy = 0.88, precision = 0.22, recall = 0.91, specificity = 0.88, kappa = 0.31, and MCC = 0.41. We have also curated an evaluation set ( n = 153 compounds) published in 2017, and when used to test our model, it showed the comparable statistics (accuracy = 0.83, precision = 0.27, recall = 1.00, specificity = 0.81, kappa = 0.36, and MCC = 0.47). We have also compared these models with additional machine learning algorithms showing Bayesian machine learning models constructed with literature Mtb data generated by different laboratories generally were equivalent to or outperformed deep neural networks with external test sets. Finally, we have also compared our training and test sets to show they were suitably diverse and different in order to represent useful evaluation sets. Such Mtb machine learning models could help prioritize compounds for testing in vitro and in vivo.

Asunto(s)

Antituberculosos/farmacología , Mycobacterium tuberculosis/efectos de los fármacos , Teorema de Bayes , Descubrimiento de Drogas , Aprendizaje Automático , Máquina de Vectores de Soporte

16.

Comparison of Deep Learning With Multiple Machine Learning Methods and Metrics Using Diverse Drug Discovery Data Sets.

Korotcov, Alexandru; Tkachenko, Valery; Russo, Daniel P; Ekins, Sean.

Mol Pharm ; 14(12): 4462-4475, 2017 12 04.

Artículo en Inglés | MEDLINE | ID: mdl-29096442

RESUMEN

Machine learning methods have been applied to many data sets in pharmaceutical research for several decades. The relative ease and availability of fingerprint type molecular descriptors paired with Bayesian methods resulted in the widespread use of this approach for a diverse array of end points relevant to drug discovery. Deep learning is the latest machine learning algorithm attracting attention for many of pharmaceutical applications from docking to virtual screening. Deep learning is based on an artificial neural network with multiple hidden layers and has found considerable traction for many artificial intelligence applications. We have previously suggested the need for a comparison of different machine learning methods with deep learning across an array of varying data sets that is applicable to pharmaceutical research. End points relevant to pharmaceutical research include absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) properties, as well as activity against pathogens and drug discovery data sets. In this study, we have used data sets for solubility, probe-likeness, hERG, KCNQ1, bubonic plague, Chagas, tuberculosis, and malaria to compare different machine learning methods using FCFP6 fingerprints. These data sets represent whole cell screens, individual proteins, physicochemical properties as well as a data set with a complex end point. Our aim was to assess whether deep learning offered any improvement in testing when assessed using an array of metrics including AUC, F1 score, Cohen's kappa, Matthews correlation coefficient and others. Based on ranked normalized scores for the metrics or data sets Deep Neural Networks (DNN) ranked higher than SVM, which in turn was ranked higher than all the other machine learning methods. Visualizing these properties for training and test sets using radar type plots indicates when models are inferior or perhaps over trained. These results also suggest the need for assessing deep learning further using multiple metrics with much larger scale comparisons, prospective testing as well as assessment of different fingerprints and DNN architectures beyond those used.

Asunto(s)

Descubrimiento de Drogas/métodos , Aprendizaje Automático , Redes Neurales de la Computación , Teorema de Bayes , Conjuntos de Datos como Asunto

17.

Correction to "Comparing and Validating Machine Learning Models for Mycobacterium tuberculosis Drug Discovery".

Lane, Thomas; Russo, Daniel P; Zorn, Kimberley M; Clark, Alex M; Korotcov, Alexandru; Tkachenko, Valery; Reynolds, Robert C; Perryman, Alexander L; Freundlich, Joel S; Ekins, Sean.

Mol Pharm ; 18(7): 2833, 2021 Jul 05.

Artículo en Inglés | MEDLINE | ID: mdl-34137624

18.

High-Throughput Screening Assay Profiling for Large Chemical Databases.

Russo, Daniel P; Zhu, Hao.

Methods Mol Biol ; 2474: 125-132, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-35294761

RESUMEN

High-throughput screening (HTS) techniques are increasingly being adopted by a variety of fields of toxicology. Notably, large-scale research efforts from government, industrial, and academic laboratories are screening millions of chemicals against a variety of biomolecular targets, producing an enormous amount of publicly available HTS assay data. These HTS assay data provide toxicologists important information on how chemicals interact with different biomolecular targets and provide illustrations of potential toxicity mechanisms. Open public data repositories, such as the National Institutes of Health's PubChem ( http://pubchem.ncbi.nlm.nih.gov ), were established to accept, store, and share HTS data. Through the PubChem website, users can rapidly obtain the PubChem assay results for compounds by using different chemical identifiers (including SMILES, InChIKey, IUPAC names, etc.). However, obtaining these data in a user-friendly format suitable for modeling and other informatics analysis (e.g., gathering PubChem data for hundreds or thousands of chemicals in a modeling friendly format) directly through the PubChem web portal is not feasible. This chapter aims to introduce two approaches to obtain the HTS assay results for large datasets of compounds from the PubChem portal. First, programmatic access via PubChem's PUG-REST web service using the Python programming language will be described. Second, most users, who lack programming skills, can directly obtain PubChem data for a large set of compounds by using the freely available Chemical In vitro-In vivo Profiling (CIIPro) portal ( http://www.ciipro.rutgers.edu ).

Asunto(s)

Bases de Datos de Compuestos Químicos , Ensayos Analíticos de Alto Rendimiento , Lenguajes de Programación

19.

Automatic Quantitative Structure-Activity Relationship Modeling to Fill Data Gaps in High-Throughput Screening.

Ciallella, Heather L; Chung, Elena; Russo, Daniel P; Zhu, Hao.

Methods Mol Biol ; 2474: 169-187, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-35294765

RESUMEN

Advances in high-throughput screening (HTS) revolutionized the environmental and health sciences data landscape. However, new compounds still need to be experimentally synthesized and tested to obtain HTS data, which will still be costly and time-consuming when a large set of new compounds need to be studied against many tests. Quantitative structure-activity relationship (QSAR) modeling is a standard method to fill data gaps for new compounds. The major challenge for many toxicologists, especially those with limited computational backgrounds, is efficiently developing optimized QSAR models for each assay with missing data for certain test compounds. This chapter aims to introduce a freely available and user-friendly QSAR modeling workflow, which trains and optimizes models using five algorithms without the need for a programming background.

Asunto(s)

Ensayos Analíticos de Alto Rendimiento , Relación Estructura-Actividad Cuantitativa , Algoritmos , Bioensayo

20.

Mechanism-driven modeling of chemical hepatotoxicity using structural alerts and an in vitro screening assay.

Jia, Xuelian; Wen, Xia; Russo, Daniel P; Aleksunes, Lauren M; Zhu, Hao.

J Hazard Mater ; 436: 129193, 2022 08 15.

Artículo en Inglés | MEDLINE | ID: mdl-35739723

RESUMEN

Traditional experimental approaches to evaluate hepatotoxicity are expensive and time-consuming. As an advanced framework of risk assessment, adverse outcome pathways (AOPs) describe the sequence of molecular and cellular events underlying chemical toxicities. We aimed to develop an AOP that can be used to predict hepatotoxicity by leveraging computational modeling and in vitro assays. We curated 869 compounds with known hepatotoxicity classifications as a modeling set and extracted assay data from PubChem. The antioxidant response element (ARE) assay, which quantifies transcriptional responses to oxidative stress, showed a high correlation to hepatotoxicity (PPV=0.82). Next, we developed quantitative structure-activity relationship (QSAR) models to predict ARE activation for compounds lacking testing results. Potential toxicity alerts were identified and used to construct a mechanistic hepatotoxicity model. For experimental validation, 16 compounds in the modeling set and 12 new compounds were selected and tested using an in-house ARE-luciferase assay in HepG2-C8 cells. The mechanistic model showed good hepatotoxicity predictivity (accuracy = 0.82) for these compounds. Potential false positive hepatotoxicity predictions by only using ARE results can be corrected by incorporating structural alerts and vice versa. This mechanistic model illustrates a potential toxicity pathway for hepatotoxicity, and this strategy can be expanded to develop predictive models for other complex toxicities.

Asunto(s)

Rutas de Resultados Adversos , Enfermedad Hepática Inducida por Sustancias y Drogas , Bioensayo , Simulación por Computador , Células Hep G2 , Humanos , Relación Estructura-Actividad Cuantitativa

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA