Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Toxics ; 11(9)2023 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-37755795

RESUMO

In silico (quantitative) structure-activity relationship modeling is an approach that provides a fast and cost-effective alternative to assess the genotoxic potential of chemicals. However, one of the limiting factors for model development is the availability of consolidated experimental datasets. In the present study, we collected experimental data on micronuclei in vitro and in vivo, utilizing databases and conducting a PubMed search, aided by text mining using the BioBERT large language model. Chemotype enrichment analysis on the updated datasets was performed to identify enriched substructures. Additionally, chemotypes common for both endpoints were found. Five machine learning models in combination with molecular descriptors, twelve fingerprints and two data balancing techniques were applied to construct individual models. The best-performing individual models were selected for the ensemble construction. The curated final dataset consists of 981 chemicals for micronuclei in vitro and 1309 for mouse micronuclei in vivo, respectively. Out of 18 chemotypes enriched in micronuclei in vitro, only 7 were found to be relevant for in vivo prediction. The ensemble model exhibited high accuracy and sensitivity when applied to an external test set of in vitro data. A good balanced predictive performance was also achieved for the micronucleus in vivo endpoint.

2.
J Cheminform ; 14(1): 69, 2022 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-36242073

RESUMO

Collecting labeled data for many important tasks in chemoinformatics is time consuming and requires expensive experiments. In recent years, machine learning has been used to learn rich representations of molecules using large scale unlabeled molecular datasets and transfer the knowledge to solve the more challenging tasks with limited datasets. Variational autoencoders are one of the tools that have been proposed to perform the transfer for both chemical property prediction and molecular generation tasks. In this work we propose a simple method to improve chemical property prediction performance of machine learning models by incorporating additional information on correlated molecular descriptors in the representations learned by variational autoencoders. We verify the method on three property prediction tasks. We explore the impact of the number of incorporated descriptors, correlation between the descriptors and the target properties, sizes of the datasets etc. Finally, we show the relation between the performance of property prediction models and the distance between property prediction dataset and the larger unlabeled dataset in the representation space.

3.
Sci Data ; 6(1): 96, 2019 06 17.
Artigo em Inglês | MEDLINE | ID: mdl-31209213

RESUMO

Health care is one of the most exciting frontiers in data mining and machine learning. Successful adoption of electronic health records (EHRs) created an explosion in digital clinical data available for analysis, but progress in machine learning for healthcare research has been difficult to measure because of the absence of publicly available benchmark data sets. To address this problem, we propose four clinical prediction benchmarks using data derived from the publicly available Medical Information Mart for Intensive Care (MIMIC-III) database. These tasks cover a range of clinical problems including modeling risk of mortality, forecasting length of stay, detecting physiologic decline, and phenotype classification. We propose strong linear and neural baselines for all four tasks and evaluate the effect of deep supervision, multitask training and data-specific architectural modifications on the performance of neural models.


Assuntos
Benchmarking , Registros Eletrônicos de Saúde , Aprendizado de Máquina , Mineração de Dados , Bases de Dados Factuais , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...