Pesquisa | BVS Aleitamento Materno

ANN multiscale model of anti-HIV drugs activity vs AIDS prevalence in the US at county level based on information indices of molecular graphs and social networks.

González-Díaz, Humberto; Herrera-Ibatá, Diana María; Duardo-Sánchez, Aliuska; Munteanu, Cristian R; Orbegozo-Medina, Ricardo Alfredo; Pazos, Alejandro.

J Chem Inf Model ; 54(3): 744-55, 2014 Mar 24.

Artigo em Inglês | MEDLINE | ID: mdl-24521170

RESUMO

This work is aimed at describing the workflow for a methodology that combines chemoinformatics and pharmacoepidemiology methods and at reporting the first predictive model developed with this methodology. The new model is able to predict complex networks of AIDS prevalence in the US counties, taking into consideration the social determinants and activity/structure of anti-HIV drugs in preclinical assays. We trained different Artificial Neural Networks (ANNs) using as input information indices of social networks and molecular graphs. We used a Shannon information index based on the Gini coefficient to quantify the effect of income inequality in the social network. We obtained the data on AIDS prevalence and the Gini coefficient from the AIDSVu database of Emory University. We also used the Balaban information indices to quantify changes in the chemical structure of anti-HIV drugs. We obtained the data on anti-HIV drug activity and structure (SMILE codes) from the ChEMBL database. Last, we used Box-Jenkins moving average operators to quantify information about the deviations of drugs with respect to data subsets of reference (targets, organisms, experimental parameters, protocols). The best model found was a Linear Neural Network (LNN) with values of Accuracy, Specificity, and Sensitivity above 0.76 and AUROC > 0.80 in training and external validation series. This model generates a complex network of AIDS prevalence in the US at county level with respect to the preclinical activity of anti-HIV drugs in preclinical assays. To train/validate the model and predict the complex network we needed to analyze 43,249 data points including values of AIDS prevalence in 2,310 counties in the US vs ChEMBL results for 21,582 unique drugs, 9 viral or human protein targets, 4,856 protocols, and 10 possible experimental measures.

Assuntos

Síndrome da Imunodeficiência Adquirida/tratamento farmacológico , Síndrome da Imunodeficiência Adquirida/epidemiologia , Fármacos Anti-HIV/uso terapêutico , Algoritmos , Animais , Fármacos Anti-HIV/química , Bases de Dados Factuais , Avaliação Pré-Clínica de Medicamentos , HIV/efeitos dos fármacos , HIV/isolamento & purificação , Humanos , Modelos Estatísticos , Redes Neurais de Computação , Prevalência , Apoio Social , Estados Unidos/epidemiologia

Mapping chemical structure-activity information of HAART-drug cocktails over complex networks of AIDS epidemiology and socioeconomic data of U.S. counties.

Herrera-Ibatá, Diana María; Pazos, Alejandro; Orbegozo-Medina, Ricardo Alfredo; Romero-Durán, Francisco Javier; González-Díaz, Humberto.

Biosystems ; 132-133: 20-34, 2015 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-25916548

RESUMO

Using computational algorithms to design tailored drug cocktails for highly active antiretroviral therapy (HAART) on specific populations is a goal of major importance for both pharmaceutical industry and public health policy institutions. New combinations of compounds need to be predicted in order to design HAART cocktails. On the one hand, there are the biomolecular factors related to the drugs in the cocktail (experimental measure, chemical structure, drug target, assay organisms, etc.); on the other hand, there are the socioeconomic factors of the specific population (income inequalities, employment levels, fiscal pressure, education, migration, population structure, etc.) to study the relationship between the socioeconomic status and the disease. In this context, machine learning algorithms, able to seek models for problems with multi-source data, have to be used. In this work, the first artificial neural network (ANN) model is proposed for the prediction of HAART cocktails, to halt AIDS on epidemic networks of U.S. counties using information indices that codify both biomolecular and several socioeconomic factors. The data was obtained from at least three major sources. The first dataset included assays of anti-HIV chemical compounds released to ChEMBL. The second dataset is the AIDSVu database of Emory University. AIDSVu compiled AIDS prevalence for >2300 U.S. counties. The third data set included socioeconomic data from the U.S. Census Bureau. Three scales or levels were employed to group the counties according to the location or population structure codes: state, rural urban continuum code (RUCC) and urban influence code (UIC). An analysis of >130,000 pairs (network links) was performed, corresponding to AIDS prevalence in 2310 counties in U.S. vs. drug cocktails made up of combinations of ChEMBL results for 21,582 unique drugs, 9 viral or human protein targets, 4856 protocols, and 10 possible experimental measures. The best model found with the original data was a linear neural network (LNN) with AUROC>0.80 and accuracy, specificity, and sensitivity≈77% in training and external validation series. The change of the spatial and population structure scale (State, UIC, or RUCC codes) does not affect the quality of the model. Unbalance was detected in all the models found comparing positive/negative cases and linear/non-linear model accuracy ratios. Using synthetic minority over-sampling technique (SMOTE), data pre-processing and machine-learning algorithms implemented into the WEKA software, more balanced models were found. In particular, a multilayer perceptron (MLP) with AUROC=97.4% and precision, recall, and F-measure >90% was found.

Assuntos

Síndrome da Imunodeficiência Adquirida/tratamento farmacológico , Síndrome da Imunodeficiência Adquirida/epidemiologia , Fármacos Anti-HIV/química , Fármacos Anti-HIV/uso terapêutico , Terapia Antirretroviral de Alta Atividade/estatística & dados numéricos , Modelos Estatísticos , Síndrome da Imunodeficiência Adquirida/economia , Algoritmos , Terapia Antirretroviral de Alta Atividade/economia , Simulação por Computador , Mineração de Dados/métodos , Bases de Dados Factuais , Escolaridade , Emprego , Humanos , Aprendizado de Máquina , Prevalência , Mídias Sociais/estatística & dados numéricos , Fatores Socioeconômicos , Relação Estrutura-Atividade , Resultado do Tratamento , Estados Unidos/epidemiologia

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA