Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
1.
Brief Bioinform ; 23(3)2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35325021

RESUMO

Prediction of antimicrobial resistance based on whole-genome sequencing data has attracted greater attention due to its rapidity and convenience. Numerous machine learning-based studies have used genetic variants to predict drug resistance in Mycobacterium tuberculosis (MTB), assuming that variants are homogeneous, and most of these studies, however, have ignored the essential correlation between variants and corresponding genes when encoding variants, and used a limited number of variants as prediction input. In this study, taking advantage of genome-wide variants for drug-resistance prediction and inspired by natural language processing, we summarize drug resistance prediction into document classification, in which variants are considered as words, mutated genes in an isolate as sentences, and an isolate as a document. We propose a novel hierarchical attentive neural network model (HANN) that helps discover drug resistance-related genes and variants and acquire more interpretable biological results. It captures the interaction among variants in a mutated gene as well as among mutated genes in an isolate. Our results show that for the four first-line drugs of isoniazid (INH), rifampicin (RIF), ethambutol (EMB) and pyrazinamide (PZA), the HANN achieves the optimal area under the ROC curve of 97.90, 99.05, 96.44 and 95.14% and the optimal sensitivity of 94.63, 96.31, 92.56 and 87.05%, respectively. In addition, without any domain knowledge, the model identifies drug resistance-related genes and variants consistent with those confirmed by previous studies, and more importantly, it discovers one more potential drug-resistance-related gene.


Assuntos
Mycobacterium tuberculosis , Antituberculosos/farmacologia , Antituberculosos/uso terapêutico , Resistência a Medicamentos , Testes de Sensibilidade Microbiana , Mutação , Redes Neurais de Computação
2.
Brief Bioinform ; 21(1): 318-328, 2020 Jan 17.
Artigo em Inglês | MEDLINE | ID: mdl-30496338

RESUMO

Drug resistance is one of the most intractable issues for successful treatment in current clinical practice. Although many mutations contributing to drug resistance have been identified, the relationship between the mutations and the related pharmacological profile of drug candidates has yet to be fully elucidated, which is valuable both for the molecular dissection of drug resistance mechanisms and for suggestion of promising treatment strategies to counter resistant. Hence, effective prediction approach for estimating the sensitivity of mutations to agents is a new opportunity that counters drug resistance and creates a high interest in pharmaceutical research. However, this task is always hampered by limited known resistance training samples and accurately estimation of binding affinity. Upon this challenge, we successfully developed Auto In Silico Macromolecular Mutation Scanning (AIMMS), a web server for computer-aided de novo drug resistance prediction for any ligand-protein systems. AIMMS can qualitatively estimate the free energy consequences of any mutations through a fast mutagenesis scanning calculation based on a single molecular dynamics trajectory, which is differentiated with other web services by a statistical learning system. AIMMS suite is available at http://chemyang.ccnu.edu.cn/ccb/server/AIMMS/.

3.
Artigo em Inglês | MEDLINE | ID: mdl-32122893

RESUMO

In this retrospective study, whole-genome sequencing (WGS) data generated on an Ion Torrent platform was used to predict phenotypic drug resistance profiles for first- and second-line drugs among Swedish clinical Mycobacterium tuberculosis isolates from 2016 to 2018. The accuracy was ∼99% for all first-line drugs and 100% for four second-line drugs. Our analysis supports the introduction of WGS into routine diagnostics, which might, at least in Sweden, replace phenotypic drug susceptibility testing in the future.


Assuntos
Farmacorresistência Bacteriana Múltipla/efeitos dos fármacos , Farmacorresistência Bacteriana Múltipla/genética , Mycobacterium tuberculosis/efeitos dos fármacos , Mycobacterium tuberculosis/genética , Tuberculose/microbiologia , Sequenciamento Completo do Genoma , Humanos , Suécia , Tuberculose Resistente a Múltiplos Medicamentos/tratamento farmacológico
4.
BMC Bioinformatics ; 20(1): 410, 2019 Jul 30.
Artigo em Inglês | MEDLINE | ID: mdl-31362714

RESUMO

BACKGROUND: Antiretroviral drugs are a very effective therapy against HIV infection. However, the high mutation rate of HIV permits the emergence of variants that can be resistant to the drug treatment. Predicting drug resistance to previously unobserved variants is therefore very important for an optimum medical treatment. In this paper, we propose the use of weighted categorical kernel functions to predict drug resistance from virus sequence data. These kernel functions are very simple to implement and are able to take into account HIV data particularities, such as allele mixtures, and to weigh the different importance of each protein residue, as it is known that not all positions contribute equally to the resistance. RESULTS: We analyzed 21 drugs of four classes: protease inhibitors (PI), integrase inhibitors (INI), nucleoside reverse transcriptase inhibitors (NRTI) and non-nucleoside reverse transcriptase inhibitors (NNRTI). We compared two categorical kernel functions, Overlap and Jaccard, against two well-known noncategorical kernel functions (Linear and RBF) and Random Forest (RF). Weighted versions of these kernels were also considered, where the weights were obtained from the RF decrease in node impurity. The Jaccard kernel was the best method, either in its weighted or unweighted form, for 20 out of the 21 drugs. CONCLUSIONS: Results show that kernels that take into account both the categorical nature of the data and the presence of mixtures consistently result in the best prediction model. The advantage of including weights depended on the protein targeted by the drug. In the case of reverse transcriptase, weights based in the relative importance of each position clearly increased the prediction performance, while the improvement in the protease was much smaller. This seems to be related to the distribution of weights, as measured by the Gini index. All methods described, together with documentation and examples, are freely available at https://bitbucket.org/elies_ramon/catkern.


Assuntos
Algoritmos , Biologia Computacional/métodos , Farmacorresistência Viral/genética , HIV-1/genética , Fármacos Anti-HIV/farmacologia , Farmacorresistência Viral/efeitos dos fármacos , Infecções por HIV/virologia , HIV-1/efeitos dos fármacos , HIV-1/isolamento & purificação , Humanos , Modelos Lineares , Análise de Componente Principal
5.
BMC Bioinformatics ; 18(1): 369, 2017 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-28810826

RESUMO

BACKGROUND: Drug resistance in HIV treatment is still a worldwide problem. Predicting resistance to antiretrovirals (ARVs) before starting any treatment is important. Prediction accuracy is essential, as low-accuracy predictions increase the risk of prescribing sub-optimal drug regimens leading to patients developing resistance sooner. Artificial Neural Networks (ANNs) are a powerful tool that would be able to assist in drug resistance prediction. In this study, we constrained the dataset to subtype B, sacrificing generalizability for a higher predictive performance, and demonstrated that the predictive quality of the ANN regression models have definite improvement for most ARVs. RESULTS: Trained regression ANNs were optimized for eight protease inhibitors, six nucleoside reverse transcriptase (RT) inhibitors and four non-nucleoside RT inhibitors by experimenting combinations of rare variant filtering (none versus 1 residue occurrence) and ANN topologies (1-3 hidden layers with 2, 4, 6, 8 and 10 nodes per layer). Single hidden layers (5-20 nodes) were used for training where overfitting was detected. 5-fold cross-validation produced mean R2 values over 0.95 and standard deviations lower than 0.04 for all but two antiretrovirals. CONCLUSIONS: Overall, higher accuracies and lower variances (compared to results published in 2016) were obtained by experimenting with various preprocessing methods, while focusing on the most prevalent subtype in the raw dataset (subtype B).We thus highlight the need to develop and make available subtype-specific datasets for developing higher accuracy in drug-resistance prediction methods.


Assuntos
Farmacorresistência Viral , Protease de HIV/metabolismo , Transcriptase Reversa do HIV/metabolismo , HIV-1/metabolismo , Redes Neurais de Computação , Bases de Dados Factuais , Infecções por HIV/tratamento farmacológico , Protease de HIV/química , Inibidores da Protease de HIV/farmacologia , Inibidores da Protease de HIV/uso terapêutico , Transcriptase Reversa do HIV/química , HIV-1/efeitos dos fármacos , Humanos , Inibidores da Transcriptase Reversa/farmacologia , Inibidores da Transcriptase Reversa/uso terapêutico
6.
J Clin Microbiol ; 55(6): 1871-1882, 2017 06.
Artigo em Inglês | MEDLINE | ID: mdl-28381603

RESUMO

Whole-genome sequencing (WGS) is a newer alternative for tuberculosis (TB) diagnostics and is capable of providing rapid drug resistance profiles while performing species identification and capturing the data necessary for genotyping. Our laboratory developed and validated a comprehensive and sensitive WGS assay to characterize Mycobacterium tuberculosis and other M. tuberculosis complex (MTBC) strains, composed of a novel DNA extraction, optimized library preparation, paired-end WGS, and an in-house-developed bioinformatics pipeline. This new assay was assessed using 608 MTBC isolates, with 146 isolates during the validation portion of this study and 462 samples received prospectively. In February 2016, this assay was implemented to test all clinical cases of MTBC in New York State, including isolates and early positive Bactec mycobacterial growth indicator tube (MGIT) 960 cultures from primary specimens. Since the inception of the assay, we have assessed the accuracy of identification of MTBC strains to the species level, concordance with culture-based drug susceptibility testing (DST), and turnaround time. Species identification by WGS was determined to be 99% accurate. Concordance between drug resistance profiles generated by WGS and culture-based DST methods was 96% for eight drugs, with an average resistance-predictive value of 93% and susceptible-predictive value of 96%. This single comprehensive WGS assay has replaced seven molecular assays and has resulted in resistance profiles being reported to physicians an average of 9 days sooner than with culture-based DST for first-line drugs and 32 days sooner for second-line drugs.


Assuntos
Farmacorresistência Bacteriana , Técnicas de Genotipagem/métodos , Testes de Sensibilidade Microbiana/métodos , Mycobacterium tuberculosis/genética , Tuberculose/diagnóstico , Sequenciamento Completo do Genoma/métodos , Biologia Computacional/métodos , Humanos , New York , Estudos Prospectivos , Estudos Retrospectivos , Tuberculose/microbiologia
7.
BMC Bioinformatics ; 17 Suppl 8: 278, 2016 Aug 31.
Artigo em Inglês | MEDLINE | ID: mdl-27586700

RESUMO

BACKGROUND: HIV/AIDS is a serious threat to public health. The emergence of drug resistance mutations diminishes the effectiveness of drug therapy for HIV/AIDS. Developing a computational prediction of drug resistance phenotype will enable efficient and timely selection of the best treatment regimens. RESULTS: A unified encoding of protein sequence and structure was used as the feature vector for predicting phenotypic resistance from genotype data. Two machine learning algorithms, Random Forest and K-nearest neighbor, were used. The prediction accuracies were examined by five-fold cross-validation on the genotype-phenotype datasets. A supervised machine learning approach for automatic prediction of drug resistance was developed to handle genotype-phenotype datasets of HIV protease (PR) and reverse transcriptase (RT). It predicts the drug resistance phenotype and its relative severity from a query sequence. The accuracy of the classification was higher than 0.973 for eight PR inhibitors and 0.986 for ten RT inhibitors, respectively. The overall cross-validated regression R(2)-values for the severity of drug resistance were 0.772-0.953 for 8 PR inhibitors and 0.773-0.995 for 10 RT inhibitors. CONCLUSIONS: Machine learning using a unified encoding of sequence and protein structure as a feature vector provides an accurate prediction of drug resistance from genotype data. A practical webserver for clinicians has been implemented.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Farmacorresistência Viral/genética , Transcriptase Reversa do HIV/antagonistas & inibidores , HIV-1/genética , Algoritmos , Fármacos Anti-HIV/farmacologia , Automação , Farmacorresistência Viral/efeitos dos fármacos , Genótipo , Infecções por HIV/tratamento farmacológico , Protease de HIV/genética , Inibidores da Protease de HIV/farmacologia , Transcriptase Reversa do HIV/química , HIV-1/efeitos dos fármacos , Humanos , Inibidores da Transcriptase Reversa/farmacologia
8.
Microb Genom ; 9(8)2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37552534

RESUMO

Tuberculosis is a global pandemic disease with a rising burden of antimicrobial resistance. As a result, the World Health Organization (WHO) has a goal of enabling universal access to drug susceptibility testing (DST). Given the slowness of and infrastructure requirements for phenotypic DST, whole-genome sequencing, followed by genotype-based prediction of DST, now provides a route to achieving this. Since a central component of genotypic DST is to detect the presence of any known resistance-causing mutations, a natural approach is to use a reference graph that allows encoding of known variation. We have developed DrPRG (Drug resistance Prediction with Reference Graphs) using the bacterial reference graph method Pandora. First, we outline the construction of a Mycobacterium tuberculosis drug resistance reference graph. The graph is built from a global dataset of isolates with varying drug susceptibility profiles, thus capturing common and rare resistance- and susceptible-associated haplotypes. We benchmark DrPRG against the existing graph-based tool Mykrobe and the haplotype-based approach of TBProfiler using 44 709 and 138 publicly available Illumina and Nanopore samples with associated phenotypes. We find that DrPRG has significantly improved sensitivity and specificity for some drugs compared to these tools, with no significant decreases. It uses significantly less computational memory than both tools, and provides significantly faster runtimes, except when runtime is compared to Mykrobe with Nanopore data. We discover and discuss novel insights into resistance-conferring variation for M. tuberculosis - including deletion of genes katG and pncA - and suggest mutations that may warrant reclassification as associated with resistance.


Assuntos
Mycobacterium tuberculosis , Tuberculose Resistente a Múltiplos Medicamentos , Tuberculose , Humanos , Antituberculosos/farmacologia , Antituberculosos/uso terapêutico , Tuberculose Resistente a Múltiplos Medicamentos/genética , Testes de Sensibilidade Microbiana , Farmacorresistência Bacteriana Múltipla/genética , Tuberculose/microbiologia
9.
Braz J Infect Dis ; 26(1): 102332, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35176257

RESUMO

Tuberculosis (TB), caused by Mycobacterium tuberculosis (MTB), is one of the top 10 causes of death worldwide. Drug-resistant tuberculosis (DR-TB) poses a major threat to the World Health Organization's "End TB" strategy which has defined its target as the year 2035. In 2019, there were close to 0.5 million cases of DRTB, of which 78% were resistant to multiple TB drugs. The traditional culture-based drug susceptibility test (DST - the current gold standard) often takes multiple weeks and the necessary laboratory facilities are not readily available in low-income countries. Whole genome sequencing (WGS) technology is rapidly becoming an important tool in clinical and research applications including transmission detection or prediction of DR-TB. For the latter, many tools have recently been developed using curated database(s) of known resistance conferring mutations. However, documenting all the mutations and their effect is a time-taking and a continuous process and therefore Machine Learning (ML) techniques can be useful for predicting the presence of DR-TB based on WGS data. This can pave the way to an earlier detection of drug resistance and consequently more efficient treatment when compared to the traditional DST.


Assuntos
Mycobacterium tuberculosis , Tuberculose Resistente a Múltiplos Medicamentos , Tuberculose , Antituberculosos/farmacologia , Antituberculosos/uso terapêutico , Resistência a Medicamentos , Humanos , Aprendizado de Máquina , Testes de Sensibilidade Microbiana , Mycobacterium tuberculosis/genética , Tuberculose/tratamento farmacológico , Tuberculose/microbiologia , Tuberculose Resistente a Múltiplos Medicamentos/tratamento farmacológico , Tuberculose Resistente a Múltiplos Medicamentos/microbiologia
10.
Curr Med Chem ; 29(10): 1664-1676, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34238145

RESUMO

Acquired immunodeficiency syndrome (AIDS) has been a chronic, life-threatening disease for a long time. Though, a broad range of antiretroviral drug regimens is applicable for the successful suppression of virus replication in human immunodeficiency virus type 1 (HIV-1) infected people. The mutation-induced drug resistance problems during the treatment of AIDS forced people to continuously look for new antiviral agents. HIV-1 integrase (IN) and reverse transcriptase associated ribonuclease (RT-RNase H), two pivotal enzymes in HIV-1 replication progress, have gained popularity as druggable targets for designing novel HIV-1 antiviral drugs. During the development of HIV-1 IN and/or RT-RNase H inhibitors, computer-aided drug design (CADD), including homology modeling, pharmacophore, docking, molecular dynamics (MD) simulation and binding free energy calculation, represent a significant tool to accelerate the discovery of new drug candidates and reduce costs in antiviral drug development. In this review, we summarized the recent advances in the design of single- and dual-target inhibitors against HIV-1 IN or/and RT-RNase H as well as the prediction of mutation-induced drug resistance based on computational methods. We highlighted the results of the reported literatures and proposed some perspectives on the design of novel and more effective antiviral drugs in the future.


Assuntos
Síndrome da Imunodeficiência Adquirida , Fármacos Anti-HIV , Infecções por HIV , Inibidores de Integrase de HIV , Integrase de HIV , Fármacos Anti-HIV/química , Antivirais/farmacologia , Computadores , Desenho de Fármacos , Infecções por HIV/tratamento farmacológico , Integrase de HIV/metabolismo , Inibidores de Integrase de HIV/química , Transcriptase Reversa do HIV , Humanos , Inibidores da Transcriptase Reversa/química , Inibidores da Transcriptase Reversa/farmacologia , Ribonuclease H
11.
IDCases ; 26: e01308, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34745885

RESUMO

A 44-year-old woman undergoing therapy for acute promyelocytic leukemia (APL) developed disseminated tuberculosis. Mycobacterium tuberculosis (TB) was isolated from the blood and sputum. Initial drug susceptibility testing (DST) of the blood isolate revealed resistance to isoniazid and ethambutol but the sputum isolate showed no resistance. Due to drug resistance concerns, the patient was treated with multiple second and third-line drugs, and suffered from drug side effects. To further investigate the DST discrepancies, whole genome sequencing (WGS) was performed on both isolates. No known resistance mutations to first line or second line drugs were identified in either isolate, which was confirmed by additional susceptibility testing performed by a different reference laboratory and the California Department of Public Health (CDPH) laboratory. Treatment was reduced to a simpler and less toxic regimen due to these investigations. WGS is shown to be a valuable tool for resolving discordant phenotypic DST results of TB isolates and has the potential to provide accurate and timely results guiding appropriate therapy in the clinical setting.

12.
Front Chem ; 8: 243, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32411655

RESUMO

In silico methodologies have opened new avenues of research to understanding and predicting drug resistance, a pressing health issue that keeps rising at alarming pace. Sequence-based interpretation systems are routinely applied in clinical context in an attempt to predict mutation-based drug resistance and thus aid the choice of the most adequate antibiotic and antiviral therapy. An important limitation of approaches based on genotypic data exclusively is that mutations are not considered in the context of the three-dimensional (3D) structure of the target. Structure-based in silico methodologies are inherently more suitable to interpreting and predicting the impact of mutations on target-drug interactions, at the cost of higher computational and time demands when compared with sequence-based approaches. Herein, we present a fast, computationally inexpensive, sequence-to-structure-based approach to drug resistance prediction, which makes use of 3D protein structures encoded by input target sequences to draw binding-site comparisons with susceptible templates. Rather than performing atom-by-atom comparisons between input target and template structures, our workflow generates and compares Molecular Interaction Fields (MIFs) that map the areas of energetically favorable interactions between several chemical probe types and the target binding site. Quantitative, pairwise dissimilarity measurements between the target and the template binding sites are thus produced. The method is particularly suited to understanding changes to the 3D structure and the physicochemical environment introduced by mutations into the target binding site. Furthermore, the workflow relies exclusively on freeware, making it accessible to anyone. Using four datasets of known HIV-1 protease sequences as a case-study, we show that our approach is capable of correctly classifying resistant and susceptible sequences given as input. Guided by ROC curve analyses, we fined-tuned a dissimilarity threshold of classification that results in remarkable discriminatory performance (accuracy ≈ ROC AUC ≈ 0.99), illustrating the high potential of sequence-to-structure-, MIF-based approaches in the context of drug resistance prediction. We discuss the complementarity of the proposed methodology to existing prediction algorithms based on genotypic data. The present work represents a new step toward a more comprehensive and structurally-informed interpretation of the impact of genetic variability on the response to HIV-1 therapies.

13.
Exp Hematol Oncol ; 6: 3, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28097046

RESUMO

BACKGROUND: The concept of precision medicine in cancer includes individual molecular studies to predict clinical outcomes. In the present N = 1 case we retrospectively have analysed lymphoma tissue by exome sequencing and global gene expression in a patient with unexpected long-term remission following relaps. The goals were to phenotype the diagnostic and relapsed lymphoma tissue and evaluate its pattern. Furthermore, to identify mutations available for targeted therapy and expression of genes to predict specific drug effects by resistance gene signatures (REGS) for R-CHOP as described at http://www.hemaclass.org. We expected that such a study could generate therapeutic information and a frame for future individual evaluation of molecular resistance detected at clinical relapse. CASE PRESENTATION: The patient was diagnosed with a transformed high-grade non-Hodgkin lymphoma stage III and treated with conventional R-CHOP [rituximab (R), cyclophosphamide (C), doxorubicin (H), vincristine (O) and prednisone (P)]. Unfortunately, she suffered from severe toxicity but recovered during the following 6 months' remission until biopsy-verified relapse. The patient refused second-line combination chemotherapy, but accepted 3 months' palliation with R and chlorambucil. Unexpectedly, she obtained continuous complete remission and is at present >9 years after primary diagnosis. Molecular studies and data evaluation by principal component analysis, mutation screening and copy number variations of the primary and relapsed tumor, identified a pattern of branched lymphoma evolution, most likely diverging from an in situ follicular lymphoma. Accordingly, the primary diagnosed transformed lymphoma was classified as a diffuse large B cell lymphoma (DLBCL) of the GCB/centrocytic subtype by "cell of origin BAGS" assignment and R sensitive and C, H, O and P resistant by "drug specific REGS" assignment. The relapsed DLBCL was classified as NC/memory subtype and R, C, H sensitive but O and P resistant. CONCLUSIONS: Thorough analysis of the tumor DNA and RNA documented a branched evolution of the two clinical diagnosed tFL, most likely transformed from an unknown in situ lymphoma. Classification of the malignant tissue for drug-specific resistance did not explain the unexpected long-term remission and potential cure. However, it is tempting to consider the anti-CD20 immunotherapy as the curative intervention in the two independent tumors of this case.

14.
Braz. j. infect. dis ; 26(1): 102332, 2022. graf
Artigo em Inglês | LILACS-Express | LILACS | ID: biblio-1364546

RESUMO

Abstract Tuberculosis (TB), caused by Mycobacterium tuberculosis (MTB), is one of the top 10 causes of death worldwide. Drug-resistant tuberculosis (DR-TB) poses a major threat to the World Health Organization's "End TB" strategy which has defined its target as the year 2035. In 2019, there were close to 0.5 million cases of DRTB, of which 78% were resistant to multiple TB drugs. The traditional culture-based drug susceptibility test (DST - the current gold standard) often takes multiple weeks and the necessary laboratory facilities are not readily available in low-income countries. Whole genome sequencing (WGS) technology is rapidly becoming an important tool in clinical and research applications including transmission detection or prediction of DR-TB. For the latter, many tools have recently been developed using curated database(s) of known resistance conferring mutations. However, documenting all the mutations and their effect is a time-taking and a continuous process and therefore Machine Learning (ML) techniques can be useful for predicting the presence of DR-TB based on WGS data. This can pave the way to an earlier detection of drug resistance and consequently more efficient treatment when compared to the traditional DST.

16.
Artigo em Inglês | MEDLINE | ID: mdl-29226916

RESUMO

Effective machine-learning handles large datasets efficiently. One key feature of handling large data is the use of databases such as MySQL. The freeware fuzzy decision tree induction tool, FDT, is a scalable supervised-classification software tool implementing fuzzy decision trees. It is based on an optimized fuzzy ID3 (FID3) algorithm. FDT 2.0 improves upon FDT 1.0 by bridging the gap between data science and data engineering: it combines a robust decisioning tool with data retention for future decisions, so that the tool does not need to be recalibrated from scratch every time a new decision is required. In this paper we briefly review the analytical capabilities of the freeware FDT tool and its major features and functionalities; examples of large biological datasets from HIV, microRNAs and sRNAs are included. This work shows how to integrate fuzzy decision algorithms with modern database technology. In addition, we show that integrating the fuzzy decision tree induction tool with database storage allows for optimal user satisfaction in today's Data Analytics world.

17.
Proc SIAM Int Conf Data Min ; 2013: 342-349, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24910813

RESUMO

HIV rapidly evolves drug resistance in response to antiviral drugs used in AIDS therapy. Estimating the specific resistance of a given strain of HIV to individual drugs from sequence data has important benefits for both the therapy of individual patients and the development of novel drugs. We have developed an accurate classification method based on the sparse representation theory, and demonstrate that this method is highly effective with HIV-1 protease. The protease structure is represented using our newly proposed encoding method based on Delaunay triangulation, and combined with the mutated amino acid sequences of known drug-resistant strains to train a machine-learning algorithm both for classification and regression of drug-resistant mutations. An overall cross-validated classification accuracy of 97% is obtained when trained on a publically available data base of approximately 1.5×104 known sequences (Stanford HIV database http://hivdb.stanford.edu/cgi-bin/GenoPhenoDS.cgi). Resistance to four FDA approved drugs is computed and comparisons with other algorithms demonstrate that our method shows significant improvements in classification accuracy.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA