Search | VHL Regional Portal

1.

An efficient hybrid deep learning architecture for predicting short antimicrobial peptides.

Nguyen, Quang H; Nguyen-Vo, Thanh-Hoang; Do, Trang T T; Nguyen, Binh P.

Proteomics ; 24(14): e2300382, 2024 Jul.

Article in English | MEDLINE | ID: mdl-38837544

ABSTRACT

Short-length antimicrobial peptides (AMPs) have been demonstrated to have intensified antimicrobial activities against a wide spectrum of microbes. Therefore, exploration of novel and promising short AMPs is highly essential in developing various types of antimicrobial drugs or treatments. In addition to experimental approaches, computational methods have been developed to improve screening efficiency. Although existing computational methods have achieved satisfactory performance, there is still much room for model improvement. In this study, we proposed iAMP-DL, an efficient hybrid deep learning architecture, for predicting short AMPs. The model was constructed using two well-known deep learning architectures: the long short-term memory architecture and convolutional neural networks. To fairly assess the performance of the model, we compared our model with existing state-of-the-art methods using the same independent test set. Our comparative analysis shows that iAMP-DL outperformed other methods. Furthermore, to assess the robustness and stability of our model, the experiments were repeated 10 times to observe the variation in prediction efficiency. The results demonstrate that iAMP-DL is an effective, robust, and stable framework for detecting promising short AMPs. Another comparative study of different negative data sampling methods also confirms the effectiveness of our method and demonstrates that it can also be used to develop a robust model for predicting AMPs in general. The proposed framework was also deployed as an online web server with a user-friendly interface to support the research community in identifying short AMPs.

Subject(s)

Antimicrobial Peptides , Deep Learning , Antimicrobial Peptides/chemistry , Antimicrobial Peptides/pharmacology , Neural Networks, Computer , Computational Biology/methods , Antimicrobial Cationic Peptides/chemistry , Antimicrobial Cationic Peptides/pharmacology

2.

eMIC-AntiKP: Estimating minimum inhibitory concentrations of antibiotics towards Klebsiella pneumoniae using deep learning.

Nguyen, Quang H; Ngo, Hoang H; Nguyen-Vo, Thanh-Hoang; Do, Trang T T; Rahardja, Susanto; Nguyen, Binh P.

Comput Struct Biotechnol J ; 21: 751-757, 2023.

Article in English | MEDLINE | ID: mdl-36659924

ABSTRACT

Nowadays, antibiotic resistance has become one of the most concerning problems that directly affects the recovery process of patients. For years, numerous efforts have been made to efficiently use antimicrobial drugs with appropriate doses not only to exterminate microbes but also stringently constrain any chances for bacterial evolution. However, choosing proper antibiotics is not a straightforward and time-effective process because well-defined drugs can only be given to patients after determining microbic taxonomy and evaluating minimum inhibitory concentrations (MICs). Besides conventional methods, numerous computer-aided frameworks have been recently developed using computational advances and public data sources of clinical antimicrobial resistance. In this study, we introduce eMIC-AntiKP, a computational framework specifically designed to predict the MIC values of 20 antibiotics towards Klebsiella pneumoniae. Our prediction models were constructed using convolutional neural networks and k-mer counting-based features. The model for cefepime has the most limited performance with a test 1-tier accuracy of 0.49, while the model for ampicillin has the highest performance with a test 1-tier accuracy of 1.00. Most models have satisfactory performance, with test accuracies ranging from about 0.70-0.90. The significance of eMIC-AntiKP is the effective utilization of computing resources to make it a compact and portable tool for most moderately configured computers. We provide users with two options, including an online web server for basic analysis and an offline package for deeper analysis and technical modification.

3.

iNSP-GCAAP: Identifying nonclassical secreted proteins using global composition of amino acid properties.

Do, Trang T T; Nguyen-Vo, Thanh-Hoang; Pham, Hung T; Trinh, Quang H; Nguyen, Binh P.

Proteomics ; 23(1): e2100134, 2023 01.

Article in English | MEDLINE | ID: mdl-36401584

ABSTRACT

Nonclassical secreted proteins (NSPs) refer to a group of proteins released into the extracellular environment under the facilitation of different biological transporting pathways apart from the Sec/Tat system. As experimental determination of NSPs is often costly and requires skilled handling techniques, computational approaches are necessary. In this study, we introduce iNSP-GCAAP, a computational prediction framework, to identify NSPs. We propose using global composition of a customized set of amino acid properties to encode sequence data and use the random forest (RF) algorithm for classification. We used the training dataset introduced by Zhang et al. (Bioinformatics, 36(3), 704-712, 2020) to develop our model and test it with the independent test set in the same study. The area under the receiver operating characteristic curve on that test set was 0.9256, which outperformed other state-of-the-art methods using the same datasets. Our framework is also deployed as a user-friendly web-based application to support the research community to predict NSPs.

Subject(s)

Amino Acids , Proteins , Amino Acids/metabolism , Proteins/chemistry , Software , Computational Biology/methods , Algorithms

4.

Predicting Antimalarial Activity in Natural Products Using Pretrained Bidirectional Encoder Representations from Transformers.

Nguyen-Vo, Thanh-Hoang; Trinh, Quang H; Nguyen, Loc; Do, Trang T T; Chua, Matthew Chin Heng; Nguyen, Binh P.

J Chem Inf Model ; 62(21): 5050-5058, 2022 Nov 14.

Article in English | MEDLINE | ID: mdl-36373285

ABSTRACT

Malaria is a threatening disease that has claimed many lives and has a high prevalence rate annually. Through the past decade, there have been many studies to uncover effective antimalarial compounds to combat this disease. Alongside chemically synthesized chemicals, a number of natural compounds have also been proven to be as effective in their antimalarial properties. Besides experimental approaches to investigate antimalarial activities in natural products, computational methods have been developed with satisfactory outcomes obtained. In this study, we propose a novel molecular encoding scheme based on Bidirectional Encoder Representations from Transformers and used our pretrained encoding model called NPBERT with four machine learning algorithms, including k-Nearest Neighbors (k-NN), Support Vector Machines (SVM), eXtreme Gradient Boosting (XGB), and Random Forest (RF), to develop various prediction models to identify antimalarial natural products. The results show that SVM models are the best-performing classifiers, followed by the XGB, k-NN, and RF models. Additionally, comparative analysis between our proposed molecular encoding scheme and existing state-of-the-art methods indicates that NPBERT is more effective compared to the others. Moreover, the deployment of transformers in constructing molecular encoders is not limited to this study but can be utilized for other biomedical applications.

Subject(s)

Antimalarials , Biological Products , Antimalarials/pharmacology , Antimalarials/chemistry , Biological Products/pharmacology , Support Vector Machine , Machine Learning , Algorithms

5.

Identifying Transcription Factors That Prefer Binding to Methylated DNA Using Reduced G-Gap Dipeptide Composition.

Nguyen, Quang H; Tran, Hoang V; Nguyen, Binh P; Do, Trang T T.

ACS Omega ; 7(36): 32322-32330, 2022 Sep 13.

Article in English | MEDLINE | ID: mdl-36119976

ABSTRACT

Transcription factors (TFs) play an important role in gene expression and regulation of 3D genome conformation. TFs have ability to bind to specific DNA fragments called enhancers and promoters. Some TFs bind to promoter DNA fragments which are near the transcription initiation site and form complexes that allow polymerase enzymes to bind to initiate transcription. Previous studies showed that methylated DNAs had ability to inhibit and prevent TFs from binding to DNA fragments. However, recent studies have found that there were TFs that could bind to methylated DNA fragments. The identification of these TFs is an important steppingstone to a better understanding of cellular gene expression mechanisms. However, as experimental methods are often time-consuming and labor-intensive, developing computational methods is essential. In this study, we propose two machine learning methods for two problems: (1) identifying TFs and (2) identifying TFs that prefer binding to methylated DNA targets (TFPMs). For the TF identification problem, the proposed method uses the position-specific scoring matrix for data representation and a deep convolutional neural network for modeling. This method achieved 90.56% sensitivity, 83.96% specificity, and an area under the receiver operating characteristic curve (AUC) of 0.9596 on an independent test set. For the TFPM identification problem, we propose to use the reduced g-gap dipeptide composition for data representation and the support vector machine algorithm for modeling. This method achieved 82.61% sensitivity, 64.86% specificity, and an AUC of 0.8486 on another independent test set. These results are higher than those of other studies on the same problems.

6.

Testing early warning and response systems through a full-scale exercise in Vietnam.

Clara, Alexey; Dao, Anh T P; Tran, Quy; Tran, Phu D; Dang, Tan Q; Nguyen, Huong T; Tran, Quang D; Rzeszotarski, Peter; Talbert, Karen; Stehling-Ariza, Tasha; Veasey, Frances; Clemens, Lynne; Mounts, Anthony W; Lofgren, Hannah; Balajee, S Arunmozhi; Do, Trang T.

BMC Public Health ; 21(1): 409, 2021 02 26.

Article in English | MEDLINE | ID: mdl-33637080

ABSTRACT

BACKGROUND: Simulation exercises can functionally validate World Health Organization (WHO) International Health Regulations (IHR 2005) core capacities. In 2018, the Vietnam Ministry of Health (MOH) conducted a full-scale exercise (FSX) in response to cases of severe viral pneumonia with subsequent laboratory confirmation for Middle East Respiratory Syndrome Coronavirus (MERS-CoV) to evaluate the country's early warning and response capabilities for high-risk events. METHODS: An exercise planning team designed a complex fictitious scenario beginning with one case of severe viral pneumonia presenting at the hospital level and developed all the materials required for the exercise. Actors, controllers and evaluators were trained. In August 2018, a 3-day exercise was conducted in Quang Ninh province and Hanoi city, with participation of public health partners at the community, district, province, regional and national levels. Immediate debriefings and an after-action review were conducted after all exercise activities. Participants assessed overall exercise design, conduction and usefulness. RESULTS: FSX findings demonstrated that the event-based surveillance component of the MOH surveillance system worked optimally at different administrative levels. Detection and reporting of signals at the community and health facility levels were appropriate. Triage, verification and risk assessment were successfully implemented to identify a high-risk event and trigger timely response. The FSX identified infection control, coordination with internal and external response partners and process documentation as response challenges. Participants positively evaluated the exercise training and design. CONCLUSIONS: This exercise documents the value of exercising surveillance capabilities as part of a real-time operational scenario before facing a true emergency. The timing of this exercise and choice of disease scenario was particularly fortuitous given the subsequent appearance of COVID-19. As a result of this exercise and subsequent improvements made by the MOH, the country may have been better able to deal with the emergence of SARS-CoV-2 and contain it.

Subject(s)

Disease Outbreaks/prevention & control , Public Health Surveillance/methods , COVID-19/epidemiology , COVID-19/prevention & control , Coronavirus Infections/epidemiology , Coronavirus Infections/prevention & control , Humans , Middle East Respiratory Syndrome Coronavirus/isolation & purification , Pneumonia, Viral/epidemiology , Pneumonia, Viral/prevention & control , Vietnam/epidemiology , World Health Organization

7.

Developing monitoring and evaluation tools for event-based surveillance: experience from Vietnam.

Clara, Alexey; Dao, Anh T P; Mounts, Anthony W; Bernadotte, Christina; Nguyen, Huyen T; Tran, Quy M; Tran, Quang D; Dang, Tan Q; Merali, Sharifa; Balajee, S Arunmozhi; Do, Trang T.

Global Health ; 16(1): 38, 2020 04 30.

Article in English | MEDLINE | ID: mdl-32354353

ABSTRACT

BACKGROUND: In 2016-2017, Vietnam's Ministry of Health (MoH) implemented an event-based surveillance (EBS) pilot project in six provinces as part of Global Health Security Agenda (GHSA) efforts. This manuscript describes development and design of tools for monitoring and evaluation (M&E) of EBS in Vietnam. METHODS: A strategic EBS framework was developed based on the EBS implementation pilot project's goals and objectives. The main process and outcome components were identified and included input, activities, outputs, and outcome indicators. M&E tools were developed to collect quantitative and qualitative data. The tools included a supervisory checklist, a desk review tool, a key informant interview guide, a focus group discussion guide, a timeliness form, and an online acceptability survey. An evaluation team conducted field visits for assessment of EBS 5-9 months after implementation. RESULTS: The quantitative data collected provided evidence on the number and type of events that were being reported, the timeliness of the system, and the event-to-signal ratio. The qualitative and subjective data collected helped to increase understanding of the system's field utility and acceptance by field staff, reasons for non-compliance with established guidelines, and other factors influencing implementation. CONCLUSIONS: The use of M&E tools for the EBS pilot project in Vietnam provided data on signals and events reported, timeliness of reporting and response, perceptions and opinions of implementers, and fidelity of EBS implementation. These data were valuable for Vietnam's MoH to understand the function of the EBS program, and the success and challenges of implementing this project in Vietnam.

Subject(s)

Epidemiological Monitoring , Global Health , Disease Outbreaks , Humans , Pilot Projects , Surveys and Questionnaires , Vietnam

8.

iEnhancer-ECNN: identifying enhancers and their strength using ensembles of convolutional neural networks.

Nguyen, Quang H; Nguyen-Vo, Thanh-Hoang; Le, Nguyen Quoc Khanh; Do, Trang T T; Rahardja, Susanto; Nguyen, Binh P.

BMC Genomics ; 20(Suppl 9): 951, 2019 Dec 24.

Article in English | MEDLINE | ID: mdl-31874637

ABSTRACT

BACKGROUND: Enhancers are non-coding DNA fragments which are crucial in gene regulation (e.g. transcription and translation). Having high locational variation and free scattering in 98% of non-encoding genomes, enhancer identification is, therefore, more complicated than other genetic factors. To address this biological issue, several in silico studies have been done to identify and classify enhancer sequences among a myriad of DNA sequences using computational advances. Although recent studies have come up with improved performance, shortfalls in these learning models still remain. To overcome limitations of existing learning models, we introduce iEnhancer-ECNN, an efficient prediction framework using one-hot encoding and k-mers for data transformation and ensembles of convolutional neural networks for model construction, to identify enhancers and classify their strength. The benchmark dataset from Liu et al.'s study was used to develop and evaluate the ensemble models. A comparative analysis between iEnhancer-ECNN and existing state-of-the-art methods was done to fairly assess the model performance. RESULTS: Our experimental results demonstrates that iEnhancer-ECNN has better performance compared to other state-of-the-art methods using the same dataset. The accuracy of the ensemble model for enhancer identification (layer 1) and enhancer classification (layer 2) are 0.769 and 0.678, respectively. Compared to other related studies, improvements in the Area Under the Receiver Operating Characteristic Curve (AUC), sensitivity, and Matthews's correlation coefficient (MCC) of our models are remarkable, especially for the model of layer 2 with about 11.0%, 46.5%, and 65.0%, respectively. CONCLUSIONS: iEnhancer-ECNN outperforms other previously proposed methods with significant improvement in most of the evaluation metrics. Strong growths in the MCC of both layers are highly meaningful in assuring the stability of our models.

Subject(s)

Enhancer Elements, Genetic , Neural Networks, Computer , Sequence Analysis, DNA/methods

9.

Predicting the onset of type 2 diabetes using wide and deep learning with electronic health records.

Nguyen, Binh P; Pham, Hung N; Tran, Hop; Nghiem, Nhung; Nguyen, Quang H; Do, Trang T T; Tran, Cao Truong; Simpson, Colin R.

Comput Methods Programs Biomed ; 182: 105055, 2019 Dec.

Article in English | MEDLINE | ID: mdl-31505379

ABSTRACT

OBJECTIVE: Diabetes is responsible for considerable morbidity, healthcare utilisation and mortality in both developed and developing countries. Currently, methods of treating diabetes are inadequate and costly so prevention becomes an important step in reducing the burden of diabetes and its complications. Electronic health records (EHRs) for each individual or a population have become important tools in understanding developing trends of diseases. Using EHRs to predict the onset of diabetes could improve the quality and efficiency of medical care. In this paper, we apply a wide and deep learning model that combines the strength of a generalised linear model with various features and a deep feed-forward neural network to improve the prediction of the onset of type 2 diabetes mellitus (T2DM). MATERIALS AND METHODS: The proposed method was implemented by training various models into a logistic loss function using a stochastic gradient descent. We applied this model using public hospital record data provided by the Practice Fusion EHRs for the United States population. The dataset consists of de-identified electronic health records for 9948 patients, of which 1904 have been diagnosed with T2DM. Prediction of diabetes in 2012 was based on data obtained from previous years (2009-2011). The imbalance class of the model was handled by Synthetic Minority Oversampling Technique (SMOTE) for each cross-validation training fold to analyse the performance when synthetic examples for the minority class are created. We used SMOTE of 150 and 300 percent, in which 300 percent means that three new synthetic instances are created for each minority class instance. This results in the approximated diabetes:non-diabetes distributions in the training set of 1:2 and 1:1, respectively. RESULTS: Our final ensemble model not using SMOTE obtained an accuracy of 84.28%, area under the receiver operating characteristic curve (AUC) of 84.13%, sensitivity of 31.17% and specificity of 96.85%. Using SMOTE of 150 and 300 percent did not improve AUC (83.33% and 82.12%, respectively) but increased sensitivity (49.40% and 71.57%, respectively) with a moderate decrease in specificity (90.16% and 76.59%, respectively). DISCUSSION AND CONCLUSIONS: Our algorithm has further optimised the prediction of diabetes onset using a novel state-of-the-art machine learning algorithm: the wide and deep learning neural network architecture.

Subject(s)

Deep Learning , Diabetes Mellitus, Type 2/diagnosis , Electronic Health Records , Humans , Machine Learning

10.

iPseU-NCP: Identifying RNA pseudouridine sites using random forest and NCP-encoded features.

Nguyen-Vo, Thanh-Hoang; Nguyen, Quang H; Do, Trang T T; Nguyen, Thien-Ngan; Rahardja, Susanto; Nguyen, Binh P.

BMC Genomics ; 20(Suppl 10): 971, 2019 Dec 30.

Article in English | MEDLINE | ID: mdl-31888464

ABSTRACT

BACKGROUND: Pseudouridine modification is most commonly found among various kinds of RNA modification occurred in both prokaryotes and eukaryotes. This biochemical event has been proved to occur in multiple types of RNAs, including rRNA, mRNA, tRNA, and nuclear/nucleolar RNA. Hence, gaining a holistic understanding of pseudouridine modification can contribute to the development of drug discovery and gene therapies. Although some laboratory techniques have come up with moderately good outcomes in pseudouridine identification, they are costly and required skilled work experience. We propose iPseU-NCP - an efficient computational framework to predict pseudouridine sites using the Random Forest (RF) algorithm combined with nucleotide chemical properties (NCP) generated from RNA sequences. The benchmark dataset collected from Chen et al. (2016) was used to develop iPseU-NCP and fairly compare its performances with other methods. RESULTS: Under the same experimental settings, comparing with three state-of-the-art methods including iPseU-CNN, PseUI, and iRNA-PseU, the Matthew's correlation coefficient (MCC) of our model increased by about 20.0%, 55.0%, and 109.0% when tested on the H. sapiens (H_200) dataset and by about 6.5%, 35.0%, and 150.0% when tested on the S. cerevisiae (S_200) dataset, respectively. This significant growth in MCC is very important since it ensures the stability and performance of our model. With those two independent test datasets, our model also presented higher accuracy with a success rate boosted by 7.0%, 13.0%, and 20.0% and 2.0%, 9.5%, and 25.0% when compared to iPseU-CNN, PseUI, and iRNA-PseU, respectively. For majority of other evaluation metrics, iPseU-NCP demonstrated superior performance as well. CONCLUSIONS: iPseU-NCP combining the RF and NPC-encoded features showed better performances than other existing state-of-the-art methods in the identification of pseudouridine sites. This also shows an optimistic view in addressing biological issues related to human diseases.

Subject(s)

Computational Biology/methods , Pseudouridine/metabolism , RNA/metabolism , RNA/genetics , Software

11.

Factors Influencing Community Event-based Surveillance: Lessons Learned from Pilot Implementation in Vietnam.

Clara, Alexey; Dao, Anh T P; Do, Trang T; Tran, Phu D; Tran, Quang D; Ngu, Nghia D; Ngo, Tu H; Phan, Hung C; Nguyen, Thuy T P; Bernadotte-Schmidt, Christina; Nguyen, Huyen T; Alroy, Karen Ann; Balajee, S Arunmozhi; Mounts, Anthony W.

Health Secur ; 16(S1): S66-S75, 2018.

Article in English | MEDLINE | ID: mdl-30480498

ABSTRACT

Community event-based surveillance aims to enhance the early detection of emerging public health threats and thus build health security. The Ministry of Health of Vietnam launched a community event-based surveillance pilot program in 6 provinces to improve the early warning functions of the existing surveillance system. An evaluation of the pilot program took place in 2017 and 2018. Data from this evaluation were analyzed to determine which factors were associated with increased detection and reporting. Results show that a number of small, local events were detected and reported through community event-based surveillance, supporting the notion that it would also facilitate the rapid detection and reporting of potentially larger events or outbreaks. The study showed the value of supportive supervision and monitoring to sustain community health worker reporting and the importance of conducting evaluations for community event-based surveillance programs to identify barriers to effective implementation.

Subject(s)

Disease Outbreaks/prevention & control , Population Surveillance/methods , Program Evaluation , Public Health , Global Health , Humans , Pilot Projects , Security Measures , Vietnam

12.

Event-Based Surveillance at Community and Healthcare Facilities, Vietnam, 2016-2017.

Clara, Alexey; Do, Trang T; Dao, Anh T P; Tran, Phu D; Dang, Tan Q; Tran, Quang D; Ngu, Nghia D; Ngo, Tu H; Phan, Hung C; Nguyen, Thuy T P; Lai, Anh T; Nguyen, Dung T; Nguyen, My K; Nguyen, Hieu T M; Becknell, Steven; Bernadotte, Christina; Nguyen, Huyen T; Nguyen, Quoc C; Mounts, Anthony W; Balajee, S Arunmozhi.

Emerg Infect Dis ; 24(9): 1649-1658, 2018 09.

Article in English | MEDLINE | ID: mdl-30124198

ABSTRACT

Surveillance and outbreak reporting systems in Vietnam required improvements to function effectively as early warning and response systems. Accordingly, the Ministry of Health of Vietnam, in collaboration with the US Centers for Disease Control and Prevention, launched a pilot project in 2016 focusing on community and hospital event-based surveillance. The pilot was implemented in 4 of Vietnam's 63 provinces. The pilot demonstrated that event-based surveillance resulted in early detection and reporting of outbreaks, improved collaboration between the healthcare facilities and preventive sectors of the ministry, and increased community participation in surveillance and reporting.

Subject(s)

Communicable Disease Control , Disease Outbreaks/prevention & control , Population Surveillance , Health Facilities , Hospitals , Humans , Vietnam/epidemiology

13.

Sustainable Model for Public Health Emergency Operations Centers for Global Settings.

Balajee, S Arunmozhi; Pasi, Omer G; Etoundi, Alain Georges M; Rzeszotarski, Peter; Do, Trang T; Hennessee, Ian; Merali, Sharifa; Alroy, Karen A; Phu, Tran Dac; Mounts, Anthony W.

Emerg Infect Dis ; 23(13)2017 10.

Article in English | MEDLINE | ID: mdl-29155649

ABSTRACT

Capacity to receive, verify, analyze, assess, and investigate public health events is essential for epidemic intelligence. Public health Emergency Operations Centers (PHEOCs) can be epidemic intelligence hubs by 1) having the capacity to receive, analyze, and visualize multiple data streams, including surveillance and 2) maintaining a trained workforce that can analyze and interpret data from real-time emerging events. Such PHEOCs could be physically located within a ministry of health epidemiology, surveillance, or equivalent department rather than exist as a stand-alone space and serve as operational hubs during nonoutbreak times but in emergencies can scale up according to the traditional Incident Command System structure.

Subject(s)

Disease Outbreaks/prevention & control , Global Health , Models, Organizational , Public Health Administration , Cameroon , Emergencies , Humans , Organizational Case Studies , Population Surveillance , Public Health Administration/methods , Vietnam , Workforce

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL