Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
1.
Methods ; 207: 90-96, 2022 11.
Article in English | MEDLINE | ID: mdl-36174933

ABSTRACT

Adaptor proteins (APs) are a family of proteins that aids in intracellular membrane trafficking, and their impairments or defects are closely related to various disorders. Traditional methods to identify and classify APs require time and complex techniques, which were then advanced by machine learning and computational approaches to facilitate the APs recognition task. However, most studies focused on recognizing separate ones in the APs family or the APs in general with non-APs, lacking one comprehensive strategy to distinguish the complexes of AP subtypes. Herein, we proposed a novel method to implement one novel task as discriminating the AP complexes in the APs family, utilizing an interpretable deep neural network architecture on sequence-based encoding features. This work also introduced a benchmark data set of AP complexes originating from the UniProt and GeneOntology databases. To assess the robustness of our proposed method, we compared our performance to various machine learning algorithms and feature extraction strategies. Furthermore, the interpretation of the model's prediction performance was implemented using t-distributed stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection (UMAP), and SHapley Additive exPlanations (SHAP) analysis to show the distribution of AP complexes on optimal features. The promising performance of our architecture can assist scientists not only in AP complexes distinction but also in general protein sequences. Moreover, we have also made our work publicly on GitHub https://github.com/khanhlee/adaptor-dnn.


Subject(s)
Deep Learning , Neural Networks, Computer , Machine Learning , Algorithms , Amino Acid Sequence , Proteins
2.
Sensors (Basel) ; 23(8)2023 Apr 13.
Article in English | MEDLINE | ID: mdl-37112302

ABSTRACT

Possible drug-food constituent interactions (DFIs) could change the intended efficiency of particular therapeutics in medical practice. The increasing number of multiple-drug prescriptions leads to the rise of drug-drug interactions (DDIs) and DFIs. These adverse interactions lead to other implications, e.g., the decline in medicament's effect, the withdrawals of various medications, and harmful impacts on the patients' health. However, the importance of DFIs remains underestimated, as the number of studies on these topics is constrained. Recently, scientists have applied artificial intelligence-based models to study DFIs. However, there were still some limitations in data mining, input, and detailed annotations. This study proposed a novel prediction model to address the limitations of previous studies. In detail, we extracted 70,477 food compounds from the FooDB database and 13,580 drugs from the DrugBank database. We extracted 3780 features from each drug-food compound pair. The optimal model was eXtreme Gradient Boosting (XGBoost). We also validated the performance of our model on one external test set from a previous study which contained 1922 DFIs. Finally, we applied our model to recommend whether a drug should or should not be taken with some food compounds based on their interactions. The model can provide highly accurate and clinically relevant recommendations, especially for DFIs that may cause severe adverse events and even death. Our proposed model can contribute to developing more robust predictive models to help patients, under the supervision and consultants of physicians, avoid DFI adverse effects in combining drugs and foods for therapy.


Subject(s)
Drug-Related Side Effects and Adverse Reactions , Food-Drug Interactions , Humans , Artificial Intelligence , Machine Learning
3.
J Digit Imaging ; 36(3): 911-922, 2023 06.
Article in English | MEDLINE | ID: mdl-36717518

ABSTRACT

The malignant tumors in nature share some common morphological characteristics. Radiomics is not only images but also data; we think that a probability exists in a set of radiomics signatures extracted from CT scan images of one cancer tumor in one specific organ also be utilized for overall survival prediction in different types of cancers in different organs. The retrospective study enrolled four data sets of cancer patients in three different organs (420, 157, 137, and 191 patients for lung 1 training, lung 2 testing, and two external validation set: kidney and head and neck, respectively). In the training set, radiomics features were obtained from CT scan images, and essential features were chosen by LASSO algorithm. Univariable and multivariable analyses were then conducted to find a radiomics signature via Cox proportional hazard regression. The Kaplan-Meier curve was performed based on the risk score. The integrated time-dependent area under the ROC curve (iAUC) was calculated for each predictive model. In the training set, Kaplan-Meier curve classified patients as high or low-risk groups (p-value < 0.001; log-rank test). The risk score of radiomics signature was locked and independently evaluated in the testing set, and two external validation sets showed significant differences (p-value < 0.05; log-rank test). A combined model (radiomics + clinical) showed improved iAUC in lung 1, lung 2, head and neck, and kidney data set are 0.621 (95% CI 0.588, 0.654), 0.736 (95% CI 0.654, 0.819), 0.732 (95% CI 0.655, 0.809), and 0.834 (95% CI 0.722, 0.946), respectively. We believe that CT-based radiomics signatures for predicting overall survival in various cancer sites may exist.


Subject(s)
Neoplasms , Humans , Retrospective Studies , Neoplasms/diagnostic imaging , Tomography, X-Ray Computed/methods , Neck , Kidney
4.
J Chem Inf Model ; 62(19): 4820-4826, 2022 10 10.
Article in English | MEDLINE | ID: mdl-36166351

ABSTRACT

Background: SNARE proteins play a vital role in membrane fusion and cellular physiology and pathological processes. Many potential therapeutics for mental diseases or even cancer based on SNAREs are also developed. Therefore, there is a dire need to predict the SNAREs for further manipulation of these essential proteins, which demands new and efficient approaches. Methods: Some computational frameworks were proposed to tackle the hurdles of biological methods, which take plenty of time and budget to conduct the identification of SNAREs. However, the performances of existing frameworks were insufficiently satisfied, as they failed to retain the SNARE sequence order and capture the mass hidden features from SNAREs. This paper proposed a novel model constructed on the multiscan convolutional neural network (CNN) and position-specific scoring matrix (PSSM) profiles to address these limitations. We employed and trained our model on the benchmark dataset with fivefold cross-validation and two different independent datasets. Results: Overall, the multiscan CNN was cross-validated on the training set and excelled in the SNARE classification reaching 0.963 in AUC and 0.955 in AUPRC. On top of that, with the sensitivity, specificity, accuracy, and MCC of 0.842, 0.968, 0.955, and 0.767, respectively, our proposed framework outperformed previous models in the SNARE recognition task. Conclusions: It is truly believed that our model can contribute to the discrimination of SNARE proteins and general proteins.


Subject(s)
Neural Networks, Computer , SNARE Proteins , Position-Specific Scoring Matrices
5.
Int J Mol Sci ; 22(17)2021 Aug 26.
Article in English | MEDLINE | ID: mdl-34502160

ABSTRACT

Early identification of epidermal growth factor receptor (EGFR) and Kirsten rat sarcoma viral oncogene homolog (KRAS) mutations is crucial for selecting a therapeutic strategy for patients with non-small-cell lung cancer (NSCLC). We proposed a machine learning-based model for feature selection and prediction of EGFR and KRAS mutations in patients with NSCLC by including the least number of the most semantic radiomics features. We included a cohort of 161 patients from 211 patients with NSCLC from The Cancer Imaging Archive (TCIA) and analyzed 161 low-dose computed tomography (LDCT) images for detecting EGFR and KRAS mutations. A total of 851 radiomics features, which were classified into 9 categories, were obtained through manual segmentation and radiomics feature extraction from LDCT. We evaluated our models using a validation set consisting of 18 patients derived from the same TCIA dataset. The results showed that the genetic algorithm plus XGBoost classifier exhibited the most favorable performance, with an accuracy of 0.836 and 0.86 for detecting EGFR and KRAS mutations, respectively. We demonstrated that a noninvasive machine learning-based model including the least number of the most semantic radiomics signatures could robustly predict EGFR and KRAS mutations in patients with NSCLC.


Subject(s)
Carcinoma, Non-Small-Cell Lung/diagnostic imaging , Carcinoma, Non-Small-Cell Lung/genetics , Lung Neoplasms/diagnostic imaging , Lung Neoplasms/genetics , Machine Learning , Mutation , Proto-Oncogene Proteins p21(ras)/genetics , Aged , Aged, 80 and over , Algorithms , Biomarkers , Carcinoma, Non-Small-Cell Lung/pathology , ErbB Receptors/genetics , Female , Humans , Lung Neoplasms/pathology , Male , Middle Aged , Neoplasm Staging , ROC Curve , Reproducibility of Results , Supervised Machine Learning , Tomography, X-Ray Computed
6.
Med Biol Eng Comput ; 61(10): 2699-2712, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37432527

ABSTRACT

Lower-grade gliomas (LGG) can eventually progress to glioblastoma (GBM) and death. In the context of the transfer learning approach, we aimed to train and test an MRI-based radiomics model for predicting survival in GBM patients and validate it in LGG patients. From each patient's 704 MRI-based radiomics features, we chose seventeen optimal radiomics signatures in the GBM training set (n = 71) and used these features in both the GBM testing set (n = 31) and LGG validation set (n = 107) for further analysis. Each patient's risk score, calculated based on those optimal radiomics signatures, was chosen to represent the radiomics model. We compared the radiomics model with clinical, gene status models, and combined model integrating radiomics, clinical, and gene status in predicting survival. The average iAUCs of combined models in training, testing, and validation sets were respectively 0.804, 0.878, and 0.802, and those of radiomics models were 0.798, 0.867, and 0.717. The average iAUCs of gene status and clinical models ranged from 0.522 to 0.735 in all three sets. The radiomics model trained in GBM patients can effectively predict the overall survival of GBM and LGG patients, and the combined model improved this ability.


Subject(s)
Brain Neoplasms , Glioblastoma , Glioma , Humans , Brain Neoplasms/diagnostic imaging , Brain Neoplasms/genetics , Retrospective Studies , Glioma/diagnostic imaging , Magnetic Resonance Imaging , Machine Learning
7.
Comput Struct Biotechnol J ; 20: 2112-2123, 2022.
Article in English | MEDLINE | ID: mdl-35832629

ABSTRACT

Over the past decade, polypharmacy instances have been common in multi-diseases treatment. However, unwanted drug-drug interactions (DDIs) that might cause unexpected adverse drug events (ADEs) in multiple regimens therapy remain a significant issue. Since artificial intelligence (AI) is ubiquitous today, many AI prediction models have been developed to predict DDIs to support clinicians in pharmacotherapy-related decisions. However, even though DDI prediction models have great potential for assisting physicians in polypharmacy decisions, there are still concerns regarding the reliability of AI models due to their black-box nature. Building AI models with explainable mechanisms can augment their transparency to address the above issue. Explainable AI (XAI) promotes safety and clarity by showing how decisions are made in AI models, especially in critical tasks like DDI predictions. In this review, a comprehensive overview of AI-based DDI prediction, including the publicly available source for AI-DDIs studies, the methods used in data manipulation and feature preprocessing, the XAI mechanisms to promote trust of AI, especially for critical tasks as DDIs prediction, the modeling methods, is provided. Limitations and the future directions of XAI in DDIs are also discussed.

8.
Cancers (Basel) ; 13(14)2021 Jul 19.
Article in English | MEDLINE | ID: mdl-34298828

ABSTRACT

This study aimed to create a risk score generated from CT-based radiomics signatures that could be used to predict overall survival in patients with non-small cell lung cancer (NSCLC). We retrospectively enrolled three sets of NSCLC patients (including 336, 84, and 157 patients for training, testing, and validation set, respectively). A total of 851 radiomics features for each patient from CT images were extracted for further analyses. The most important features (strongly linked with overall survival) were chosen by pairwise correlation analysis, Least Absolute Shrinkage and Selection Operator (LASSO) regression model, and univariate Cox proportional hazard regression. Multivariate Cox proportional hazard model survival analysis was used to create risk scores for each patient, and Kaplan-Meier was used to separate patients into two groups: high-risk and low-risk, respectively. ROC curve assessed the prediction ability of the risk score model for overall survival compared to clinical parameters. The risk score, which developed from ten radiomics signatures model, was found to be independent of age, gender, and stage for predicting overall survival in NSCLC patients (HR, 2.99; 95% CI, 2.27-3.93; p < 0.001) and overall survival prediction ability was 0.696 (95% CI, 0.635-0.758), 0.705 (95% CI, 0.649-0.762), 0.657 (95% CI, 0.589-0.726) (AUC) for 1, 3, and 5 years, respectively, in the training set. The risk score is more likely to have a better accuracy in predicting survival at 1, 3, and 5 years than clinical parameters, such as age 0.57 (95% CI, 0.499-0.64), 0.552 (95% CI, 0.489-0.616), 0.621 (95% CI, 0.544-0.689) (AUC); gender 0.554, 0.546, 0.566 (AUC); stage 0.527, 0.501, 0.459 (AUC), respectively, in 1, 3 and 5 years in the training set. In the training set, the Kaplan-Meier curve revealed that NSCLC patients in the high-risk group had a lower overall survival time than the low-risk group (p < 0.001). We also had similar results that were statistically significant in the testing and validation set. In conclusion, risk scores developed from ten radiomics signatures models have great potential to predict overall survival in NSCLC patients compared to the clinical parameters. This model was able to stratify NSCLC patients into high-risk and low-risk groups regarding the overall survival prediction.

9.
Cancers (Basel) ; 13(21)2021 Oct 27.
Article in English | MEDLINE | ID: mdl-34771562

ABSTRACT

The prognosis and treatment plans for patients diagnosed with low-grade gliomas (LGGs) may significantly be improved if there is evidence of chromosome 1p/19q co-deletion mutation. Many studies proved that the codeletion status of 1p/19q enhances the sensitivity of the tumor to different types of therapeutics. However, the current clinical gold standard of detecting this chromosomal mutation remains invasive and poses implicit risks to patients. Radiomics features derived from medical images have been used as a new approach for non-invasive diagnosis and clinical decisions. This study proposed an eXtreme Gradient Boosting (XGBoost)-based model to predict the 1p/19q codeletion status in a binary classification task. We trained our model on the public database extracted from The Cancer Imaging Archive (TCIA), including 159 LGG patients with 1p/19q co-deletion mutation status. The XGBoost was the baseline algorithm, and we combined the SHapley Additive exPlanations (SHAP) analysis to select the seven most optimal radiomics features to build the final predictive model. Our final model achieved an accuracy of 87% and 82.8% on the training set and external test set, respectively. With seven wavelet radiomics features, our XGBoost-based model can identify the 1p/19q codeletion status in LGG-diagnosed patients for better management and address the drawbacks of invasive gold-standard tests in clinical practice.

SELECTION OF CITATIONS
SEARCH DETAIL