Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Materials (Basel) ; 17(17)2024 Sep 08.
Artigo em Inglês | MEDLINE | ID: mdl-39274811

RESUMO

We employ machine learning (ML) to predict the yield stress and plastic strain of body-centered cubic (BCC) high-entropy alloys (HEAs) in the compression test. Our machine learning model leverages currently available databases of BCC and BCC+B2 entropy alloys, using feature engineering to capture electronic factors, atomic ordering from mixing enthalpy, and the D parameter related to stacking fault energy. The model achieves low Root Mean Square Errors (RMSE). Utilizing Random Forest Regression (RFR) and Genetic Algorithms for feature selection, our model excels in both predictive accuracy and interpretability. Rigorous 10-fold cross-validation ensures robust generalization. Our discussion delves into feature importance, highlighting key predictors and their impact on mechanical properties. This work provides an important step toward designing high-performance structural high-entropy alloys, providing a powerful tool for predicting mechanical properties and identifying new alloys with superior strength and ductility.

2.
J Big Data ; 10(1): 55, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37193361

RESUMO

Background: Multiple organ dysfunction syndrome (MODS) is one of the leading causes of death in critically ill patients. MODS is the result of a dysregulated inflammatory response that can be triggered by various causes. Owing to the lack of an effective treatment for patients with MODS, early identification and intervention are the most effective strategies. Therefore, we have developed a variety of early warning models whose prediction results can be interpreted by Kernel SHapley Additive exPlanations (Kernel-SHAP) and reversed by diverse counterfactual explanations (DiCE). So we can predict the probability of MODS 12 h in advance, quantify the risk factors, and automatically recommend relevant interventions. Methods: We used various machine learning algorithms to complete the early risk assessment of MODS, and used a stacked ensemble to improve the prediction performance. The kernel-SHAP algorithm was used to quantify the positive and minus factors corresponding to the individual prediction results, and finally, the DiCE method was used to automatically recommend interventions. We completed the model training and testing based on the MIMIC-III and MIMIC-IV databases, in which the sample features in the model training included the patients' vital signs, laboratory test results, test reports, and data related to the use of ventilators. Results: The customizable model called SuperLearner, which integrated multiple machine learning algorithms, had the highest authenticity of screening, and its Yordon index (YI), sensitivity, accuracy, and utility_score on the MIMIC-IV test set were 0.813, 0.884, 0.893, and 0.763, respectively, which were all maximum values of eleven models. The area under the curve of the deep-wide neural network (DWNN) model on the MIMIC-IV test set was 0.960, and the specificity was 0.935, which were both the maximum values of all these models. The Kernel-SHAP algorithm combined with SuperLearner was used to determine the minimum value of glasgow coma scale (GCS) in the current hour (OR = 0.609, 95% CI 0.606-0.612), maximum value of MODS score corresponding to GCS in the past 24 h (OR = 2.632, 95% CI 2.588-2.676), and maximum score of MODS corresponding to creatinine in the past 24 h (OR = 3.281, 95% CI 3.267-3.295) were generally the most influential factors. Conclusion: The MODS early warning model based on machine learning algorithms has considerable application value, and the prediction efficiency of SuperLearner is superior to those of SubSuperLearner, DWNN, and other eight common machine learning models. Considering that the attribution analysis of Kernel-SHAP is a static analysis of the prediction results, we introduce the DiCE algorithm to automatically recommend counterfactuals to reverse the prediction results, which will be an important step towards the practical application of automatic MODS early intervention. Supplementary Information: The online version contains supplementary material available at 10.1186/s40537-023-00719-2.

3.
Patterns (N Y) ; 3(8): 100536, 2022 Aug 12.
Artigo em Inglês | MEDLINE | ID: mdl-36033591

RESUMO

Single-cell technologies generate large, high-dimensional datasets encompassing a diversity of omics. Dimensionality reduction captures the structure and heterogeneity of the original dataset, creating low-dimensional visualizations that contribute to the human understanding of data. Existing algorithms are typically unsupervised, using measured features to generate manifolds, disregarding known biological labels such as cell type or experimental time point. We repurpose the classification algorithm, linear discriminant analysis (LDA), for supervised dimensionality reduction of single-cell data. LDA identifies linear combinations of predictors that optimally separate a priori classes, enabling the study of specific aspects of cellular heterogeneity. We implement feature selection by hybrid subset selection (HSS) and demonstrate that this computationally efficient approach generates non-stochastic, interpretable axes amenable to diverse biological processes such as differentiation over time and cell cycle. We benchmark HSS-LDA against several popular dimensionality-reduction algorithms and illustrate its utility and versatility for the exploration of single-cell mass cytometry, transcriptomics, and chromatin accessibility data.

4.
Brief Bioinform ; 22(3)2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-34020542

RESUMO

Machine learning methods have been widely applied to big data analysis in genomics and epigenomics research. Although accuracy and efficiency are common goals in many modeling tasks, model interpretability is especially important to these studies towards understanding the underlying molecular and cellular mechanisms. Deep neural networks (DNNs) have recently gained popularity in various types of genomic and epigenomic studies due to their capabilities in utilizing large-scale high-throughput bioinformatics data and achieving high accuracy in predictions and classifications. However, DNNs are often challenged by their potential to explain the predictions due to their black-box nature. In this review, we present current development in the model interpretation of DNNs, focusing on their applications in genomics and epigenomics. We first describe state-of-the-art DNN interpretation methods in representative machine learning fields. We then summarize the DNN interpretation methods in recent studies on genomics and epigenomics, focusing on current data- and computing-intensive topics such as sequence motif identification, genetic variations, gene expression, chromatin interactions and non-coding RNAs. We also present the biological discoveries that resulted from these interpretation methods. We finally discuss the advantages and limitations of current interpretation approaches in the context of genomic and epigenomic studies. Contact:xiaoman@mail.ucf.edu, haihu@cs.ucf.edu.


Assuntos
Aprendizado Profundo , Epigênese Genética , Genômica , Redes Neurais de Computação , Cromatina/metabolismo , Biologia Computacional/métodos , DNA/genética , Expressão Gênica , Ligação Proteica , RNA/genética
5.
Artif Intell Med ; 99: 101690, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31606112

RESUMO

In order to gain insight into oligogenic disorders, understanding those involving bi-locus variant combinations appears to be key. In prior work, we showed that features at multiple biological scales can already be used to discriminate among two types, i.e. disorders involving true digenic and modifier combinations. The current study expands this machine learning work towards dual molecular diagnosis cases, providing a classifier able to effectively distinguish between these three types. To reach this goal and gain an in-depth understanding of the decision process, game theory and tree decomposition techniques are applied to random forest predictors to investigate the relevance of feature combinations in the prediction. A machine learning model with high discrimination capabilities was developed, effectively differentiating the three classes in a biologically meaningful manner. Combining prediction interpretation and statistical analysis, we propose a biologically meaningful characterization of each class relying on specific feature strengths. Figuring out how biological characteristics shift samples towards one of three classes provides clinically relevant insight into the underlying biological processes as well as the disease itself.


Assuntos
Teoria dos Jogos , Predisposição Genética para Doença/genética , Aprendizado de Máquina , Herança Multifatorial/genética , Árvores de Decisões , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA