Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros

Base de dados
País como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 20(3): 985-994, 2019 05 21.
Artigo em Inglês | MEDLINE | ID: mdl-29112707

RESUMO

MOTIVATION: One of the main challenges in machine learning (ML) is choosing an appropriate normalization method. Here, we examine the effect of various normalization methods on analyzing FPKM upper quartile (FPKM-UQ) RNA sequencing data sets. We collect the HTSeq-FPKM-UQ files of patients with colon adenocarcinoma from TCGA-COAD project. We compare three most common normalization methods: scaling, standardizing using z-score and vector normalization by visualizing the normalized data set and evaluating the performance of 12 supervised learning algorithms on the normalized data set. Additionally, for each of these normalization methods, we use two different normalization strategies: normalizing samples (files) or normalizing features (genes). RESULTS: Regardless of normalization methods, a support vector machine (SVM) model with the radial basis function kernel had the maximum accuracy (78%) in predicting the vital status of the patients. However, the fitting time of SVM depended on the normalization methods, and it reached its minimum fitting time when files were normalized to the unit length. Furthermore, among all 12 learning algorithms and 6 different normalization techniques, the Bernoulli naive Bayes model after standardizing files had the best performance in terms of maximizing the accuracy as well as minimizing the fitting time. We also investigated the effect of dimensionality reduction methods on the performance of the supervised ML algorithms. Reducing the dimension of the data set did not increase the maximum accuracy of 78%. However, it leaded to discovery of the 7SK RNA gene expression as a predictor of survival in patients with colon adenocarcinoma with accuracy of 78%.


Assuntos
Adenocarcinoma/patologia , Algoritmos , Neoplasias do Colo/patologia , Aprendizado de Máquina , RNA/genética , Análise de Sequência de RNA/métodos , Adenocarcinoma/genética , Neoplasias do Colo/genética , Feminino , Humanos , Masculino , Análise de Sobrevida
2.
Chemosphere ; 352: 141328, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38296215

RESUMO

Due to the expansive use of tetracycline antibiotics (TCs) to treat various infectious diseases in humans and animals, their presence in the environment has created many challenges for human societies. Therefore, providing green and cost-effective solutions for their effective removal has become an urgent need. Here, we will introduce 2D/2D p-n heterostructures that exhibit excellent sonophotocatalytic/photocatalytic properties for water-soluble pollutant removal. In this contribution, for the first time, ß- Ni(OH)2 nanosheets were synthesized through visible-light-induced photodeposition of different amounts of nickel on ZnO nanosheets (ß-Ni(x)/ZNs) to fabricate 2D/2D p-n heterostructures. The PXRD patterns confirmed the formation of wurtzite phase for ZNs and the hexagonal crystal structure of ß-Ni(OH)2. The FESEM and TEM micrographs showed that the ß-Ni(OH)2 sheets were dispersed on the surface of ZNs and formed 2D/2D p-n heterojunction in ß-Ni(x)/ZNs samples. With the photodeposition of ß-Ni(OH)2 nanosheets on ZNs, the surface area, pore volume, and pore diameter of ß-Ni(x)/ZNs heterostructures have increased compared to ZNs, which can have a positive effect on the sonophotocatalytic/photocatalytic performance of ZNs. The degradation experiments showed that ß-Ni(0.1)/ZNs and ß-Ni(0.4)/ZNs have the highest degradation percentage in photocatalytic (51 %) and sonophotocatalytic (71 %) degradation of TC, respectively. Finally, the sonophotocatalytic/photocatalytic degradation process of TC was systematically validated through modeling with three powerful and supervised machine learning algorithms, including Support Vector Regression (SVR), Artificial Neural Networks (ANNs), and Stochastic Gradient Boosting (SGB). Five statistical criteria including R2, SAE, MSE, SSE, and RMSE were calculated for model validation. It was observed that the developed SGB algorithm was the most reliable model for predicting the degradation percent of TC. The results revealed that using fabricated 2D/2D p-n heterojunctions (ß-Ni(x)/ZNs) is more sustainable than the conventional ZnO photocatalytic systems in practical applications.


Assuntos
Óxido de Zinco , Humanos , Óxido de Zinco/química , Níquel/química , Antibacterianos/química , Tetraciclina , Redes Neurais de Computação
3.
MethodsX ; 10: 102059, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36851982

RESUMO

Predictive models are statistical representations that indicate, based on the historical data analysis, the probability of triggering a given phenomenon in the future. In geosciences, such models have been essential to predict the occurrence of adverse phenomena commonly associated with environmental disasters, such as gully erosion. Therefore, this paper presents a method for producing gully erosion predictive models based on geoenvironmental data and machine learning techniques. The method's effectiveness test was produced in a region of approximately 40,000 km² in southeastern Brazil and compared the predictive performance of four models designed with different machine learning algorithms. The results demonstrated that the technique is capable of producing models with high predictive ability, with emphasis on the random forest algorithm, which, in addition to having achieved the highest levels of accuracy, also produced highly realistic maps for the study area.•The method is straightforward and may be applied to predict other geological processes.•The application of the method does not require knowledge of programming language.•The models produced achieved high predictive performance.

4.
Sci Justice ; 62(3): 288-309, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35598923

RESUMO

Sex estimation standards are population specific however, we argue that machine learning techniques (ML) may enhance the biological sex determination on trans-population application. Linear discriminant analysis (LDA) versus nine ML including quadratic discriminant analysis (QDA), support vector machine (SVM), Decision Tree (DT), Gaussian process (GPC), Naïve Bayesian (NBC), K-Nearest Neighbor (KNN), Random Forest (RFM) and Adaptive boosting (Adaboost) were compared. The experiments involve two contemporary populations: Turkish (n = 300) and Egyptian populations (n = 100) for training and validation, respectively. Base models were calibrated using isotonic and sigmoid calibration schemes. Results were analyzed at posterior probabilities (pp) thresholds >0.95 and >0.80. At pp = 0.5, ML algorithms yielded comparable accuracies in the training (90% to 97%) and test sets (81% to 88%) which are not modified after employing the calibration techniques. At pp >0.95, the raw RFM, LDA, QDA, and SVM models have shown the best performance however, calibration techniques improved the performance of various classifier especially NBC and Adaboost. By contrast, the performance of GPC, KNN, QDA models worsened by calibration. RFM has shown the best performance among all models at both thresholds whereas LDA benefited the best from using both calibration methods at pp >0.80. Complex ML models are not necessarily achieving better performance metrics. LDA and QDA remain the fastest and simplest classifiers. We demonstrated the capability of enhancing sex estimation using ML on an independent population sample however, differences in the underlying probability distribution generated by models were detected which warranted more cautious application by forensic practitioners.


Assuntos
Algoritmos , Máquina de Vetores de Suporte , Teorema de Bayes , Egito , Fêmur , Humanos
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa