Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.952
Filtrar
Más filtros

Intervalo de año de publicación
1.
Brief Bioinform ; 25(5)2024 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-39101500

RESUMEN

Genomic selection (GS) has emerged as an effective technology to accelerate crop hybrid breeding by enabling early selection prior to phenotype collection. Genomic best linear unbiased prediction (GBLUP) is a robust method that has been routinely used in GS breeding programs. However, GBLUP assumes that markers contribute equally to the total genetic variance, which may not be the case. In this study, we developed a novel GS method called GA-GBLUP that leverages the genetic algorithm (GA) to select markers related to the target trait. We defined four fitness functions for optimization, including AIC, BIC, R2, and HAT, to improve the predictability and bin adjacent markers based on the principle of linkage disequilibrium to reduce model dimension. The results demonstrate that the GA-GBLUP model, equipped with R2 and HAT fitness function, produces much higher predictability than GBLUP for most traits in rice and maize datasets, particularly for traits with low heritability. Moreover, we have developed a user-friendly R package, GAGBLUP, for GS, and the package is freely available on CRAN (https://CRAN.R-project.org/package=GAGBLUP).


Asunto(s)
Algoritmos , Genómica , Selección Genética , Zea mays , Genómica/métodos , Zea mays/genética , Oryza/genética , Modelos Genéticos , Fitomejoramiento/métodos , Desequilibrio de Ligamiento , Fenotipo , Sitios de Carácter Cuantitativo , Genoma de Planta , Polimorfismo de Nucleótido Simple , Programas Informáticos
2.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38388680

RESUMEN

CRISPR Cas-9 is a groundbreaking genome-editing tool that harnesses bacterial defense systems to alter DNA sequences accurately. This innovative technology holds vast promise in multiple domains like biotechnology, agriculture and medicine. However, such power does not come without its own peril, and one such issue is the potential for unintended modifications (Off-Target), which highlights the need for accurate prediction and mitigation strategies. Though previous studies have demonstrated improvement in Off-Target prediction capability with the application of deep learning, they often struggle with the precision-recall trade-off, limiting their effectiveness and do not provide proper interpretation of the complex decision-making process of their models. To address these limitations, we have thoroughly explored deep learning networks, particularly the recurrent neural network based models, leveraging their established success in handling sequence data. Furthermore, we have employed genetic algorithm for hyperparameter tuning to optimize these models' performance. The results from our experiments demonstrate significant performance improvement compared with the current state-of-the-art in Off-Target prediction, highlighting the efficacy of our approach. Furthermore, leveraging the power of the integrated gradient method, we make an effort to interpret our models resulting in a detailed analysis and understanding of the underlying factors that contribute to Off-Target predictions, in particular the presence of two sub-regions in the seed region of single guide RNA which extends the established biological hypothesis of Off-Target effects. To the best of our knowledge, our model can be considered as the first model combining high efficacy, interpretability and a desirable balance between precision and recall.


Asunto(s)
Sistemas CRISPR-Cas , Aprendizaje Profundo , Edición Génica/métodos , ARN Guía de Sistemas CRISPR-Cas , Redes Neurales de la Computación
3.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36460620

RESUMEN

Lysine succinylation is a kind of post-translational modification (PTM) that plays a crucial role in regulating the cellular processes. Aberrant succinylation may cause inflammation, cancers, metabolism diseases and nervous system diseases. The experimental methods to detect succinylation sites are time-consuming and costly. This thus calls for computational models with high efficacy, and attention has been given in the literature to develop such models, albeit with only moderate success in the context of different evaluation metrics. One crucial aspect in this context is the biochemical and physicochemical properties of amino acids, which appear to be useful as features for such computational predictors. However, some of the existing computational models did not use the biochemical and physicochemical properties of amino acids. In contrast, some others used them without considering the inter-dependency among the properties. The combinations of biochemical and physicochemical properties derived through our optimization process achieve better results than the results achieved by combining all the properties. We propose three deep learning architectures: CNN+Bi-LSTM (CBL), Bi-LSTM+CNN (BLC) and their combination (CBL_BLC). We find that CBL_BLC outperforms the other two. Ensembling of different models successfully improves the results. Notably, tuning the threshold of the ensemble classifiers further improves the results. Upon comparing our work with other existing works on two datasets, we successfully achieve better sensitivity and specificity by varying the threshold value.


Asunto(s)
Algoritmos , Lisina , Lisina/metabolismo , Aminoácidos/química , Sensibilidad y Especificidad , Procesamiento Proteico-Postraduccional
4.
Brief Bioinform ; 24(2)2023 03 19.
Artículo en Inglés | MEDLINE | ID: mdl-36869849

RESUMEN

Drug resistance is one of principal limiting factors for cancer treatment. Several mechanisms, especially mutation, have been validated to implicate in drug resistance. In addition, drug resistance is heterogeneous, which makes an urgent need to explore the personalized driver genes of drug resistance. Here, we proposed an approach DRdriver to identify drug resistance driver genes in individual-specific network of resistant patients. First, we identified the differential mutations for each resistant patient. Next, the individual-specific network, which included the genes with differential mutations and their targets, was constructed. Then, the genetic algorithm was utilized to identify the drug resistance driver genes, which regulated the most differentially expressed genes and the least non-differentially expressed genes. In total, we identified 1202 drug resistance driver genes for 8 cancer types and 10 drugs. We also demonstrated that the identified driver genes were mutated more frequently than other genes and tended to be associated with the development of cancer and drug resistance. Based on the mutational signatures of all driver genes and enriched pathways of driver genes in brain lower grade glioma treated by temozolomide, the drug resistance subtypes were identified. Additionally, the subtypes showed great diversity in epithelial-mesenchyme transition, DNA damage repair and tumor mutation burden. In summary, this study developed a method DRdriver for identifying personalized drug resistance driver genes, which provides a framework for unlocking the molecular mechanism and heterogeneity of drug resistance.


Asunto(s)
Redes Reguladoras de Genes , Neoplasias , Humanos , Neoplasias/tratamiento farmacológico , Neoplasias/genética , Neoplasias/patología , Mutación , Oncogenes , Resistencia a Medicamentos
5.
Brief Bioinform ; 24(3)2023 05 19.
Artículo en Inglés | MEDLINE | ID: mdl-36941113

RESUMEN

Traditional Chinese medicine (TCM) has accumulated thousands years of knowledge in herbal therapy, but the use of herbal formulas is still characterized by reliance on personal experience. Due to the complex mechanism of herbal actions, it is challenging to discover effective herbal formulas for diseases by integrating the traditional experiences and modern pharmacological mechanisms of multi-target interactions. In this study, we propose a herbal formula prediction approach (TCMFP) combined therapy experience of TCM, artificial intelligence and network science algorithms to screen optimal herbal formula for diseases efficiently, which integrates a herb score (Hscore) based on the importance of network targets, a pair score (Pscore) based on empirical learning and herbal formula predictive score (FmapScore) based on intelligent optimization and genetic algorithm. The validity of Hscore, Pscore and FmapScore was verified by functional similarity and network topological evaluation. Moreover, TCMFP was used successfully to generate herbal formulae for three diseases, i.e. the Alzheimer's disease, asthma and atherosclerosis. Functional enrichment and network analysis indicates the efficacy of targets for the predicted optimal herbal formula. The proposed TCMFP may provides a new strategy for the optimization of herbal formula, TCM herbs therapy and drug development.


Asunto(s)
Asma , Medicamentos Herbarios Chinos , Humanos , Medicamentos Herbarios Chinos/uso terapéutico , Medicamentos Herbarios Chinos/farmacología , Inteligencia Artificial , Medicina Tradicional China/métodos , Asma/tratamiento farmacológico , Aprendizaje Automático Supervisado
6.
BMC Bioinformatics ; 25(1): 183, 2024 May 09.
Artículo en Inglés | MEDLINE | ID: mdl-38724908

RESUMEN

BACKGROUND: In recent years, gene clustering analysis has become a widely used tool for studying gene functions, efficiently categorizing genes with similar expression patterns to aid in identifying gene functions. Caenorhabditis elegans is commonly used in embryonic research due to its consistent cell lineage from fertilized egg to adulthood. Biologists use 4D confocal imaging to observe gene expression dynamics at the single-cell level. However, on one hand, the observed tree-shaped time-series datasets have characteristics such as non-pairwise data points between different individuals. On the other hand, the influence of cell type heterogeneity should also be considered during clustering, aiming to obtain more biologically significant clustering results. RESULTS: A biclustering model is proposed for tree-shaped single-cell gene expression data of Caenorhabditis elegans. Detailedly, a tree-shaped piecewise polynomial function is first employed to fit non-pairwise gene expression time series data. Then, four factors are considered in the objective function, including Pearson correlation coefficients capturing gene correlations, p-values from the Kolmogorov-Smirnov test measuring the similarity between cells, as well as gene expression size and bicluster overlapping size. After that, Genetic Algorithm is utilized to optimize the function. CONCLUSION: The results on the small-scale dataset analysis validate the feasibility and effectiveness of our model and are superior to existing classical biclustering models. Besides, gene enrichment analysis is employed to assess the results on the complete real dataset analysis, confirming that the discovered biclustering results hold significant biological relevance.


Asunto(s)
Caenorhabditis elegans , Análisis de la Célula Individual , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Animales , Análisis de la Célula Individual/métodos , Análisis por Conglomerados , Perfilación de la Expresión Génica/métodos , Algoritmos
7.
J Neurophysiol ; 132(1): 136-146, 2024 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-38863430

RESUMEN

Deep brain stimulation (DBS) of the subthalamic nucleus (STN) is an effective treatment for Parkinson's disease, but its mechanisms of action remain unclear. Detailed multicompartment computational models of STN neurons are often used to study how DBS electric fields modulate the neurons. However, currently available STN neuron models have some limitations in their biophysical realism. In turn, the goal of this study was to update a detailed rodent STN neuron model originally developed by Gillies and Willshaw in 2006. Our design requirements consisted of explicitly representing an axon connected to the neuron and updating the ion channel distributions based on the experimental literature to match established electrophysiological features of rodent STN neurons. We found that adding an axon to the STN neuron model substantially altered its firing characteristics. We then used a genetic algorithm to optimize biophysical parameters of the model. The updated model exhibited spontaneous firing, action potential shape, hyperpolarization response, and frequency-current curve that aligned well with experimental recordings from STN neurons. Subsequently, we evaluated the general compatibility of the updated biophysics by applying them to 26 different STN neuron morphologies derived from three-dimensional anatomical reconstructions. The different morphologies affected the firing behavior of the model, but the updated biophysics were robustly capable of maintaining the desired electrophysiological features. The new STN neuron model developed in this work offers a valuable tool for studying STN neuron firing properties and may find application in simulating STN local field potentials and analyzing the effects of STN DBS.NEW & NOTEWORTHY This study presents an anatomically and biophysically realistic rodent STN neuron model. The work showcases the use of a genetic algorithm to optimize the model parameters. We noted a substantial influence of the axon on the electrophysiological characteristics of STN neurons. The updated model offers a valuable tool to investigate the firing of STN neurons and their modulation by intrinsic and/or extrinsic factors.


Asunto(s)
Potenciales de Acción , Modelos Neurológicos , Neuronas , Núcleo Subtalámico , Núcleo Subtalámico/fisiología , Núcleo Subtalámico/citología , Animales , Neuronas/fisiología , Potenciales de Acción/fisiología , Ratas , Axones/fisiología , Estimulación Encefálica Profunda
8.
BMC Biotechnol ; 24(1): 68, 2024 Sep 27.
Artículo en Inglés | MEDLINE | ID: mdl-39334143

RESUMEN

INTRODUCTION: Developing somatic embryogenesis is one of the main steps in successful in vitro propagation and gene transformation in the carrot. However, somatic embryogenesis is influenced by different intrinsic (genetics, genotype, and explant) and extrinsic (e.g., plant growth regulators (PGRs), medium composition, and gelling agent) factors which cause challenges in developing the somatic embryogenesis protocol. Therefore, optimizing somatic embryogenesis is a tedious, time-consuming, and costly process. Novel data mining approaches through a hybrid of artificial neural networks (ANNs) and optimization algorithms can facilitate modeling and optimizing in vitro culture processes and thereby reduce large experimental treatments and combinations. Carrot is a model plant in genetic engineering works and recombinant drugs, and therefore it is an important plant in research works. Also, in this research, for the first time, embryogenesis in carrot (Daucus carota L.) using Genetic algorithm (GA) and data mining technology has been reviewed and analyzed. MATERIALS AND METHODS: In the current study, data mining approach through multilayer perceptron (MLP) and radial basis function (RBF) as two well-known ANNs were employed to model and predict embryogenic callus production in carrot based on eight input variables including carrot cultivars, agar, magnesium sulfate (MgSO4), calcium dichloride (CaCl2), manganese (II) sulfate (MnSO4), 2,4-dichlorophenoxyacetic acid (2,4-D), 6-benzylaminopurine (BAP), and kinetin (KIN). To confirm the reliability and accuracy of the developed model, the result obtained from RBF-GA model were tested in the laboratory. RESULTS: The results showed that RBF had better prediction efficiency than MLP. Then, the developed model was linked to a genetic algorithm (GA) to optimize the system. To confirm the reliability and accuracy of the developed model, the result of RBF-GA was experimentally tested in the lab as a validation experiment. The result showed that there was no significant difference between the predicted optimized result and the experimental result. CONCLUTIONS: Generally, the results of this study suggest that data mining through RBF-GA can be considered as a robust approach, besides experimental methods, to model and optimize in vitro culture systems. According to the RBF-GA result, the highest somatic embryogenesis rate (62.5%) can be obtained from Nantes improved cultivar cultured on medium containing 195.23 mg/l MgSO4, 330.07 mg/l CaCl2, 18.3 mg/l MnSO4, 0.46 mg/l 2,4- D, 0.03 mg/l BAP, and 0.88 mg/l KIN. These results were also confirmed in the laboratory.


Asunto(s)
Medios de Cultivo , Minería de Datos , Daucus carota , Técnicas de Embriogénesis Somática de Plantas , Daucus carota/genética , Daucus carota/embriología , Minería de Datos/métodos , Técnicas de Embriogénesis Somática de Plantas/métodos , Medios de Cultivo/química , Algoritmos , Redes Neurales de la Computación , Reguladores del Crecimiento de las Plantas/farmacología
9.
Biostatistics ; 24(2): 295-308, 2023 04 14.
Artículo en Inglés | MEDLINE | ID: mdl-34494086

RESUMEN

Support vector regression (SVR) is particularly beneficial when the outcome and predictors are nonlinearly related. However, when many covariates are available, the method's flexibility can lead to overfitting and an overall loss in predictive accuracy. To overcome this drawback, we develop a feature selection method for SVR based on a genetic algorithm that iteratively searches across potential subsets of covariates to find those that yield the best performance according to a user-defined fitness function. We evaluate the performance of our feature selection method for SVR, comparing it to alternate methods including LASSO and random forest, in a simulation study. We find that our method yields higher predictive accuracy than SVR without feature selection. Our method outperforms LASSO when the relationship between covariates and outcome is nonlinear. Random forest performs equivalently to our method in some scenarios, but more poorly when covariates are correlated. We apply our method to predict donor kidney function 1 year after transplant using data from the United Network for Organ Sharing national registry.


Asunto(s)
Algoritmos , Análisis de Regresión , Humanos , Máquina de Vectores de Soporte
10.
Small ; 20(6): e2305375, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-37771186

RESUMEN

Nanoparticles (NPs) have been employed as drug delivery systems (DDSs) for several decades, primarily as passive carriers, with limited selectivity. However, recent publications have shed light on the emerging phenomenon of NPs exhibiting selective cytotoxicity against cancer cell lines, attributable to distinct metabolic disparities between healthy and pathological cells. This study revisits the concept of NPs selective cytotoxicity, and for the first time proposes a high-throughput in silico screening approach to massive targeted discovery of selectively cytotoxic inorganic NPs. In the first step, this work trains a gradient boosting regression model to predict viability of NP-treated cell lines. The model achieves mean cross-validation (CV) Q2 = 0.80 and root mean square error (RMSE) of 13.6. In the second step, this work develops a machine learning (ML) reinforced genetic algorithm (GA), capable of screening >14 900 candidates/min, to identify the best-performing selectively cytotoxic NPs. As proof-of-concept, DDS candidates for the treatment of liver cancer are screened on HepG2 and hepatocytes cell lines resulting in Ag NPs with selective toxicity score of 42%. This approach opens the door for clinical translation of NPs, expanding their therapeutic application to a wider range of chemical space of NPs and living organisms such as bacteria and fungi.


Asunto(s)
Antineoplásicos , Neoplasias Hepáticas , Nanopartículas , Humanos , Nanopartículas/química , Aprendizaje Automático , Algoritmos
11.
Brief Bioinform ; 23(5)2022 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-36088543

RESUMEN

Ensemble learning is a kind of machine learning method which can integrate multiple basic learners together and achieve higher accuracy. Recently, single machine learning methods have been established to predict survival for patients with cancer. However, it still lacked a robust ensemble learning model with high accuracy to pick out patients with high risks. To achieve this, we proposed a novel genetic algorithm-aided three-stage ensemble learning method (3S score) for survival prediction. During the process of constructing the 3S score, double training sets were used to avoid over-fitting; the gene-pairing method was applied to reduce batch effect; a genetic algorithm was employed to select the best basic learner combination. When used to predict the survival state of glioma patients, this model achieved the highest C-index (0.697) as well as area under the receiver operating characteristic curve (ROC-AUCs) (first year = 0.705, third year = 0.825 and fifth year = 0.839) in the combined test set (n = 1191), compared with 12 other baseline models. Furthermore, the 3S score can distinguish survival significantly in eight cohorts among the total of nine independent test cohorts (P < 0.05), achieving significant improvement of ROC-AUCs. Notably, ablation experiments demonstrated that the gene-pairing method, double training sets and genetic algorithm make sure the robustness and effectiveness of the 3S score. The performance exploration on pan-cancer showed that the 3S score has excellent ability on survival prediction in five kinds of cancers, which was verified by Cox regression, survival curves and ROC curves together. To enable its clinical adoption, we implemented the 3S score and other two clinical factors as an easy-to-use web tool for risk scoring and therapy stratification in glioma patients.


Asunto(s)
Glioma , Aprendizaje Automático , Glioma/genética , Humanos , Curva ROC , Factores de Riesgo
12.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34929743

RESUMEN

Recently, deep learning (DL)-based de novo drug design represents a new trend in pharmaceutical research, and numerous DL-based methods have been developed for the generation of novel compounds with desired properties. However, a comprehensive understanding of the advantages and disadvantages of these methods is still lacking. In this study, the performances of different generative models were evaluated by analyzing the properties of the generated molecules in different scenarios, such as goal-directed (rediscovery, optimization and scaffold hopping of active compounds) and target-specific (generation of novel compounds for a given target) tasks. In overall, the DL-based models have significant advantages over the baseline models built by the traditional methods in learning the physicochemical property distributions of the training sets and may be more suitable for target-specific tasks. However, both the baselines and DL-based generative models cannot fully exploit the scaffolds of the training sets, and the molecules generated by the DL-based methods even have lower scaffold diversity than those generated by the traditional models. Moreover, our assessment illustrates that the DL-based methods do not exhibit obvious advantages over the genetic algorithm-based baselines in goal-directed tasks. We believe that our study provides valuable guidance for the effective use of generative models in de novo drug design.


Asunto(s)
Diseño de Fármacos , Descubrimiento de Drogas/métodos , Algoritmos , Aprendizaje Profundo
13.
J Transl Med ; 22(1): 353, 2024 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-38622716

RESUMEN

Recent studies have increasingly revealed the connection between metabolic reprogramming and tumor progression. However, the specific impact of metabolic reprogramming on inter-patient heterogeneity and prognosis in lung adenocarcinoma (LUAD) still requires further exploration. Here, we introduced a cellular hierarchy framework according to a malignant and metabolic gene set, named malignant & metabolism reprogramming (MMR), to reanalyze 178,739 single-cell reference profiles. Furthermore, we proposed a three-stage ensemble learning pipeline, aided by genetic algorithm (GA), for survival prediction across 9 LUAD cohorts (n = 2066). Throughout the pipeline of developing the three stage-MMR (3 S-MMR) score, double training sets were implemented to avoid over-fitting; the gene-pairing method was utilized to remove batch effect; GA was harnessed to pinpoint the optimal basic learner combination. The novel 3 S-MMR score reflects various aspects of LUAD biology, provides new insights into precision medicine for patients, and may serve as a generalizable predictor of prognosis and immunotherapy response. To facilitate the clinical adoption of the 3 S-MMR score, we developed an easy-to-use web tool for risk scoring as well as therapy stratification in LUAD patients. In summary, we have proposed and validated an ensemble learning model pipeline within the framework of metabolic reprogramming, offering potential insights for LUAD treatment and an effective approach for developing prognostic models for other diseases.


Asunto(s)
Adenocarcinoma del Pulmón , Neoplasias Pulmonares , Humanos , Reprogramación Metabólica , Adenocarcinoma del Pulmón/genética , Neoplasias Pulmonares/genética , Aprendizaje Automático , Algoritmos , Pronóstico
14.
New Phytol ; 2024 Aug 25.
Artículo en Inglés | MEDLINE | ID: mdl-39183371

RESUMEN

Phenotypic plasticity describes a genotype's ability to produce different phenotypes in response to different environments. Breeding crops that exhibit appropriate levels of plasticity for future climates will be crucial to meeting global demand, but knowledge of the critical environmental factors is limited to a handful of well-studied major crops. Using 727 maize (Zea mays L.) hybrids phenotyped for grain yield in 45 environments, we investigated the ability of a genetic algorithm and two other methods to identify environmental determinants of grain yield from a large set of candidate environmental variables constructed using minimal assumptions. The genetic algorithm identified pre- and postanthesis maximum temperature, mid-season solar radiation, and whole season net evapotranspiration as the four most important variables from a candidate set of 9150. Importantly, these four variables are supported by previous literature. After calculating reaction norms for each environmental variable, candidate genes were identified and gene annotations investigated to demonstrate how this method can generate insights into phenotypic plasticity. The genetic algorithm successfully identified known environmental determinants of hybrid maize grain yield. This demonstrates that the methodology could be applied to other less well-studied phenotypes and crops to improve understanding of phenotypic plasticity and facilitate breeding crops for future climates.

15.
Chemphyschem ; 25(4): e202300800, 2024 Feb 16.
Artículo en Inglés | MEDLINE | ID: mdl-38083816

RESUMEN

In this work, an unbiased global search with a homemade genetic algorithm was performed to investigate the structural evolution and electronic properties of Snx - (x=21-35) clusters with density functional theory (DFT) calculations. All the ground-state structures for all these Snx - (x=21-35) clusters have been confirmed by the comparison of the experimental and simulated photoelectron spectra (PESs). It has been revealed that all Snx - (x=21-35) clusters are tricapped trigonal prism (TTP)-based structures consisting of two (for sizes x=21-28) or three (for x=29-35) TTP units, with the remaining atoms adsorbed on the surface or inserted between TTP units. The gradually decreasing HOMO-LUMO gaps indicate that these clusters are undergoing semiconductor-to-metal transformation. The average binding energies show that the structural stabilities of Snx - clusters are not as good as that of silicon and germanium clusters. It found that sizes x=23, 25, 29, 33 show high relative stability.

16.
Biotechnol Bioeng ; 121(5): 1583-1595, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38247359

RESUMEN

As a non-destructive sensing technique, Raman spectroscopy is often combined with regression models for real-time detection of key components in microbial cultivation processes. However, achieving accurate model predictions often requires a large amount of offline measurement data for training, which is both time-consuming and labor-intensive. In order to overcome the limitations of traditional models that rely on large datasets and complex spectral preprocessing, in addition to the difficulty of training models with limited samples, we have explored a genetic algorithm-based semi-supervised convolutional neural network (GA-SCNN). GA-SCNN integrates unsupervised process spectral labeling, feature extraction, regression prediction, and transfer learning. Using only an extremely small number of offline samples of the target protein, this framework can accurately predict protein concentration, which represents a significant challenge for other models. The effectiveness of the framework has been validated in a system of Escherichia coli expressing recombinant ProA5M protein. By utilizing the labeling technique of this framework, the available dataset for glucose, lactate, ammonium ions, and optical density at 600 nm (OD600) has been expanded from 52 samples to 1302 samples. Furthermore, by introducing a small component of offline detection data for recombinant proteins into the OD600 model through transfer learning, a model for target protein detection has been retrained, providing a new direction for the development of associated models. Comparative analysis with traditional algorithms demonstrates that the GA-SCNN framework exhibits good adaptability when there is no complex spectral preprocessing. Cross-validation results confirm the robustness and high accuracy of the framework, with the predicted values of the model highly consistent with the offline measurement results.


Asunto(s)
Escherichia coli , Redes Neurales de la Computación , Fermentación , Escherichia coli/genética , Algoritmos , Proteínas Recombinantes/genética
17.
BMC Med Res Methodol ; 24(1): 50, 2024 Feb 27.
Artículo en Inglés | MEDLINE | ID: mdl-38413856

RESUMEN

INTRODUCTION: The determination of identity factors such as age and sex has gained significance in both criminal and civil cases. Paranasal sinuses like frontal and maxillary sinuses, are resistant to trauma and can aid profiling. We developed a deep learning (DL) model optimized by an evolutionary algorithm (genetic algorithm/GA) to determine sex and age using paranasal sinus parameters based on cone-beam computed tomography (CBCT). METHODS: Two hundred and forty CBCT images (including 129 females and 111 males, aged 18-52) were included in this study. CBCT images were captured using the Newtom3G device with specific exposure parameters. These images were then analyzed in ITK-SNAP 3.6.0 beta software to extract four paranasal sinus parameters: height, width, length, and volume for both the frontal and maxillary sinuses. A hybrid model, Genetic Algorithm-Deep Neural Network (GADNN), was proposed for feature selection and classification. Traditional statistical methods and machine learning models, including logistic regression (LR), random forest (RF), multilayer perceptron neural network (MLP), and deep learning (DL) were evaluated for their performance. The synthetic minority oversampling technique was used to deal with the unbalanced data. RESULTS: GADNN showed superior accuracy in both sex determination (accuracy of 86%) and age determination (accuracy of 68%), outperforming other models. Also, DL and RF were the second and third superior methods in sex determination (accuracy of 78% and 71% respectively) and age determination (accuracy of 92% and 57%). CONCLUSIONS: The study introduces a novel approach combining DL and GA to enhance sex determination and age determination accuracy. The potential of DL in forensic dentistry is highlighted, demonstrating its efficiency in improving accuracy for sex determination and age determination. The study contributes to the burgeoning field of DL in dentistry and forensic sciences.


Asunto(s)
Aprendizaje Profundo , Masculino , Femenino , Humanos , Tomografía Computarizada de Haz Cónico/métodos , Seno Maxilar/diagnóstico por imagen , Programas Informáticos , Redes Neurales de la Computación
18.
J Chem Inf Model ; 64(6): 1794-1805, 2024 03 25.
Artículo en Inglés | MEDLINE | ID: mdl-38485516

RESUMEN

As the number of determined and predicted protein structures and the size of druglike 'make-on-demand' libraries soar, the time-consuming nature of structure-based computer-aided drug design calls for innovative computational algorithms. De novo drug design introduces in silico heuristics to accelerate searching in the vast chemical space. This review focuses on recent advances in structure-based de novo drug design, ranging from conventional fragment-based methods, evolutionary algorithms, and Metropolis Monte Carlo methods to deep generative models. Due to the historical limitation of de novo drug design generating readily available drug-like molecules, we highlight the synthetic accessibility efforts in each category and the benchmarking strategies taken to validate the proposed framework.


Asunto(s)
Algoritmos , Diseño de Fármacos
19.
Environ Res ; 244: 117964, 2024 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-38135102

RESUMEN

In this study, we evaluate the efficiency of two novel nanostructured adsorbents - chitosan-graphitic carbon nitride@magnetite (CS-g-CN@Fe3O4) and graphitic carbon nitride@copper/zinc nanocomposite (g-CN@Cu/Zn NC) - for the rapid removal of methylparaben (MPB) from water. Our characterization methods, aimed at understanding the adsorbents' structures and surface areas, informed our systematic examination of influential parameters including sonication time, adsorbent dosage, initial MPB concentration, and temperature. We applied advanced modeling techniques, such as response surface methodology (RSM), generalized regression neural network (GRNN), and radial basis function neural network (RBFNN), to evaluate the adsorption process. The adsorbents proved highly effective, achieving maximum adsorption capacities of 255 mg g-1 for CS-g-CN@Fe3O4 and 218 mg g-1 for g-CN@Cu/Zn NC. Through genetic algorithm (GA) optimization, we identified the optimal conditions for the highest MPB removal efficiency: a sonication period of 12.00 min and an adsorbent dose of 0.010 g for CS-g-CN@Fe3O4 NC, with an MPB concentration of 17.20 mg L-1 at 42.85 °C; and a sonication time of 10.25 min and a 0.011 g dose for g-CN@Cu/Zn NC, with an MPB concentration of 13.45 mg L-1 at 36.50 °C. The predictive accuracy of the RBFNN and GRNN models was confirmed to be satisfactory. Our findings demonstrate the significant capabilities of these synthesized adsorbents in effectively removing MPB from water, paving the way for optimized applications in water purification.


Asunto(s)
Grafito , Compuestos de Nitrógeno , Parabenos , Contaminantes Químicos del Agua , Purificación del Agua , Cobre/química , Temperatura , Agua/química , Adsorción , Contaminantes Químicos del Agua/química , Cinética , Concentración de Iones de Hidrógeno , Purificación del Agua/métodos
20.
Network ; : 1-28, 2024 Apr 22.
Artículo en Inglés | MEDLINE | ID: mdl-38647219

RESUMEN

Brain tumour can be cured if it is initially screened and given timely treatment to the patients. This proposed idea suggests a transform- and windowing-based optimization strategy for exposing and segmenting the tumour region in brain pictures. The processes of image processing that are included in the proposed idea include preprocessing, transformation, feature extraction, feature optimization, classification, and segmentation. In order to convert the pixels connected to the spatial domain into a multi-resolution domain, the Gabor transform is first applied to the brain test image. The Gabor converted brain image is then used to extract the parameters of the multi-level features. After that, the Genetic Algorithm (GA) is used to optimize the extracted features, and Neuro Fuzzy System (NFS) is used to classify the optimistic prominent section. Finally, the tumour region in brain images is found and segmented using the normalized segmentation algorithm. The effective detection and classification of brain tumours by the characteristics of sensitivity, specificity, and accuracy are described by the suggested GA-based NFS classification approach. The trial findings are displayed with an average of 99.37% sensitivity, 98.9% specificity, 99.21% accuracy, 97.8% PPV, 91.8% NPV, 96.8% FPR, and 90.4% FNR.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA