Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.972
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39101500

RESUMO

Genomic selection (GS) has emerged as an effective technology to accelerate crop hybrid breeding by enabling early selection prior to phenotype collection. Genomic best linear unbiased prediction (GBLUP) is a robust method that has been routinely used in GS breeding programs. However, GBLUP assumes that markers contribute equally to the total genetic variance, which may not be the case. In this study, we developed a novel GS method called GA-GBLUP that leverages the genetic algorithm (GA) to select markers related to the target trait. We defined four fitness functions for optimization, including AIC, BIC, R2, and HAT, to improve the predictability and bin adjacent markers based on the principle of linkage disequilibrium to reduce model dimension. The results demonstrate that the GA-GBLUP model, equipped with R2 and HAT fitness function, produces much higher predictability than GBLUP for most traits in rice and maize datasets, particularly for traits with low heritability. Moreover, we have developed a user-friendly R package, GAGBLUP, for GS, and the package is freely available on CRAN (https://CRAN.R-project.org/package=GAGBLUP).


Assuntos
Algoritmos , Genômica , Seleção Genética , Zea mays , Genômica/métodos , Zea mays/genética , Oryza/genética , Modelos Genéticos , Melhoramento Vegetal/métodos , Desequilíbrio de Ligação , Fenótipo , Locos de Características Quantitativas , Genoma de Planta , Polimorfismo de Nucleotídeo Único , Software
2.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38388680

RESUMO

CRISPR Cas-9 is a groundbreaking genome-editing tool that harnesses bacterial defense systems to alter DNA sequences accurately. This innovative technology holds vast promise in multiple domains like biotechnology, agriculture and medicine. However, such power does not come without its own peril, and one such issue is the potential for unintended modifications (Off-Target), which highlights the need for accurate prediction and mitigation strategies. Though previous studies have demonstrated improvement in Off-Target prediction capability with the application of deep learning, they often struggle with the precision-recall trade-off, limiting their effectiveness and do not provide proper interpretation of the complex decision-making process of their models. To address these limitations, we have thoroughly explored deep learning networks, particularly the recurrent neural network based models, leveraging their established success in handling sequence data. Furthermore, we have employed genetic algorithm for hyperparameter tuning to optimize these models' performance. The results from our experiments demonstrate significant performance improvement compared with the current state-of-the-art in Off-Target prediction, highlighting the efficacy of our approach. Furthermore, leveraging the power of the integrated gradient method, we make an effort to interpret our models resulting in a detailed analysis and understanding of the underlying factors that contribute to Off-Target predictions, in particular the presence of two sub-regions in the seed region of single guide RNA which extends the established biological hypothesis of Off-Target effects. To the best of our knowledge, our model can be considered as the first model combining high efficacy, interpretability and a desirable balance between precision and recall.


Assuntos
Sistemas CRISPR-Cas , Aprendizado Profundo , Edição de Genes/métodos , RNA Guia de Sistemas CRISPR-Cas , Redes Neurais de Computação
3.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36460620

RESUMO

Lysine succinylation is a kind of post-translational modification (PTM) that plays a crucial role in regulating the cellular processes. Aberrant succinylation may cause inflammation, cancers, metabolism diseases and nervous system diseases. The experimental methods to detect succinylation sites are time-consuming and costly. This thus calls for computational models with high efficacy, and attention has been given in the literature to develop such models, albeit with only moderate success in the context of different evaluation metrics. One crucial aspect in this context is the biochemical and physicochemical properties of amino acids, which appear to be useful as features for such computational predictors. However, some of the existing computational models did not use the biochemical and physicochemical properties of amino acids. In contrast, some others used them without considering the inter-dependency among the properties. The combinations of biochemical and physicochemical properties derived through our optimization process achieve better results than the results achieved by combining all the properties. We propose three deep learning architectures: CNN+Bi-LSTM (CBL), Bi-LSTM+CNN (BLC) and their combination (CBL_BLC). We find that CBL_BLC outperforms the other two. Ensembling of different models successfully improves the results. Notably, tuning the threshold of the ensemble classifiers further improves the results. Upon comparing our work with other existing works on two datasets, we successfully achieve better sensitivity and specificity by varying the threshold value.


Assuntos
Algoritmos , Lisina , Lisina/metabolismo , Aminoácidos/química , Sensibilidade e Especificidade , Processamento de Proteína Pós-Traducional
4.
Brief Bioinform ; 24(2)2023 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-36869849

RESUMO

Drug resistance is one of principal limiting factors for cancer treatment. Several mechanisms, especially mutation, have been validated to implicate in drug resistance. In addition, drug resistance is heterogeneous, which makes an urgent need to explore the personalized driver genes of drug resistance. Here, we proposed an approach DRdriver to identify drug resistance driver genes in individual-specific network of resistant patients. First, we identified the differential mutations for each resistant patient. Next, the individual-specific network, which included the genes with differential mutations and their targets, was constructed. Then, the genetic algorithm was utilized to identify the drug resistance driver genes, which regulated the most differentially expressed genes and the least non-differentially expressed genes. In total, we identified 1202 drug resistance driver genes for 8 cancer types and 10 drugs. We also demonstrated that the identified driver genes were mutated more frequently than other genes and tended to be associated with the development of cancer and drug resistance. Based on the mutational signatures of all driver genes and enriched pathways of driver genes in brain lower grade glioma treated by temozolomide, the drug resistance subtypes were identified. Additionally, the subtypes showed great diversity in epithelial-mesenchyme transition, DNA damage repair and tumor mutation burden. In summary, this study developed a method DRdriver for identifying personalized drug resistance driver genes, which provides a framework for unlocking the molecular mechanism and heterogeneity of drug resistance.


Assuntos
Redes Reguladoras de Genes , Neoplasias , Humanos , Neoplasias/tratamento farmacológico , Neoplasias/genética , Neoplasias/patologia , Mutação , Oncogenes , Resistência a Medicamentos
5.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-36941113

RESUMO

Traditional Chinese medicine (TCM) has accumulated thousands years of knowledge in herbal therapy, but the use of herbal formulas is still characterized by reliance on personal experience. Due to the complex mechanism of herbal actions, it is challenging to discover effective herbal formulas for diseases by integrating the traditional experiences and modern pharmacological mechanisms of multi-target interactions. In this study, we propose a herbal formula prediction approach (TCMFP) combined therapy experience of TCM, artificial intelligence and network science algorithms to screen optimal herbal formula for diseases efficiently, which integrates a herb score (Hscore) based on the importance of network targets, a pair score (Pscore) based on empirical learning and herbal formula predictive score (FmapScore) based on intelligent optimization and genetic algorithm. The validity of Hscore, Pscore and FmapScore was verified by functional similarity and network topological evaluation. Moreover, TCMFP was used successfully to generate herbal formulae for three diseases, i.e. the Alzheimer's disease, asthma and atherosclerosis. Functional enrichment and network analysis indicates the efficacy of targets for the predicted optimal herbal formula. The proposed TCMFP may provides a new strategy for the optimization of herbal formula, TCM herbs therapy and drug development.


Assuntos
Asma , Medicamentos de Ervas Chinesas , Humanos , Medicamentos de Ervas Chinesas/uso terapêutico , Medicamentos de Ervas Chinesas/farmacologia , Inteligência Artificial , Medicina Tradicional Chinesa/métodos , Asma/tratamento farmacológico , Aprendizado de Máquina Supervisionado
6.
BMC Bioinformatics ; 25(1): 183, 2024 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-38724908

RESUMO

BACKGROUND: In recent years, gene clustering analysis has become a widely used tool for studying gene functions, efficiently categorizing genes with similar expression patterns to aid in identifying gene functions. Caenorhabditis elegans is commonly used in embryonic research due to its consistent cell lineage from fertilized egg to adulthood. Biologists use 4D confocal imaging to observe gene expression dynamics at the single-cell level. However, on one hand, the observed tree-shaped time-series datasets have characteristics such as non-pairwise data points between different individuals. On the other hand, the influence of cell type heterogeneity should also be considered during clustering, aiming to obtain more biologically significant clustering results. RESULTS: A biclustering model is proposed for tree-shaped single-cell gene expression data of Caenorhabditis elegans. Detailedly, a tree-shaped piecewise polynomial function is first employed to fit non-pairwise gene expression time series data. Then, four factors are considered in the objective function, including Pearson correlation coefficients capturing gene correlations, p-values from the Kolmogorov-Smirnov test measuring the similarity between cells, as well as gene expression size and bicluster overlapping size. After that, Genetic Algorithm is utilized to optimize the function. CONCLUSION: The results on the small-scale dataset analysis validate the feasibility and effectiveness of our model and are superior to existing classical biclustering models. Besides, gene enrichment analysis is employed to assess the results on the complete real dataset analysis, confirming that the discovered biclustering results hold significant biological relevance.


Assuntos
Caenorhabditis elegans , Análise de Célula Única , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Animais , Análise de Célula Única/métodos , Análise por Conglomerados , Perfilação da Expressão Gênica/métodos , Algoritmos
7.
J Neurophysiol ; 132(1): 136-146, 2024 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-38863430

RESUMO

Deep brain stimulation (DBS) of the subthalamic nucleus (STN) is an effective treatment for Parkinson's disease, but its mechanisms of action remain unclear. Detailed multicompartment computational models of STN neurons are often used to study how DBS electric fields modulate the neurons. However, currently available STN neuron models have some limitations in their biophysical realism. In turn, the goal of this study was to update a detailed rodent STN neuron model originally developed by Gillies and Willshaw in 2006. Our design requirements consisted of explicitly representing an axon connected to the neuron and updating the ion channel distributions based on the experimental literature to match established electrophysiological features of rodent STN neurons. We found that adding an axon to the STN neuron model substantially altered its firing characteristics. We then used a genetic algorithm to optimize biophysical parameters of the model. The updated model exhibited spontaneous firing, action potential shape, hyperpolarization response, and frequency-current curve that aligned well with experimental recordings from STN neurons. Subsequently, we evaluated the general compatibility of the updated biophysics by applying them to 26 different STN neuron morphologies derived from three-dimensional anatomical reconstructions. The different morphologies affected the firing behavior of the model, but the updated biophysics were robustly capable of maintaining the desired electrophysiological features. The new STN neuron model developed in this work offers a valuable tool for studying STN neuron firing properties and may find application in simulating STN local field potentials and analyzing the effects of STN DBS.NEW & NOTEWORTHY This study presents an anatomically and biophysically realistic rodent STN neuron model. The work showcases the use of a genetic algorithm to optimize the model parameters. We noted a substantial influence of the axon on the electrophysiological characteristics of STN neurons. The updated model offers a valuable tool to investigate the firing of STN neurons and their modulation by intrinsic and/or extrinsic factors.


Assuntos
Potenciais de Ação , Modelos Neurológicos , Neurônios , Núcleo Subtalâmico , Núcleo Subtalâmico/fisiologia , Núcleo Subtalâmico/citologia , Animais , Neurônios/fisiologia , Potenciais de Ação/fisiologia , Ratos , Axônios/fisiologia , Estimulação Encefálica Profunda
8.
BMC Biotechnol ; 24(1): 83, 2024 Oct 29.
Artigo em Inglês | MEDLINE | ID: mdl-39468527

RESUMO

Optimizing extraction conditions can help maximize the efficiency and yield of the extraction process while minimizing negative impacts on the environment and human health. For the purpose of the current study, an artificial neural network (ANN) combined with a genetic algorithm (GA) was utilized for that the extraction conditions of Hypericum spectabile were optimized. In this particular investigation, the main objective was to get the highest possible levels of total antioxidant status (TAS) for the extracts that were obtained. In addition to this, conditions of the extract that exhibited the maximum activity have been determined and the biological activity of the extract that was obtained under these conditions was analyzed. TAS values were obtained from extracts obtained using extraction temperatures of 30-60 °C, extraction times of 4-10 h, and extract concentrations of 0.25-2 mg/mL. The best model selected from the established ANN models had a mean absolute percentage error (MAPE) value of 0.643%, a mean squared error (MSE) value of 0.004, and a correlation coefficient (R) value of 0.996, respectively. The genetic algorithm proposed optimal extraction conditions of an extraction temperature of 59.391 °C, an extraction time of 8.841 h, and an extraction concentration of 1.951 mg/mL. It was concluded that the integration of ANN-GA can successfully be used to optimize extraction parameters of Hypericum spectabile. The total antioxidant value of the extract obtained under optimum conditions was determined as 9.306 ± 0.080 mmol/L, total oxidant value as 13.065 ± 0.112 µmol/L, oxidative stress index as 0.140 ± 0.001. Total phenolic content (TPC) was 109.34 ± 1.29 mg/g, total flavonoid content (TFC) was measured as 148.34 ± 1.48 mg/g. Anti-AChE value was determined as 30.68 ± 0.77 µg/mL, anti-BChE value was determined as 41.30 ± 0.48 µg/mL. It was also observed that the extract exhibited strong antiproliferative activities depending on the increase in concentration. As a result of LC-MS/MS analysis of the extract produced under optimum conditions in terms of phenolic content. The presence of fumaric, gallic, protocatechuic, 4-hydroxybenzoic, caffeic, 2-hydoxycinamic acids, quercetin and kaempferol was detected. As a result, it was determined that the H. spectabile extract produced under optimum conditions had significant effects in terms of biological activity.


Assuntos
Algoritmos , Antioxidantes , Hypericum , Redes Neurais de Computação , Extratos Vegetais , Hypericum/química , Extratos Vegetais/química , Extratos Vegetais/farmacologia , Antioxidantes/química , Antioxidantes/farmacologia , Humanos
9.
BMC Biotechnol ; 24(1): 68, 2024 Sep 27.
Artigo em Inglês | MEDLINE | ID: mdl-39334143

RESUMO

INTRODUCTION: Developing somatic embryogenesis is one of the main steps in successful in vitro propagation and gene transformation in the carrot. However, somatic embryogenesis is influenced by different intrinsic (genetics, genotype, and explant) and extrinsic (e.g., plant growth regulators (PGRs), medium composition, and gelling agent) factors which cause challenges in developing the somatic embryogenesis protocol. Therefore, optimizing somatic embryogenesis is a tedious, time-consuming, and costly process. Novel data mining approaches through a hybrid of artificial neural networks (ANNs) and optimization algorithms can facilitate modeling and optimizing in vitro culture processes and thereby reduce large experimental treatments and combinations. Carrot is a model plant in genetic engineering works and recombinant drugs, and therefore it is an important plant in research works. Also, in this research, for the first time, embryogenesis in carrot (Daucus carota L.) using Genetic algorithm (GA) and data mining technology has been reviewed and analyzed. MATERIALS AND METHODS: In the current study, data mining approach through multilayer perceptron (MLP) and radial basis function (RBF) as two well-known ANNs were employed to model and predict embryogenic callus production in carrot based on eight input variables including carrot cultivars, agar, magnesium sulfate (MgSO4), calcium dichloride (CaCl2), manganese (II) sulfate (MnSO4), 2,4-dichlorophenoxyacetic acid (2,4-D), 6-benzylaminopurine (BAP), and kinetin (KIN). To confirm the reliability and accuracy of the developed model, the result obtained from RBF-GA model were tested in the laboratory. RESULTS: The results showed that RBF had better prediction efficiency than MLP. Then, the developed model was linked to a genetic algorithm (GA) to optimize the system. To confirm the reliability and accuracy of the developed model, the result of RBF-GA was experimentally tested in the lab as a validation experiment. The result showed that there was no significant difference between the predicted optimized result and the experimental result. CONCLUTIONS: Generally, the results of this study suggest that data mining through RBF-GA can be considered as a robust approach, besides experimental methods, to model and optimize in vitro culture systems. According to the RBF-GA result, the highest somatic embryogenesis rate (62.5%) can be obtained from Nantes improved cultivar cultured on medium containing 195.23 mg/l MgSO4, 330.07 mg/l CaCl2, 18.3 mg/l MnSO4, 0.46 mg/l 2,4- D, 0.03 mg/l BAP, and 0.88 mg/l KIN. These results were also confirmed in the laboratory.


Assuntos
Meios de Cultura , Mineração de Dados , Daucus carota , Técnicas de Embriogênese Somática de Plantas , Daucus carota/genética , Daucus carota/embriologia , Mineração de Dados/métodos , Técnicas de Embriogênese Somática de Plantas/métodos , Meios de Cultura/química , Algoritmos , Redes Neurais de Computação , Reguladores de Crescimento de Plantas/farmacologia
10.
Biostatistics ; 24(2): 295-308, 2023 04 14.
Artigo em Inglês | MEDLINE | ID: mdl-34494086

RESUMO

Support vector regression (SVR) is particularly beneficial when the outcome and predictors are nonlinearly related. However, when many covariates are available, the method's flexibility can lead to overfitting and an overall loss in predictive accuracy. To overcome this drawback, we develop a feature selection method for SVR based on a genetic algorithm that iteratively searches across potential subsets of covariates to find those that yield the best performance according to a user-defined fitness function. We evaluate the performance of our feature selection method for SVR, comparing it to alternate methods including LASSO and random forest, in a simulation study. We find that our method yields higher predictive accuracy than SVR without feature selection. Our method outperforms LASSO when the relationship between covariates and outcome is nonlinear. Random forest performs equivalently to our method in some scenarios, but more poorly when covariates are correlated. We apply our method to predict donor kidney function 1 year after transplant using data from the United Network for Organ Sharing national registry.


Assuntos
Algoritmos , Análise de Regressão , Humanos , Máquina de Vetores de Suporte
11.
Small ; 20(6): e2305375, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37771186

RESUMO

Nanoparticles (NPs) have been employed as drug delivery systems (DDSs) for several decades, primarily as passive carriers, with limited selectivity. However, recent publications have shed light on the emerging phenomenon of NPs exhibiting selective cytotoxicity against cancer cell lines, attributable to distinct metabolic disparities between healthy and pathological cells. This study revisits the concept of NPs selective cytotoxicity, and for the first time proposes a high-throughput in silico screening approach to massive targeted discovery of selectively cytotoxic inorganic NPs. In the first step, this work trains a gradient boosting regression model to predict viability of NP-treated cell lines. The model achieves mean cross-validation (CV) Q2 = 0.80 and root mean square error (RMSE) of 13.6. In the second step, this work develops a machine learning (ML) reinforced genetic algorithm (GA), capable of screening >14 900 candidates/min, to identify the best-performing selectively cytotoxic NPs. As proof-of-concept, DDS candidates for the treatment of liver cancer are screened on HepG2 and hepatocytes cell lines resulting in Ag NPs with selective toxicity score of 42%. This approach opens the door for clinical translation of NPs, expanding their therapeutic application to a wider range of chemical space of NPs and living organisms such as bacteria and fungi.


Assuntos
Antineoplásicos , Neoplasias Hepáticas , Nanopartículas , Humanos , Nanopartículas/química , Aprendizado de Máquina , Algoritmos
12.
Brief Bioinform ; 23(5)2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-36088543

RESUMO

Ensemble learning is a kind of machine learning method which can integrate multiple basic learners together and achieve higher accuracy. Recently, single machine learning methods have been established to predict survival for patients with cancer. However, it still lacked a robust ensemble learning model with high accuracy to pick out patients with high risks. To achieve this, we proposed a novel genetic algorithm-aided three-stage ensemble learning method (3S score) for survival prediction. During the process of constructing the 3S score, double training sets were used to avoid over-fitting; the gene-pairing method was applied to reduce batch effect; a genetic algorithm was employed to select the best basic learner combination. When used to predict the survival state of glioma patients, this model achieved the highest C-index (0.697) as well as area under the receiver operating characteristic curve (ROC-AUCs) (first year = 0.705, third year = 0.825 and fifth year = 0.839) in the combined test set (n = 1191), compared with 12 other baseline models. Furthermore, the 3S score can distinguish survival significantly in eight cohorts among the total of nine independent test cohorts (P < 0.05), achieving significant improvement of ROC-AUCs. Notably, ablation experiments demonstrated that the gene-pairing method, double training sets and genetic algorithm make sure the robustness and effectiveness of the 3S score. The performance exploration on pan-cancer showed that the 3S score has excellent ability on survival prediction in five kinds of cancers, which was verified by Cox regression, survival curves and ROC curves together. To enable its clinical adoption, we implemented the 3S score and other two clinical factors as an easy-to-use web tool for risk scoring and therapy stratification in glioma patients.


Assuntos
Glioma , Aprendizado de Máquina , Glioma/genética , Humanos , Curva ROC , Fatores de Risco
13.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34929743

RESUMO

Recently, deep learning (DL)-based de novo drug design represents a new trend in pharmaceutical research, and numerous DL-based methods have been developed for the generation of novel compounds with desired properties. However, a comprehensive understanding of the advantages and disadvantages of these methods is still lacking. In this study, the performances of different generative models were evaluated by analyzing the properties of the generated molecules in different scenarios, such as goal-directed (rediscovery, optimization and scaffold hopping of active compounds) and target-specific (generation of novel compounds for a given target) tasks. In overall, the DL-based models have significant advantages over the baseline models built by the traditional methods in learning the physicochemical property distributions of the training sets and may be more suitable for target-specific tasks. However, both the baselines and DL-based generative models cannot fully exploit the scaffolds of the training sets, and the molecules generated by the DL-based methods even have lower scaffold diversity than those generated by the traditional models. Moreover, our assessment illustrates that the DL-based methods do not exhibit obvious advantages over the genetic algorithm-based baselines in goal-directed tasks. We believe that our study provides valuable guidance for the effective use of generative models in de novo drug design.


Assuntos
Desenho de Fármacos , Descoberta de Drogas/métodos , Algoritmos , Aprendizado Profundo
14.
J Transl Med ; 22(1): 353, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38622716

RESUMO

Recent studies have increasingly revealed the connection between metabolic reprogramming and tumor progression. However, the specific impact of metabolic reprogramming on inter-patient heterogeneity and prognosis in lung adenocarcinoma (LUAD) still requires further exploration. Here, we introduced a cellular hierarchy framework according to a malignant and metabolic gene set, named malignant & metabolism reprogramming (MMR), to reanalyze 178,739 single-cell reference profiles. Furthermore, we proposed a three-stage ensemble learning pipeline, aided by genetic algorithm (GA), for survival prediction across 9 LUAD cohorts (n = 2066). Throughout the pipeline of developing the three stage-MMR (3 S-MMR) score, double training sets were implemented to avoid over-fitting; the gene-pairing method was utilized to remove batch effect; GA was harnessed to pinpoint the optimal basic learner combination. The novel 3 S-MMR score reflects various aspects of LUAD biology, provides new insights into precision medicine for patients, and may serve as a generalizable predictor of prognosis and immunotherapy response. To facilitate the clinical adoption of the 3 S-MMR score, we developed an easy-to-use web tool for risk scoring as well as therapy stratification in LUAD patients. In summary, we have proposed and validated an ensemble learning model pipeline within the framework of metabolic reprogramming, offering potential insights for LUAD treatment and an effective approach for developing prognostic models for other diseases.


Assuntos
Adenocarcinoma de Pulmão , Neoplasias Pulmonares , Humanos , Reprogramação Metabólica , Adenocarcinoma de Pulmão/genética , Neoplasias Pulmonares/genética , Aprendizado de Máquina , Algoritmos , Prognóstico
15.
New Phytol ; 244(2): 618-634, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-39183371

RESUMO

Phenotypic plasticity describes a genotype's ability to produce different phenotypes in response to different environments. Breeding crops that exhibit appropriate levels of plasticity for future climates will be crucial to meeting global demand, but knowledge of the critical environmental factors is limited to a handful of well-studied major crops. Using 727 maize (Zea mays L.) hybrids phenotyped for grain yield in 45 environments, we investigated the ability of a genetic algorithm and two other methods to identify environmental determinants of grain yield from a large set of candidate environmental variables constructed using minimal assumptions. The genetic algorithm identified pre- and postanthesis maximum temperature, mid-season solar radiation, and whole season net evapotranspiration as the four most important variables from a candidate set of 9150. Importantly, these four variables are supported by previous literature. After calculating reaction norms for each environmental variable, candidate genes were identified and gene annotations investigated to demonstrate how this method can generate insights into phenotypic plasticity. The genetic algorithm successfully identified known environmental determinants of hybrid maize grain yield. This demonstrates that the methodology could be applied to other less well-studied phenotypes and crops to improve understanding of phenotypic plasticity and facilitate breeding crops for future climates.


Assuntos
Algoritmos , Clima , Fenótipo , Zea mays , Zea mays/genética , Zea mays/fisiologia , Zea mays/crescimento & desenvolvimento , Melhoramento Vegetal/métodos , Meio Ambiente , Produtos Agrícolas/genética , Produtos Agrícolas/crescimento & desenvolvimento , Produtos Agrícolas/fisiologia , Genótipo , Grão Comestível/genética , Grão Comestível/fisiologia , Grão Comestível/crescimento & desenvolvimento
16.
Chemphyschem ; 25(4): e202300800, 2024 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-38083816

RESUMO

In this work, an unbiased global search with a homemade genetic algorithm was performed to investigate the structural evolution and electronic properties of Snx - (x=21-35) clusters with density functional theory (DFT) calculations. All the ground-state structures for all these Snx - (x=21-35) clusters have been confirmed by the comparison of the experimental and simulated photoelectron spectra (PESs). It has been revealed that all Snx - (x=21-35) clusters are tricapped trigonal prism (TTP)-based structures consisting of two (for sizes x=21-28) or three (for x=29-35) TTP units, with the remaining atoms adsorbed on the surface or inserted between TTP units. The gradually decreasing HOMO-LUMO gaps indicate that these clusters are undergoing semiconductor-to-metal transformation. The average binding energies show that the structural stabilities of Snx - clusters are not as good as that of silicon and germanium clusters. It found that sizes x=23, 25, 29, 33 show high relative stability.

17.
Biotechnol Bioeng ; 121(5): 1583-1595, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38247359

RESUMO

As a non-destructive sensing technique, Raman spectroscopy is often combined with regression models for real-time detection of key components in microbial cultivation processes. However, achieving accurate model predictions often requires a large amount of offline measurement data for training, which is both time-consuming and labor-intensive. In order to overcome the limitations of traditional models that rely on large datasets and complex spectral preprocessing, in addition to the difficulty of training models with limited samples, we have explored a genetic algorithm-based semi-supervised convolutional neural network (GA-SCNN). GA-SCNN integrates unsupervised process spectral labeling, feature extraction, regression prediction, and transfer learning. Using only an extremely small number of offline samples of the target protein, this framework can accurately predict protein concentration, which represents a significant challenge for other models. The effectiveness of the framework has been validated in a system of Escherichia coli expressing recombinant ProA5M protein. By utilizing the labeling technique of this framework, the available dataset for glucose, lactate, ammonium ions, and optical density at 600 nm (OD600) has been expanded from 52 samples to 1302 samples. Furthermore, by introducing a small component of offline detection data for recombinant proteins into the OD600 model through transfer learning, a model for target protein detection has been retrained, providing a new direction for the development of associated models. Comparative analysis with traditional algorithms demonstrates that the GA-SCNN framework exhibits good adaptability when there is no complex spectral preprocessing. Cross-validation results confirm the robustness and high accuracy of the framework, with the predicted values of the model highly consistent with the offline measurement results.


Assuntos
Escherichia coli , Redes Neurais de Computação , Fermentação , Escherichia coli/genética , Algoritmos , Proteínas Recombinantes/genética
18.
BMC Med Res Methodol ; 24(1): 50, 2024 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-38413856

RESUMO

INTRODUCTION: The determination of identity factors such as age and sex has gained significance in both criminal and civil cases. Paranasal sinuses like frontal and maxillary sinuses, are resistant to trauma and can aid profiling. We developed a deep learning (DL) model optimized by an evolutionary algorithm (genetic algorithm/GA) to determine sex and age using paranasal sinus parameters based on cone-beam computed tomography (CBCT). METHODS: Two hundred and forty CBCT images (including 129 females and 111 males, aged 18-52) were included in this study. CBCT images were captured using the Newtom3G device with specific exposure parameters. These images were then analyzed in ITK-SNAP 3.6.0 beta software to extract four paranasal sinus parameters: height, width, length, and volume for both the frontal and maxillary sinuses. A hybrid model, Genetic Algorithm-Deep Neural Network (GADNN), was proposed for feature selection and classification. Traditional statistical methods and machine learning models, including logistic regression (LR), random forest (RF), multilayer perceptron neural network (MLP), and deep learning (DL) were evaluated for their performance. The synthetic minority oversampling technique was used to deal with the unbalanced data. RESULTS: GADNN showed superior accuracy in both sex determination (accuracy of 86%) and age determination (accuracy of 68%), outperforming other models. Also, DL and RF were the second and third superior methods in sex determination (accuracy of 78% and 71% respectively) and age determination (accuracy of 92% and 57%). CONCLUSIONS: The study introduces a novel approach combining DL and GA to enhance sex determination and age determination accuracy. The potential of DL in forensic dentistry is highlighted, demonstrating its efficiency in improving accuracy for sex determination and age determination. The study contributes to the burgeoning field of DL in dentistry and forensic sciences.


Assuntos
Aprendizado Profundo , Masculino , Feminino , Humanos , Tomografia Computadorizada de Feixe Cônico/métodos , Seio Maxilar/diagnóstico por imagem , Software , Redes Neurais de Computação
19.
J Chem Inf Model ; 64(6): 1794-1805, 2024 03 25.
Artigo em Inglês | MEDLINE | ID: mdl-38485516

RESUMO

As the number of determined and predicted protein structures and the size of druglike 'make-on-demand' libraries soar, the time-consuming nature of structure-based computer-aided drug design calls for innovative computational algorithms. De novo drug design introduces in silico heuristics to accelerate searching in the vast chemical space. This review focuses on recent advances in structure-based de novo drug design, ranging from conventional fragment-based methods, evolutionary algorithms, and Metropolis Monte Carlo methods to deep generative models. Due to the historical limitation of de novo drug design generating readily available drug-like molecules, we highlight the synthetic accessibility efforts in each category and the benchmarking strategies taken to validate the proposed framework.


Assuntos
Algoritmos , Desenho de Fármacos
20.
Network ; : 1-28, 2024 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-38647219

RESUMO

Brain tumour can be cured if it is initially screened and given timely treatment to the patients. This proposed idea suggests a transform- and windowing-based optimization strategy for exposing and segmenting the tumour region in brain pictures. The processes of image processing that are included in the proposed idea include preprocessing, transformation, feature extraction, feature optimization, classification, and segmentation. In order to convert the pixels connected to the spatial domain into a multi-resolution domain, the Gabor transform is first applied to the brain test image. The Gabor converted brain image is then used to extract the parameters of the multi-level features. After that, the Genetic Algorithm (GA) is used to optimize the extracted features, and Neuro Fuzzy System (NFS) is used to classify the optimistic prominent section. Finally, the tumour region in brain images is found and segmented using the normalized segmentation algorithm. The effective detection and classification of brain tumours by the characteristics of sensitivity, specificity, and accuracy are described by the suggested GA-based NFS classification approach. The trial findings are displayed with an average of 99.37% sensitivity, 98.9% specificity, 99.21% accuracy, 97.8% PPV, 91.8% NPV, 96.8% FPR, and 90.4% FNR.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA