RESUMO
O-linked glycosylation is a complex post-translational modification (PTM) in human proteins that plays a critical role in regulating various cellular metabolic and signaling pathways. In contrast to N-linked glycosylation, O-linked glycosylation lacks specific sequence features and maintains an unstable core structure. Identifying O-linked threonine glycosylation sites (OTGs) remains challenging, requiring extensive experimental tests. While bioinformatics tools have emerged for predicting OTGs, their reliance on limited conventional features and absence of well-defined feature selection strategies limit their effectiveness. To address these limitations, we introduced HOTGpred (Human O-linked Threonine Glycosylation predictor), employing a multi-stage feature selection process to identify the optimal feature set for accurately identifying OTGs. Initially, we assessed 25 different feature sets derived from various pretrained protein language model (PLM)-based embeddings and conventional feature descriptors using nine classifiers. Subsequently, we integrated the top five embeddings linearly and determined the most effective scoring function for ranking hybrid features, identifying the optimal feature set through a process of sequential forward search. Among the classifiers, the extreme gradient boosting (XGBT)-based model, using the optimal feature set (HOTGpred), achieved 92.03 % accuracy on the training dataset and 88.25 % on the balanced independent dataset. Notably, HOTGpred significantly outperformed the current state-of-the-art methods on both the balanced and imbalanced independent datasets, demonstrating its superior prediction capabilities. Additionally, SHapley Additive exPlanations (SHAP) and ablation analyses were conducted to identify the features contributing most significantly to HOTGpred. Finally, we developed an easy-to-navigate web server, accessible at https://balalab-skku.org/HOTGpred/, to support glycobiologists in their research on glycosylation structure and function.
Assuntos
Treonina , Glicosilação , Humanos , Treonina/metabolismo , Treonina/química , Processamento de Proteína Pós-Traducional , Software , Biologia Computacional/métodos , Bases de Dados de Proteínas , Proteínas/química , Proteínas/metabolismoRESUMO
RNA N4-acetylcytidine (ac4C) is a highly conserved RNA modification that plays a crucial role in controlling mRNA stability, processing, and translation. Consequently, accurate identification of ac4C sites across the genome is critical for understanding gene expression regulation mechanisms. In this study, we have developed ac4C-AFL, a bioinformatics tool that precisely identifies ac4C sites from primary RNA sequences. In ac4C-AFL, we identified the optimal sequence length for model building and implemented an adaptive feature representation strategy that is capable of extracting the most representative features from RNA. To identify the most relevant features, we proposed a novel ensemble feature importance scoring strategy to rank features effectively. We then used this information to conduct the sequential forward search, which individually determine the optimal feature set from the 16 sequence-derived feature descriptors. Utilizing these optimal feature descriptors, we constructed 176 baseline models using 11 popular classifiers. The most efficient baseline models were identified using the two-step feature selection approach, whose predicted scores were integrated and trained with the appropriate classifier to develop the final prediction model. Our rigorous cross-validations and independent tests demonstrate that ac4C-AFL surpasses contemporary tools in predicting ac4C sites. Moreover, we have developed a publicly accessible web server at https://balalab-skku.org/ac4C-AFL/.
RESUMO
BACKGROUND: Horse gram (Macrotyloma uniflorum (Lam.) Verdc.) is an underutilized pulse crop with good drought resistance traits. It is a rich source of protein. Conventional breeding methods for high yielding and abiotic stress tolerant germplasm are hampered by the scarcity of morphological data sets. Thus, horse gram cultivars considered for this study is classified based on prevailing growth factors showing homogenous genotype in various agro ecological zones. Nowadays, several machine learning (ML) methods are used in the field of plant phenotyping. RESULTS: We adopted unsupervised learning techniques from the K-means clustering algorithm to analyze important morphological traits: plant shoot length, total plant height, flowering percentage, number of pods per plant, pod length, number of seeds per plant, and seed length variants between germplasm. Unsupervised clustering revealed that 20 germplasm accessions were grouped in four clusters in which high-yielding traits were predominantly observed in cluster 2. CONCLUSION: These findings could guide ML-based classification to characterize suitable germplasms on the basis of high-yielding varieties for different agro-ecological zones. © 2020 Society of Chemical Industry.
Assuntos
Fabaceae/classificação , Fabaceae/genética , Aprendizado de Máquina , Algoritmos , Secas , Fabaceae/crescimento & desenvolvimento , Fabaceae/fisiologia , Genótipo , Fenótipo , Melhoramento Vegetal , Locos de Características Quantitativas , Sementes/classificação , Sementes/genética , Sementes/crescimento & desenvolvimento , Sementes/fisiologia , Estresse FisiológicoRESUMO
In this paper, synchronization of an inertial neural network with time-varying delays is investigated. Based on the variable transformation method, we transform the second-order differential equations into the first-order differential equations. Then, using suitable Lyapunov-Krasovskii functionals and Jensen's inequality, the synchronization criteria are established in terms of linear matrix inequalities. Moreover, a feedback controller is designed to attain synchronization between the master and slave models, and to ensure that the error model is globally asymptotically stable. Numerical examples and simulations are presented to indicate the effectiveness of the proposed method. Besides that, an image encryption algorithm is proposed based on the piecewise linear chaotic map and the chaotic inertial neural network. The chaotic signals obtained from the inertial neural network are utilized for the encryption process. Statistical analyses are provided to evaluate the effectiveness of the proposed encryption algorithm. The results ascertain that the proposed encryption algorithm is efficient and reliable for secure communication applications.
RESUMO
This paper addresses the problem of exponential synchronization of neural networks with time-varying delays. A sampled-data controller with stochastically varying sampling intervals is considered. The novelty of this paper lies in the fact that the control packet loss from the controller to the actuator is considered, which may occur in many real-world situations. Sufficient conditions for the exponential synchronization in the mean square sense are derived in terms of linear matrix inequalities (LMIs) by constructing a proper Lyapunov-Krasovskii functional that involves more information about the delay bounds and by employing some inequality techniques. Moreover, the obtained LMIs can be easily checked for their feasibility through any of the available MATLAB tool boxes. Numerical examples are provided to validate the theoretical results.
Assuntos
Redes Neurais de Computação , Processos Estocásticos , Simulação por Computador , Humanos , Fatores de TempoRESUMO
This paper presents a new design scheme for the passivity and passification of a class of memristor-based recurrent neural networks (MRNNs) with additive time-varying delays. The predictable assumptions on the boundedness and Lipschitz continuity of activation functions are formulated. The systems considered here are based on a different time-delay model suggested recently, which includes additive time-varying delay components in the state. The connection between the time-varying delay and its upper bound is considered when estimating the upper bound of the derivative of Lyapunov functional. It is recognized that the passivity condition can be expressed in a linear matrix inequality (LMI) format and by using characteristic function method. For state feedback passification, it is verified that it is apathetic to use immediate or delayed state feedback. By constructing a Lyapunov-Krasovskii functional and employing Jensen's inequality and reciprocal convex combination technique together with a tighter estimation of the upper bound of the cross-product terms derived from the derivatives of the Lyapunov functional, less conventional delay-dependent passivity criteria are established in terms of LMIs. Moreover, second-order reciprocally convex approach is employed for deriving the upper bound for terms with inverses of squared convex parameters. The model based on the memristor with additive time-varying delays widens the application scope for the design of neural networks. Finally, pertinent examples are given to show the advantages of the derived passivity criteria and the significant improvement of the theoretical approaches.