RESUMEN
Predicting the strength of promoters and guiding their directed evolution is a crucial task in synthetic biology. This approach significantly reduces the experimental costs in conventional promoter engineering. Previous studies employing machine learning or deep learning methods have shown some success in this task, but their outcomes were not satisfactory enough, primarily due to the neglect of evolutionary information. In this paper, we introduce the Chaos-Attention net for Promoter Evolution (CAPE) to address the limitations of existing methods. We comprehensively extract evolutionary information within promoters using merged chaos game representation and process the overall information with modified DenseNet and Transformer structures. Our model achieves state-of-the-art results on two kinds of distinct tasks related to prokaryotic promoter strength prediction. The incorporation of evolutionary information enhances the model's accuracy, with transfer learning further extending its adaptability. Furthermore, experimental results confirm CAPE's efficacy in simulating in silico directed evolution of promoters, marking a significant advancement in predictive modeling for prokaryotic promoter strength. Our paper also presents a user-friendly website for the practical implementation of in silico directed evolution on promoters. The source code implemented in this study and the instructions on accessing the website can be found in our GitHub repository https://github.com/BobYHY/CAPE.
Asunto(s)
Aprendizaje Profundo , Regiones Promotoras Genéticas , Algoritmos , Evolución Molecular , Simulación por Computador , Dinámicas no Lineales , Biología Computacional/métodosRESUMEN
In this article, we propose the novel neural stochastic differential equations (SDEs) driven by noisy sequential observations called neural projection filter (NPF) under the continuous state-space models (SSMs) framework. The contributions of this work are both theoretical and algorithmic. On the one hand, we investigate the approximation capacity of the NPF, i.e., the universal approximation theorem for NPF. More explicitly, under some natural assumptions, we prove that the solution of the SDE driven by the semimartingale can be well approximated by the solution of the NPF. In particular, the explicit estimation bound is given. On the other hand, as an important application of this result, we develop a novel data-driven filter based on NPF. Also, under certain condition, we prove the algorithm convergence; i.e., the dynamics of NPF converges to the target dynamics. At last, we systematically compare the NPF with the existing filters. We verify the convergence theorem in linear case and experimentally demonstrate that the NPF outperforms existing filters in nonlinear case with robustness and efficiency. Furthermore, NPF could handle high-dimensional systems in real-time manner, even for the 100 -D cubic sensor, while the state-of-the-art (SOTA) filter fails to do it.
RESUMEN
In this article, we investigate the approximation ability of recurrent neural networks (RNNs) with stochastic inputs in state space model form. More explicitly, we prove that open dynamical systems with stochastic inputs can be well-approximated by a special class of RNNs under some natural assumptions, and the asymptotic approximation error has also been delicately analyzed as time goes to infinity. In addition, as an important application of this result, we construct an RNN-based filter and prove that it can well-approximate finite dimensional filters which include Kalman filter (KF) and Benes filter as special cases. The efficiency of RNN-based filter has also been verified by two numerical experiments compared with optimal KF.
RESUMEN
Metal-organic frameworks (MOFs) with the aggregation-induced emission (AIE) activities exhibit potential applications in the fields of energy and biomedical technology. However, the controllable synthesis of MOFs in the varied particle sizes not only affects their AIE activities, but also restricts their application scenarios. In this work, the varied particle sizes of Eu-MOFs are synthesized by adjusting the synthesis process parameters, and their variation rules combining the single factor analysis method with machine learning technology are studied. Based on the R2 score, the gradient boosting decision tree (GBDT) regression model (0.9535) is employed to calculate the weight and correlation between different synthesis process parameters and it is shown that all these parameters have synergic effects on the particle sizes of Eu-MOFs, and the Eu-precursors concentration dominates in their synthesis process. Furthermore, it is indicated that the large size of Eu-MOFs and strong structural stability contribute to their high AIE activities. Finally, a screen-printed pattern is fabricated using the sample of "120-0.3-6," and this pattern exhibits a bright red fluorescence under the UV light. More importantly, this kind of Eu-MOFs can also be used to identify varied ions (Fe3+ , F- , I- , SO42- , CO32- , and PO43- ) and citric acid.