RESUMO
We introduce a groundbreaking approach: the minimum free energy-based Gaussian Self-Benchmarking (MFE-GSB) framework, designed to combat the myriad of biases inherent in RNA-seq data. Central to our methodology is the MFE concept, facilitating the adoption of a Gaussian distribution model tailored to effectively mitigate all co-existing biases within a k-mer counting scheme. The MFE-GSB framework operates on a sophisticated dual-model system, juxtaposing modeling data of uniform k-mer distribution against the real, observed sequencing data characterized by nonuniform k-mer distributions. The framework applies a Gaussian function, guided by the predetermined parameters-mean and SD-derived from modeling data, to fit unknown sequencing data. This dual comparison allows for the accurate prediction of k-mer abundances across MFE categories, enabling simultaneous correction of biases at the single k-mer level. Through validation with both engineered RNA constructs and human tissue RNA samples, its wide-ranging efficacy and applicability are demonstrated.
Assuntos
RNA-Seq , Humanos , RNA-Seq/métodos , Benchmarking , Análise de Sequência de RNA/métodos , RNA/química , RNA/genética , Algoritmos , Distribuição Normal , Biologia Computacional/métodos , ViésRESUMO
The parabolic convection-diffusion-reaction problem is examined in this work, where the diffusion and convection terms are multiplied by two small parameters, respectively. The proposed approach is based on a fitted operator finite difference method. The Crank-Nicolson method on uniform mesh is utilized to discretize the time variables in the first step. Two-point Gaussian quadrature rule is used for further discretizing these semi-discrete problems in space, and the second order interpolation of the first derivatives is utilized. The fitting factor's value, which accounts for abrupt changes in the solution, is calculated using the theory of singular perturbations. The developed scheme is demonstrated to be second-order accurate and uniformly convergent. The proposed method's applicability is validated by two examples, which yielded more accurate results than some other methods found in the literatures.
Assuntos
Algoritmos , Distribuição Normal , Difusão , Modelos Teóricos , ConvecçãoRESUMO
The air pollution problem has now amassed worldwide attention due to its multifaceted harm to human health. Exploring the concentration of air pollution and improving forecast have important consideration worldwide. In this research, we analyze the air pollution concentration of Southern Thailand and compare it with the central region. Also, we proposed a methodology based on the lag-dependent Gaussian process (LDGP), a Bayesian non-parametric machine learning model, with a stable optimization approach, which is a cluster-based multi-starter technique based on the Nelder-Mead optimizer. This model also provides the confidence band for forecasted values. We also used autoregressive deep neural network (AR-DNN), autoregressive random forest (AR-RF), gradient boosting (GB), and K-nearest neighbors (KNN) models. A comparison of the proposed methodology was performed on the daily air pollution data collected from the southern provinces and also from the capital of Thailand from 1 January 2018 to 31 December 2022. We used well-established performance evaluation measures to compare the performance of the models. To evaluate the bias due to overfit, we performed a tenfold cross-validation for all the pollutants in each region and compared the models to choose the best one. Moreover, we explored the concentration of air pollution in these regions. Results of descriptive analysis revealed that Bangkok had a much higher concentration of air pollution as compared to the southern region. However, the southern region had higher exposure to PM air pollutants as per WHO recommendations and also had higher exposure to O3 and CO levels. The proposed LDGP model outperformed the other machine learning models for forecasting all air pollutants. Hence, it is recommended to be used by experts for further research and studies with different kernel functions. This research is also expected to contribute to local government planning and prevention and worldwide use of the same methodology for the sustainability of public health.
Assuntos
Poluentes Atmosféricos , Poluição do Ar , Monitoramento Ambiental , Previsões , Tailândia , Poluição do Ar/estatística & dados numéricos , Monitoramento Ambiental/métodos , Poluentes Atmosféricos/análise , Distribuição Normal , Teorema de Bayes , Material Particulado/análiseRESUMO
This paper introduces a model for longitudinal functional data analysis that accounts for pointwise skewness. The proposed procedure decouples the marginal pointwise variation from the complex longitudinal and functional dependence using copula methodology. Pointwise variation is described through parametric distribution functions that capture varying skewness and change smoothly both in time and over the functional argument. Joint dependence is quantified through a Gaussian copula with a low-rank approximation-based covariance. The introduced class of models provides a unifying platform for both pointwise quantile estimation and prediction of complete trajectories at new times. We investigate the methods numerically in simulations and discuss their application to a diffusion tensor imaging study of multiple sclerosis patients. This approach is implemented in the R package sLFDA that is publicly available on GitHub.
Assuntos
Simulação por Computador , Imagem de Tensor de Difusão , Modelos Estatísticos , Esclerose Múltipla , Humanos , Estudos Longitudinais , Esclerose Múltipla/diagnóstico por imagem , Imagem de Tensor de Difusão/estatística & dados numéricos , Distribuição Normal , Interpretação Estatística de Dados , Biometria/métodosRESUMO
In regression-based analyses of group-level neuroimage data, researchers typically fit a series of marginal general linear models to image outcomes at each spatially referenced pixel. Spatial regularization of effects of interest is usually induced indirectly by applying spatial smoothing to the data during preprocessing. While this procedure often works well, the resulting inference can be poorly calibrated. Spatial modeling of effects of interest leads to more powerful analyses; however, the number of locations in a typical neuroimage can preclude standard computing methods in this setting. Here, we contribute a Bayesian spatial regression model for group-level neuroimaging analyses. We induce regularization of spatially varying regression coefficient functions through Gaussian process priors. When combined with a simple non-stationary model for the error process, our prior hierarchy can lead to more data-adaptive smoothing than standard methods. We achieve computational tractability through a Vecchia-type approximation of our prior that retains full spatial rank and can be constructed for a wide class of spatial correlation functions. We outline several ways to work with our model in practice and compare performance against standard vertex-wise analyses and several alternatives. Finally, we illustrate our methods in an analysis of cortical surface functional magnetic resonance imaging task contrast data from a large cohort of children enrolled in the adolescent brain cognitive development study.
Assuntos
Teorema de Bayes , Neuroimagem , Humanos , Distribuição Normal , Neuroimagem/métodos , Neuroimagem/estatística & dados numéricos , Imageamento por Ressonância Magnética/métodos , Imageamento por Ressonância Magnética/estatística & dados numéricos , Criança , Córtex Cerebral/diagnóstico por imagem , Simulação por Computador , Análise de Regressão , Modelos Estatísticos , Processamento de Imagem Assistida por Computador/métodosRESUMO
In this contribution, we use Gaussian posterior probability densities to characterize local estimates from distributed sensors, and assume that they all belong to the Riemannian manifold of Gaussian distributions. Our starting point is to introduce a proper Lie algebraic structure for the Gaussian submanifold with a fixed mean vector, and then the average dissimilarity between the fused density and local posterior densities can be measured by the norm of a Lie algebraic vector. Under Gaussian assumptions, a geodesic projection based algebraic fusion method is proposed to achieve the fused density by taking the norm as the loss. It provides a robust fixed point iterative algorithm for the mean fusion with theoretical convergence, and gives an analytical form for the fused covariance matrix. The effectiveness of the proposed fusion method is illustrated by numerical examples.
Assuntos
Algoritmos , Distribuição Normal , Modelos TeóricosRESUMO
Long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) are two typical types of non-coding RNAs that interact and play important regulatory roles in many animal organisms. Exploring the unknown interactions between lncRNAs and miRNAs contributes to a better understanding of their functional involvement. Currently, studying the interactions between lncRNAs and miRNAs heavily relies on laborious biological experiments. Therefore, it is necessary to design a computational method for predicting lncRNA-miRNA interactions. In this work, we propose a method called MPGK-LMI, which utilizes a graph attention network (GAT) to predict lncRNA-miRNA interactions in animals. First, we construct a meta-path similarity matrix based on known lncRNA-miRNA interaction information. Then, we use GAT to aggregate the constructed meta-path similarity matrix and the computed Gaussian kernel similarity matrix to update the feature matrix with neighbourhood information. Finally, a scoring module is used for prediction. By comparing with three state-of-the-art algorithms, MPGK-LMI achieves the best results in terms of performance, with AUC value of 0.9077, AUPR of 0.9327, ACC of 0.9080, F1-score of 0.9143 and precision of 0.8739. These results validate the effectiveness and reliability of MPGK-LMI. Additionally, we conduct detailed case studies to demonstrate the effectiveness and feasibility of our approach in practical applications. Through these empirical results, we gain deeper insights into the functional roles and mechanisms of lncRNA-miRNA interactions, providing significant breakthroughs and advancements in this field of research. In summary, our method not only outperforms others in terms of performance but also establishes its practicality and reliability in biological research through real-case analysis, offering strong support and guidance for future studies and applications.
Assuntos
Algoritmos , Biologia Computacional , MicroRNAs , RNA Longo não Codificante , RNA Longo não Codificante/genética , MicroRNAs/genética , Biologia Computacional/métodos , Animais , Humanos , Redes Reguladoras de Genes , Distribuição NormalRESUMO
BACKGROUND: RNA sequencing is a vital technique for analyzing RNA behavior in cells, but it often suffers from various biases that distort the data. Traditional methods to address these biases are typically empirical and handle them individually, limiting their effectiveness. Our study introduces the Gaussian Self-Benchmarking (GSB) framework, a novel approach that leverages the natural distribution patterns of guanine (G) and cytosine (C) content in RNA to mitigate multiple biases simultaneously. This method is grounded in a theoretical model, organizing k-mers based on their GC content and applying a Gaussian model for alignment to ensure empirical sequencing data closely match their theoretical distribution. RESULTS: The GSB framework demonstrated superior performance in mitigating sequencing biases compared to existing methods. Testing with synthetic RNA constructs and real human samples showed that the GSB approach not only addresses individual biases more effectively but also manages co-existing biases jointly. The framework's reliance on accurately pre-determined parameters like mean and standard deviation of GC content distribution allows for a more precise representation of RNA samples. This results in improved accuracy and reliability of RNA sequencing data, enhancing our understanding of RNA behavior in health and disease. CONCLUSIONS: The GSB framework presents a significant advancement in RNA sequencing analysis by providing a well-validated, multi-bias mitigation strategy. It functions independently from previously identified dataset flaws and sets a new standard for unbiased RNA sequencing results. This development enhances the reliability of RNA studies, broadening the potential for scientific breakthroughs in medicine and biology, particularly in genetic disease research and the development of targeted treatments.
Assuntos
Composição de Bases , RNA-Seq , Humanos , RNA-Seq/métodos , Distribuição Normal , Análise de Sequência de RNA/métodos , Viés , RNA/genéticaRESUMO
Thermal proteome profiling (TPP) is a proteome wide technology that enables unbiased detection of protein drug interactions as well as changes in post-translational state of proteins between different biological conditions. Statistical analysis of temperature range TPP (TPP-TR) datasets relies on comparing protein melting curves, describing the amount of non-denatured proteins as a function of temperature, between different conditions (e.g. presence or absence of a drug). However, state-of-the-art models are restricted to sigmoidal melting behaviours while unconventional melting curves, representing up to 50% of TPP-TR datasets, have recently been shown to carry important biological information. We present a novel statistical framework, based on hierarchical Gaussian process models and named GPMelt, to make TPP-TR datasets analysis unbiased with respect to the melting profiles of proteins. GPMelt scales to multiple conditions, and extension of the model to deeper hierarchies (i.e. with additional sub-levels) allows to deal with complex TPP-TR protocols. Collectively, our statistical framework extends the analysis of TPP-TR datasets for both protein and peptide level melting curves, offering access to thousands of previously excluded melting curves and thus substantially increasing the coverage and the ability of TPP to uncover new biology.
Assuntos
Proteoma , Proteoma/metabolismo , Distribuição Normal , Biologia Computacional/métodos , Proteômica/métodos , Modelos Estatísticos , AlgoritmosRESUMO
Complexome profiling is an experimental approach to identify interactions by integrating native separation of protein complexes and quantitative mass spectrometry. In a typical complexome profile, thousands of proteins are detected across typically ≤100 fractions. This relatively low resolution leads to similar abundance profiles between proteins that are not necessarily interaction partners. To address this challenge, we introduce the Gaussian Interaction Profiler (GIP), a Gaussian mixture modeling-based clustering workflow that assigns protein clusters by modeling the migration profile of each cluster. Uniquely, the GIP offers a way to prioritize actual interactors over spuriously comigrating proteins. Using previously analyzed human fibroblast complexome profiles, we show good performance of the GIP compared to other state-of-the-art tools. We further demonstrate GIP utility by applying it to complexome profiles from the transmissible lifecycle stage of malaria parasites. We unveil promising novel associations for future experimental verification, including an interaction between the vaccine target Pfs47 and the hypothetical protein PF3D7_0417000. Taken together, the GIP provides methodological advances that facilitate more accurate and automated detection of protein complexes, setting the stage for more varied and nuanced analyses in the field of complexome profiling. The complexome profiling data have been deposited to the ProteomeXchange Consortium with the dataset identifier PXD050751.
Assuntos
Plasmodium falciparum , Proteínas de Protozoários , Plasmodium falciparum/metabolismo , Plasmodium falciparum/química , Proteínas de Protozoários/química , Proteínas de Protozoários/metabolismo , Proteínas de Protozoários/análise , Humanos , Proteômica/métodos , Distribuição Normal , Espectrometria de Massas/métodos , Mapeamento de Interação de Proteínas/métodos , Análise por Conglomerados , Proteoma/análiseRESUMO
Smoothing filters are widely used in EEG signal processing for noise removal while preserving signals' features. Inspired by our recent work on Upscale and Downscale Representation (UDR), this paper proposes a cascade arrangement of some effective image-processing techniques for signal filtering in the image domain. The UDR concept is to visualize EEG signals at an appropriate line width and convert it to a binary image. The smoothing process is then conducted by skeletonizing the signal object to a unit width and projecting it back to the time domain. Two successive UDRs could result in a better-smoothing performance, but their binary image conversion should be restricted. The process is computationally ineffective, especially at higher line width values. Cascaded Thinning UDR (CTUDR) is proposed, exploiting morphological operations to perform a two-stage upscale and downscale within one binary image representation. CTUDR is verified on a signal smoothing and classification task and compared with conventional techniques, such as the Moving Average, the Binomial, the Median, and the Savitzky Golay filters. Simulated EEG data with added white Gaussian noise is employed in the former, while cognitive conflict data obtained from a 3D object selection task is utilized in the latter. CTUDR outperforms its counterparts, scoring the best fitting error and correlation coefficient in signal smoothing while achieving the highest gain in Accuracy (0.7640%) and F-measure (0.7607%) when used as a smoothing filter for training data of EEGNet.
Assuntos
Algoritmos , Eletroencefalografia , Processamento de Sinais Assistido por Computador , Eletroencefalografia/métodos , Humanos , Razão Sinal-Ruído , Distribuição Normal , Simulação por Computador , Processamento de Imagem Assistida por Computador/métodosRESUMO
Polyphenol oxidase (PPO) plays a key role in the enzymatic browning process, and this study employed Gaussian-accelerated molecular dynamics (GaMD) simulations to investigate the catalytic efficiency mechanisms of lotus root PPO with different substrates, including catechin, epicatechin, and chlorogenic acid, as well as the inhibitor oxalic acid. Key findings reveal significant conformational changes in PPO that correlate with its enzymatic activity. Upon substrate binding, the alpha-helix in the Q53-D63 region near the copper ion extends, likely stabilizing the active site and enhancing catalysis. In contrast, this helix is disrupted in the presence of the inhibitor, resulting in a decrease in enzymatic efficiency. Additionally, the F350-V378 region, which covers the substrate-binding site, forms an alpha-helix upon substrate binding, further stabilizing the substrate and promoting catalytic function. However, this alpha-helix does not form when the inhibitor is bound, destabilizing the binding site and contributing to inhibition. These findings offer new insights into the substrate-specific and inhibitor-induced structural dynamics of lotus root PPO, providing valuable information for enhancing food processing and preservation techniques.
Assuntos
Catecol Oxidase , Lotus , Simulação de Dinâmica Molecular , Raízes de Plantas , Lotus/enzimologia , Catecol Oxidase/metabolismo , Catecol Oxidase/química , Raízes de Plantas/enzimologia , Especificidade por Substrato , Cadeias de Markov , Domínio Catalítico , Proteínas de Plantas/metabolismo , Proteínas de Plantas/química , Catequina/química , Catequina/metabolismo , Sítios de Ligação , Distribuição NormalRESUMO
We present a new method for constructing valid covariance functions of Gaussian processes for spatial analysis in irregular, non-convex domains such as bodies of water. Standard covariance functions based on geodesic distances are not guaranteed to be positive definite on such domains, while existing non-Euclidean approaches fail to respect the partially Euclidean nature of these domains where the geodesic distance agrees with the Euclidean distances for some pairs of points. Using a visibility graph on the domain, we propose a class of covariance functions that preserve Euclidean-based covariances between points that are connected in the domain while incorporating the non-convex geometry of the domain via conditional independence relationships. We show that the proposed method preserves the partially Euclidean nature of the intrinsic geometry on the domain while maintaining validity (positive definiteness) and marginal stationarity of the covariance function over the entire parameter space, properties which are not always fulfilled by existing approaches to construct covariance functions on non-convex domains. We provide useful approximations to improve computational efficiency, resulting in a scalable algorithm. We compare the performance of our method with those of competing state-of-the-art methods using simulation studies on synthetic non-convex domains. The method is applied to data regarding acidity levels in the Chesapeake Bay, showing its potential for ecological monitoring in real-world spatial applications on irregular domains.
Assuntos
Algoritmos , Simulação por Computador , Análise Espacial , Modelos Estatísticos , Distribuição Normal , Biometria/métodosRESUMO
A promising approach for scalable Gaussian processes (GPs) is the Karhunen-Loève (KL) decomposition, in which the GP kernel is represented by a set of basis functions which are the eigenfunctions of the kernel operator. Such decomposed kernels have the potential to be very fast, and do not depend on the selection of a reduced set of inducing points. However KL decompositions lead to high dimensionality, and variable selection thus becomes paramount. This paper reports a new method of forward variable selection, enabled by the ordered nature of the basis functions in the KL expansion of the Bayesian Smoothing Spline ANOVA kernel (BSS-ANOVA), coupled with fast Gibbs sampling in a fully Bayesian approach. It quickly and effectively limits the number of terms, yielding a method with competitive accuracies, training and inference times for tabular datasets of low feature set dimensionality. Theoretical computational complexities are [Formula: see text] in training and [Formula: see text] per point in inference, where N is the number of instances and P the number of expansion terms. The inference speed and accuracy makes the method especially useful for dynamic systems identification, by modeling the dynamics in the tangent space as a static problem, then integrating the learned dynamics using a high-order scheme. The methods are demonstrated on two dynamic datasets: a 'Susceptible, Infected, Recovered' (SIR) toy problem, along with the experimental 'Cascaded Tanks' benchmark dataset. Comparisons on the static prediction of time derivatives are made with a random forest (RF), a residual neural network (ResNet), and the Orthogonal Additive Kernel (OAK) inducing points scalable GP, while for the timeseries prediction comparisons are made with LSTM and GRU recurrent neural networks (RNNs) along with the SINDy package.
Assuntos
Algoritmos , Teorema de Bayes , Distribuição NormalRESUMO
Large-scale studies of gene expression are commonly influenced by biological and technical sources of expression variation, including batch effects, sample characteristics, and environmental impacts. Learning the causal relationships between observable variables may be challenging in the presence of unobserved confounders. Furthermore, many high-dimensional regression techniques may perform worse. In fact, controlling for unobserved confounding variables is essential, and many deconfounding methods have been suggested for application in a variety of situations. The main contribution of this article is the development of a two-stage deconfounding procedure based on Bow-free Acyclic Paths (BAP) search developed into the framework of Structural Equation Models (SEM), called SEMbap(). In the first stage, an exhaustive search of missing edges with significant covariance is performed via Shipley d-separation tests; then, in the second stage, a Constrained Gaussian Graphical Model (CGGM) is fitted or a low dimensional representation of bow-free edges structure is obtained via Graph Laplacian Principal Component Analysis (gLPCA). We compare four popular deconfounding methods to BAP search approach with applications on simulated and observed expression data. In the former, different structures of the hidden covariance matrix have been replicated. Compared to existing methods, BAP search algorithm is able to correctly identify hidden confounding whilst controlling false positive rate and achieving good fitting and perturbation metrics.
Assuntos
Algoritmos , Biologia Computacional , Biologia Computacional/métodos , Humanos , Análise de Componente Principal , Simulação por Computador , Perfilação da Expressão Gênica/métodos , Perfilação da Expressão Gênica/estatística & dados numéricos , Modelos Estatísticos , Correlação de Dados , Distribuição NormalRESUMO
Gaussian graphical models (GGMs) are useful for understanding the complex relationships between biological entities. Transfer learning can improve the estimation of GGMs in a target dataset by incorporating relevant information from related source studies. However, biomedical research often involves intrinsic and latent heterogeneity within a study, such as heterogeneous subpopulations. This heterogeneity can make it difficult to identify informative source studies or lead to negative transfer if the source study is improperly used. To address this challenge, we developed a heterogeneous latent transfer learning (Latent-TL) approach that accounts for both within-sample and between-sample heterogeneity. The idea behind this approach is to "learn from the alike" by leveraging the similarities between source and target GGMs within each subpopulation. The Latent-TL algorithm simultaneously identifies common subpopulation structures among samples and facilitates the learning of target GGMs using source samples from the same subpopulation. Through extensive simulations and real data application, we have shown that the proposed method outperforms single-site learning and standard transfer learning that ignores the latent structures. We have also demonstrated the applicability of the proposed algorithm in characterizing gene co-expression networks in breast cancer patients, where the inferred genetic networks identified many biologically meaningful gene-gene interactions.
Assuntos
Algoritmos , Neoplasias da Mama , Simulação por Computador , Modelos Estatísticos , Distribuição Normal , Humanos , Neoplasias da Mama/genética , Feminino , Aprendizado de Máquina , Redes Reguladoras de GenesRESUMO
Breast cancer detection and differentiation of breast tissues are critical for accurate diagnosis and treatment planning. This study addresses the challenge of distinguishing between invasive ductal carcinoma (IDC), normal glandular breast tissues (nGBT), and adipose tissue using electrical impedance spectroscopy combined with Gaussian relaxation-time distribution (EIS-GRTD). The primary objective is to investigate the relaxation-time characteristics of these tissues and their potential to differentiate between normal and abnormal breast tissues. We applied a single-point EIS-GRTD measurement to ten mastectomy specimens across a frequency rangef= 4 Hz to 5 MHz. The method calculates the differential ratio of the relaxation-time distribution functionΔγbetween IDC and nGBT, which is denoted byΔγIDC-nGBT,andΔγbetween IDC and adipose tissues, which is denoted byΔγIDC-adipose.As a result, the differential ratio ofΔγbetween IDC and nGBTΔγIDC-nGBTis 0.36, and between IDC and adiposeΔγIDC-adiposeis 0.27, which included in theα-dispersion atτpeak1=0.033±0.001s.In all specimens, the relaxation-time distribution functionγof IDCγIDCis higher, and there is no intersection withγof nGBTγnGBTand adiposeγadipose.The difference inγsuggests potential variations in relaxation properties at the molecular or structural level within each breast tissue that contribute to the overall relaxation response. The average mean percentage errorδfor IDC, nGBT, and adipose tissues are 5.90%, 6.33%, and 8.07%, respectively, demonstrating the model's accuracy and reliability. This study provides novel insights into the use of relaxation-time characteristic for differentiating breast tissue types, offering potential advancements in diagnosis methods. Future research will focus on correlating EIS-GRTD finding with pathological results from the same test sites to further validate the method's efficacy.
Assuntos
Tecido Adiposo , Neoplasias da Mama , Carcinoma Ductal de Mama , Espectroscopia Dielétrica , Humanos , Espectroscopia Dielétrica/métodos , Feminino , Carcinoma Ductal de Mama/patologia , Distribuição Normal , Mama/diagnóstico por imagem , Impedância Elétrica , MastectomiaRESUMO
In the field of image processing, common noise types include Gaussian noise, salt and pepper noise, speckle noise, uniform noise and pulse noise. Different types of noise require different denoising algorithms and techniques to maintain image quality and fidelity. Traditional image denoising methods not only remove image noise, but also result in the detail loss in the image. It cannot guarantee the clean removal of noise information while preserving the true signal of the image. To address the aforementioned issues, an image denoising method combining an improved threshold function and wavelet transform is proposed in the experiment. Unlike traditional threshold functions, the improved threshold function is a continuous function that can avoid the pseudo Gibbs effect after image denoising and improve image quality. During the process, the output image of the finite ridge wave transform is first combined with the wavelet transform to improve the denoising performance. Then, an improved threshold function is introduced to enhance the quality of the reconstructed image. In addition, to evaluate the performance of different algorithms, different densities of Gaussian noise are added to Lena images of black, white, and color in the experiment. The results showed that when adding 0.010.01 variance Gaussian noise to black and white images, the peak signal-to-noise ratio of the research method increased by 2.58dB in a positive direction. The mean square error decreased by 0.10dB. When using the algorithm for denoising, the research method had a minimum denoising time of only 13ms, which saved 9ms and 3ms compared to the hard threshold algorithm (Hard TA) and soft threshold algorithm (Soft TA), respectively. The research method exhibited higher stability, with an average similarity error fluctuating within 0.89%. The above results indicate that the research method has smaller errors and better system stability in image denoising. It can be applied in the field of digital image denoising, which can effectively promote the positive development of image denoising technology to a certain extent.
Assuntos
Algoritmos , Processamento de Imagem Assistida por Computador , Razão Sinal-Ruído , Análise de Ondaletas , Processamento de Imagem Assistida por Computador/métodos , Distribuição NormalRESUMO
Models intended to describe the time evolution of a gene network must somehow include transcription, the DNA-templated synthesis of RNA, and translation, the RNA-templated synthesis of proteins. In eukaryotes, the DNA template for transcription can be very long, often consisting of tens of thousands of nucleotides, and lengthy pauses may punctuate this process. Accordingly, transcription can last for many minutes, in some cases hours. There is a long history of introducing delays in gene expression models to take the transcription and translation times into account. Here we study a family of detailed transcription models that includes initiation, elongation, and termination reactions. We establish a framework for computing the distribution of transcription times, and work out these distributions for some typical cases. For elongation, a fixed delay is a good model provided elongation is fast compared to initiation and termination, and there are no sites where long pauses occur. The initiation and termination phases of the model then generate a nontrivial delay distribution, and elongation shifts this distribution by an amount corresponding to the elongation delay. When initiation and termination are relatively fast, the distribution of elongation times can be approximated by a Gaussian. A convolution of this Gaussian with the initiation and termination time distributions gives another analytic approximation to the transcription time distribution. If there are long pauses during elongation, because of the modularity of the family of models considered, the elongation phase can be partitioned into reactions generating a simple delay (elongation through regions where there are no long pauses), and reactions whose distribution of waiting times must be considered explicitly (initiation, termination, and motion through regions where long pauses are likely). In these cases, the distribution of transcription times again involves a nontrivial part and a shift due to fast elongation processes.
Assuntos
Modelos Genéticos , Transcrição Gênica , Redes Reguladoras de Genes , Simulação por Computador , Algoritmos , Distribuição Normal , Biossíntese de Proteínas , DNA/genética , Fatores de Tempo , RNA/genética , HumanosRESUMO
Binding of partners and mutations highly affects the conformational dynamics of KRAS4B, which is of significance for deeply understanding its function. Gaussian accelerated molecular dynamics (GaMD) simulations followed by deep learning (DL) and principal component analysis (PCA) were carried out to probe the effect of G12C and binding of three partners NF1, RAF1, and SOS1 on the conformation alterations of KRAS4B. DL reveals that G12C and binding of partners result in alterations in the contacts of key structure domains, such as the switch domains SW1 and SW2 together with the loops L4, L5, and P-loop. Binding of NF1, RAF1, and SOS1 constrains the structural fluctuation of SW1, SW2, L4, and L5; on the contrary, G12C leads to the instability of these four structure domains. The analyses of free energy landscapes (FELs) and PCA also show that binding of partners maintains the stability of the conformational states of KRAS4B while G12C induces greater mobility of the switch domains SW1 and SW2, which produces significant impacts on the interactions of GTP with SW1, L4, and L5. Our findings suggest that partner binding and G12C play important roles in the activity and allosteric regulation of KRAS4B, which may theoretically aid in further understanding the function of KRAS4B.