Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
1.
Front Microbiol ; 15: 1339156, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38572227

RESUMEN

Traditional alignment-based methods meet serious challenges in genome sequence comparison and phylogeny reconstruction due to their high computational complexity. Here, we propose a new alignment-free method to analyze the phylogenetic relationships (classification) among species. In our method, the dynamical language (DL) model and the chaos game representation (CGR) method are used to characterize the frequency information and the context information of k-mers in a sequence, respectively. Then for each DNA sequence or protein sequence in a dataset, our method converts the sequence into a feature vector that represents the sequence information based on CGR weighted by the DL model to infer phylogenetic relationships. We name our method CGRWDL. Its performance was tested on both DNA and protein sequences of 8 datasets of viruses to construct the phylogenetic trees. We compared the Robinson-Foulds (RF) distance between the phylogenetic tree constructed by CGRWDL and the reference tree by other advanced methods for each dataset. The results show that the phylogenetic trees constructed by CGRWDL can accurately classify the viruses, and the RF scores between the trees and the reference trees are smaller than that with other methods.

2.
Chaos ; 34(1)2024 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-38198680

RESUMEN

The significance of accurate long-term forecasting of air quality for a long-term policy decision for controlling air pollution and for evaluating its impacts on human health has attracted greater attention recently. This paper proposes an ensemble multi-scale framework to refine the previous version with ensemble empirical mode decomposition (EMD) and nonstationary oscillation resampling (NSOR) for long-term forecasting. Within the proposed ensemble multi-scale framework, we on one hand apply modified EMD to produce more regular and stable EMD components, allowing the long-range oscillation characteristics of the original time series to be better captured. On the other hand, we provide an ensemble mechanism to alleviate the error propagation problem in forecasts caused by iterative implementation of NSOR at all lead times and name it improved NSOR. Application of the proposed multi-scale framework to long-term forecasting of the daily PM2.5 at 14 monitoring stations in Hong Kong demonstrates that it can effectively capture the long-term variation in air pollution processes and significantly increase the forecasting performance. Specifically, the framework can, respectively, reduce the average root-mean-square error and the mean absolute error over all 14 stations by 8.4% and 9.2% for a lead time of 100 days, compared to previous studies. Additionally, better robustness can be obtained by the proposed ensemble framework for 180-day and 365-day long-term forecasting scenarios. It should be emphasized that the proposed ensemble multi-scale framework is a feasible framework, which is applicable for long-term time series forecasting in general.

3.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37401373

RESUMEN

Recent advances and achievements of artificial intelligence (AI) as well as deep and graph learning models have established their usefulness in biomedical applications, especially in drug-drug interactions (DDIs). DDIs refer to a change in the effect of one drug to the presence of another drug in the human body, which plays an essential role in drug discovery and clinical research. DDIs prediction through traditional clinical trials and experiments is an expensive and time-consuming process. To correctly apply the advanced AI and deep learning, the developer and user meet various challenges such as the availability and encoding of data resources, and the design of computational methods. This review summarizes chemical structure based, network based, natural language processing based and hybrid methods, providing an updated and accessible guide to the broad researchers and development community with different domain knowledge. We introduce widely used molecular representation and describe the theoretical frameworks of graph neural network models for representing molecular structures. We present the advantages and disadvantages of deep and graph learning methods by performing comparative experiments. We discuss the potential technical challenges and highlight future directions of deep and graph learning models for accelerating DDIs prediction.


Asunto(s)
Inteligencia Artificial , Redes Neurales de la Computación , Humanos , Interacciones Farmacológicas , Procesamiento de Lenguaje Natural , Descubrimiento de Drogas
4.
Front Immunol ; 14: 1160397, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37377963

RESUMEN

Introduction: Substantial links between autoimmune diseases have been shown by an increasing number of studies, and one hypothesis for this comorbidity is that there is a common genetic cause. Methods: In this paper, a large-scale cross-trait Genome-wide Association Studies (GWAS) was conducted to investigate the genetic overlap among rheumatoid arthritis, multiple sclerosis, inflammatory bowel disease and type 1 diabetes. Results and discussion: Through the local genetic correlation analysis, 2 regions with locally significant genetic associations between rheumatoid arthritis and multiple sclerosis, and 4 regions with locally significant genetic associations between rheumatoid arthritis and type 1 diabetes were discovered. By cross-trait meta-analysis, 58 independent loci associated with rheumatoid arthritis and multiple sclerosis, 86 independent loci associated with rheumatoid arthritis and inflammatory bowel disease, and 107 independent loci associated with rheumatoid arthritis and type 1 diabetes were identified with genome-wide significance. In addition, 82 common risk genes were found through genetic identification. Based on gene set enrichment analysis, it was found that shared genes are enriched in exposed dermal system, calf, musculoskeletal, subcutaneous fat, thyroid and other tissues, and are also significantly enriched in 35 biological pathways. To verify the association between diseases, Mendelian randomized analysis was performed, which shows possible causal associations between rheumatoid arthritis and multiple sclerosis, and between rheumatoid arthritis and type 1 diabetes. The common genetic structure of rheumatoid arthritis, multiple sclerosis, inflammatory bowel disease and type 1 diabetes was explored by these studies, and it is believed that this important discovery will lead to new ideas for clinical treatment.


Asunto(s)
Artritis Reumatoide , Enfermedades Autoinmunes , Diabetes Mellitus Tipo 1 , Enfermedades Inflamatorias del Intestino , Esclerosis Múltiple , Humanos , Estudio de Asociación del Genoma Completo , Diabetes Mellitus Tipo 1/genética , Predisposición Genética a la Enfermedad , Enfermedades Autoinmunes/genética , Artritis Reumatoide/genética , Sitios Genéticos , Esclerosis Múltiple/genética , Enfermedades Inflamatorias del Intestino/genética
5.
Front Cell Infect Microbiol ; 13: 1117421, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36779183

RESUMEN

Introduction: The species diversity of microbiomes is a cutting-edge concept in metagenomic research. In this study, we propose a multifractal analysis for metagenomic research. Method and Results: Firstly, we visualized the chaotic game representation (CGR) of simulated metagenomes and real metagenomes. We find that metagenomes are visualized with self-similarity. Then we defined and calculated the multifractal dimension for the visualized plot of simulated and real metagenomes, respectively. By analyzing the Pearson correlation coefficients between the multifractal dimension and the traditional species diversity index, we obtain that the correlation coefficients between the multifractal dimension and the species richness index and Shannon diversity index reached the maximum value when q = 0, 1, and the correlation coefficient between the multifractal dimension and the Simpson diversity index reached the maximum value when q = 5. Finally, we apply our method to real metagenomes of the gut microbiota of 100 infants who are newborn and 4 and 12 months old. The results show that the multifractal dimensions of an infant's gut microbiomes can distinguish age differences. Conclusion and Discussion: There is self-similarity among the CGRs of WGS of metagenomes, and the multifractal spectrum is an important characteristic for metagenomes. The traditional diversity indicators can be unified under the framework of multifractal analysis. These results coincided with similar results in macrobial ecology. The multifractal spectrum of infants' gut microbiomes are related to the development of the infants.


Asunto(s)
Microbioma Gastrointestinal , Microbiota , Humanos , Lactante , Recién Nacido , Metagenoma , Microbiota/genética , Microbioma Gastrointestinal/genética , Metagenómica/métodos , Ecología
6.
Brief Funct Genomics ; 21(5): 399-407, 2022 09 16.
Artículo en Inglés | MEDLINE | ID: mdl-35942693

RESUMEN

Identification and classification of enhancers are highly significant because they play crucial roles in controlling gene transcription. Recently, several deep learning-based methods for identifying enhancers and their strengths have been developed. However, existing methods are usually limited because they use only local or only global features. The combination of local and global features is critical to further improve the prediction performance. In this work, we propose a novel deep learning-based method, called iEnhancer-DLRA, to identify enhancers and their strengths. iEnhancer-DLRA extracts local and multi-scale global features of sequences by using a residual convolutional network and two bidirectional long short-term memory networks. Then, a self-attention fusion strategy is proposed to deeply integrate these local and global features. The experimental results on the independent test dataset indicate that iEnhancer-DLRA performs better than nine existing state-of-the-art methods in both identification and classification of enhancers in almost all metrics. iEnhancer-DLRA achieves 13.8% (for identifying enhancers) and 12.6% (for classifying strengths) improvement in accuracy compared with the best existing state-of-the-art method. This is the first time that the accuracy of an enhancer identifier exceeds 0.9 and the accuracy of the enhancer classifier exceeds 0.8 on the independent test set. Moreover, iEnhancer-DLRA achieves superior predictive performance on the rice dataset compared with the state-of-the-art method RiceENN.


Asunto(s)
Atención , Elementos de Facilitación Genéticos , Elementos de Facilitación Genéticos/genética
7.
Interdiscip Sci ; 14(2): 439-451, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-35106702

RESUMEN

N4-Acetylcytidine (ac4C) is a highly conserved post-transcriptional and an extensively existing RNA modification, playing versatile roles in the cellular processes. Due to the limitation of techniques and knowledge, large-scale identification of ac4C is still a challenging task. RNA sequences are like sentences containing semantics in the natural language. Inspired by the semantics of language, we proposed a hybrid model for ac4C prediction. The model used long short-term memory and convolution neural network to extract the semantic features hidden in the sequences. The semantic and the two traditional features (k-nucleotide frequencies and pseudo tri-tuple nucleotide composition) were combined to represent ac4C or non-ac4C sequences. The eXtreme Gradient Boosting was used as the learning algorithm. Five-fold cross-validation over the training set consisting of 1160 ac4C and 10,855 non-ac4C sequences obtained the area under the receiver operating characteristic curve (AUROC) of 0.9004, and the independent test over 469 ac4C and 4343 non-ac4C sequences reached an AUROC of 0.8825. The model obtained a sensitivity of 0.6474 in the five-fold cross-validation and 0.6290 in the independent test, outperforming two state-of-the-art methods. The performance of semantic features alone was better than those of k-nucleotide frequencies and pseudo tri-tuple nucleotide composition, implying that ac4C sequences are of semantics. The proposed hybrid model was implemented into a user-friendly web-server which is freely available to scientific communities: http://47.113.117.61/ac4c/ . The presented model and tool are beneficial to identify ac4C on large scale.


Asunto(s)
Citidina , Nucleótidos , Algoritmos , Citidina/análogos & derivados , Citidina/genética , Curva ROC
8.
Front Genet ; 12: 752732, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34764983

RESUMEN

Knowledge about protein-protein interactions is beneficial in understanding cellular mechanisms. Protein-protein interactions are usually determined according to their protein-protein interaction sites. Due to the limitations of current techniques, it is still a challenging task to detect protein-protein interaction sites. In this article, we presented a method based on deep learning and XGBoost (called DeepPPISP-XGB) for predicting protein-protein interaction sites. The deep learning model served as a feature extractor to remove redundant information from protein sequences. The Extreme Gradient Boosting algorithm was used to construct a classifier for predicting protein-protein interaction sites. The DeepPPISP-XGB achieved the following results: area under the receiver operating characteristic curve of 0.681, a recall of 0.624, and area under the precision-recall curve of 0.339, being competitive with the state-of-the-art methods. We also validated the positive role of global features in predicting protein-protein interaction sites.

9.
Front Genet ; 12: 766496, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34745231

RESUMEN

Alignment methods have faced disadvantages in sequence comparison and phylogeny reconstruction due to their high computational costs in handling time and space complexity. On the other hand, alignment-free methods incur low computational costs and have recently gained popularity in the field of bioinformatics. Here we propose a new alignment-free method for phylogenetic tree reconstruction based on whole genome sequences. A key component is a measure called information-entropy position-weighted k-mer relative measure (IEPWRMkmer), which combines the position-weighted measure of k-mers proposed by our group and the information entropy of frequency of k-mers. The Manhattan distance is used to calculate the pairwise distance between species. Finally, we use the Neighbor-Joining method to construct the phylogenetic tree. To evaluate the performance of this method, we perform phylogenetic analysis on two datasets used by other researchers. The results demonstrate that the IEPWRMkmer method is efficient and reliable. The source codes of our method are provided at https://github.com/ wuyaoqun37/IEPWRMkmer.

10.
Biomedicines ; 9(9)2021 Sep 03.
Artículo en Inglés | MEDLINE | ID: mdl-34572337

RESUMEN

Abnormal miRNA functions are widely involved in many diseases recorded in the database of experimentally supported human miRNA-disease associations (HMDD). Some of the associations are complicated: There can be up to five heterogeneous association types of miRNA with the same disease, including genetics type, epigenetics type, circulating miRNAs type, miRNA tissue expression type and miRNA-target interaction type. When one type of association is known for an miRNA-disease pair, it is important to predict any other types of the association for a better understanding of the disease mechanism. It is even more important to reveal associations for currently unassociated miRNAs and diseases. Methods have been recently proposed to make predictions on the association types of miRNA-disease pairs through restricted Boltzman machines, label propagation theories and tensor completion algorithms. None of them has exploited the non-linear characteristics in the miRNA-disease association network to improve the performance. We propose to use attributed multi-layer heterogeneous network embedding to learn the latent representations of miRNAs and diseases from each association type and then to predict the existence of the association type for all the miRNA-disease pairs. The performance of our method is compared with two newest methods via 10-fold cross-validation on the database HMDD v3.2 to demonstrate the superior prediction achieved by our method under different settings. Moreover, our real predictions made beyond the HMDD database can be all validated by NCBI literatures, confirming that our method is capable of accurately predicting new associations of miRNAs with diseases and their association types as well.

11.
Biomed Res Int ; 2021: 9923112, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34159204

RESUMEN

Lysine succinylation is a typical protein post-translational modification and plays a crucial role of regulation in the cellular process. Identifying succinylation sites is fundamental to explore its functions. Although many computational methods were developed to deal with this challenge, few considered semantic relationship between residues. We combined long short-term memory (LSTM) and convolutional neural network (CNN) into a deep learning method for predicting succinylation site. The proposed method obtained a Matthews correlation coefficient of 0.2508 on the independent test, outperforming state of the art methods. We also performed the enrichment analysis of succinylation proteins. The results showed that functions of succinylation were conserved across species but differed to a certain extent with species. On basis of the proposed method, we developed a user-friendly web server for predicting succinylation sites.


Asunto(s)
Algoritmos , Aprendizaje Profundo , Redes Neurales de la Computación , Ácido Succínico/química , Animales , Área Bajo la Curva , Biología Computacional/métodos , Escherichia coli , Humanos , Internet , Procesamiento Proteico-Postraduccional , Proteínas/metabolismo , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
12.
Phys Rev E ; 103(4-1): 043303, 2021 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-34005996

RESUMEN

Among various algorithms of multifractal analysis (MFA) for complex networks, the sandbox MFA algorithm behaves with the best computational efficiency. However, the existing sandbox algorithm is still computationally expensive for MFA of large-scale networks with tens of millions of nodes. It is also not clear whether MFA results can be improved by a largely increased size of a theoretical network. To tackle these challenges, a computationally efficient sandbox algorithm (CESA) is presented in this paper for MFA of large-scale networks. Distinct from the existing sandbox algorithm that uses the shortest-path distance matrix to obtain the required information for MFA of networks, our CESA employs the compressed sparse row format of the adjacency matrix and the breadth-first search technique to directly search the neighbor nodes of each layer of center nodes, and then to retrieve the required information. A theoretical analysis reveals that the CESA reduces the time complexity of the existing sandbox algorithm from cubic to quadratic, and also improves the space complexity from quadratic to linear. Then the CESA is demonstrated to be effective, efficient, and feasible through the MFA results of (u,v)-flower model networks from the fifth to the 12th generations. It enables us to study the multifractality of networks of the size of about 11 million nodes with a normal desktop computer. Furthermore, we have also found that increasing the size of (u,v)-flower model network does improve the accuracy of MFA results. Finally, our CESA is applied to a few typical real-world networks of large scale.

13.
Environ Pollut ; 271: 116381, 2021 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-33421843

RESUMEN

Air quality forecasting for Hong Kong is a challenge. Even taking the advantages of auto-regressive integrated moving average and some state-of-the-art numerical models, a recently-developed hybrid method for one-day (two- and three-day) ahead forecasting performs similarly to (slightly better than) a simple persistence forecasting. Long-term forecasting also remains an important issue, especially for policy decision for better control of air pollution and for evaluation of the long-term impacts on public health. Given the well-recognized negative effects of PM2.5, NO2 and O3 on public health, we study their time series under the multi-scale framework with empirical mode decomposition and nonstationary oscillation resampling to explore the possibility of long-term forecasting and to improve short-term forecasts in Hong Kong. Applied to a dataset from January 2016 to December 2018, the long-term forecasting (with lead time about 100 days) of the multi-scale framework has the root-mean-square error (RMSE) comparable with that of the short-term (with lead time of one or two days) forecasting by the persistence method, while its improvement for short-term forecasting (with lead time of one, two or three days) is quite substantial over the persistence forecasting, with RMSEs reduced by respectively 44%-47%, 30%-45%, and 40%-60% for PM2.5, NO2, and O3. Compared to the hybrid method, it turns out that, for short-term forecasting for the same data, the multi-scale framework can reduce RMSE by about 25% (respectively 30%) for PM2.5 (respectively NO2 and O3). In addition, we find no significant difference in the forecasting performance of the multi-scale framework among different types of stations. The multi-scale framework is feasible for time series forecasting and applicable to other pollutants in other cities.


Asunto(s)
Contaminantes Atmosféricos , Contaminación del Aire , Contaminantes Atmosféricos/análisis , Contaminación del Aire/análisis , Ciudades , Predicción , Hong Kong , Material Particulado/análisis
14.
Chaos ; 30(11): 113123, 2020 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-33261323

RESUMEN

In this study, we focus on the fractal property of recurrence networks constructed from the two-dimensional fractional Brownian motion (2D fBm), i.e., the inter-system recurrence network, the joint recurrence network, the cross-joint recurrence network, and the multidimensional recurrence network, which are the variants of classic recurrence networks extended for multiple time series. Generally, the fractal dimension of these recurrence networks can only be estimated numerically. The numerical analysis identifies the existence of fractality in these constructed recurrence networks. Furthermore, it is found that the numerically estimated fractal dimension of these networks can be connected to the theoretical fractal dimension of the 2D fBm graphs, because both fractal dimensions are piecewisely associated with the Hurst exponent H in a highly similar pattern, i.e., a linear decrease (if H varies from 0 to 0.5) followed by an inversely proportional-like decay (if H changes from 0.5 to 1). Although their fractal dimensions are not exactly identical, their difference can actually be deciphered by one single parameter with the value around 1. Therefore, it can be concluded that these recurrence networks constructed from the 2D fBms must inherit some fractal properties of its associated 2D fBms with respect to the fBm graphs.

15.
Chaos ; 30(5): 053113, 2020 May.
Artículo en Inglés | MEDLINE | ID: mdl-32491907

RESUMEN

A novel general randomized method is proposed to investigate multifractal properties of long time series. Based on multifractal temporally weighted detrended fluctuation analysis (MFTWDFA), we obtain randomized multifractal temporally weighted detrended fluctuation analysis (RMFTWDFA). The innovation of this algorithm is applying a random idea in the process of dividing multiple intervals to find the local trend. To test the performance of the RMFTWDFA algorithm, we apply it, together with the MFTWDFA, to the artificially generated time series and real genomic sequences. For three types of artificially generated time series, consistency tests are performed on the estimated h(q), and all results indicate that there is no significant difference in the estimated h(q) of the two methods. Meanwhile, for different sequence lengths, the running time of RMFTWDFA is reduced by over ten times. We use prokaryote genomic sequences with large scales as real examples, the results obtained by RMFTWDFA demonstrate that these genomic sequences show fractal characteristics, and we leverage estimated exponents to study phylogenetic relationships between species. The final clustering results are consistent with real relationships. All the results reflect that RMFTWDFA is significantly effective and timesaving for long time series, while obtaining an accuracy statistically comparable to other methods.


Asunto(s)
Fractales , Filogenia , Algoritmos , Bacterias/genética , Bases de Datos Genéticas
16.
Chaos ; 30(2): 023134, 2020 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-32113234

RESUMEN

Fractal and multifractal properties of various systems have been studied extensively. In this paper, first, the multivariate multifractal detrend cross-correlation analysis (MMXDFA) is proposed to investigate the multifractal features in multivariate time series. MMXDFA may produce oscillations in the fluctuation function and spurious cross correlations. In order to overcome these problems, we then propose the multivariate multifractal temporally weighted detrended cross-correlation analysis (MMTWXDFA). In relation to the multivariate detrended cross-correlation analysis and multifractal temporally weighted detrended cross-correlation analysis, an innovation of MMTWXDFA is the application of the signed Manhattan distance to calculate the local detrended covariance function. To evaluate the performance of the MMXDFA and MMTWXDFA methods, we apply them on some artificially generated multivariate series. Several numerical tests demonstrate that both methods can identify their fractality, but MMTWXDFA can detect long-range cross correlations and simultaneously quantify the levels of cross correlation between two multivariate series more accurately.

17.
Sci Rep ; 9(1): 2474, 2019 02 21.
Artículo en Inglés | MEDLINE | ID: mdl-30792474

RESUMEN

More and more research works have indicated that microRNAs (miRNAs) play indispensable roles in exploring the pathogenesis of diseases. Detecting miRNA-disease associations by experimental techniques in biology is expensive and time-consuming. Hence, it is important to propose reliable and accurate computational methods to exploring potential miRNAs related diseases. In our work, we develop a novel method (BRWHNHA) to uncover potential miRNAs associated with diseases based on hybrid recommendation algorithm and unbalanced bi-random walk. We first integrate the Gaussian interaction profile kernel similarity into the miRNA functional similarity network and the disease semantic similarity network. Then we calculate the transition probability matrix of bipartite network by using hybrid recommendation algorithm. Finally, we adopt unbalanced bi-random walk on the heterogeneous network to infer undiscovered miRNA-disease relationships. We tested BRWHNHA on 22 diseases based on five-fold cross-validation and achieves reliable performance with average AUC of 0.857, which an area under the ROC curve ranging from 0.807 to 0.924. As a result, BRWHNHA significantly improves the performance of inferring potential miRNA-disease association compared with previous methods. Moreover, the case studies on lung neoplasms and prostate neoplasms also illustrate that BRWHNHA is superior to previous prediction methods and is more advantageous in exploring potential miRNAs related diseases. All source codes can be downloaded from https://github.com/myl446/BRWHNHA .


Asunto(s)
Estudios de Asociación Genética/métodos , Predisposición Genética a la Enfermedad/genética , MicroARNs/genética , Algoritmos , Biología Computacional , Simulación por Computador , Humanos , Modelos Genéticos
18.
Front Genet ; 10: 1325, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-32117407

RESUMEN

Butyrylation plays a crucial role in the cellular processes. Due to limit of techniques, it is a challenging task to identify histone butyrylation sites on a large scale. To fill the gap, we propose an approach based on information entropy and machine learning for computationally identifying histone butyrylation sites. The proposed method achieves 0.92 of area under the receiver operating characteristic (ROC) curve over the training set by 3-fold cross validation and 0.80 over the testing set by independent test. Feature analysis implies that amino acid residues in the down/upstream of butyrylation sites would exhibit specific sequence motif to a certain extent. Functional analysis suggests that histone butyrylation was most possibly associated with four pathways (systemic lupus erythematosus, alcoholism, viral carcinogenesis and transcriptional misregulation in cancer), was involved in binding with other molecules, processes of biosynthesis, assembly, arrangement or disassembly and was located in such complex as consists of DNA, RNA, protein, etc. The proposed method is useful to predict histone butyrylation sites. Analysis of feature and function improves understanding of histone butyrylation and increases knowledge of functions of butyrylated histones.

19.
BMC Syst Biol ; 11(Suppl 4): 81, 2017 Sep 21.
Artículo en Inglés | MEDLINE | ID: mdl-28950903

RESUMEN

BACKGROUND: Molecular interaction data at proteomic and genetic levels provide physical and functional insights into a molecular biosystem and are helpful for the construction of pathway structures complementarily. Despite advances in inferring biological pathways using genetic interaction data, there still exists weakness in developed models, such as, activity pathway networks (APN), when integrating the data from proteomic and genetic levels. It is necessary to develop new methods to infer pathway structure by both of interaction data. RESULTS: We utilized probabilistic graphical model to develop a new method that integrates genetic interaction and protein interaction data and infers exquisitely detailed pathway structure. We modeled the pathway network as Bayesian network and applied this model to infer pathways for the coherent subsets of the global genetic interaction profiles, and the available data set of endoplasmic reticulum genes. The protein interaction data were derived from the BioGRID database. Our method can accurately reconstruct known cellular pathway structures, including SWR complex, ER-Associated Degradation (ERAD) pathway, N-Glycan biosynthesis pathway, Elongator complex, Retromer complex, and Urmylation pathway. By comparing N-Glycan biosynthesis pathway and Urmylation pathway identified from our approach with that from APN, we found that our method is able to overcome its weakness (certain edges are inexplicable). According to underlying protein interaction network, we defined a simple scoring function that only adopts genetic interaction information to avoid the balance difficulty in the APN. Using the effective stochastic simulation algorithm, the performance of our proposed method is significantly high. CONCLUSION: We developed a new method based on Bayesian network to infer detailed pathway structures from interaction data at proteomic and genetic levels. The results indicate that the developed method performs better in predicting signaling pathways than previously described models.


Asunto(s)
Biología Computacional/métodos , Redes Reguladoras de Genes , Modelos Estadísticos , Mapeo de Interacción de Proteínas , Teorema de Bayes , Modelos Biológicos
20.
Chaos ; 27(6): 063111, 2017 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-28679233

RESUMEN

A new method-multifractal temporally weighted detrended cross-correlation analysis (MF-TWXDFA)-is proposed to investigate multifractal cross-correlations in this paper. This new method is based on multifractal temporally weighted detrended fluctuation analysis and multifractal cross-correlation analysis (MFCCA). An innovation of the method is applying geographically weighted regression to estimate local trends in the nonstationary time series. We also take into consideration the sign of the fluctuations in computing the corresponding detrended cross-covariance function. To test the performance of the MF-TWXDFA algorithm, we apply it and the MFCCA method on simulated and actual series. Numerical tests on artificially simulated series demonstrate that our method can accurately detect long-range cross-correlations for two simultaneously recorded series. To further show the utility of MF-TWXDFA, we apply it on time series from stock markets and find that power-law cross-correlation between stock returns is significantly multifractal. A new coefficient, MF-TWXDFA cross-correlation coefficient, is also defined to quantify the levels of cross-correlation between two time series.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...