Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 45
Filtrar
1.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36534961

RESUMO

The inference of large-scale gene regulatory networks is essential for understanding comprehensive interactions among genes. Most existing methods are limited to reconstructing networks with a few hundred nodes. Therefore, parallel computing paradigms must be leveraged to construct large networks. We propose a generic parallel framework that enables any existing method, without re-engineering, to infer large networks in parallel, guaranteeing quality output. The framework is tested on 15 inference methods (not limited to) employing in silico benchmarks and real-world large expression matrices, followed by qualitative and speedup assessment. The framework does not compromise the quality of the base serial inference method. We rank the candidate methods and use the top-performing method to infer an Alzheimer's Disease (AD) affected network from large expression profiles of a triple transgenic mouse model consisting of 45,101 genes. The resultant network is further explored to obtain hub genes that emerge functionally related to the disease. We partition the network into 41 modules and conduct pathway enrichment analysis, revealing that a good number of participating genes are collectively responsible for several brain disorders, including AD. Finally, we extract the interactions of a few known AD genes and observe that they are periphery genes connected to the network's hub genes. Availability: The R implementation of the framework is downloadable from https://github.com/Netralab/GenericParallelFramework.


Assuntos
Doença de Alzheimer , Redes Reguladoras de Genes , Animais , Camundongos , Doença de Alzheimer/genética , Encéfalo , Animais Geneticamente Modificados , Algoritmos
2.
BMC Bioinformatics ; 25(1): 245, 2024 Jul 19.
Artigo em Inglês | MEDLINE | ID: mdl-39030497

RESUMO

BACKGROUND: Inference of Gene Regulatory Networks (GRNs) is a difficult and long-standing question in Systems Biology. Numerous approaches have been proposed with the latest methods exploring the richness of single-cell data. One of the current difficulties lies in the fact that many methods of GRN inference do not result in one proposed GRN but in a collection of plausible networks that need to be further refined. In this work, we present a Design of Experiment strategy to use as a second stage after the inference process. It is specifically fitted for identifying the next most informative experiment to perform for deciding between multiple network topologies, in the case where proposed GRNs are executable models. This strategy first performs a topological analysis to reduce the number of perturbations that need to be tested, then predicts the outcome of the retained perturbations by simulation of the GRNs and finally compares predictions with novel experimental data. RESULTS: We apply this method to the results of our divide-and-conquer algorithm called WASABI, adapt its gene expression model to produce perturbations and compare our predictions with experimental results. We show that our networks were able to produce in silico predictions on the outcome of a gene knock-out, which were qualitatively validated for 48 out of 49 genes. Finally, we eliminate as many as two thirds of the candidate networks for which we could identify an incorrect topology, thus greatly improving the accuracy of our predictions. CONCLUSION: These results both confirm the inference accuracy of WASABI and show how executable gene expression models can be leveraged to further refine the topology of inferred GRNs. We hope this strategy will help systems biologists further explore their data and encourage the development of more executable GRN models.


Assuntos
Algoritmos , Redes Reguladoras de Genes , Redes Reguladoras de Genes/genética , Biologia de Sistemas/métodos , Biologia Computacional/métodos , Simulação por Computador , Modelos Genéticos
3.
BMC Bioinformatics ; 24(1): 84, 2023 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-36879188

RESUMO

BACKGROUND: A cell exhibits a variety of responses to internal and external cues. These responses are possible, in part, due to the presence of an elaborate gene regulatory network (GRN) in every single cell. In the past 20 years, many groups worked on reconstructing the topological structure of GRNs from large-scale gene expression data using a variety of inference algorithms. Insights gained about participating players in GRNs may ultimately lead to therapeutic benefits. Mutual information (MI) is a widely used metric within this inference/reconstruction pipeline as it can detect any correlation (linear and non-linear) between any number of variables (n-dimensions). However, the use of MI with continuous data (for example, normalized fluorescence intensity measurement of gene expression levels) is sensitive to data size, correlation strength and underlying distributions, and often requires laborious and, at times, ad hoc optimization. RESULTS: In this work, we first show that estimating MI of a bi- and tri-variate Gaussian distribution using k-nearest neighbor (kNN) MI estimation results in significant error reduction as compared to commonly used methods based on fixed binning. Second, we demonstrate that implementing the MI-based kNN Kraskov-Stoögbauer-Grassberger (KSG) algorithm leads to a significant improvement in GRN reconstruction for popular inference algorithms, such as Context Likelihood of Relatedness (CLR). Finally, through extensive in-silico benchmarking we show that a new inference algorithm CMIA (Conditional Mutual Information Augmentation), inspired by CLR, in combination with the KSG-MI estimator, outperforms commonly used methods. CONCLUSIONS: Using three canonical datasets containing 15 synthetic networks, the newly developed method for GRN reconstruction-which combines CMIA, and the KSG-MI estimator-achieves an improvement of 20-35% in precision-recall measures over the current gold standard in the field. This new method will enable researchers to discover new gene interactions or better choose gene candidates for experimental validations.


Assuntos
Algoritmos , Redes Reguladoras de Genes , Análise por Conglomerados
4.
BMC Biol ; 20(1): 253, 2022 11 09.
Artigo em Inglês | MEDLINE | ID: mdl-36352408

RESUMO

BACKGROUND: Without the availability of disease-modifying drugs, there is an unmet therapeutic need for osteoarthritic patients. During osteoarthritis, the homeostasis of articular chondrocytes is dysregulated and a phenotypical transition called hypertrophy occurs, leading to cartilage degeneration. Targeting this phenotypic transition has emerged as a potential therapeutic strategy. Chondrocyte phenotype maintenance and switch are controlled by an intricate network of intracellular factors, each influenced by a myriad of feedback mechanisms, making it challenging to intuitively predict treatment outcomes, while in silico modeling can help unravel that complexity. In this study, we aim to develop a virtual articular chondrocyte to guide experiments in order to rationalize the identification of potential drug targets via screening of combination therapies through computational modeling and simulations. RESULTS: We developed a signal transduction network model using knowledge-based and data-driven (machine learning) modeling technologies. The in silico high-throughput screening of (pairwise) perturbations operated with that network model highlighted conditions potentially affecting the hypertrophic switch. A selection of promising combinations was further tested in a murine cell line and primary human chondrocytes, which notably highlighted a previously unreported synergistic effect between the protein kinase A and the fibroblast growth factor receptor 1. CONCLUSIONS: Here, we provide a virtual articular chondrocyte in the form of a signal transduction interactive knowledge base and of an executable computational model. Our in silico-in vitro strategy opens new routes for developing osteoarthritis targeting therapies by refining the early stages of drug target discovery.


Assuntos
Cartilagem Articular , Osteoartrite , Humanos , Camundongos , Animais , Cartilagem Articular/metabolismo , Osteoartrite/tratamento farmacológico , Osteoartrite/genética , Osteoartrite/metabolismo , Condrócitos/metabolismo , Hipertrofia/metabolismo , Transdução de Sinais
5.
BMC Bioinformatics ; 23(1): 429, 2022 Oct 17.
Artigo em Inglês | MEDLINE | ID: mdl-36245002

RESUMO

BACKGROUND: Gene expression is regulated at different molecular levels, including chromatin accessibility, transcription, RNA maturation, and transport. These regulatory mechanisms have strong connections with cellular metabolism. In order to study the cellular system and its functioning, omics data at each molecular level can be generated and efficiently integrated. Here, we propose BRANENET, a novel multi-omics integration framework for multilayer heterogeneous networks. BRANENET is an expressive, scalable, and versatile method to learn node embeddings, leveraging random walk information within a matrix factorization framework. Our goal is to efficiently integrate multi-omics data to study different regulatory aspects of multilayered processes that occur in organisms. We evaluate our framework using multi-omics data of Saccharomyces cerevisiae, a well-studied yeast model organism. RESULTS: We test BRANENET on transcriptomics (RNA-seq) and targeted metabolomics (NMR) data for wild-type yeast strain during a heat-shock time course of 0, 20, and 120 min. Our framework learns features for differentially expressed bio-molecules showing heat stress response. We demonstrate the applicability of the learned features for targeted omics inference tasks: transcription factor (TF)-target prediction, integrated omics network (ION) inference, and module identification. The performance of BRANENET is compared to existing network integration methods. Our model outperforms baseline methods by achieving high prediction scores for a variety of downstream tasks.


Assuntos
RNA , Saccharomyces cerevisiae , Cromatina , RNA-Seq , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
6.
BMC Bioinformatics ; 22(1): 153, 2021 Mar 24.
Artigo em Inglês | MEDLINE | ID: mdl-33761871

RESUMO

BACKGROUND: Given expression data, gene regulatory network(GRN) inference approaches try to determine regulatory relations. However, current inference methods ignore the inherent topological characters of GRN to some extent, leading to structures that lack clear biological explanation. To increase the biophysical meanings of inferred networks, this study performed data-driven module detection before network inference. Gene modules were identified by decomposition-based methods. RESULTS: ICA-decomposition based module detection methods have been used to detect functional modules directly from transcriptomic data. Experiments about time-series expression, curated and scRNA-seq datasets suggested that the advantages of the proposed ModularBoost method over established methods, especially in the efficiency and accuracy. For scRNA-seq datasets, the ModularBoost method outperformed other candidate inference algorithms. CONCLUSIONS: As a complicated task, GRN inference can be decomposed into several tasks of reduced complexity. Using identified gene modules as topological constraints, the initial inference problem can be accomplished by inferring intra-modular and inter-modular interactions respectively. Experimental outcomes suggest that the proposed ModularBoost method can improve the accuracy and efficiency of inference algorithms by introducing topological constraints.


Assuntos
Algoritmos , Redes Reguladoras de Genes , Biologia Computacional
7.
Plant J ; 101(3): 716-730, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31571287

RESUMO

Predicting gene regulatory networks (GRNs) from expression profiles is a common approach for identifying important biological regulators. Despite the increased use of inference methods, existing computational approaches often do not integrate RNA-sequencing data analysis, are not automated or are restricted to users with bioinformatics backgrounds. To address these limitations, we developed tuxnet, a user-friendly platform that can process raw RNA-sequencing data from any organism with an existing reference genome using a modified tuxedo pipeline (hisat 2 + cufflinks package) and infer GRNs from these processed data. tuxnet is implemented as a graphical user interface and can mine gene regulations, either by applying a dynamic Bayesian network (DBN) inference algorithm, genist, or a regression tree-based pipeline, rtp-star. We obtained time-course expression data of a PERIANTHIA (PAN) inducible line and inferred a GRN using genist to illustrate the use of tuxnet while gaining insight into the regulations downstream of the Arabidopsis root stem cell regulator PAN. Using rtp-star, we inferred the network of ATHB13, a downstream gene of PAN, for which we obtained wild-type and mutant expression profiles. Additionally, we generated two networks using temporal data from developmental leaf data and spatial data from root cell-type data to highlight the use of tuxnet to form new testable hypotheses from previously explored data. Our case studies feature the versatility of tuxnet when using different types of gene expression data to infer networks and its accessibility as a pipeline for non-bioinformaticians to analyze transcriptome data, predict causal regulations, assess network topology and identify key regulators.


Assuntos
Arabidopsis/genética , Biologia Computacional , Regulação da Expressão Gênica de Plantas , Redes Reguladoras de Genes/genética , Genoma de Planta/genética , Transcriptoma , Algoritmos , Teorema de Bayes , Análise de Sequência de RNA
8.
BMC Genomics ; 22(1): 387, 2021 May 26.
Artigo em Inglês | MEDLINE | ID: mdl-34039282

RESUMO

BACKGROUND: High-throughput transcriptomic datasets are often examined to discover new actors and regulators of a biological response. To this end, graphical interfaces have been developed and allow a broad range of users to conduct standard analyses from RNA-seq data, even with little programming experience. Although existing solutions usually provide adequate procedures for normalization, exploration or differential expression, more advanced features, such as gene clustering or regulatory network inference, often miss or do not reflect current state of the art methodologies. RESULTS: We developed here a user interface called DIANE (Dashboard for the Inference and Analysis of Networks from Expression data) designed to harness the potential of multi-factorial expression datasets from any organisms through a precise set of methods. DIANE interactive workflow provides normalization, dimensionality reduction, differential expression and ontology enrichment. Gene clustering can be performed and explored via configurable Mixture Models, and Random Forests are used to infer gene regulatory networks. DIANE also includes a novel procedure to assess the statistical significance of regulator-target influence measures based on permutations for Random Forest importance metrics. All along the pipeline, session reports and results can be downloaded to ensure clear and reproducible analyses. CONCLUSIONS: We demonstrate the value and the benefits of DIANE using a recently published data set describing the transcriptional response of Arabidopsis thaliana under the combination of temperature, drought and salinity perturbations. We show that DIANE can intuitively carry out informative exploration and statistical procedures with RNA-Seq data, perform model based gene expression profiles clustering and go further into gene network reconstruction, providing relevant candidate genes or signalling pathways to explore. DIANE is available as a web service ( https://diane.bpmp.inrae.fr ), or can be installed and locally launched as a complete R package.


Assuntos
Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Análise por Conglomerados , Biologia Computacional , Software , Transcriptoma
9.
BMC Bioinformatics ; 21(1): 308, 2020 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-32664870

RESUMO

BACKGROUND: Inferring gene regulatory networks (GRNs) from gene expression data remains a challenge in system biology. In past decade, numerous methods have been developed for the inference of GRNs. It remains a challenge due to the fact that the data is noisy and high dimensional, and there exists a large number of potential interactions. RESULTS: We present a novel method, namely priori-fused boosting network inference method (PFBNet), to infer GRNs from time-series expression data by using the non-linear model of Boosting and the prior information (e.g., the knockout data) fusion scheme. Specifically, PFBNet first calculates the confidences of the regulation relationships using the boosting-based model, where the information about the accumulation impact of the gene expressions at previous time points is taken into account. Then, a newly defined strategy is applied to fuse the information from the prior data by elevating the confidences of the regulation relationships from the corresponding regulators. CONCLUSIONS: The experiments on the benchmark datasets from DREAM challenge as well as the E.coli datasets show that PFBNet achieves significantly better performance than other state-of-the-art methods (Jump3, GEINE3-lag, HiDi, iRafNet and BiXGBoost).


Assuntos
Algoritmos , Redes Reguladoras de Genes , Área Sob a Curva , Biologia Computacional , Escherichia coli/genética , Escherichia coli/metabolismo , Expressão Gênica , Curva ROC
10.
BMC Bioinformatics ; 19(1): 376, 2018 Oct 12.
Artigo em Inglês | MEDLINE | ID: mdl-30314469

RESUMO

BACKGROUND: Relatively small changes to gene expression data dramatically affect co-expression networks inferred from that data which, in turn, can significantly alter the subsequent biological interpretation. This error propagation is an underappreciated problem that, while hinted at in the literature, has not yet been thoroughly explored. Resampling methods (e.g. bootstrap aggregation, random subspace method) are hypothesized to alleviate variability in network inference methods by minimizing outlier effects and distilling persistent associations in the data. But the efficacy of the approach assumes the generalization from statistical theory holds true in biological network inference applications. RESULTS: We evaluated the effect of bootstrap aggregation on inferred networks using commonly applied network inference methods in terms of stability, or resilience to perturbations in the underlying expression data, a metric for accuracy, and functional enrichment of edge interactions. CONCLUSION: Bootstrap aggregation results in improved stability and, depending on the size of the input dataset, a marginal improvement to accuracy assessed by each method's ability to link genes in the same functional pathway.


Assuntos
Expressão Gênica/genética , Redes Reguladoras de Genes/genética , Algoritmos , Humanos
11.
Arch Toxicol ; 91(6): 2343-2352, 2017 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-28032149

RESUMO

Unravelling gene regulatory networks (GRNs) influenced by chemicals is a major challenge in systems toxicology. Because toxicant-induced GRNs evolve over time and dose, the analysis of global gene expression data measured at multiple time points and doses will provide insight in the adverse effects of compounds. Therefore, there is a need for mathematical methods for GRN identification from time-over-dose-dependent data. One of the current approaches for GRN inference is Time Series Network Identification (TSNI). TSNI is based on ordinary differential equations (ODE), describing the time evolution of the expression of each gene, which is assumed to be dependent on the expression of other genes and an external perturbation (i.e. chemical exposure). Here, we present Dose-Time Network Identification (DTNI), a method extending TSNI by including ODE describing how the expression of each gene evolves with dose, which is supposed to depend on the expression of other genes and the exposure time. We also adapted TSNI in order to enable inclusion of time-over-dose-dependent data from multiple compounds. Here, we show that DTNI outperforms TSNI in inferring a toxicant-induced GRN. Moreover, we show that DTNI is a suitable method to infer a GRN dose- and time-dependently induced by a group of compounds influencing a common biological process. Applying DTNI on experimental data from TG-GATEs, we demonstrate that DTNI provides in-depth information on the mode of action of compounds, in particular key events and potential molecular initiating events. Furthermore, DTNI also discloses several unknown interactions which have to be verified experimentally.


Assuntos
Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/genética , Expressão Gênica/efeitos dos fármacos , Redes Reguladoras de Genes/efeitos dos fármacos , Substâncias Perigosas/toxicidade , Modelos Biológicos , Toxicogenética/métodos , Algoritmos , Doença Hepática Induzida por Substâncias e Drogas/etiologia , Doença Hepática Induzida por Substâncias e Drogas/genética , Simulação por Computador , Relação Dose-Resposta a Droga , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/etiologia , Hepatócitos/efeitos dos fármacos , Hepatócitos/metabolismo , Humanos , Análise de Regressão , Reprodutibilidade dos Testes , Transdução de Sinais/efeitos dos fármacos , Transdução de Sinais/genética , Fatores de Tempo
12.
BMC Bioinformatics ; 17(1): 545, 2016 Dec 28.
Artigo em Inglês | MEDLINE | ID: mdl-28031031

RESUMO

BACKGROUND: Inferring the topology of gene regulatory networks (GRNs) from microarray gene expression data has many potential applications, such as identifying candidate drug targets and providing valuable insights into the biological processes. It remains a challenge due to the fact that the data is noisy and high dimensional, and there exists a large number of potential interactions. RESULTS: We introduce an ensemble gene regulatory network inference method PLSNET, which decomposes the GRN inference problem with p genes into p subproblems and solves each of the subproblems by using Partial least squares (PLS) based feature selection algorithm. Then, a statistical technique is used to refine the predictions in our method. The proposed method was evaluated on the DREAM4 and DREAM5 benchmark datasets and achieved higher accuracy than the winners of those competitions and other state-of-the-art GRN inference methods. CONCLUSIONS: Superior accuracy achieved on different benchmark datasets, including both in silico and in vivo networks, shows that PLSNET reaches state-of-the-art performance.


Assuntos
Redes Reguladoras de Genes , Modelos Genéticos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Expressão Gênica , Análise dos Mínimos Quadrados
13.
Biophys Rev ; 16(1): 57-67, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38495440

RESUMO

Learning how multicellular organs are developed from single cells to different cell types is a fundamental problem in biology. With the high-throughput scRNA-seq technology, computational methods have been developed to reveal the temporal dynamics of single cells from transcriptomic data, from phenomena on cell trajectories to the underlying mechanism that formed the trajectory. There are several distinct families of computational methods including Trajectory Inference (TI), Lineage Tracing (LT), and Gene Regulatory Network (GRN) Inference which are involved in such studies. This review summarizes these computational approaches which use scRNA-seq data to study cell differentiation and cell fate specification as well as the advantages and limitations of different methods. We further discuss how GRNs can potentially affect cell fate decisions and trajectory structures. Supplementary Information: The online version contains supplementary material available at 10.1007/s12551-023-01090-5.

14.
Genome Biol ; 25(1): 88, 2024 04 08.
Artigo em Inglês | MEDLINE | ID: mdl-38589899

RESUMO

Inferring gene regulatory networks (GRNs) from single-cell data is challenging due to heuristic limitations. Existing methods also lack estimates of uncertainty. Here we present Probabilistic Matrix Factorization for Gene Regulatory Network Inference (PMF-GRN). Using single-cell expression data, PMF-GRN infers latent factors capturing transcription factor activity and regulatory relationships. Using variational inference allows hyperparameter search for principled model selection and direct comparison to other generative models. We extensively test and benchmark our method using real single-cell datasets and synthetic data. We show that PMF-GRN infers GRNs more accurately than current state-of-the-art single-cell GRN inference methods, offering well-calibrated uncertainty estimates.


Assuntos
Algoritmos , Redes Reguladoras de Genes
15.
Interdiscip Sci ; 2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38778003

RESUMO

Gene regulatory network (GRN) inference based on single-cell RNA sequencing data (scRNAseq) plays a crucial role in understanding the regulatory mechanisms between genes. Various computational methods have been employed for GRN inference, but their performance in terms of network accuracy and model generalization is not satisfactory, and their poor performance is caused by high-dimensional data and network sparsity. In this paper, we propose a self-supervised method for gene regulatory network inference using single-cell RNA sequencing data (CVGAE). CVGAE uses graph neural network for inductive representation learning, which merges gene expression data and observed topology into a low-dimensional vector space. The well-trained vectors will be used to calculate mathematical distance of each gene, and further predict interactions between genes. In overall framework, FastICA is implemented to relief computational complexity caused by high dimensional data, and CVGAE adopts multi-stacked GraphSAGE layers as an encoder and an improved decoder to overcome network sparsity. CVGAE is evaluated on several single cell datasets containing four related ground-truth networks, and the result shows that CVGAE achieve better performance than comparative methods. To validate learning and generalization capabilities, CVGAE is applied in few-shot environment by change the ratio of train set and test set. In condition of few-shot, CVGAE obtains comparable or superior performance.

16.
Comput Struct Biotechnol J ; 23: 1036-1050, 2024 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38464935

RESUMO

Melanoma, the deadliest form of skin cancer, can metastasize to different organs. Molecular differences between brain and extracranial melanoma metastases are poorly understood. Here, promoter methylation and gene expression of 11 heterogeneous patient-matched pairs of brain and extracranial metastases were analyzed using melanoma-specific gene regulatory networks learned from public transcriptome and methylome data followed by network-based impact propagation of patient-specific alterations. This innovative data analysis strategy allowed to predict potential impacts of patient-specific driver candidate genes on other genes and pathways. The patient-matched metastasis pairs clustered into three robust subgroups with specific downstream targets with known roles in cancer, including melanoma (SG1: RBM38, BCL11B, SG2: GATA3, FES, SG3: SLAMF6, PYCARD). Patient subgroups and ranking of target gene candidates were confirmed in a validation cohort. Summarizing, computational network-based impact analyses of heterogeneous metastasis pairs predicted individual regulatory differences in melanoma brain metastases, cumulating into three consistent subgroups with specific downstream target genes.

17.
bioRxiv ; 2023 Nov 13.
Artigo em Inglês | MEDLINE | ID: mdl-38014297

RESUMO

Reconstruction of gene regulatory networks (GRNs) from expression data is a significant open problem. Common approaches train a machine learning (ML) model to predict a gene's expression using transcription factors' (TFs') expression as features and designate important features/TFs as regulators of the gene. Here, we present an entirely different paradigm, where GRN edges are directly predicted by the ML model. The new approach, named "SPREd" is a simulation-supervised neural network for GRN inference. Its inputs comprise expression relationships (e.g., correlation, mutual information) between the target gene and each TF and between pairs of TFs. The output includes binary labels indicating whether each TF regulates the target gene. We train the neural network model using synthetic expression data generated by a biophysics-inspired simulation model that incorporates linear as well as non-linear TF-gene relationships and diverse GRN configurations. We show SPREd to outperform state-of-the-art GRN reconstruction tools GENIE3, ENNET, PORTIA and TIGRESS on synthetic datasets with high co-expression among TFs, similar to that seen in real data. A key advantage of the new approach is its robustness to relatively small numbers of conditions (columns) in the expression matrix, which is a common problem faced by existing methods. Finally, we evaluate SPREd on real data sets in yeast that represent gold standard benchmarks of GRN reconstruction and show it to perform significantly better than or comparably to existing methods. In addition to its high accuracy and speed, SPREd marks a first step towards incorporating biophysics principles of gene regulation into ML-based approaches to GRN reconstruction.

18.
Comput Struct Biotechnol J ; 21: 21-33, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36514338

RESUMO

Hematopoietic stem cell (HSC) aging is a multifactorial event leading to changes in HSC properties and functions, which are intrinsically coordinated and affect the early hematopoiesis. To better understand the mechanisms and factors controlling these changes, we developed an original strategy to construct a Boolean model of HSC differentiation. Based on our previous scRNA-seq data, we exhaustively characterized active transcription modules or regulons along the differentiation trajectory and constructed an influence graph between 15 selected components involved in the dynamics of the process. Then we defined dynamical constraints between observed cellular states along the trajectory and using answer set programming with in silico perturbation analysis, we obtained a Boolean model explaining the early priming of HSCs. Finally, perturbations of the model based on age-related changes revealed important deregulations, such as the overactivation of Egr1 and Junb or the loss of Cebpa activation by Gata2. These new regulatory mechanisms were found to be relevant for the myeloid bias of aged HSC and explain the decreased transcriptional priming of HSCs to all mature cell types except megakaryocytes.

19.
Genes (Basel) ; 14(2)2023 01 20.
Artigo em Inglês | MEDLINE | ID: mdl-36833196

RESUMO

Context: Inferring gene regulatory networks (GRN) from high-throughput gene expression data is a challenging task for which different strategies have been developed. Nevertheless, no ever-winning method exists, and each method has its advantages, intrinsic biases, and application domains. Thus, in order to analyze a dataset, users should be able to test different techniques and choose the most appropriate one. This step can be particularly difficult and time consuming, since most methods' implementations are made available independently, possibly in different programming languages. The implementation of an open-source library containing different inference methods within a common framework is expected to be a valuable toolkit for the systems biology community. Results: In this work, we introduce GReNaDIne (Gene Regulatory Network Data-driven Inference), a Python package that implements 18 machine learning data-driven gene regulatory network inference methods. It also includes eight generalist preprocessing techniques, suitable for both RNA-seq and microarray dataset analysis, as well as four normalization techniques dedicated to RNA-seq. In addition, this package implements the possibility to combine the results of different inference tools to form robust and efficient ensembles. This package has been successfully assessed under the DREAM5 challenge benchmark dataset. The open-source GReNaDIne Python package is made freely available in a dedicated GitLab repository, as well as in the official third-party software repository PyPI Python Package Index. The latest documentation on the GReNaDIne library is also available at Read the Docs, an open-source software documentation hosting platform. Contribution: The GReNaDIne tool represents a technological contribution to the field of systems biology. This package can be used to infer gene regulatory networks from high-throughput gene expression data using different algorithms within the same framework. In order to analyze their datasets, users can apply a battery of preprocessing and postprocessing tools and choose the most adapted inference method from the GReNaDIne library and even combine the output of different methods to obtain more robust results. The results format provided by GReNaDIne is compatible with well-known complementary refinement tools such as PYSCENIC.


Assuntos
Biologia Computacional , Redes Reguladoras de Genes , Biologia Computacional/métodos , São Vicente e Granadinas , Software , Expressão Gênica
20.
Methods Mol Biol ; 2395: 13-31, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34822147

RESUMO

Over the last few decades, many genes have been functionally characterized and shown to be involved in various metabolic, developmental, and signaling pathways. However it still remains unclear how all these genes and pathways integrate into a unique regulatory network to coordinate the development and the growth, or the response to the environment. This is why unraveling the topology of gene regulatory networks (GRN) has become central to our understanding of all these processes. The recent advancement of high-throughput methods has provided enormous amount of -omics data. These data can now be exploited for rapid network reconstruction with statistical inference methods. We recently published a new GRN inference algorithm called TDCor which reconstructs GRN from time-series transcriptomic data. The algorithm has been released in the form of an R package. Here, I describe into details how to install and use the package.


Assuntos
Redes Reguladoras de Genes , Transcriptoma , Algoritmos , Biologia Computacional , Fatores de Tempo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA