Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 52
Filter
1.
Brief Bioinform ; 25(4)2024 May 23.
Article in English | MEDLINE | ID: mdl-38904542

ABSTRACT

The inherent heterogeneity of cancer contributes to highly variable responses to any anticancer treatments. This underscores the need to first identify precise biomarkers through complex multi-omics datasets that are now available. Although much research has focused on this aspect, identifying biomarkers associated with distinct drug responders still remains a major challenge. Here, we develop MOMLIN, a multi-modal and -omics machine learning integration framework, to enhance drug-response prediction. MOMLIN jointly utilizes sparse correlation algorithms and class-specific feature selection algorithms, which identifies multi-modal and -omics-associated interpretable components. MOMLIN was applied to 147 patients' breast cancer datasets (clinical, mutation, gene expression, tumor microenvironment cells and molecular pathways) to analyze drug-response class predictions for non-responders and variable responders. Notably, MOMLIN achieves an average AUC of 0.989, which is at least 10% greater when compared with current state-of-the-art (data integration analysis for biomarker discovery using latent components, multi-omics factor analysis, sparse canonical correlation analysis). Moreover, MOMLIN not only detects known individual biomarkers such as genes at mutation/expression level, most importantly, it correlates multi-modal and -omics network biomarkers for each response class. For example, an interaction between ER-negative-HMCN1-COL5A1 mutations-FBXO2-CSF3R expression-CD8 emerge as a multimodal biomarker for responders, potentially affecting antimicrobial peptides and FLT3 signaling pathways. In contrast, for resistance cases, a distinct combination of lymph node-TP53 mutation-PON3-ENSG00000261116 lncRNA expression-HLA-E-T-cell exclusions emerged as multimodal biomarkers, possibly impacting neurotransmitter release cycle pathway. MOMLIN, therefore, is expected advance precision medicine, such as to detect context-specific multi-omics network biomarkers and better predict drug-response classifications.


Subject(s)
Breast Neoplasms , Machine Learning , Humans , Breast Neoplasms/genetics , Breast Neoplasms/drug therapy , Breast Neoplasms/metabolism , Female , Biomarkers, Tumor/genetics , Biomarkers, Tumor/metabolism , Algorithms , Antineoplastic Agents/therapeutic use , Antineoplastic Agents/pharmacology , Computational Biology/methods , Genomics/methods
2.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38711370

ABSTRACT

Across many scientific disciplines, the development of computational models and algorithms for generating artificial or synthetic data is gaining momentum. In biology, there is a great opportunity to explore this further as more and more big data at multi-omics level are generated recently. In this opinion, we discuss the latest trends in biological applications based on process-driven and data-driven aspects. Moving ahead, we believe these methodologies can help shape novel multi-omics-scale cellular inferences.


Subject(s)
Algorithms , Computational Biology , Computational Biology/methods , Genomics/methods , Humans , Big Data , Proteomics/methods , Multiomics
3.
Elife ; 122024 Apr 03.
Article in English | MEDLINE | ID: mdl-38567944

ABSTRACT

Aging and senescence are characterized by pervasive transcriptional dysfunction, including increased expression of transposons and introns. Our aim was to elucidate mechanisms behind this increased expression. Most transposons are found within genes and introns, with a large minority being close to genes. This raises the possibility that transcriptional readthrough and intron retention are responsible for age-related changes in transposon expression rather than expression of autonomous transposons. To test this, we compiled public RNA-seq datasets from aged human fibroblasts, replicative and drug-induced senescence in human cells, and RNA-seq from aging mice and senescent mouse cells. Indeed, our reanalysis revealed a correlation between transposons expression, intron retention, and transcriptional readthrough across samples and within samples. Both intron retention and readthrough increased with aging or cellular senescence and these transcriptional defects were more pronounced in human samples as compared to those of mice. In support of a causal connection between readthrough and transposon expression, analysis of models showing induced transcriptional readthrough confirmed that they also show elevated transposon expression. Taken together, our data suggest that elevated transposon reads during aging seen in various RNA-seq dataset are concomitant with multiple transcriptional defects. Intron retention and transcriptional readthrough are the most likely explanation for the expression of transposable elements that lack a functional promoter.


Subject(s)
Aging , DNA Transposable Elements , Animals , Mice , Humans , Aged , Introns , RNA-Seq , Aging/genetics , Promoter Regions, Genetic , DNA Transposable Elements/genetics
4.
Curr Opin Biotechnol ; 87: 103115, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38547588

ABSTRACT

With the continuous increment in global population growth, compounded by post-pandemic food security challenges due to labor shortages, effects of climate change, political conflicts, limited land for agriculture, and carbon emissions control, addressing food production in a sustainable manner for future generations is critical. Microorganisms are potential alternative food sources that can help close the gap in food production. For the development of more efficient and yield-enhancing products, it is necessary to have a better understanding on the underlying regulatory molecular pathways of microbial growth. Nevertheless, as microbes are regulated at multiomics scales, current research focusing on single omics (genomics, proteomics, or metabolomics) independently is inadequate for optimizing growth and product output. Here, we discuss digital twin (DT) approaches that integrate systems biology and artificial intelligence in analyzing multiomics datasets to yield a microbial replica model for in silico testing before production. DT models can thus provide a holistic understanding of microbial growth, metabolite biosynthesis mechanisms, as well as identifying crucial production bottlenecks. Our argument, therefore, is to support the development of novel DT models that can potentially revolutionize microorganism-based alternative food production efficiency.


Subject(s)
Systems Biology , Artificial Intelligence , Metabolomics/methods , Genomics , Bacteria/metabolism , Bacteria/genetics
5.
Methods Mol Biol ; 2745: 3-19, 2024.
Article in English | MEDLINE | ID: mdl-38060176

ABSTRACT

Living cells display dynamic and complex behaviors. To understand their response and to infer novel insights not possible with traditional reductionist approaches, over the last few decades various computational modelling methodologies have been developed. In this chapter, we focus on modelling the dynamic metabolic response, using linear and nonlinear ordinary differential equations, of an engineered Escherichia coli MG1655 strain with plasmid pJBEI-6409 that produces limonene. We show the systems biology steps involved from collecting time-series data of living cells, to dynamic model creation and fitting the model with experimental responses using COPASI software.


Subject(s)
Escherichia coli , Software , Limonene/metabolism , Computer Simulation , Escherichia coli/genetics , Escherichia coli/metabolism , Systems Biology/methods , Models, Biological
6.
Biomolecules ; 13(7)2023 07 06.
Article in English | MEDLINE | ID: mdl-37509116

ABSTRACT

For many years, there has been general interest in developing virtual cells or digital twin models [...].


Subject(s)
Systems Biology , Humans
7.
NPJ Syst Biol Appl ; 9(1): 28, 2023 Jun 24.
Article in English | MEDLINE | ID: mdl-37355674

ABSTRACT

Cancer is widely considered a genetic disease. Notably, recent works have highlighted that every human gene may possibly be associated with cancer. Thus, the distinction between genes that drive oncogenesis and those that are associated to the disease, but do not play a role, requires attention. Here we investigated single cells and bulk (cell-population) datasets of several cancer transcriptomes and proteomes in relation to their healthy counterparts. When analyzed by machine learning and statistical approaches in bulk datasets, both general and cancer-specific oncogenes, as defined by the Cancer Genes Census, show invariant behavior to randomly selected gene sets of the same size for all cancers. However, when protein-protein interaction analyses were performed, the oncogenes-derived networks show higher connectivity than those relative to random genes. Moreover, at single-cell scale, we observe variant behavior in a subset of oncogenes for each considered cancer type. Moving forward, we concur that the role of oncogenes needs to be further scrutinized by adopting protein causality and higher-resolution single-cell analyses.


Subject(s)
Genes, Tumor Suppressor , Neoplasms , Humans , Oncogenes/genetics , Neoplasms/genetics , Transcriptome
8.
Adv Nutr ; 14(1): 1-11, 2023 01.
Article in English | MEDLINE | ID: mdl-36811582

ABSTRACT

Food security has become a pressing issue in the modern world. The ever-increasing world population, ongoing COVID-19 pandemic, and political conflicts together with climate change issues make the problem very challenging. Therefore, fundamental changes to the current food system and new sources of alternative food are required. Recently, the exploration of alternative food sources has been supported by numerous governmental and research organizations, as well as by small and large commercial ventures. Microalgae are gaining momentum as an effective source of alternative laboratory-based nutritional proteins as they are easy to grow under variable environmental conditions, with the added advantage of absorbing carbon dioxide. Despite their attractiveness, the utilization of microalgae faces several practical limitations. Here, we discuss both the potential and challenges of microalgae in food sustainability and their possible long-term contribution to the circular economy of converting food waste into feed via modern methods. We also argue that systems biology and artificial intelligence can play a role in overcoming some of the challenges and limitations; through data-guided metabolic flux optimization, and by systematically increasing the growth of the microalgae strains without negative outcomes, such as toxicity. This requires microalgae databases rich in omics data and further developments on its mining and analytics methods.


Subject(s)
COVID-19 , Microalgae , Refuse Disposal , Humans , Food , Artificial Intelligence , Multiomics , Pandemics , Machine Learning
9.
Methods Mol Biol ; 2553: 221-263, 2023.
Article in English | MEDLINE | ID: mdl-36227547

ABSTRACT

Research in synthetic biology and metabolic engineering require a deep understanding on the function and regulation of complex pathway genes. This can be achieved through gene expression profiling which quantifies the transcriptome-wide expression under any condition, such as a cell development stage, mutant, disease, or treatment with a drug. The expression profiling is usually done using high-throughput techniques such as RNA sequencing (RNA-Seq) or microarray. Although both methods are based on different technical approaches, they provide quantitative measures of the expression levels of thousands of genes. The expression levels of the genes are compared under different conditions to identify the differentially expressed genes (DEGs), the genes with different expression levels under different conditions. DEGs, usually involving thousands in number, are then investigated using bioinformatics and data analytic tools to infer and compare their functional roles between conditions. Dealing with such large datasets, therefore, requires intensive data processing and analyses to ensure its quality and produce results that are statistically sound. Thus, there is a need for deep statistical and bioinformatics knowledge to deal with high-throughput gene expression data. This represents a barrier for wet biologists with limited computational, programming, and data analytic skills that prevent them from getting the full potential of the data. In this chapter, we present a step-by-step protocol to perform transcriptome analysis using GeneCloudOmics, a cloud-based web server that provides an end-to-end platform for high-throughput gene expression analysis.


Subject(s)
Synthetic Biology , Transcriptome , Computational Biology/methods , Data Science , Gene Expression Profiling/methods , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, RNA/methods
10.
Brief Bioinform ; 23(6)2022 11 19.
Article in English | MEDLINE | ID: mdl-36184188

ABSTRACT

In recent years, artificial intelligence (AI)/machine learning has emerged as a plausible alternative to systems biology for the elucidation of biological phenomena and in attaining specified design objective in synthetic biology. Although considered highly disruptive with numerous notable successes so far, we seek to bring attention to both the fundamental and practical pitfalls of their usage, especially in illuminating emergent behaviors from chaotic or stochastic systems in biology. Without deliberating on their suitability and the required data qualities and pre-processing approaches beforehand, the research and development community could experience similar 'AI winters' that had plagued other fields. Instead, we anticipate the integration or combination of the two approaches, where appropriate, moving forward.


Subject(s)
Artificial Intelligence , Systems Biology , Machine Learning
11.
Metab Eng Commun ; 15: e00209, 2022 Dec.
Article in English | MEDLINE | ID: mdl-36281261

ABSTRACT

Metabolic engineering involves the manipulation of microbes to produce desirable compounds through genetic engineering or synthetic biology approaches. Metabolomics involves the quantitation of intracellular and extracellular metabolites, where mass spectrometry and nuclear magnetic resonance based analytical instrumentation are often used. Here, the experimental designs, sample preparations, metabolite quenching and extraction are essential to the quantitative metabolomics workflow. The resultant metabolomics data can then be used with computational modelling approaches, such as kinetic and constraint-based modelling, to better understand underlying mechanisms and bottlenecks in the synthesis of desired compounds, thereby accelerating research through systems metabolic engineering. Constraint-based models, such as genome scale models, have been used successfully to enhance the yield of desired compounds from engineered microbes, however, unlike kinetic or dynamic models, constraint-based models do not incorporate regulatory effects. Nevertheless, the lack of time-series metabolomic data generation has hindered the usefulness of dynamic models till today. In this review, we show that improvements in automation, dynamic real-time analysis and high throughput workflows can drive the generation of more quality data for dynamic models through time-series metabolomics data generation. Spatial metabolomics also has the potential to be used as a complementary approach to conventional metabolomics, as it provides information on the localization of metabolites. However, more effort must be undertaken to identify metabolites from spatial metabolomics data derived through imaging mass spectrometry, where machine learning approaches could prove useful. On the other hand, single-cell metabolomics has also seen rapid growth, where understanding cell-cell heterogeneity can provide more insights into efficient metabolic engineering of microbes. Moving forward, with potential improvements in automation, dynamic real-time analysis, high throughput workflows, and spatial metabolomics, more data can be produced and studied using machine learning algorithms, in conjunction with dynamic models, to generate qualitative and quantitative predictions to advance metabolic engineering efforts.

12.
Genomics ; 114(1): 215-228, 2022 01.
Article in English | MEDLINE | ID: mdl-34843905

ABSTRACT

The study of gene expression variability, especially for cancer and cell differentiation studies, has become important. Here, we investigate transcriptome-wide scatter of 23 cell types and conditions across different levels of biological complexity. We focused on genes that act like toggle switches between pairwise replicates of the same cell type, i.e. genes expressed in one replicate and not expressed in the other, sometimes also referred as ON/OFF genes. The proportion of these toggle genes dramatically increases from unicellular to multicellular organization, especially for development and cancer cells. A relevant portion of toggle switches are non-coding genes: in unicellular systems the most represented classes are tRNA and rRNA, while multicellular systems more frequently show lncRNA, sncRNA and pseudogenes. Notably, disease associated microRNAs (miRNAs), pseudogenes and numerous uncharacterized transcripts are present in both development and cancer cells. On top of the known intrinsic and extrinsic factors, our work indicates toggle genes as a novel collective component creating transcriptome-wide variability. This requires further investigation for elucidating both evolutionary and disease processes.


Subject(s)
MicroRNAs , Neoplasms , RNA, Long Noncoding , Cell Differentiation , Humans , MicroRNAs/genetics , MicroRNAs/metabolism , Neoplasms/genetics , Transcriptome
13.
Front Immunol ; 12: 736349, 2021.
Article in English | MEDLINE | ID: mdl-34867957

ABSTRACT

The majority of human genome are non-coding genes. Recent research have revealed that about half of these genome sequences make up of transposable elements (TEs). A branch of these belong to the endogenous retroviruses (ERVs), which are germline viral infection that occurred over millions of years ago. They are generally harmless as evolutionary mutations have made them unable to produce viral agents and are mostly epigenetically silenced. Nevertheless, ERVs are able to express by still unknown mechanisms and recent evidences have shown links between ERVs and major proinflammatory diseases and cancers. The major challenge is to elucidate a detailed mechanistic understanding between them, so that novel therapeutic approaches can be explored. Here, we provide a brief overview of TEs, human ERVs and their links to microbiome, innate immune response, proinflammatory diseases and cancer. Finally, we recommend the employment of systems biology approaches for future HERV research.


Subject(s)
Endogenous Retroviruses/pathogenicity , Inflammation/etiology , Autoimmune Diseases/etiology , Autoimmune Diseases/immunology , Autoimmune Diseases/virology , Biological Evolution , DNA Transposable Elements/genetics , Endogenous Retroviruses/genetics , Endogenous Retroviruses/immunology , Genome, Human , Humans , Immunity, Innate , Inflammation/immunology , Inflammation/virology , Machine Learning , Microbiota/immunology , Models, Biological , Neoplasms/etiology , Neoplasms/immunology , Neoplasms/virology , Neurodegenerative Diseases/etiology , Neurodegenerative Diseases/immunology , Neurodegenerative Diseases/virology , Systems Biology
15.
Front Bioinform ; 1: 693836, 2021.
Article in English | MEDLINE | ID: mdl-36303746

ABSTRACT

Gene expression profiling techniques, such as DNA microarray and RNA-Sequencing, have provided significant impact on our understanding of biological systems. They contribute to almost all aspects of biomedical research, including studying developmental biology, host-parasite relationships, disease progression and drug effects. However, the high-throughput data generations present challenges for many wet experimentalists to analyze and take full advantage of such rich and complex data. Here we present GeneCloudOmics, an easy-to-use web server for high-throughput gene expression analysis that extends the functionality of our previous ABioTrans with several new tools, including protein datasets analysis, and a web interface. GeneCloudOmics allows both microarray and RNA-Seq data analysis with a comprehensive range of data analytics tools in one package that no other current standalone software or web-based tool can do. In total, GeneCloudOmics provides the user access to 23 different data analytical and bioinformatics tasks including reads normalization, scatter plots, linear/non-linear correlations, PCA, clustering (hierarchical, k-means, t-SNE, SOM), differential expression analyses, pathway enrichments, evolutionary analyses, pathological analyses, and protein-protein interaction (PPI) identifications. Furthermore, GeneCloudOmics allows the direct import of gene expression data from the NCBI Gene Expression Omnibus database. The user can perform all tasks rapidly through an intuitive graphical user interface that overcomes the hassle of coding, installing tools/packages/libraries and dealing with operating systems compatibility and version issues, complications that make data analysis tasks challenging for biologists. Thus, GeneCloudOmics is a one-stop open-source tool for gene expression data analysis and visualization. It is freely available at http://combio-sifbi.org/GeneCloudOmics.

16.
Sci Rep ; 10(1): 17483, 2020 10 15.
Article in English | MEDLINE | ID: mdl-33060728

ABSTRACT

Differential expressed (DE) genes analysis is valuable for understanding comparative transcriptomics between cells, conditions or time evolution. However, the predominant way of identifying DE genes is to use arbitrary threshold fold or expression changes as cutoff. Here, we developed a more objective method, Scatter Overlay or ScatLay, to extract and graphically visualize DE genes across any two samples by utilizing their pair-wise scatter or transcriptome-wide noise, while factoring replicate variabilities. We tested ScatLay for 3 cell types: between time points for Escherichia coli aerobiosis and Saccharomyces cerevisiae hypoxia, and between untreated and Etomoxir treated Mus Musculus embryonic stem cell. As a result, we obtain 1194, 2061 and 2932 DE genes, respectively. Next, we compared these data with two widely used current approaches (DESeq2 and NOISeq) with typical twofold expression changes threshold, and show that ScatLay reveals significantly larger number of DE genes. Hence, our method provides a wider coverage of DE genes, and will likely pave way for finding more novel regulatory genes in future works.


Subject(s)
Computational Biology/methods , Gene Expression Regulation , Transcriptome , Animals , Cell Hypoxia , Computer Graphics , Embryonic Stem Cells/metabolism , Enzyme Inhibitors/pharmacology , Epoxy Compounds/pharmacology , Escherichia coli/metabolism , Gene Expression Profiling , Male , Mice , Mice, Inbred C57BL , Principal Component Analysis , Programming Languages , Saccharomyces cerevisiae/metabolism , Scattering, Radiation , Systems Biology
17.
Metab Eng Commun ; 11: e00149, 2020 Dec.
Article in English | MEDLINE | ID: mdl-33072513

ABSTRACT

Metabolic engineering aims to maximize the production of bio-economically important substances (compounds, enzymes, or other proteins) through the optimization of the genetics, cellular processes and growth conditions of microorganisms. This requires detailed understanding of underlying metabolic pathways involved in the production of the targeted substances, and how the cellular processes or growth conditions are regulated by the engineering. To achieve this goal, a large system of experimental techniques, compound libraries, computational methods and data resources, including multi-omics data, are used. The recent advent of multi-omics systems biology approaches significantly impacted the field by opening new avenues to perform dynamic and large-scale analyses that deepen our knowledge on the manipulations. However, with the enormous transcriptomics, proteomics and metabolomics available, it is a daunting task to integrate the data for a more holistic understanding. Novel data mining and analytics approaches, including Artificial Intelligence (AI), can provide breakthroughs where traditional low-throughput experiment-alone methods cannot easily achieve. Here, we review the latest attempts of combining systems biology and AI in metabolic engineering research, and highlight how this alliance can help overcome the current challenges facing industrial biotechnology, especially for food-related substances and compounds using microorganisms.

18.
Sci Rep ; 10(1): 5878, 2020 04 03.
Article in English | MEDLINE | ID: mdl-32246034

ABSTRACT

For any dynamical system, like living organisms, an attractor state is a set of variables or mechanisms that converge towards a stable system behavior despite a wide variety of initial conditions. Here, using multi-dimensional statistics, we investigate the global gene expression attractor mechanisms shaping anaerobic to aerobic state transition (AAT) of Escherichia coli in a bioreactor at early times. Out of 3,389 RNA-Seq expression changes over time, we identified 100 sharply changing genes that are key for guiding 1700 genes into the AAT attractor basin. Collectively, these genes were named as attractor genes constituting of 6 dynamic clusters. Apart from the expected anaerobic (glycolysis), aerobic (TCA cycle) and fermentation (succinate pathways) processes, sulphur metabolism, ribosome assembly and amino acid transport mechanisms together with 332 uncharacterised genes are also key for AAT. Overall, our work highlights the importance of multi-dimensional statistical analyses for revealing novel processes shaping AAT.


Subject(s)
Aerobiosis/genetics , Escherichia coli/metabolism , Transcriptome , Aerobiosis/physiology , Anaerobiosis/genetics , Anaerobiosis/physiology , Escherichia coli/genetics , Escherichia coli/physiology , Gene Expression Profiling , Gene Expression Regulation, Bacterial/genetics , Gene Expression Regulation, Bacterial/physiology , Genes, Bacterial/physiology , Transcriptome/genetics
19.
BMC Res Notes ; 12(1): 763, 2019 Nov 21.
Article in English | MEDLINE | ID: mdl-31752996

ABSTRACT

OBJECTIVE: Living cells display complex and non-linear behaviors, especially when posed to environmental threats. Here, to understand the self-organizing cooperative behavior of a microorganism Pseudomonas aeruginosa, we developed a discrete spatiotemporal cellular automata model based on simple physical rules, similar to Conway's game of life. RESULTS: The time evolution model simulations were experimentally verified for P. aeruginosa biofilm for both control and antibiotic azithromycin (AZM) treated condition. Our model suggests that AZM regulates the single cell motility, thereby resulting in delayed, but not abolished, biofilm formation. In addition, the model highlights the importance of reproduction by cell to cell interaction is key for biofilm formation. Overall, this work highlights another example where biological evolutionary complexity may be interpreted using rules taken from theoretical disciplines.


Subject(s)
Anti-Bacterial Agents/pharmacology , Azithromycin/pharmacology , Biofilms/growth & development , Computer Simulation , Pseudomonas aeruginosa/physiology , Biofilms/drug effects , Biological Evolution , Cell Movement , Pseudomonas aeruginosa/drug effects , Spatio-Temporal Analysis
20.
Front Genet ; 10: 499, 2019.
Article in English | MEDLINE | ID: mdl-31214245

ABSTRACT

Here we report a bio-statistical/informatics tool, ABioTrans, developed in R for gene expression analysis. The tool allows the user to directly read RNA-Seq data files deposited in the Gene Expression Omnibus or GEO database. Operated using any web browser application, ABioTrans provides easy options for multiple statistical distribution fitting, Pearson and Spearman rank correlations, PCA, k-means and hierarchical clustering, differential expression (DE) analysis, Shannon entropy and noise (square of coefficient of variation) analyses, as well as Gene ontology classifications.

SELECTION OF CITATIONS
SEARCH DETAIL
...