Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 120
Filter
Add more filters

Publication year range
1.
Nature ; 602(7896): 223-228, 2022 02.
Article in English | MEDLINE | ID: mdl-35140384

ABSTRACT

Many potential applications of artificial intelligence involve making real-time decisions in physical systems while interacting with humans. Automobile racing represents an extreme example of these conditions; drivers must execute complex tactical manoeuvres to pass or block opponents while operating their vehicles at their traction limits1. Racing simulations, such as the PlayStation game Gran Turismo, faithfully reproduce the non-linear control challenges of real race cars while also encapsulating the complex multi-agent interactions. Here we describe how we trained agents for Gran Turismo that can compete with the world's best e-sports drivers. We combine state-of-the-art, model-free, deep reinforcement learning algorithms with mixed-scenario training to learn an integrated control policy that combines exceptional speed with impressive tactics. In addition, we construct a reward function that enables the agent to be competitive while adhering to racing's important, but under-specified, sportsmanship rules. We demonstrate the capabilities of our agent, Gran Turismo Sophy, by winning a head-to-head competition against four of the world's best Gran Turismo drivers. By describing how we trained championship-level racers, we demonstrate the possibilities and challenges of using these techniques to control complex dynamical systems in domains where agents must respect imprecisely defined human norms.


Subject(s)
Automobile Driving , Deep Learning , Reinforcement, Psychology , Sports , Video Games , Automobile Driving/standards , Competitive Behavior , Humans , Reward , Sports/standards
3.
BMC Bioinformatics ; 24(1): 252, 2023 Jun 15.
Article in English | MEDLINE | ID: mdl-37322439

ABSTRACT

BACKGROUND: Bioinformatics capability to analyze spatio-temporal dynamics of gene expression is essential in understanding animal development. Animal cells are spatially organized as functional tissues where cellular gene expression data contain information that governs morphogenesis during the developmental process. Although several computational tissue reconstruction methods using transcriptomics data have been proposed, those methods have been ineffective in arranging cells in their correct positions in tissues or organs unless spatial information is explicitly provided. RESULTS: This study demonstrates stochastic self-organizing map clustering with Markov chain Monte Carlo calculations for optimizing informative genes effectively reconstruct any spatio-temporal topology of cells from their transcriptome profiles with only a coarse topological guideline. The method, eSPRESSO (enhanced SPatial REconstruction by Stochastic Self-Organizing Map), provides a powerful in silico spatio-temporal tissue reconstruction capability, as confirmed by using human embryonic heart and mouse embryo, brain, embryonic heart, and liver lobule with generally high reproducibility (average max. accuracy = 92.0%), while revealing topologically informative genes, or spatial discriminator genes. Furthermore, eSPRESSO was used for temporal analysis of human pancreatic organoids to infer rational developmental trajectories with several candidate 'temporal' discriminator genes responsible for various cell type differentiations. CONCLUSIONS: eSPRESSO provides a novel strategy for analyzing mechanisms underlying the spatio-temporal formation of cellular organizations.


Subject(s)
Gene Expression Profiling , Transcriptome , Humans , Animals , Mice , Reproducibility of Results , Brain , Cluster Analysis , Spatio-Temporal Analysis
4.
Mol Syst Biol ; 16(8): e9110, 2020 08.
Article in English | MEDLINE | ID: mdl-32845085

ABSTRACT

Systems biology has experienced dramatic growth in the number, size, and complexity of computational models. To reproduce simulation results and reuse models, researchers must exchange unambiguous model descriptions. We review the latest edition of the Systems Biology Markup Language (SBML), a format designed for this purpose. A community of modelers and software authors developed SBML Level 3 over the past decade. Its modular form consists of a core suited to representing reaction-based models and packages that extend the core with features suited to other model types including constraint-based models, reaction-diffusion models, logical network models, and rule-based models. The format leverages two decades of SBML and a rich software ecosystem that transformed how systems biologists build and interact with models. More recently, the rise of multiscale models of whole cells and organs, and new data sources such as single-cell measurements and live imaging, has precipitated new ways of integrating data with models. We provide our perspectives on the challenges presented by these developments and how SBML Level 3 provides the foundation needed to support this evolution.


Subject(s)
Systems Biology/methods , Animals , Humans , Logistic Models , Models, Biological , Software
5.
BMC Genomics ; 19(1): 715, 2018 Sep 27.
Article in English | MEDLINE | ID: mdl-30261835

ABSTRACT

BACKGROUND: Microarray and DNA-sequencing based technologies continue to produce enormous amounts of data on gene expression. This data has great potential to illuminate our understanding of biology and medicine, but the data alone is of limited value without computational tools to allow human investigators to visualize and interpret it in the context of their problem of interest. RESULTS: We created a web server called SHOE that provides an interactive, visual presentation of the available evidence of transcriptional regulation and gene co-expression to facilitate its exploration and interpretation. SHOE predicts the likely transcription factor binding sites in orthologous promoters of humans, mice, and rats using the combined information of 1) transcription factor binding preferences (position-specific scoring matrix (PSSM) libraries such as Transfac32, Jaspar, HOCOMOCO, ChIP-seq, SELEX, PBM, and iPS-reprogramming factor), 2) evolutionary conservation of putative binding sites in orthologous promoters, and 3) co-expression tendencies of gene pairs based on 1,714 normal human cells selected from the Gene Expression Omnibus Database. CONCLUSION: SHOE enables users to explore potential interactions between transcription factors and target genes via multiple data views, discover transcription factor binding motifs on top of gene co-expression, and visualize genes as a network of gene and transcription factors on its native gadget GeneViz, the CellDesigner pathway analyzer, and the Reactome database to search the pathways involved. As we demonstrate here when using the CREB1 and Nf-κB datasets, SHOE can reliably identify experimentally verified interactions and predict plausible novel ones, yielding new biological insights into the gene regulatory mechanisms involved. SHOE comes with a manual describing how to run it on a local PC or via the Garuda platform ( www.garuda-alliance.org ), where it joins other popular gadgets such as the CellDesigner pathway analyzer and the Reactome database, as part of analysis workflows to meet the growing needs of molecular biologists and medical researchers. SHOE is available from the following URL http://ec2-54-150-223-65.ap-northeast-1.compute.amazonaws.com A video demonstration of SHOE can be found here: https://www.youtube.com/watch?v=qARinNb9NtE.


Subject(s)
Computational Biology/methods , DNA/metabolism , Promoter Regions, Genetic , Transcription Factors/metabolism , Animals , Binding Sites , DNA/chemistry , Evolution, Molecular , Gene Expression Regulation , Humans , Internet , Mice , Position-Specific Scoring Matrices , Rats , Sequence Homology, Nucleic Acid , Software
6.
Nucleic Acids Res ; 44(11): 5010-21, 2016 06 20.
Article in English | MEDLINE | ID: mdl-27131787

ABSTRACT

Predicting responsible transcription regulators on the basis of transcriptome data is one of the most promising computational approaches to understanding cellular processes and characteristics. Here, we present a novel method employing vast amounts of chromatin immunoprecipitation (ChIP) experimental data to address this issue. Global high-throughput ChIP data was collected to construct a comprehensive database, containing 8 578 738 binding interactions of 454 transcription regulators. To incorporate information about heterogeneous frequencies of transcription factor (TF)-binding events, we developed a flexible framework for gene set analysis employing the weighted t-test procedure, namely weighted parametric gene set analysis (wPGSA). Using transcriptome data as an input, wPGSA predicts the activities of transcription regulators responsible for observed gene expression. Validation of wPGSA with published transcriptome data, including that from over-expressed TFs, showed that the method can predict activities of various TFs, regardless of cell type and conditions, with results totally consistent with biological observations. We also applied wPGSA to other published transcriptome data and identified potential key regulators of cell reprogramming and influenza virus pathogenesis, generating compelling hypotheses regarding underlying regulatory mechanisms. This flexible framework will contribute to uncovering the dynamic and robust architectures of biological regulation, by incorporating high-throughput experimental data in the form of weights.


Subject(s)
Binding Sites , Chromatin Immunoprecipitation , Computational Biology/methods , High-Throughput Nucleotide Sequencing , Transcription Factors/metabolism , Transcriptome , Algorithms , Animals , Cluster Analysis , Databases, Genetic , Humans , Mice , Protein Binding , Reproducibility of Results
7.
Nucleic Acids Res ; 44(W1): W507-13, 2016 Jul 08.
Article in English | MEDLINE | ID: mdl-27131384

ABSTRACT

We present systemsDock, a web server for network pharmacology-based prediction and analysis, which permits docking simulation and molecular pathway map for comprehensive characterization of ligand selectivity and interpretation of ligand action on a complex molecular network. It incorporates an elaborately designed scoring function for molecular docking to assess protein-ligand binding potential. For large-scale screening and ease of investigation, systemsDock has a user-friendly GUI interface for molecule preparation, parameter specification and result inspection. Ligand binding potentials against individual proteins can be directly displayed on an uploaded molecular interaction map, allowing users to systemically investigate network-dependent effects of a drug or drug candidate. A case study is given to demonstrate how systemsDock can be used to discover a test compound's multi-target activity. systemsDock is freely accessible at http://systemsdock.unit.oist.jp/.


Subject(s)
Internet , Pharmacology/methods , Software , Acids, Carbocyclic , Cyclopentanes/chemistry , Cyclopentanes/metabolism , Cyclopentanes/pharmacology , Guanidines/chemistry , Guanidines/metabolism , Guanidines/pharmacology , Humans , Influenza, Human/metabolism , Influenza, Human/virology , Ligands , Molecular Docking Simulation , Orthomyxoviridae/drug effects , Orthomyxoviridae/metabolism , Oseltamivir/chemistry , Oseltamivir/metabolism , Oseltamivir/pharmacology , User-Computer Interface
8.
PLoS Pathog ; 11(6): e1004856, 2015 Jun.
Article in English | MEDLINE | ID: mdl-26046528

ABSTRACT

Influenza viruses present major challenges to public health, evident by the 2009 influenza pandemic. Highly pathogenic influenza virus infections generally coincide with early, high levels of inflammatory cytokines that some studies have suggested may be regulated in a strain-dependent manner. However, a comprehensive characterization of the complex dynamics of the inflammatory response induced by virulent influenza strains is lacking. Here, we applied gene co-expression and nonlinear regression analysis to time-course, microarray data developed from influenza-infected mouse lung to create mathematical models of the host inflammatory response. We found that the dynamics of inflammation-associated gene expression are regulated by an ultrasensitive-like mechanism in which low levels of virus induce minimal gene expression but expression is strongly induced once a threshold virus titer is exceeded. Cytokine assays confirmed that the production of several key inflammatory cytokines, such as interleukin 6 and monocyte chemotactic protein 1, exhibit ultrasensitive behavior. A systematic exploration of the pathways regulating the inflammatory-associated gene response suggests that the molecular origins of this ultrasensitive response mechanism lie within the branch of the Toll-like receptor pathway that regulates STAT1 phosphorylation. This study provides the first evidence of an ultrasensitive mechanism regulating influenza virus-induced inflammation in whole lungs and provides insight into how different virus strains can induce distinct temporal inflammation response profiles. The approach developed here should facilitate the construction of gene regulatory models of other infectious diseases.


Subject(s)
Influenza A Virus, H1N1 Subtype , Orthomyxoviridae Infections/immunology , Animals , Blotting, Western , Female , Flow Cytometry , Inflammation/genetics , Inflammation/immunology , Influenza A Virus, H1N1 Subtype/genetics , Influenza A Virus, H1N1 Subtype/immunology , Influenza A Virus, H1N1 Subtype/pathogenicity , Mice , Mice, Inbred C57BL , Oligonucleotide Array Sequence Analysis , Orthomyxoviridae Infections/genetics , Transcriptome , Virulence
9.
Nat Rev Genet ; 12(12): 821-32, 2011 Nov 03.
Article in English | MEDLINE | ID: mdl-22048662

ABSTRACT

Understanding complex biological systems requires extensive support from software tools. Such tools are needed at each step of a systems biology computational workflow, which typically consists of data handling, network inference, deep curation, dynamical simulation and model analysis. In addition, there are now efforts to develop integrated software platforms, so that tools that are used at different stages of the workflow and by different researchers can easily be used together. This Review describes the types of software tools that are required at different stages of systems biology research and the current options that are available for systems biology researchers. We also discuss the challenges and prospects for modelling the effects of genetic changes on physiology and the concept of an integrated platform.


Subject(s)
Software , Systems Biology , Computational Biology , Computer Simulation , Data Mining , Humans , Models, Biological , Systems Integration
10.
BMC Genomics ; 17(Suppl 13): 1025, 2016 12 22.
Article in English | MEDLINE | ID: mdl-28155657

ABSTRACT

BACKGROUND: The ability to sequence the transcriptomes of single cells using single-cell RNA-seq sequencing technologies presents a shift in the scientific paradigm where scientists, now, are able to concurrently investigate the complex biology of a heterogeneous population of cells, one at a time. However, till date, there has not been a suitable computational methodology for the analysis of such intricate deluge of data, in particular techniques which will aid the identification of the unique transcriptomic profiles difference between the different cellular subtypes. In this paper, we describe the novel methodology for the analysis of single-cell RNA-seq data, obtained from neocortical cells and neural progenitor cells, using machine learning algorithms (Support Vector machine (SVM) and Random Forest (RF)). RESULTS: Thirty-eight key transcripts were identified, using the SVM-based recursive feature elimination (SVM-RFE) method of feature selection, to best differentiate developing neocortical cells from neural progenitor cells in the SVM and RF classifiers built. Also, these genes possessed a higher discriminative power (enhanced prediction accuracy) as compared commonly used statistical techniques or geneset-based approaches. Further downstream network reconstruction analysis was carried out to unravel hidden general regulatory networks where novel interactions could be further validated in web-lab experimentation and be useful candidates to be targeted for the treatment of neuronal developmental diseases. CONCLUSION: This novel approach reported for is able to identify transcripts, with reported neuronal involvement, which optimally differentiate neocortical cells and neural progenitor cells. It is believed to be extensible and applicable to other single-cell RNA-seq expression profiles like that of the study of the cancer progression and treatment within a highly heterogeneous tumour.


Subject(s)
Brain/metabolism , Gene Expression Profiling , Machine Learning , Organogenesis/genetics , Single-Cell Analysis , Transcriptome , Algorithms , Biomarkers , Brain/embryology , Brain/growth & development , Models, Statistical , Neurogenesis/genetics , Organ Specificity , Reproducibility of Results , Single-Cell Analysis/methods , Support Vector Machine
11.
Genome Res ; 23(2): 300-11, 2013 Feb.
Article in English | MEDLINE | ID: mdl-23275495

ABSTRACT

Gene overexpression beyond a permissible limit causes defects in cellular functions. However, the permissible limits of most genes are unclear. Previously, we developed a genetic method designated genetic tug-of-war (gTOW) to measure the copy number limit of overexpression of a target gene. In the current study, we applied gTOW to the analysis of all protein-coding genes in the budding yeast Saccharomyces cerevisiae. We showed that the yeast cellular system was robust against an increase in the copy number by up to 100 copies in >80% of the genes. After frameshift and segmentation analyses, we isolated 115 dosage-sensitive genes (DSGs) with copy number limits of 10 or less. DSGs contained a significant number of genes involved in cytoskeletal organization and intracellular transport. DSGs tended to be highly expressed and to encode protein complex members. We demonstrated that the protein burden caused the dosage sensitivity of highly expressed genes using a gTOW experiment in which the open reading frame was replaced with GFP. Dosage sensitivities of some DSGs were rescued by the simultaneous increase in the copy numbers of partner genes, indicating that stoichiometric imbalances among complexes cause dosage sensitivity. The results obtained in this study will provide basic knowledge about the physiology of chromosomal abnormalities and the evolution of chromosomal composition.


Subject(s)
Gene Dosage , Genes, Fungal , Saccharomyces cerevisiae/genetics , Gene Expression , Gene Regulatory Networks , Genome, Fungal , Molecular Sequence Annotation , Open Reading Frames , Protein Interaction Maps , Saccharomyces cerevisiae/metabolism
12.
BMC Bioinformatics ; 16: 141, 2015 May 01.
Article in English | MEDLINE | ID: mdl-25929466

ABSTRACT

BACKGROUND: Existing de novo software platforms have largely overlooked a valuable resource, the expertise of the intended biologist users. Typical data representations such as long gene lists, or highly dense and overlapping transcription factor networks often hinder biologists from relating these results to their expertise. RESULTS: VISIONET, a streamlined visualisation tool built from experimental needs, enables biologists to transform large and dense overlapping transcription factor networks into sparse human-readable graphs via numerically filtering. The VISIONET interface allows users without a computing background to interactively explore and filter their data, and empowers them to apply their specialist knowledge on far more complex and substantial data sets than is currently possible. Applying VISIONET to the Tbx20-Gata4 transcription factor network led to the discovery and validation of Aldh1a2, an essential developmental gene associated with various important cardiac disorders, as a healthy adult cardiac fibroblast gene co-regulated by cardiogenic transcription factors Gata4 and Tbx20. CONCLUSIONS: We demonstrate with experimental validations the utility of VISIONET for expertise-driven gene discovery that opens new experimental directions that would not otherwise have been identified.


Subject(s)
Computer Graphics , Gene Regulatory Networks , Genetic Association Studies , Heart/physiology , Software , Transcription Factors/genetics , Adult , Cells, Cultured , Fibroblasts/cytology , Fibroblasts/metabolism , GATA4 Transcription Factor/genetics , Gene Expression Regulation, Developmental , Humans , T-Box Domain Proteins/genetics
13.
J Virol ; 88(16): 8981-97, 2014 Aug.
Article in English | MEDLINE | ID: mdl-24899188

ABSTRACT

UNLABELLED: Occasional transmission of highly pathogenic avian H5N1 influenza viruses to humans causes severe pneumonia with high mortality. To better understand the mechanisms via which H5N1 viruses induce severe disease in humans, we infected cynomolgus macaques with six different H5N1 strains isolated from human patients and compared their pathogenicity and the global host responses to the virus infection. Although all H5N1 viruses replicated in the respiratory tract, there was substantial heterogeneity in their replicative ability and in the disease severity induced, which ranged from asymptomatic to fatal. A comparison of global gene expression between severe and mild disease cases indicated that interferon-induced upregulation of genes related to innate immunity, apoptosis, and antigen processing/presentation in the early phase of infection was limited in severe disease cases, although interferon expression was upregulated in both severe and mild cases. Furthermore, coexpression analysis of microarray data, which reveals the dynamics of host responses during the infection, demonstrated that the limited expression of these genes early in infection led to a failure to suppress virus replication and to the hyperinduction of genes related to immunity, inflammation, coagulation, and homeostasis in the late phase of infection, resulting in a more severe disease. Our data suggest that the attenuated interferon-induced activation of innate immunity, apoptosis, and antigen presentation in the early phase of H5N1 virus infection leads to subsequent severe disease outcome. IMPORTANCE: Highly pathogenic avian H5N1 influenza viruses sometimes transmit to humans and cause severe pneumonia with ca. 60% lethality. The continued circulation of these viruses poses a pandemic threat; however, their pathogenesis in mammals is not fully understood. We, therefore, investigated the pathogenicity of six H5N1 viruses and compared the host responses of cynomolgus macaques to the virus infection. We identified differences in the viral replicative ability of and in disease severity caused by these H5N1 viruses. A comparison of global host responses between severe and mild disease cases identified the limited upregulation of interferon-stimulated genes early in infection in severe cases. The dynamics of the host responses indicated that the limited response early in infection failed to suppress virus replication and led to hyperinduction of pathological condition-related genes late in infection. These findings provide insight into the pathogenesis of H5N1 viruses in mammals.


Subject(s)
Gene Expression Regulation, Viral/genetics , Gene Expression/genetics , Influenza A Virus, H5N1 Subtype/genetics , Orthomyxoviridae Infections/virology , Primates/virology , Animals , Antigen Presentation/immunology , Apoptosis/immunology , Cells, Cultured , Dogs , Gene Expression/immunology , Gene Expression Regulation, Viral/immunology , Humans , Immunity, Innate/immunology , Inflammation/immunology , Inflammation/virology , Influenza A Virus, H5N1 Subtype/immunology , Macaca/immunology , Macaca/virology , Macaca fascicularis/immunology , Macaca fascicularis/virology , Madin Darby Canine Kidney Cells , Orthomyxoviridae Infections/immunology , Primates/immunology , Respiratory System/immunology , Respiratory System/virology , Severity of Illness Index , Virus Replication/genetics , Virus Replication/immunology
14.
PLoS Comput Biol ; 9(11): e1003361, 2013.
Article in English | MEDLINE | ID: mdl-24278007

ABSTRACT

Elucidating gene regulatory network (GRN) from large scale experimental data remains a central challenge in systems biology. Recently, numerous techniques, particularly consensus driven approaches combining different algorithms, have become a potentially promising strategy to infer accurate GRNs. Here, we develop a novel consensus inference algorithm, TopkNet that can integrate multiple algorithms to infer GRNs. Comprehensive performance benchmarking on a cloud computing framework demonstrated that (i) a simple strategy to combine many algorithms does not always lead to performance improvement compared to the cost of consensus and (ii) TopkNet integrating only high-performance algorithms provide significant performance improvement compared to the best individual algorithms and community prediction. These results suggest that a priori determination of high-performance algorithms is a key to reconstruct an unknown regulatory network. Similarity among gene-expression datasets can be useful to determine potential optimal algorithms for reconstruction of unknown regulatory networks, i.e., if expression-data associated with known regulatory network is similar to that with unknown regulatory network, optimal algorithms determined for the known regulatory network can be repurposed to infer the unknown regulatory network. Based on this observation, we developed a quantitative measure of similarity among gene-expression datasets and demonstrated that, if similarity between the two expression datasets is high, TopkNet integrating algorithms that are optimal for known dataset perform well on the unknown dataset. The consensus framework, TopkNet, together with the similarity measure proposed in this study provides a powerful strategy towards harnessing the wisdom of the crowds in reconstruction of unknown regulatory networks.


Subject(s)
Computational Biology/methods , Gene Expression/genetics , Gene Regulatory Networks/genetics , Algorithms , Databases, Genetic , Gene Expression Profiling
15.
PLoS Comput Biol ; 9(1): e1002860, 2013.
Article in English | MEDLINE | ID: mdl-23300433

ABSTRACT

Interactions of proteins regulate signaling, catalysis, gene expression and many other cellular functions. Therefore, characterizing the entire human interactome is a key effort in current proteomics research. This challenge is complicated by the dynamic nature of protein-protein interactions (PPIs), which are conditional on the cellular context: both interacting proteins must be expressed in the same cell and localized in the same organelle to meet. Additionally, interactions underlie a delicate control of signaling pathways, e.g. by post-translational modifications of the protein partners - hence, many diseases are caused by the perturbation of these mechanisms. Despite the high degree of cell-state specificity of PPIs, many interactions are measured under artificial conditions (e.g. yeast cells are transfected with human genes in yeast two-hybrid assays) or even if detected in a physiological context, this information is missing from the common PPI databases. To overcome these problems, we developed a method that assigns context information to PPIs inferred from various attributes of the interacting proteins: gene expression, functional and disease annotations, and inferred pathways. We demonstrate that context consistency correlates with the experimental reliability of PPIs, which allows us to generate high-confidence tissue- and function-specific subnetworks. We illustrate how these context-filtered networks are enriched in bona fide pathways and disease proteins to prove the ability of context-filters to highlight meaningful interactions with respect to various biological questions. We use this approach to study the lung-specific pathways used by the influenza virus, pointing to IRAK1, BHLHE40 and TOLLIP as potential regulators of influenza virus pathogenicity, and to study the signalling pathways that play a role in Alzheimer's disease, identifying a pathway involving the altered phosphorylation of the Tau protein. Finally, we provide the annotated human PPI network via a web frontend that allows the construction of context-specific networks in several ways.


Subject(s)
Proteins/metabolism , Alzheimer Disease/metabolism , Biocatalysis , Humans , Phosphorylation , Protein Binding , Proteome , Signal Transduction , Viral Proteins/metabolism
16.
J Toxicol Sci ; 49(3): 105-115, 2024.
Article in English | MEDLINE | ID: mdl-38432953

ABSTRACT

With the advancement of large-scale omics technologies, particularly transcriptomics data sets on drug and treatment response repositories available in public domain, toxicogenomics has emerged as a key field in safety pharmacology and chemical risk assessment. Traditional statistics-based bioinformatics analysis poses challenges in its application across multidimensional toxicogenomic data, including administration time, dosage, and gene expression levels. Motivated by the visual inspection workflow of field experts to augment their efficiency of screening significant genes to derive meaningful insights, together with the ability of deep neural architectures to learn the image signals, we developed DTox, a deep neural network-based in visio approach. Using the Percellome toxicogenomics database, instead of utilizing the numerical gene expression values of the transcripts (gene probes of the microarray) for dose-time combinations, DTox learned the image representation of 3D surface plots of distinct time and dosage data points to train the classifier on the experts' labels of gene probe significance. DTox outperformed statistical threshold-based bioinformatics and machine learning approaches based on numerical expression values. This result shows the ability of image-driven neural networks to overcome the limitations of classical numeric value-based approaches. Further, by augmenting the model with explainability modules, our study showed the potential to reveal the visual analysis process of human experts in toxicogenomics through the model weights. While the current work demonstrates the application of the DTox model in toxicogenomic studies, it can be further generalized as an in visio approach for multi-dimensional numeric data with applications in various fields in medical data sciences.


Subject(s)
Computational Biology , Toxicogenetics , Humans , Gene Expression Profiling , Machine Learning , Neural Networks, Computer
17.
Ann Rev Mar Sci ; 16: 443-466, 2024 Jan 17.
Article in English | MEDLINE | ID: mdl-37552896

ABSTRACT

The holobiont concept (i.e., multiple living beings in close symbiosis with one another and functioning as a unit) is revolutionizing our understanding of biology, especially in marine systems. The earliest marine holobiont was likely a syntrophic partnership of at least two prokaryotic members. Since then, symbiosis has enabled marine organisms to conquer all ocean habitats through the formation of holobionts with a wide spectrum of complexities. However, most scientific inquiries have focused on isolated organisms and their adaptations to specific environments. In this review, we attempt to illustrate why a holobiont perspective-specifically, the study of how numerous organisms form a discrete ecological unit through symbiosis-will be a more impactful strategy to advance our understanding of the ecology and evolution of marine life. We argue that this approach is instrumental in addressing the threats to marine biodiversity posed by the current global environmental crisis.


Subject(s)
Biodiversity , Symbiosis
18.
bioRxiv ; 2024 Feb 19.
Article in English | MEDLINE | ID: mdl-38464031

ABSTRACT

Viruses are an abundant and crucial component of the human microbiome, but accurately discovering them via metagenomics is still challenging. Currently, the available viral reference genomes poorly represent the diversity in microbiome samples, and expanding such a set of viral references is difficult. As a result, many viruses are still undetectable through metagenomics even when considering the power of de novo metagenomic assembly and binning, as viruses lack universal markers. Here, we describe a novel approach to catalog new viral members of the human gut microbiome and show how the resulting resource improves metagenomic analyses. We retrieved >3,000 viral-like particles (VLP) enriched metagenomic samples (viromes), evaluated the efficiency of the enrichment in each sample to leverage the viromes of highest purity, and applied multiple analysis steps involving assembly and comparison with hundreds of thousands of metagenome-assembled genomes to discover new viral genomes. We reported over 162,000 viral sequences passing quality control from thousands of gut metagenomes and viromes. The great majority of the retrieved viral sequences (~94.4%) were of unknown origin, most had a CRISPR spacer matching host bacteria, and four of them could be detected in >50% of a set of 18,756 gut metagenomes we surveyed. We included the obtained collection of sequences in a new MetaPhlAn 4.1 release, which can quantify reads within a metagenome matching the known and newly uncovered viral diversity. Additionally, we released the viral database for further virome and metagenomic studies of the human microbiome.

19.
Nat Methods ; 7(3 Suppl): S56-68, 2010 Mar.
Article in English | MEDLINE | ID: mdl-20195258

ABSTRACT

High-throughput studies of biological systems are rapidly accumulating a wealth of 'omics'-scale data. Visualization is a key aspect of both the analysis and understanding of these data, and users now have many visualization methods and tools to choose from. The challenge is to create clear, meaningful and integrated visualizations that give biological insight, without being overwhelmed by the intrinsic complexity of the data. In this review, we discuss how visualization tools are being used to help interpret protein interaction, gene expression and metabolic profile data, and we highlight emerging new directions.


Subject(s)
Genomics , Image Processing, Computer-Assisted , Metabolomics , Proteomics , Systems Biology , Mass Spectrometry , Nuclear Magnetic Resonance, Biomolecular , Protein Binding
20.
Bioinformatics ; 28(15): 2016-21, 2012 Aug 01.
Article in English | MEDLINE | ID: mdl-22581176

ABSTRACT

MOTIVATION: LibSBGN is a software library for reading, writing and manipulating Systems Biology Graphical Notation (SBGN) maps stored using the recently developed SBGN-ML file format. The library (available in C++ and Java) makes it easy for developers to add SBGN support to their tools, whereas the file format facilitates the exchange of maps between compatible software applications. The library also supports validation of maps, which simplifies the task of ensuring compliance with the detailed SBGN specifications. With this effort we hope to increase the adoption of SBGN in bioinformatics tools, ultimately enabling more researchers to visualize biological knowledge in a precise and unambiguous manner. AVAILABILITY AND IMPLEMENTATION: Milestone 2 was released in December 2011. Source code, example files and binaries are freely available under the terms of either the LGPL v2.1+ or Apache v2.0 open source licenses from http://libsbgn.sourceforge.net. CONTACT: sbgn-libsbgn@lists.sourceforge.net.


Subject(s)
Computational Biology/methods , Software , Systems Biology , Programming Languages
SELECTION OF CITATIONS
SEARCH DETAIL