RESUMEN
Paneth cells (PCs), a specialized secretory cell type in the small intestine, are increasingly recognized as having an essential role in host responses to microbiome and environmental stresses. Whether and how commensal and pathogenic microbes modify PC composition to modulate inflammation remain unclear. Using newly developed PC-reporter mice under conventional and gnotobiotic conditions, we determined PC transcriptomic heterogeneity in response to commensal and invasive microbes at single cell level. Infection expands the pool of CD74+ PCs, whose number correlates with auto or allogeneic inflammatory disease progressions in mice. Similar correlation was found in human inflammatory disease tissues. Infection-stimulated cytokines increase production of reactive oxygen species (ROS) and expression of a PC-specific mucosal pentraxin (Mptx2) in activated PCs. A PC-specific ablation of MyD88 reduced CD74+ PC population, thus ameliorating pathogen-induced systemic disease. A similar phenotype was also observed in mice lacking Mptx2. Thus, infection stimulates expansion of a PC subset that influences disease progression.
Asunto(s)
Microbiota , Células de Paneth , Humanos , Animales , Ratones , Células de Paneth/metabolismo , Células de Paneth/patología , Intestino Delgado , Inflamación/patología , Citocinas/metabolismoRESUMEN
Spatially resolved transcriptomics technologies enable the measurement of transcriptome information while retaining the spatial context at the regional, cellular or sub-cellular level. While previous computational methods have relied on gene expression information alone for clustering single-cell populations, more recent methods have begun to leverage spatial location and histology information to improve cell clustering and cell-type identification. In this study, using seven semi-synthetic datasets with real spatial locations, simulated gene expression and histology images as well as ground truth cell-type labels, we evaluate 15 clustering methods based on clustering accuracy, robustness to data variation and input parameters, computational efficiency, and software usability. Our analysis demonstrates that even though incorporating the additional spatial and histology information leads to increased accuracy in some datasets, it does not consistently improve clustering compared with using only gene expression data. Our results indicate that for the clustering of spatial transcriptomics data, there are still opportunities to enhance the overall accuracy and robustness by improving information extraction and feature selection from spatial and histology data.
Asunto(s)
Benchmarking , Transcriptoma , Perfilación de la Expresión Génica/métodos , Programas Informáticos , Análisis por ConglomeradosRESUMEN
The advent of single-cell RNA sequencing (scRNA-seq) technologies has enabled gene expression profiling at the single-cell resolution, thereby enabling the quantification and comparison of transcriptional variability among individual cells. Although alterations in transcriptional variability have been observed in various biological states, statistical methods for quantifying and testing differential variability between groups of cells are still lacking. To identify the best practices in differential variability analysis of single-cell gene expression data, we propose and compare 12 statistical pipelines using different combinations of methods for normalization, feature selection, dimensionality reduction and variability calculation. Using high-quality synthetic scRNA-seq datasets, we benchmarked the proposed pipelines and found that the most powerful and accurate pipeline performs simple library size normalization, retains all genes in analysis and uses denSNE-based distances to cluster medoids as the variability measure. By applying this pipeline to scRNA-seq datasets of COVID-19 and autism patients, we have identified cellular variability changes between patients with different severity status or between patients and healthy controls.
Asunto(s)
COVID-19 , Humanos , COVID-19/genética , Perfilación de la Expresión Génica/métodos , Expresión Génica , Análisis de Secuencia de ARN/métodos , Análisis por ConglomeradosRESUMEN
The obligate intracellular bacterium Chlamydia has a unique developmental cycle that alternates between two contrasting cell types. With a hardy envelope and highly condensed genome, the small elementary body (EB) maintains limited metabolic activities yet survives in extracellular environments and is infectious. After entering host cells, EBs differentiate into larger and proliferating reticulate bodies (RBs). Progeny EBs are derived from RBs in late developmental stages and eventually exit host cells. How expression of the chlamydial genome consisting of nearly 1,000 genes governs the chlamydial developmental cycle is unclear. A previous microarray study identified only 29 Chlamydia trachomatis immediate early genes, defined as genes with increased expression during the first hour postinoculation in cultured cells. In this study, we performed more sensitive RNA sequencing (RNA-Seq) analysis for C. trachomatis cultures with high multiplicities of infection. Remarkably, we observed well over 700 C. trachomatis genes that underwent 2- to 900-fold activation within 1 hour postinoculation. Quantitative reverse transcription real-time PCR analysis was further used to validate the activated expression of a large subset of the genes identified by RNA-Seq. Importantly, our results demonstrate that the immediate early transcriptome is over 20 times more extensive than previously realized. Gene ontology analysis indicates that the activated expression spans all functional categories. We conclude that over 70% of C. trachomatis genes are activated in EBs almost immediately upon entry into host cells, thus implicating their importance in initiating rapid differentiation into RBs and establishing an intracellular niche conducive with chlamydial development and growth.
Asunto(s)
Infecciones por Chlamydia , Chlamydia trachomatis , Humanos , Células Cultivadas , Secuencia de Bases , Transcriptoma , Reacción en Cadena en Tiempo Real de la Polimerasa , Infecciones por Chlamydia/genéticaRESUMEN
BACKGROUND: RNA sequencing (RNA-Seq) offers profound insights into the complex transcriptomes of diverse biological systems. However, standard differential expression analysis pipelines based on DESeq2 and edgeR encounter challenges when applied to the immediate early transcriptomes of Chlamydia spp., obligate intracellular bacteria. These challenges arise from their reliance on assumptions that do not hold in scenarios characterized by extensive transcriptomic activation and limited repression. RESULTS: Standard analyses using unique chlamydial RNA-Seq reads alone identify nearly 300 upregulated and about 300 downregulated genes, significantly deviating from actual RNA-Seq read trends. By incorporating both chlamydial and host reads or adjusting for total sequencing depth, the revised normalization methods each detected over 700 upregulated genes and 30 or fewer downregulated genes, closely aligned with observed RNA-Seq data. Further validation through qRT-PCR analysis confirmed the effectiveness of these adjusted approaches in capturing the true extent of transcriptomic activation during the immediate early phase of chlamydial infection. CONCLUSIONS: This study highlights the limitations of standard RNA-Seq analysis tools in scenarios with extensive transcriptomic activation, such as in Chlamydia spp. during early infection. Our revised normalization methods, incorporating host reads or total sequencing depth, provide a more accurate representation of gene expression dynamics. These approaches may inform similar adjustments in other systems with unbalanced gene expression dynamics, enhancing the accuracy of transcriptomic analysis.
Asunto(s)
Chlamydia , Transcriptoma , Chlamydia/genética , Humanos , RNA-Seq/métodos , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos , Infecciones por Chlamydia/microbiología , Infecciones por Chlamydia/genéticaRESUMEN
Intestinal microbiota confers susceptibility to diet-induced obesity, yet many probiotic species that synthesize tryptophan (trp) actually attenuate this effect, although the underlying mechanisms are unclear. We monocolonized germ-free mice with a widely consumed probiotic Lacticaseibacillus rhamnosus GG (LGG) under trp-free or -sufficient dietary conditions. We obtained untargeted metabolomics from the mouse feces and serum using liquid chromatography-mass spectrometry and obtained intestinal transcriptomic profiles via bulk-RNA sequencing. When comparing LGG-monocolonized mice with germ-free mice, we found a synergy between LGG and dietary trp in markedly promoting the transcriptome of fatty acid metabolism and ß-oxidation. Upregulation was specific and was not observed in transcriptomes of trp-fed conventional mice and mice monocolonized with Ruminococcus gnavus. Metabolomics showed that fecal and serum metabolites were also modified by LGG-host-trp interaction. We developed an R-Script-based MEtabolome-TRanscriptome Correlation Analysis algorithm and uncovered LGG- and trp-dependent metabolites that were positively or negatively correlated with fatty acid metabolism and ß-oxidation gene networks. This high-throughput metabolome-transcriptome correlation strategy can be used in similar investigations to reveal potential interactions between specific metabolites and functional or disease-related transcriptomic networks.
Asunto(s)
Microbioma Gastrointestinal , Lacticaseibacillus rhamnosus , Ratones , Animales , Intestinos , Microbioma Gastrointestinal/genética , Perfilación de la Expresión Génica , Ácidos GrasosRESUMEN
MOTIVATION: Since the development of single-cell RNA sequencing (scRNA-seq) technologies, clustering analysis of single-cell gene expression data has been an essential tool for distinguishing cell types and identifying novel cell types. Even though many methods have been available for scRNA-seq clustering analysis, the majority of them are constrained by the requirement on predetermined cluster numbers or the dependence on selected initial cluster assignment. RESULTS: In this article, we propose an adaptive embedding and clustering method named scAce, which constructs a variational autoencoder to simultaneously learn cell embeddings and cluster assignments. In the scAce method, we develop an adaptive cluster merging approach which achieves improved clustering results without the need to estimate the number of clusters in advance. In addition, scAce provides an option to perform clustering enhancement, which can update and enhance cluster assignments based on previous clustering results from other methods. Based on computational analysis of both simulated and real datasets, we demonstrate that scAce outperforms state-of-the-art clustering methods for scRNA-seq data, and achieves better clustering accuracy and robustness. AVAILABILITY AND IMPLEMENTATION: The scAce package is implemented in python 3.8 and is freely available from https://github.com/sldyns/scAce.
Asunto(s)
Análisis por Conglomerados , Expresión Génica , Análisis de Secuencia de ARNRESUMEN
Single-cell RNA sequencing (scRNA-seq) technologies facilitate the characterization of transcriptomic landscapes in diverse species, tissues, and cell types with unprecedented molecular resolution. In order to evaluate various biological hypotheses using high-dimensional single-cell gene expression data, most computational and statistical methods depend on a gene feature selection step to identify genes with high biological variability and reduce computational complexity. Even though many gene selection methods have been developed for scRNA-seq analysis, there lacks a systematic comparison of the assumptions, statistical models, and selection criteria used by these methods. In this article, we summarize and discuss 17 computational methods for selecting gene features in unsupervised analysis of single-cell gene expression data, with unified notations and statistical frameworks. Our discussion provides a useful summary to help practitioners select appropriate methods based on their assumptions and applicability, and to assist method developers in designing new computational tools for unsupervised learning of scRNA-seq data.
Asunto(s)
Biología Computacional/métodos , Expresión Génica , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , HumanosRESUMEN
MOTIVATION: Single-cell RNA sequencing technologies facilitate the characterization of transcriptomic landscapes in diverse species, tissues and cell types with unprecedented molecular resolution. In order to better understand animal development, physiology, and pathology, unsupervised clustering analysis is often used to identify relevant cell populations. Although considerable progress has been made in terms of clustering algorithms in recent years, it remains challenging to evaluate the quality of the inferred single-cell clusters, which can greatly impact downstream analysis and interpretation. RESULTS: We propose a bioinformatics tool named Phitest to analyze the homogeneity of single-cell populations. Phitest is able to distinguish between homogeneous and heterogeneous cell populations, providing an objective and automatic method to optimize the performance of single-cell clustering analysis. AVAILABILITY AND IMPLEMENTATION: The PhitestR package is freely available on both Github (https://github.com/Vivianstats/PhitestR) and the Comprehensive R Archive Network (CRAN). There is no new genomic data associated with this article. Published data used in the analysis are described in detail in the Supplementary Data. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Análisis de la Célula Individual , Programas Informáticos , Animales , Análisis por Conglomerados , Algoritmos , TranscriptomaRESUMEN
Chronic infection of hepatitis B virus (HBV) is the major cause of hepatocellular carcinoma (HCC). Notably, 90% of HBV-positive HCC cases exhibit detectable HBV integrations, hinting at the potential early entanglement of these viral integrations in tumorigenesis and their subsequent oncogenic implications. Nevertheless, the precise chronology of integration events during HCC tumorigenesis, alongside their sequential structural patterns, has remained elusive thus far. In this study, we applied whole-genome sequencing to multiple biopsies extracted from six HBV-positive HCC cases. Through this approach, we identified point mutations and viral integrations, offering a blueprint for the intricate tumor phylogeny of these samples. The emergent narrative paints a rich tapestry of diverse evolutionary trajectories characterizing the analyzed tumors. We uncovered oncogenic integration events in some samples that appear to happen before and during the initiation stage of tumor development based on their locations in reconstituted trajectories. Furthermore, we conducted additional long-read sequencing of selected samples and unveiled integration-bridged chromosome rearrangements and tandem repeats of the HBV sequence within integrations. In summary, this study revealed premalignant oncogenic and sequential complex integrations and highlighted the contributions of HBV integrations to HCC development and genome instability.
Asunto(s)
Carcinoma Hepatocelular , Hepatitis B , Neoplasias Hepáticas , Humanos , Virus de la Hepatitis B/genética , Carcinogénesis , Transformación Celular NeoplásicaRESUMEN
Genome-wide accurate identification and quantification of full-length mRNA isoforms is crucial for investigating transcriptional and posttranscriptional regulatory mechanisms of biological phenomena. Despite continuing efforts in developing effective computational tools to identify or assemble full-length mRNA isoforms from second-generation RNA-seq data, it remains a challenge to accurately identify mRNA isoforms from short sequence reads owing to the substantial information loss in RNA-seq experiments. Here, we introduce a novel statistical method, annotation-assisted isoform discovery (AIDE), the first approach that directly controls false isoform discoveries by implementing the testing-based model selection principle. Solving the isoform discovery problem in a stepwise and conservative manner, AIDE prioritizes the annotated isoforms and precisely identifies novel isoforms whose addition significantly improves the explanation of observed RNA-seq reads. We evaluate the performance of AIDE based on multiple simulated and real RNA-seq data sets followed by PCR-Sanger sequencing validation. Our results show that AIDE effectively leverages the annotation information to compensate the information loss owing to short read lengths. AIDE achieves the highest precision in isoform discovery and the lowest error rates in isoform abundance estimation, compared with three state-of-the-art methods Cufflinks, SLIDE, and StringTie. As a robust bioinformatics tool for transcriptome analysis, AIDE enables researchers to discover novel transcripts with high confidence.
Asunto(s)
Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento , Anotación de Secuencia Molecular , Isoformas de ARN , ARN Mensajero , Análisis de Secuencia de ARN , Humanos , Isoformas de ARN/biosíntesis , Isoformas de ARN/genética , ARN Mensajero/biosíntesis , ARN Mensajero/genéticaRESUMEN
The availability of genome-wide epigenomic datasets enables in-depth studies of epigenetic modifications and their relationships with chromatin structures and gene expression. Various alignment tools have been developed to align nucleotide or protein sequences in order to identify structurally similar regions. However, there are currently no alignment methods specifically designed for comparing multi-track epigenomic signals and detecting common patterns that may explain functional or evolutionary similarities. We propose a new local alignment algorithm, EpiAlign, designed to compare chromatin state sequences learned from multi-track epigenomic signals and to identify locally aligned chromatin regions. EpiAlign is a dynamic programming algorithm that novelly incorporates varying lengths and frequencies of chromatin states. We demonstrate the efficacy of EpiAlign through extensive simulations and studies on the real data from the NIH Roadmap Epigenomics project. EpiAlign is able to extract recurrent chromatin state patterns along a single epigenome, and many of these patterns carry cell-type-specific characteristics. EpiAlign can also detect common chromatin state patterns across multiple epigenomes, and it will serve as a useful tool to group and distinguish epigenomic samples based on genome-wide or local chromatin state patterns.
Asunto(s)
Cromatina/ultraestructura , Biología Computacional/métodos , Epigenómica/métodos , Alineación de Secuencia , Algoritmos , Secuencia de Bases , Química Encefálica , Cromatina/genética , Metilación de ADN , Bases de Datos Genéticas , Conjuntos de Datos como Asunto , Ontología de Genes , Humanos , Proteínas del Tejido Nervioso/biosíntesis , Proteínas del Tejido Nervioso/química , Proteínas del Tejido Nervioso/genética , Programas InformáticosRESUMEN
MOTIVATION: Single-cell RNA sequencing (scRNA-seq) has revolutionized biological sciences by revealing genome-wide gene expression levels within individual cells. However, a critical challenge faced by researchers is how to optimize the choices of sequencing platforms, sequencing depths and cell numbers in designing scRNA-seq experiments, so as to balance the exploration of the depth and breadth of transcriptome information. RESULTS: Here we present a flexible and robust simulator, scDesign, the first statistical framework for researchers to quantitatively assess practical scRNA-seq experimental design in the context of differential gene expression analysis. In addition to experimental design, scDesign also assists computational method development by generating high-quality synthetic scRNA-seq datasets under customized experimental settings. In an evaluation based on 17 cell types and 6 different protocols, scDesign outperformed four state-of-the-art scRNA-seq simulation methods and led to rational experimental design. In addition, scDesign demonstrates reproducibility across biological replicates and independent studies. We also discuss the performance of multiple differential expression and dimension reduction methods based on the protocol-dependent scRNA-seq data generated by scDesign. scDesign is expected to be an effective bioinformatic tool that assists rational scRNA-seq experimental design and comparison of scRNA-seq computational methods based on specific research goals. AVAILABILITY AND IMPLEMENTATION: We have implemented our method in the R package scDesign, which is freely available at https://github.com/Vivianstats/scDesign. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Perfilación de la Expresión Génica , ARN Citoplasmático Pequeño , Análisis de la Célula Individual , Reproducibilidad de los Resultados , Proyectos de Investigación , Análisis de Secuencia de ARN , Programas InformáticosRESUMEN
BACKGROUND: The dynamics of epigenomic marks in their relevant chromatin states regulate distinct gene expression patterns, biological functions and phenotypic variations in biological processes. The availability of high-throughput epigenomic data generated by next-generation sequencing technologies allows a data-driven approach to evaluate the similarities and differences of diverse tissue and cell types in terms of epigenomic features. While ChromImpute has allowed for the imputation of large-scale epigenomic information to yield more robust data to capture meaningful relationships between biological samples, widely used methods such as hierarchical clustering and correlation analysis cannot adequately utilize epigenomic data to accurately reveal the distinction and grouping of different tissue and cell types. METHODS: We utilize a three-step testing procedure-ANOVA, t test and overlap test to identify tissue/cell-type- associated enhancers and promoters and to calculate a newly defined Epigenomic Overlap Measure (EPOM). EPOM results in a clear correspondence map of biological samples from different tissue and cell types through comparison of epigenomic marks evaluated in their relevant chromatin states. RESULTS: Correspondence maps by EPOM show strong capability in distinguishing and grouping different tissue and cell types and reveal biologically meaningful similarities between Heart and Muscle, Blood & T-cell and HSC & B-cell, Brain and Neurosphere, etc. The gene ontology enrichment analysis both supports and explains the discoveries made by EPOM and suggests that the associated enhancers and promoters demonstrate distinguishable functions across tissue and cell types. Moreover, the tissue/cell-type-associated enhancers and promoters show enrichment in the disease-related SNPs that are also associated with the corresponding tissue or cell types. This agreement suggests the potential of identifying causal genetic variants relevant to cell-type-specific diseases from our identified associated enhancers and promoters. CONCLUSIONS: The proposed EPOM measure demonstrates superior capability in grouping and finding a clear correspondence map of biological samples from different tissue and cell types. The identified associated enhancers and promoters provide a comprehensive catalog to study distinct biological processes and disease variants in different tissue and cell types. Our results also find that the associated promoters exhibit more cell-type-specific functions than the associated enhancers do, suggesting that the non-associated promoters have more housekeeping functions than the non-associated enhancers.
Asunto(s)
Cromatina/genética , Epigenómica , Cromatina/patología , Cromosomas Humanos , Análisis por Conglomerados , Elementos de Facilitación Genéticos , Estudio de Asociación del Genoma Completo , Histonas/genética , Histonas/metabolismo , Humanos , Polimorfismo de Nucleótido Simple , Regiones Promotoras GenéticasRESUMEN
Spatially resolved transcriptomics technologies have opened new avenues for understanding gene expression heterogeneity in spatial contexts. However, existing methods for identifying spatially variable genes often focus solely on statistical significance, limiting their ability to capture continuous expression patterns and integrate spot-level covariates. To address these challenges, we introduce spVC, a statistical method based on a generalized Poisson model. spVC seamlessly integrates constant and spatially varying effects of covariates, facilitating comprehensive exploration of gene expression variability and enhancing interpretability. Simulation and real data applications confirm spVC's accuracy in these tasks, highlighting its versatility in spatial transcriptomics analysis.
Asunto(s)
Perfilación de la Expresión Génica , Transcriptoma , Simulación por Computador , Análisis Espacial , Expresión GénicaRESUMEN
Analyzing single-cell RNA sequencing (scRNA-seq) data remains a challenge due to its high dimensionality, sparsity and technical noise. Recognizing the benefits of dimensionality reduction in simplifying complexity and enhancing the signal-to-noise ratio, we introduce scBiG, a novel graph node embedding method designed for representation learning in scRNA-seq data. scBiG establishes a bipartite graph connecting cells and expressed genes, and then constructs a multilayer graph convolutional network to learn cell and gene embeddings. Through a series of extensive experiments, we demonstrate that scBiG surpasses commonly used dimensionality reduction techniques in various analytical tasks. Downstream tasks encompass unsupervised cell clustering, cell trajectory inference, gene expression reconstruction and gene co-expression analysis. Additionally, scBiG exhibits notable computational efficiency and scalability. In summary, scBiG offers a useful graph neural network framework for representation learning in scRNA-seq data, empowering a diverse array of downstream analyses.
RESUMEN
Motivation: RNA sequencing (RNA-Seq) offers profound insights into the complex transcriptomes of diverse biological systems. However, standard differential expression analysis pipelines based on DESeq2 and edgeR encounter challenges when applied to the immediate early transcriptomes of Chlamydia spp., obligate intracellular bacteria. These challenges arise from their reliance on assumptions that do not hold in scenarios characterized by extensive transcriptomic activation and limited repression. Standard analyses using unique chlamydial RNA-Seq reads alone identify nearly 300 upregulated and about 300 downregulated genes, significantly deviating from actual RNA-Seq read trends. Results: By incorporating both chlamydial and host reads or adjusting for total sequencing depth, the revised normalization methods each detected over 700 upregulated genes and 30 or fewer downregulated genes, closely aligned with observed RNA-Seq data. Further validation through qRT-PCR analysis confirmed the effectiveness of these adjusted approaches in capturing the true extent of transcriptomic activation during the immediate early phase of chlamydial infection. While the strategies employed are developed in the context of Chlamydia, the principles of flexible and context-aware normalization may inform adjustments in other systems with unbalanced gene expression dynamics, such as bacterial spore germination. Availability and implementation: The code for reproducing the presented bioinformatic analysis is available at https://zenodo.org/records/11201379.
RESUMEN
This Voices piece will highlight the impact of artificial intelligence on algorithm development among computational biologists. How has worldwide focus on AI changed the path of research in computational biology? What is the impact on the algorithmic biology research community?
Asunto(s)
Algoritmos , Inteligencia Artificial , Biología Computacional , Inteligencia Artificial/tendencias , Biología Computacional/métodos , HumanosRESUMEN
IMPORTANCE: Hallmarks of the developmental cycle of the obligate intracellular pathogenic bacterium Chlamydia are the primary differentiation of the infectious elementary body (EB) into the proliferative reticulate body (RB) and the secondary differentiation of RBs back into EBs. The mechanisms regulating these transitions remain unclear. In this report, we developed an effective novel strategy termed dependence on plasmid-mediated expression (DOPE) that allows for the knockdown of essential genes in Chlamydia. We demonstrate that GrgA, a Chlamydia-specific transcription factor, is essential for the secondary differentiation and optimal growth of RBs. We also show that GrgA, a chromosome-encoded regulatory protein, controls the maintenance of the chlamydial virulence plasmid. Transcriptomic analysis further indicates that GrgA functions as a critical regulator of all three sigma factors that recognize different promoter sets at developmental stages. The DOPE strategy outlined here should provide a valuable tool for future studies examining chlamydial growth, development, and pathogenicity.
Asunto(s)
Infecciones por Chlamydia , Chlamydia trachomatis , Humanos , Chlamydia trachomatis/metabolismo , Regulación Bacteriana de la Expresión Génica , Factores de Transcripción/metabolismo , Factor sigma/genética , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismoRESUMEN
BACKGROUND & AIMS: Lacticaseibacillus rhamnosus GG (LGG) is the world's most consumed probiotic but its mechanism of action on intestinal permeability and differentiation along with its interactions with an essential source of signaling metabolites, dietary tryptophan (trp), are unclear. METHODS: Untargeted metabolomic and transcriptomic analyses were performed in LGG monocolonized germ-free mice fed trp-free or -sufficient diets. LGG-derived metabolites were profiled in vitro under anaerobic and aerobic conditions. Multiomic correlations using a newly developed algorithm discovered novel metabolites tightly linked to tight junction and cell differentiation genes whose abundances were regulated by LGG and dietary trp. Barrier-modulation by these metabolites were functionally tested in Caco2 cells, mouse enteroids, and dextran sulfate sodium experimental colitis. The contribution of these metabolites to barrier protection is delineated at specific tight junction proteins and enterocyte-promoting factors with gain and loss of function approaches. RESULTS: LGG, strictly with dietary trp, promotes the enterocyte program and expression of tight junction genes, particularly Ocln. Functional evaluations of fecal and serum metabolites synergistically stimulated by LGG and trp revealed a novel vitamin B3 metabolism pathway, with methylnicotinamide (MNA) unexpectedly being the most robust barrier-protective metabolite in vitro and in vivo. Reduced serum MNA is significantly associated with increased disease activity in patients with inflammatory bowel disease. Exogenous MNA enhances gut barrier in homeostasis and robustly promotes colonic healing in dextran sulfate sodium colitis. MNA is sufficient to promote intestinal epithelial Ocln and RNF43, a master inhibitor of Wnt. Blocking trp or vitamin B3 absorption abolishes barrier recovery in vivo. CONCLUSIONS: Our study uncovers a novel LGG-regulated dietary trp-dependent production of MNA that protects the gut barrier against colitis.