ABSTRACT
BACKGROUND & AIMS: Endoscopic assessment of ulcerative colitis (UC) typically reports only the maximum severity observed. Computer vision methods may better quantify mucosal injury detail, which varies among patients. METHODS: Endoscopic video from the UNIFI clinical trial (A Study to Evaluate the Safety and Efficacy of Ustekinumab Induction and Maintenance Therapy in Participants With Moderately to Severely Active Ulcerative Colitis) comparing ustekinumab and placebo for UC were processed in a computer vision analysis that spatially mapped Mayo Endoscopic Score (MES) to generate the Cumulative Disease Score (CDS). CDS was compared with the MES for differentiating ustekinumab vs placebo treatment response and agreement with symptomatic remission at week 44. Statistical power, effect, and estimated sample sizes for detecting endoscopic differences between treatments were calculated using both CDS and MES measures. Endoscopic video from a separate phase 2 clinical trial replication cohort was performed for validation of CDS performance. RESULTS: Among 748 induction and 348 maintenance patients, CDS was lower in ustekinumab vs placebo users at week 8 (141.9 vs 184.3; P < .0001) and week 44 (78.2 vs 151.5; P < .0001). CDS was correlated with the MES (P < .0001) and all clinical components of the partial Mayo score (P < .0001). Stratification by pretreatment CDS revealed ustekinumab was more effective than placebo (P < .0001) with increasing effect in severe vs mild disease (-85.0 vs -55.4; P < .0001). Compared with the MES, CDS was more sensitive to change, requiring 50% fewer participants to demonstrate endoscopic differences between ustekinumab and placebo (Hedges' g = 0.743 vs 0.460). CDS performance in the JAK-UC replication cohort was similar to UNIFI. CONCLUSIONS: As an automated and quantitative measure of global endoscopic disease severity, the CDS offers artificial intelligence enhancement of traditional MES capability to better evaluate UC in clinical trials and potentially practice.
Subject(s)
Colitis, Ulcerative , Humans , Artificial Intelligence , Colitis, Ulcerative/diagnosis , Colitis, Ulcerative/drug therapy , Colonoscopy/methods , Computers , Remission Induction , Severity of Illness Index , Ustekinumab/adverse effectsABSTRACT
OBJECTIVE: IBD therapies and treatments are evolving to deeper levels of remission. Molecular measures of disease may augment current endpoints including the potential for less invasive assessments. DESIGN: Transcriptome analysis on 712 endoscopically defined inflamed (Inf) and 1778 non-inflamed (Non-Inf) intestinal biopsies (n=498 Crohn's disease, n=421 UC and 243 controls) in the Mount Sinai Crohn's and Colitis Registry were used to identify genes differentially expressed between Inf and Non-Inf biopsies and to generate a molecular inflammation score (bMIS) via gene set variance analysis. A circulating MIS (cirMIS) score, reflecting intestinal molecular inflammation, was generated using blood transcriptome data. bMIS/cirMIS was validated as indicators of intestinal inflammation in four independent IBD cohorts. RESULTS: bMIS/cirMIS was strongly associated with clinical, endoscopic and histological disease activity indices. Patients with the same histologic score of inflammation had variable bMIS scores, indicating that bMIS describes a deeper range of inflammation. In available clinical trial data sets, both scores were responsive to IBD treatment. Despite similar baseline endoscopic and histologic activity, UC patients with lower baseline bMIS levels were more likely treatment responders compared with those with higher levels. Finally, among patients with UC in endoscopic and histologic remission, those with lower bMIS levels were less likely to have a disease flare over time. CONCLUSION: Transcriptionally based scores provide an alternative objective and deeper quantification of intestinal inflammation, which could augment current clinical assessments used for disease monitoring and have potential for predicting therapeutic response and patients at higher risk of disease flares.
Subject(s)
Colitis, Ulcerative , Crohn Disease , Humans , Colitis, Ulcerative/pathology , Inflammation/genetics , Inflammation/pathology , Crohn Disease/pathology , Biopsy , Biomarkers , Intestinal Mucosa/pathologyABSTRACT
Epidemiological studies have long recognized risky behaviors as potentially modifiable factors for the onset and flares of inflammatory bowel disease (IBD); yet, the underlying mechanisms are largely unknown. Recently, the genetic susceptibilities to cigarette smoking, alcohol and cannabis use [i.e. substance use (SU)] have been characterized by well-powered genome-wide association studies (GWASs). We aimed to assess the impact of genetic determinants of SU on IBD risk. Using Mount Sinai Crohn's and Colitis Registry (MSCCR) cohort of 1058 IBD cases and 188 healthy controls, we computed the polygenic risk score (PRS) for SU and correlated them with the observed IBD diagnoses, while adjusting for genetic ancestry, PRS for IBD and SU behavior at enrollment. The results were validated in a pediatric cohort with no SU exposure. PRS of alcohol consumption (DrnkWk), smoking cessation and age of smoking initiation, were associated with IBD risk in MSCCR even after adjustment for PRSIBD and actual smoking status. One interquartile range decrease in PRSDrnkWk was significantly associated to higher IBD risk (i.e. inverse association) (with odds ratio = 1.65 and 95% confidence interval: 1.32, 2.06). The association was replicated in a pediatric Crohn's disease cohort. Colocalization analysis identified a locus on chromosome 16 with polymorphisms in IL27, SULT1A2 and SH2B1, which reached genome-wide statistical significance in GWAS (P < 7.7e-9) for both alcohol consumption and IBD risk. This study demonstrated that the genetic predisposition to SU was associated with IBD risk, independent of PRSIBD and in the absence of SU behaviors. Our study may help further stratify individuals at risk of IBD.
Subject(s)
Alcohol Drinking/adverse effects , Biomarkers/metabolism , Genetic Predisposition to Disease , Genome-Wide Association Study , Inflammatory Bowel Diseases/diagnosis , Polymorphism, Single Nucleotide , Adolescent , Case-Control Studies , Child , Child, Preschool , Cohort Studies , Female , Humans , Infant , Inflammatory Bowel Diseases/etiology , Inflammatory Bowel Diseases/metabolism , Male , Risk FactorsABSTRACT
BACKGROUND & AIMS: Polygenic and environmental factors are underlying causes of inflammatory bowel disease (IBD). We hypothesized that integration of the genetic loci controlling a metabolite's abundance, with known IBD genetic susceptibility loci, may help resolve metabolic drivers of IBD. METHODS: We measured the levels of 1300 metabolites in the serum of 484 patients with ulcerative colitis (UC) and 464 patients with Crohn's disease (CD) and 365 controls. Differential metabolite abundance was determined for disease status, subtype, clinical and endoscopic disease activity, as well as IBD phenotype including disease behavior, location, and extent. To inform on the genetic basis underlying metabolic diversity, we integrated metabolite and genomic data. Genetic colocalization and Mendelian randomization analyses were performed using known IBD risk loci to explore whether any metabolite was causally associated with IBD. RESULTS: We found 173 genetically controlled metabolites (metabolite quantitative trait loci, 9 novel) within 63 non-overlapping loci (7 novel). Furthermore, several metabolites significantly associated with IBD disease status and activity as defined using clinical and endoscopic indexes. This constitutes a resource for biomarker discovery and IBD biology insights. Using this resource, we show that a novel metabolite quantitative trait locus for serum butyrate levels containing ACADS was not supported as causal for IBD; replicate the association of serum omega-6 containing lipids with the fatty acid desaturase 1/2 locus and identify these metabolites as causal for CD through Mendelian randomization; and validate a novel association of serum plasmalogen and TMEM229B, which was predicted as causal for CD. CONCLUSIONS: An exploratory analysis combining genetics and unbiased serum metabolome surveys can reveal novel biomarkers of disease activity and potential mediators of pathology in IBD.
Subject(s)
Acyl-CoA Dehydrogenase/genetics , Colitis, Ulcerative/genetics , Colitis, Ulcerative/metabolism , Crohn Disease/genetics , Crohn Disease/metabolism , Adolescent , Adult , Aged , Aged, 80 and over , Biomarkers/blood , Butyrates/blood , Case-Control Studies , Child , Child, Preschool , Colitis, Ulcerative/blood , Colitis, Ulcerative/drug therapy , Crohn Disease/blood , Crohn Disease/drug therapy , Cross-Sectional Studies , Feces/chemistry , Female , Genome-Wide Association Study , Genotype , HEK293 Cells , Humans , Male , Mendelian Randomization Analysis , Metabolome , Middle Aged , Plasmalogens/blood , Plasmalogens/genetics , Quantitative Trait Loci , Severity of Illness Index , Young AdultABSTRACT
BACKGROUND AND AIMS: Disease extent varies in ulcerative colitis (UC) from proctitis to left-sided colitis to pancolitis and is a major prognostic factor. When the extent of UC is limited there is often a sharp demarcation between macroscopically involved and uninvolved areas and what defines this or subsequent extension is unknown. We characterized the demarcation site molecularly and determined genes associated with subsequent disease extension. METHODS: We performed RNA sequence analysis of biopsy specimens from UC patients with endoscopically and histologically confirmed limited disease, of which a subset later extended. Biopsy specimens were obtained from the endoscopically inflamed upper (proximal) limit of disease, immediately adjacent to the uninvolved colon, as well as at more proximal, endoscopically uninflamed colonic segments. RESULTS: Differentially expressed genes were identified in the endoscopically inflamed biopsy specimens taken at each patient's most proximal diseased site relative to healthy controls. Expression of these genes in the more proximal biopsy specimens transitioned back to control levels abruptly or gradually, the latter pattern supporting the concept that disease exists beyond the endoscopic disease demarcation site. The gradually transitioning genes were associated with inflammation, angiogenesis, glucuronidation, and homeodomain pathways. A subset of these genes in inflamed biopsy specimens was found to predict disease extension better than clinical features and were responsive to biologic therapies. Network analysis revealed critical roles for interferon signaling in UC inflammation and poly(ADP-ribose) polymerase 14 (PARP14) was a predicted key driver gene of extension. Higher PARP14 protein levels were found in inflamed biopsy specimens of patients with limited UC that subsequently extended. CONCLUSION: Molecular predictors of disease extension reveal novel strategies for disease prognostication and potential therapeutic targeting.
Subject(s)
Colitis, Ulcerative/genetics , Colon/metabolism , Gene Expression Profiling , Poly(ADP-ribose) Polymerases/genetics , Sequence Analysis, RNA , Transcriptome , Bayes Theorem , Biopsy , Case-Control Studies , Colitis, Ulcerative/metabolism , Colitis, Ulcerative/pathology , Colon/pathology , Cross-Sectional Studies , Gene Expression Regulation , Gene Regulatory Networks , Humans , Patient Acuity , Poly(ADP-ribose) Polymerases/metabolism , Predictive Value of Tests , Signal TransductionABSTRACT
BACKGROUND AND AIMS: The presence of gastrointestinal symptoms and high levels of viral RNA in the stool suggest active severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) replication within enterocytes. METHODS: Here, in multiple, large cohorts of patients with inflammatory bowel disease (IBD), we have studied the intersections between Coronavirus Disease 2019 (COVID-19), intestinal inflammation, and IBD treatment. RESULTS: A striking expression of ACE2 on the small bowel enterocyte brush border supports intestinal infectivity by SARS-CoV-2. Commonly used IBD medications, both biologic and nonbiologic, do not significantly impact ACE2 and TMPRSS2 receptor expression in the uninflamed intestines. In addition, we have defined molecular responses to COVID-19 infection that are also enriched in IBD, pointing to shared molecular networks between COVID-19 and IBD. CONCLUSIONS: These data generate a novel appreciation of the confluence of COVID-19- and IBD-associated inflammation and provide mechanistic insights supporting further investigation of specific IBD drugs in the treatment of COVID-19. Preprint doi: https://doi.org/10.1101/2020.05.21.109124.
Subject(s)
Angiotensin-Converting Enzyme 2/metabolism , COVID-19/enzymology , Inflammatory Bowel Diseases/enzymology , Intestinal Mucosa/enzymology , SARS-CoV-2/pathogenicity , Serine Endopeptidases/metabolism , Angiotensin-Converting Enzyme 2/genetics , Animals , Anti-Inflammatory Agents/therapeutic use , Antiviral Agents/therapeutic use , COVID-19/genetics , COVID-19/virology , Case-Control Studies , Clinical Trials as Topic , Cross-Sectional Studies , Disease Models, Animal , Female , Gene Regulatory Networks , Host-Pathogen Interactions , Humans , Inflammatory Bowel Diseases/drug therapy , Inflammatory Bowel Diseases/genetics , Intestinal Mucosa/drug effects , Intestinal Mucosa/virology , Longitudinal Studies , Male , Mice , SARS-CoV-2/drug effects , Serine Endopeptidases/genetics , Signal Transduction , COVID-19 Drug TreatmentABSTRACT
BACKGROUND & AIMS: Crohn disease (CD) presents as chronic and often progressive intestinal inflammation, but the contributing pathogenic mechanisms are unclear. We aimed to identify alterations in intestinal cells that could contribute to the chronic and progressive course of CD. METHODS: We took an unbiased system-wide approach by performing sequence analysis of RNA extracted from formalin-fixed paraffin-embedded ileal tissue sections from patients with CD (n = 36) and without CD (controls; n = 32). We selected relatively uninflamed samples, based on histology, before gene expression profiling; validation studies were performed using adjacent serial tissue sections. A separate set of samples (3 control and 4 CD samples) was analyzed by transmission electron microscopy. We developed methods to visualize an overlapping modular network of genes dysregulated in the CD samples. We validated our findings using biopsy samples (110 CD samples for gene expression analysis and 54 for histologic analysis) from the UNITI-2 phase 3 trial of ustekinumab for patients with CD and healthy individuals (26 samples used in gene expression analysis). RESULTS: We identified gene clusters that were altered in nearly all CD samples. One cluster encoded genes associated with the enterocyte brush border, leading us to investigate microvilli. In ileal tissues from patients with CD, the microvilli were of decreased length and had ultrastructural defects compared with tissues from controls. Microvilli length correlated with expression of genes that regulate microvilli structure and function. Network analysis linked the microvilli cluster to several other down-regulated clusters associated with altered intracellular trafficking and cellular metabolism. Enrichment of a core microvilli gene set also was lower in the UNITI-2 trial CD samples compared with controls; expression of microvilli genes was correlated with microvilli length and endoscopy score and was associated with response to treatment. CONCLUSIONS: In a transcriptome analysis of formalin-fixed and paraffin-embedded ileal tissues from patients with CD and controls, we associated transcriptional alterations with histologic alterations, such as differences in microvilli length. Decreased microvilli length and decreased expression of the microvilli gene set might contribute to epithelial malfunction and the chronic and progressive disease course in patients with CD.
Subject(s)
Crohn Disease/pathology , Ileum/pathology , Intestinal Mucosa/pathology , Intestine, Small/pathology , Microvilli/pathology , Chronic Disease , Crohn Disease/genetics , Disease Progression , Gene Expression Profiling , Humans , Microvilli/genetics , TranscriptomeABSTRACT
MOTIVATION: DNA and protein patterns are usefully represented by sequence logos. However, the methods for logo generation in common use lack a proper statistical basis, and are non-optimal for recognizing functionally relevant alignment columns. RESULTS: We redefine the information at a logo position as a per-observation multiple alignment log-odds score. Such scores are positive or negative, depending on whether a column's observations are better explained as arising from relatedness or chance. Within this framework, we propose distinct normalized maximum likelihood and Bayesian measures of column information. We illustrate these measures on High Mobility Group B (HMGB) box proteins and a dataset of enzyme alignments. Particularly in the context of protein alignments, our measures improve the discrimination of biologically relevant positions. AVAILABILITY AND IMPLEMENTATION: Our new measures are implemented in an open-source Web-based logo generation program, which is available at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/logoddslogo/index.html. A stand-alone version of the program is also available from this site. CONTACT: altschul@ncbi.nlm.nih.gov SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Subject(s)
Bayes Theorem , Position-Specific Scoring Matrices , Sequence Alignment/methods , Sequence Analysis, DNA/methods , Sequence Analysis, Protein/methods , Software , Amino Acid Sequence , Humans , Molecular Sequence Annotation , Molecular Sequence Data , Sequence Homology, Amino AcidABSTRACT
MOTIVATION: Proteins underlay the functioning of a cell and the wiring of proteins in protein-protein interaction network (PIN) relates to their biological functions. Proteins with similar wiring in the PIN (topology around them) have been shown to have similar functions. This property has been successfully exploited for predicting protein functions. Topological similarity is also used to guide network alignment algorithms that find similarly wired proteins between PINs of different species; these similarities are used to transfer annotation across PINs, e.g. from model organisms to human. To refine these functional predictions and annotation transfers, we need to gain insight into the variability of the topology-function relationships. For example, a function may be significantly associated with specific topologies, while another function may be weakly associated with several different topologies. Also, the topology-function relationships may differ between different species. RESULTS: To improve our understanding of topology-function relationships and of their conservation among species, we develop a statistical framework that is built upon canonical correlation analysis. Using the graphlet degrees to represent the wiring around proteins in PINs and gene ontology (GO) annotations to describe their functions, our framework: (i) characterizes statistically significant topology-function relationships in a given species, and (ii) uncovers the functions that have conserved topology in PINs of different species, which we term topologically orthologous functions. We apply our framework to PINs of yeast and human, identifying seven biological process and two cellular component GO terms to be topologically orthologous for the two organisms.
Subject(s)
Algorithms , Computational Biology/methods , Gene Regulatory Networks , Protein Interaction Mapping/methods , Proteins/metabolism , Saccharomyces cerevisiae/metabolism , Conserved Sequence , Gene Ontology , Humans , Molecular Sequence Annotation , Saccharomyces cerevisiae/geneticsABSTRACT
The role of long noncoding RNAs (lncRNAs) in disease is incompletely understood, but their regulation of inflammation is increasingly appreciated. We addressed the extent of lncRNA involvement in inflammatory bowel disease (IBD) using biopsy-derived RNA-sequencing data from a large cohort of deeply phenotyped patients with IBD. Weighted gene correlation network analysis revealed gene modules of lncRNAs coexpressed with protein-coding genes enriched for biological pathways, correlated with epithelial and immune cell signatures, or correlated with distal colon expression. Correlation of modules with clinical features uncovered a module correlated with disease severity, with an enriched interferon response signature containing the hub lncRNA IRF1-AS1. Connecting genes to IBD-associated single nucleotide polymorphisms (SNPs) revealed an enrichment of SNP-adjacent lncRNAs in biologically relevant modules. Ulcerative colitis-specific SNPs were enriched in distal colon-related modules, suggesting that disease-specific mechanisms may result from altered lncRNA expression. The function of the IBD-associated SNP-adjacent lncRNA IRF1-AS1 was explored in human myeloid cells, and our results suggested IRF1-AS1 promoted optimal production of TNF-α, IL-6, and IL-23. A CRISPR/Cas9-mediated activation screen in THP-1 cells revealed several lncRNAs that modulated LPS-induced TNF-α responses. Overall, this study uncovered the expression patterns of lncRNAs in IBD that identify functional, disease-relevant lncRNAs.
Subject(s)
Colitis, Ulcerative , RNA, Long Noncoding , Humans , Gene Regulatory Networks , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , Tumor Necrosis Factor-alpha/genetics , Colitis, Ulcerative/genetics , InflammationABSTRACT
CytoSaddleSum provides Cytoscape users with access to the functionality of SaddleSum, a functional enrichment tool based on sum-of-weight scores. It operates by querying SaddleSum locally (using the standalone version) or remotely (through an HTTP request to a web server). The functional enrichment results are shown as a term relationship network, where nodes represent terms and edges show term relationships. Furthermore, query results are written as Cytoscape attributes allowing easy saving, retrieval and integration into network-based data analysis workflows.
Subject(s)
Genes , Software , Gene Deletion , National Library of Medicine (U.S.) , United StatesABSTRACT
BACKGROUND AND AIMS: Histologic disease activity in Inflammatory Bowel Disease (IBD) is associated with clinical outcomes and is an important endpoint in drug development. We developed deep learning models for automating histological assessments in IBD. METHODS: Histology images of intestinal mucosa from phase 2 and phase 3 clinical trials in Crohn's disease (CD) and Ulcerative Colitis (UC) were used to train artificial intelligence (AI) models to predict the Global Histology Activity Score (GHAS) for CD and Geboes histopathology score for UC. Three AI methods were compared. AI models were evaluated on held-back testing sets and model predictions were compared against an expert central reader and five independent pathologists. RESULTS: The model based on multiple instance learning and the attention mechanism (SA-AbMILP) demonstrated the best performance among competing models. AI modeled GHAS and Geboes sub-grades matched central readings with moderate to substantial agreement, with accuracies ranging from 65% to 89%. Furthermore, the model was able to distinguish the presence and absence of pathology across four selected histological features with accuracies for colon, in both CD and UC, ranging from 87% to 94% and, for CD ileum, ranging from 76% to 83%. For both CD and UC, and across anatomical compartments (ileum and colon) in CD, comparable accuracies against central readings were found between the model assigned scores and scores by an independent set of pathologists. CONCLUSIONS: Deep learning models based upon GHAS and Geboes scoring systems were effective at distinguishing between the presence and absence of IBD microscopic disease activity.
ABSTRACT
Previous studies have conducted time course characterization of murine colitis models through transcriptional profiling of differential expression. We characterize the transcriptional landscape of acute and chronic models of dextran sodium sulfate (DSS) and adoptive transfer (AT) colitis to derive temporal gene expression and splicing signatures in blood and colonic tissue in order to capture dynamics of colitis remission and relapse. We identify sub networks of patient-derived causal networks that are enriched in these temporal signatures to distinguish acute and chronic disease components within the broader molecular landscape of IBD. The interaction between the DSS phenotype and chronological time-point naturally defines parsimonious temporal gene expression and splicing signatures associated with acute and chronic phases disease (as opposed to ordinary time-specific differential expression/splicing). We show these expression and splicing signatures are largely orthogonal, i.e. affect different genetic bodies, and that using machine learning, signatures are predictive of histopathological measures from both blood and intestinal data in murine colitis models as well as an independent cohort of IBD patients. Through access to longitudinal multi-scale profiling from disease tissue in IBD patient cohorts, we can apply this machine learning pipeline to generation of direct patient temporal multimodal regulatory signatures for prediction of histopathological outcomes.
Subject(s)
Colitis , Inflammatory Bowel Diseases , Animals , Mice , Inflammatory Bowel Diseases/genetics , Colitis/chemically induced , Colitis/genetics , Phenotype , Dextran Sulfate/toxicityABSTRACT
B cells, which are critical for intestinal homeostasis, remain understudied in ulcerative colitis (UC). In this study, we recruited three cohorts of patients with UC (primary cohort, n = 145; validation cohort 1, n = 664; and validation cohort 2, n = 143) to comprehensively define the landscape of B cells during UC-associated intestinal inflammation. Using single-cell RNA sequencing, single-cell IgH gene sequencing and protein-level validation, we mapped the compositional, transcriptional and clonotypic landscape of mucosal and circulating B cells. We found major perturbations within the mucosal B cell compartment, including an expansion of naive B cells and IgG+ plasma cells with curtailed diversity and maturation. Furthermore, we isolated an auto-reactive plasma cell clone targeting integrin αvß6 from inflamed UC intestines. We also identified a subset of intestinal CXCL13-expressing TFH-like T peripheral helper cells that were associated with the pathogenic B cell response. Finally, across all three cohorts, we confirmed that changes in intestinal humoral immunity are reflected in circulation by the expansion of gut-homing plasmablasts that correlates with disease activity and predicts disease complications. Our data demonstrate a highly dysregulated B cell response in UC and highlight a potential role of B cells in disease pathogenesis.
Subject(s)
Colitis, Ulcerative , Plasma Cells , B-Lymphocytes , Colitis, Ulcerative/genetics , Humans , Intestinal Mucosa/pathology , Lymphocyte Count , T-Lymphocytes, Helper-InducerABSTRACT
MOTIVATION: Term-enrichment analysis facilitates biological interpretation by assigning to experimentally/computationally obtained data annotation associated with terms from controlled vocabularies. This process usually involves obtaining statistical significance for each vocabulary term and using the most significant terms to describe a given set of biological entities, often associated with weights. Many existing enrichment methods require selections of (arbitrary number of) the most significant entities and/or do not account for weights of entities. Others either mandate extensive simulations to obtain statistics or assume normal weight distribution. In addition, most methods have difficulty assigning correct statistical significance to terms with few entities. RESULTS: Implementing the well-known Lugananni-Rice formula, we have developed a novel approach, called SaddleSum, that is free from all the aforementioned constraints and evaluated it against several existing methods. With entity weights properly taken into account, SaddleSum is internally consistent and stable with respect to the choice of number of most significant entities selected. Making few assumptions on the input data, the proposed method is universal and can thus be applied to areas beyond analysis of microarrays. Employing asymptotic approximation, SaddleSum provides a term-size-dependent score distribution function that gives rise to accurate statistical significance even for terms with few entities. As a consequence, SaddleSum enables researchers to place confidence in its significance assignments to small terms that are often biologically most specific. AVAILABILITY: Our implementation, which uses Bonferroni correction to account for multiple hypotheses testing, is available at http://www.ncbi.nlm.nih.gov/CBBresearch/qmbp/mn/enrich/. Source code for the standalone version can be downloaded from ftp://ftp.ncbi.nlm.nih.gov/pub/qmbpmn/SaddleSum/.
Subject(s)
Computational Biology/methods , Vocabulary, Controlled , Algorithms , Data Interpretation, Statistical , Databases, Factual , Terminology as TopicABSTRACT
BACKGROUND: Inflammatory bowel disease (IBD) is a complex disease with variable presentation, progression, and response to therapies. Current disease classification is based on subjective clinical phenotypes. The peripheral blood immunophenome can reflect local inflammation, and thus we measured 39 circulating immune cell types in a large cohort of IBD and control subjects and performed immunotype:phenotype associations. METHODS: We performed fluorescence-activated cell sorting or CyTOF analysis on blood from 728 Crohn's disease, 464 ulcerative colitis, and 334 non-IBD patients, with available demographics, endoscopic and clinical examinations and medication use. RESULTS: We observed few immune cell types commonly affected in IBD (lowered natural killer cells, B cells, and CD45RA- CD8 T cells). Generally, the immunophenome was distinct between ulcerative colitis and Crohn's disease. Within disease subtype, there were further distinctions, with specific immune cell types associating with disease duration, behavior, and location. Thiopurine monotherapy altered abundance of many cell types, often in the same direction as disease association, while anti-tumor necrosis factor (anti-TNF) monotherapy demonstrated an opposing pattern. Concomitant use of an anti-TNF and thiopurine was not synergistic, but rather was additive. For example, thiopurine monotherapy use alone or in combination with anti-TNF was associated with a dramatic reduction in major subclasses of B cells. CONCLUSIONS: We present a peripheral map of immune cell changes in IBD related to disease entity and therapies as a resource for hypothesis generation. We propose the changes in B cell subsets could affect antibody formation and potentially explain the mechanism behind the superiority of combination therapy through the impact of thiopurines on pharmacokinetics of anti-TNFs.
Subject(s)
Immune System/pathology , Inflammatory Bowel Diseases/immunology , Inflammatory Bowel Diseases/therapy , Adult , B-Lymphocytes/immunology , Case-Control Studies , Cohort Studies , Colitis, Ulcerative/blood , Colitis, Ulcerative/diagnosis , Colitis, Ulcerative/immunology , Colitis, Ulcerative/therapy , Combined Modality Therapy , Crohn Disease/blood , Crohn Disease/diagnosis , Crohn Disease/immunology , Crohn Disease/therapy , Female , Humans , Immunophenotyping , Inflammatory Bowel Diseases/blood , Inflammatory Bowel Diseases/diagnosis , Male , Mercaptopurine/therapeutic use , Middle Aged , Severity of Illness Index , Surveys and Questionnaires , T-Lymphocytes/immunology , Tumor Necrosis Factor-alpha/antagonists & inhibitorsABSTRACT
SUMMARY: Founded upon diffusion with damping, ITM Probe is an application for modeling information flow in protein interaction networks without prior restriction to the sub-network of interest. Given a context consisting of desired origins and destinations of information, ITM Probe returns the set of most relevant proteins with weights and a graphical representation of the corresponding sub-network. With a click, the user may send the resulting protein list for enrichment analysis to facilitate hypothesis formation or confirmation. AVAILABILITY: ITM Probe web service and documentation can be found at www.ncbi.nlm.nih.gov/CBBresearch/qmbp/mn/itm_probe.
Subject(s)
Computational Biology/methods , Protein Interaction Mapping/methods , Proteins/chemistry , Software , Databases, ProteinABSTRACT
MOTIVATION: The flexibility in gap cost enjoyed by hidden Markov models (HMMs) is expected to afford them better retrieval accuracy than position-specific scoring matrices (PSSMs). We attempt to quantify the effect of more general gap parameters by separately examining the influence of position- and composition-specific gap scores, as well as by comparing the retrieval accuracy of the PSSMs constructed using an iterative procedure to that of the HMMs provided by Pfam and SUPERFAMILY, curated ensembles of multiple alignments. RESULTS: We found that position-specific gap penalties have an advantage over uniform gap costs. We did not explore optimizing distinct uniform gap costs for each query. For Pfam, PSSMs iteratively constructed from seeds based on HMM consensus sequences perform equivalently to HMMs that were adjusted to have constant gap transition probabilities, albeit with much greater variance. We observed no effect of composition-specific gap costs on retrieval performance. These results suggest possible improvements to the PSI-BLAST protein database search program. AVAILABILITY: The scripts for performing evaluations are available upon request from the authors.
Subject(s)
Algorithms , Artificial Intelligence , Pattern Recognition, Automated/methods , Proteins/chemistry , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Amino Acid Sequence , Markov Chains , Molecular Sequence Data , Sensitivity and SpecificityABSTRACT
There is a major unmet clinical need to identify pathways in inflammatory bowel disease (IBD) to classify patient disease activity, stratify patients that will benefit from targeted therapies such as anti-tumor necrosis factor (TNF), and identify new therapeutic targets. In this study, we conducted global transcriptome analysis to identify IBD-related pathways using colon biopsies, which highlighted the coagulation gene pathway as one of the most enriched gene sets in patients with IBD. Using this gene-network analysis across 14 independent cohorts and 1800 intestinal biopsies, we found that, among the coagulation pathway genes, plasminogen activator inhibitor-1 (PAI-1) expression was highly enriched in active disease and in patients with IBD who did not respond to anti-TNF biologic therapy and that PAI-1 is a key link between the epithelium and inflammation. Functionally, PAI-1 and its direct target, the fibrinolytic protease tissue plasminogen activator (tPA), played an important role in regulating intestinal inflammation. Intestinal epithelial cells produced tPA, which was protective against chemical and mechanical-mediated colonic injury in mice. In contrast, PAI-1 exacerbated mucosal damage by blocking tPA-mediated cleavage and activation of anti-inflammatory TGF-ß, whereas the inhibition of PAI-1 reduced both mucosal damage and inflammation. This study identifies an immune-coagulation gene axis in IBD where elevated PAI-1 may contribute to more aggressive disease.
Subject(s)
Colitis/metabolism , Colitis/pathology , Intestinal Mucosa/metabolism , Intestinal Mucosa/pathology , Plasminogen Activator Inhibitor 1/metabolism , Animals , Biological Factors/pharmacology , Biological Factors/therapeutic use , Blood Coagulation , Cell Proliferation/drug effects , Citrobacter/drug effects , Colitis/immunology , Colitis/microbiology , Colon/pathology , Cytokines/metabolism , Inflammation/pathology , Inflammatory Bowel Diseases/blood , Inflammatory Bowel Diseases/drug therapy , Inflammatory Bowel Diseases/pathology , Interleukin-17/metabolism , Mice , Severity of Illness Index , Small Molecule Libraries/pharmacology , Th17 Cells/immunology , Tissue Plasminogen Activator/metabolism , Transcription, Genetic , Transforming Growth Factor beta/metabolismABSTRACT
Interaction networks, consisting of agents linked by their interactions, are ubiquitous across many disciplines of modern science. Many methods of analysis of interaction networks have been proposed, mainly concentrating on node degree distribution or aiming to discover clusters of agents that are very strongly connected between themselves. These methods are principally based on graph-theory or machine learning. We present a mathematically simple formalism for modelling context-specific information propagation in interaction networks based on random walks. The context is provided by selection of sources and destinations of information and by use of potential functions that direct the flow towards the destinations. We also use the concept of dissipation to model the aging of information as it diffuses from its source. Using examples from yeast protein-protein interaction networks and some of the histone acetyltransferases involved in control of transcription, we demonstrate the utility of the concepts and the mathematical constructs introduced in this paper.