RESUMO
BACKGROUND AND AIMS: Nonalcoholic steatohepatitis (NASH) is considered as a pivotal stage in nonalcoholic fatty liver disease (NAFLD) progression, given that it paves the way for severe liver injuries such as fibrosis and cirrhosis. The etiology of human NASH is multifactorial, and identifying reliable molecular players and/or biomarkers has proven difficult. Together with the inappropriate consideration of risk factors revealed by epidemiological studies (altered glucose homeostasis, obesity, ethnicity, sex, etc.), the limited availability of representative NASH cohorts with associated liver biopsies, the gold standard for NASH diagnosis, probably explains the poor overlap between published "omics"-defined NASH signatures. APPROACH AND RESULTS: Here, we have explored transcriptomic profiles of livers starting from a 910-obese-patient cohort, which was further stratified based on stringent histological characterization, to define "NoNASH" and "NASH" patients. Sex was identified as the main factor for data heterogeneity in this cohort. Using powerful bootstrapping and random forest (RF) approaches, we identified reliably differentially expressed genes participating in distinct biological processes in NASH as a function of sex. RF-calculated gene signatures identified NASH patients in independent cohorts with high accuracy. CONCLUSIONS: This large-scale analysis of transcriptomic profiles from human livers emphasized the sexually dimorphic nature of NASH and its link with fibrosis, calling for the integration of sex as a major determinant of liver responses to NASH progression and responses to drugs.
Assuntos
Hepatopatia Gordurosa não Alcoólica/metabolismo , Feminino , Humanos , Fígado/metabolismo , Fígado/patologia , Masculino , Pessoa de Meia-Idade , Hepatopatia Gordurosa não Alcoólica/etiologia , Hepatopatia Gordurosa não Alcoólica/patologia , Obesidade/complicações , Obesidade/metabolismo , Fatores de Risco , Fatores Sexuais , TranscriptomaRESUMO
BACKGROUND: In eukaryotic cells, transcription factors (TFs) are thought to act in a combinatorial way, by competing and collaborating to regulate common target genes. However, several questions remain regarding the conservation of these combinations among different gene classes, regulatory regions and cell types. RESULTS: We propose a new approach named TFcoop to infer the TF combinations involved in the binding of a target TF in a particular cell type. TFcoop aims to predict the binding sites of the target TF upon the nucleotide content of the sequences and of the binding affinity of all identified cooperating TFs. The set of cooperating TFs and model parameters are learned from ChIP-seq data of the target TF. We used TFcoop to investigate the TF combinations involved in the binding of 106 TFs on 41 cell types and in four regulatory regions: promoters of mRNAs, lncRNAs and pri-miRNAs, and enhancers. We first assess that TFcoop is accurate and outperforms simple PWM methods for predicting TF binding sites. Next, analysis of the learned models sheds light on important properties of TF combinations in different promoter classes and in enhancers. First, we show that combinations governing TF binding on enhancers are more cell-type specific than that governing binding in promoters. Second, for a given TF and cell type, we observe that TF combinations are different between promoters and enhancers, but similar for promoters of mRNAs, lncRNAs and pri-miRNAs. Analysis of the TFs cooperating with the different targets show over-representation of pioneer TFs and a clear preference for TFs with binding motif composition similar to that of the target. Lastly, our models accurately distinguish promoters associated with specific biological processes. CONCLUSIONS: TFcoop appears as an accurate approach for studying TF combinations. Its use on ENCODE and FANTOM data allowed us to discover important properties of human TF combinations in different promoter classes and in enhancers. The R code for learning a TFcoop model and for reproducing the main experiments described in the paper is available in an R Markdown file at address https://gite.lirmm.fr/brehelin/TFcoop .
Assuntos
Biologia Computacional/métodos , Elementos Facilitadores Genéticos , Regulação da Expressão Gênica , Regiões Promotoras Genéticas , Fatores de Transcrição/metabolismo , Sítios de Ligação , Humanos , Fatores de Transcrição/genéticaRESUMO
Gene expression is orchestrated by distinct regulatory regions to ensure a wide variety of cell types and functions. A challenge is to identify which regulatory regions are active, what are their associated features and how they work together in each cell type. Several approaches have tackled this problem by modeling gene expression based on epigenetic marks, with the ultimate goal of identifying driving regions and associated genomic variations that are clinically relevant in particular in precision medicine. However, these models rely on experimental data, which are limited to specific samples (even often to cell lines) and cannot be generated for all regulators and all patients. In addition, we show here that, although these approaches are accurate in predicting gene expression, inference of TF combinations from this type of models is not straightforward. Furthermore these methods are not designed to capture regulation instructions present at the sequence level, before the binding of regulators or the opening of the chromatin. Here, we probe sequence-level instructions for gene expression and develop a method to explain mRNA levels based solely on nucleotide features. Our method positions nucleotide composition as a critical component of gene expression. Moreover, our approach, able to rank regulatory regions according to their contribution, unveils a strong influence of the gene body sequence, in particular introns. We further provide evidence that the contribution of nucleotide content can be linked to co-regulations associated with genome 3D architecture and to associations of genes within topologically associated domains.
Assuntos
Composição de Bases , Regulação da Expressão Gênica , Sequências Reguladoras de Ácido Nucleico , Biologia Computacional , Variações do Número de Cópias de DNA , Elementos Facilitadores Genéticos , Genoma Humano , Humanos , Modelos Genéticos , Neoplasias/genética , Neoplasias/metabolismo , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas , Locos de Características Quantitativas , RNA Mensageiro/química , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismoRESUMO
Background & Aims: Liver homeostasis is ensured in part by time-of-day-dependent processes, many of them being paced by the molecular circadian clock. Liver functions are compromised in metabolic dysfunction-associated steatotic liver disease (MASLD) and metabolic dysfunction-associated steatohepatitis (MASH), and clock disruption increases susceptibility to MASLD progression in rodent models. We therefore investigated whether the time-of-day-dependent transcriptome and metabolome are significantly altered in human steatotic and MASH livers. Methods: Liver biopsies, collected within an 8 h-window from a carefully phenotyped cohort of 290 patients and histologically diagnosed to be either normal, steatotic or MASH hepatic tissues, were analyzed by RNA sequencing and unbiased metabolomic approaches. Time-of-day-dependent gene expression patterns and metabolomes were identified and compared between histologically normal, steatotic and MASH livers. Results: Herein, we provide a first-of-its-kind report of a daytime-resolved human liver transcriptome-metabolome and associated alterations in MASLD. Transcriptomic analysis showed a robustness of core molecular clock components in steatotic and MASH livers. It also revealed stage-specific, time-of-day-dependent alterations of hundreds of transcripts involved in cell-to-cell communication, intracellular signaling and metabolism. Similarly, rhythmic amino acid and lipid metabolomes were affected in pathological livers. Both TNFα and PPARγ signaling were predicted as important contributors to altered rhythmicity. Conclusion: MASLD progression to MASH perturbs time-of-day-dependent processes in human livers, while the differential expression of core molecular clock components is maintained. Impact and implications: This work characterizes the rhythmic patterns of the transcriptome and metabolome in the human liver. Using a cohort of well-phenotyped patients (n = 290) for whom the time-of-day at biopsy collection was known, we show that time-of-day variations observed in histologically normal livers are gradually perturbed in liver steatosis and metabolic dysfunction-associated steatohepatitis. Importantly, these observations, albeit obtained across a restricted time window, provide further support for preclinical studies demonstrating alterations of rhythmic patterns in diseased livers. On a practical note, this study indicates the importance of considering time-of-day as a critical biological variable which may significantly affect data interpretation in animal and human studies of liver diseases.
RESUMO
OBJECTIVE: Steatotic liver disease (SLD) is frequent in individuals with obesity. In this study, type 2 diabetes (T2D), sex, and menopausal status were combined to refine the stratification of obesity regarding the risk of advanced SLD and gain further insight into disease physiopathology. METHODS: This study enrolled 1446 participants with obesity from the ABOS cohort (NCT01129297), who underwent extensive phenotyping, including liver histology and transcriptome profiling. Hierarchical clustering was applied to classify participants. The prevalence of metabolic disorders associated with steatohepatitis (NASH) and liver fibrosis (F ≥ 2) was determined within each identified subgroup and aligned to clinical and biological characteristics. RESULTS: The prevalence of NASH and F ≥ 2 was, respectively, 9.5% (N = 138/1446) and 11.7% (N = 159/1365) in the overall population, 20.3% (N = 107/726) and 21.1% (N = 106/502) in T2D patients, and 3.4% (N = 31/920) and 6.1% (N = 53/863) in non-T2D patients. NASH and F ≥ 2 prevalence was 15.4% (33/215) and 15.5% (32/206) among premenopausal women with T2D vs. 29.5% (33/112) and 30.3% (N = 36/119) in postmenopausal women with T2D (p < 0.01); and 21.0% (21/100) / 27.0% (24/89) in men with T2D ≥ age 50 years and 17.9% (17/95) / 18.5% (17/92) in men with T2D < age 50 years (NS). The distinct contribution of menopause was confirmed by the interaction between sex and age with respect to NASH among T2D patients (p = 0.048). Finally, several NASH-associated biological traits (lower platelet count; higher serum uric acid; gamma-glutamyl transferase; aspartate aminotransferase) and liver expressed genes AKR1B10 and CCL20 were significantly associated with menopause in women with T2D but not with age in men with T2D. CONCLUSIONS: This study unveiled a remarkably high prevalence of advanced SLD after menopause in women with T2D, associated with a dysfunctional biological liver profile.
Assuntos
Diabetes Mellitus Tipo 2 , Hepatopatia Gordurosa não Alcoólica , Masculino , Humanos , Feminino , Pessoa de Meia-Idade , Hepatopatia Gordurosa não Alcoólica/complicações , Hepatopatia Gordurosa não Alcoólica/epidemiologia , Hepatopatia Gordurosa não Alcoólica/patologia , Diabetes Mellitus Tipo 2/complicações , Diabetes Mellitus Tipo 2/epidemiologia , Diabetes Mellitus Tipo 2/metabolismo , Estudos Retrospectivos , Ácido Úrico/metabolismo , Cirrose Hepática/epidemiologia , Cirrose Hepática/complicações , Cirrose Hepática/patologia , Fígado/metabolismo , Obesidade/complicações , Obesidade/epidemiologia , Obesidade/metabolismo , MenopausaRESUMO
Tissue injury triggers activation of mesenchymal lineage cells into wound-repairing myofibroblasts, whose unrestrained activity leads to fibrosis. Although this process is largely controlled at the transcriptional level, whether the main transcription factors involved have all been identified has remained elusive. Here, we report multi-omics analyses unraveling Basonuclin 2 (BNC2) as a myofibroblast identity transcription factor. Using liver fibrosis as a model for in-depth investigations, we first show that BNC2 expression is induced in both mouse and human fibrotic livers from different etiologies and decreases upon human liver fibrosis regression. Importantly, we found that BNC2 transcriptional induction is a specific feature of myofibroblastic activation in fibrotic tissues. Mechanistically, BNC2 expression and activities allow to integrate pro-fibrotic stimuli, including TGFß and Hippo/YAP1 signaling, towards induction of matrisome genes such as those encoding type I collagen. As a consequence, Bnc2 deficiency blunts collagen deposition in livers of mice fed a fibrogenic diet. Additionally, our work establishes BNC2 as potentially druggable since we identified the thalidomide derivative CC-885 as a BNC2 inhibitor. Altogether, we propose that BNC2 is a transcription factor involved in canonical pathways driving myofibroblastic activation in fibrosis.
Assuntos
Cirrose Hepática , Miofibroblastos , Animais , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Genômica , Humanos , Cirrose Hepática/genética , Cirrose Hepática/metabolismo , Camundongos , Miofibroblastos/metabolismo , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismoRESUMO
Transcriptomic analyses are broadly used in biomedical research calling for tools allowing biologists to be directly involved in data mining and interpretation. We present here GIANT, a Galaxy-based tool for Interactive ANalysis of Transcriptomic data, which consists of biologist-friendly tools dedicated to analyses of transcriptomic data from microarray or RNA-seq analyses. GIANT is organized into modules allowing researchers to tailor their analyses by choosing the specific set of tool(s) to analyse any type of preprocessed transcriptomic data. It also includes a series of tools dedicated to the handling of raw Affymetrix microarray data. GIANT brings easy-to-use solutions to biologists for transcriptomic data mining and interpretation.
Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Interpretação Estatística de Dados , Mineração de Dados , Bases de Dados Genéticas , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência com Séries de Oligonucleotídeos , Análise de Sequência de RNA , SoftwareRESUMO
Modern technologies and especially next generation sequencing facilities are giving a cheaper access to genotype and genomic data measured on the same sample at once. This creates an ideal situation for multifactorial experiments designed to infer gene regulatory networks. The fifth "Dialogue for Reverse Engineering Assessments and Methods" (DREAM5) challenges are aimed at assessing methods and associated algorithms devoted to the inference of biological networks. Challenge 3 on "Systems Genetics" proposed to infer causal gene regulatory networks from different genetical genomics data sets. We investigated a wide panel of methods ranging from Bayesian networks to penalised linear regressions to analyse such data, and proposed a simple yet very powerful meta-analysis, which combines these inference methods. We present results of the Challenge as well as more in-depth analysis of predicted networks in terms of structure and reliability. The developed meta-analysis was ranked first among the 16 teams participating in Challenge 3A. It paves the way for future extensions of our inference method and more accurate gene network estimates in the context of genetical genomics.