Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 60
Filtrar
1.
bioRxiv ; 2024 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-39229212

RESUMO

Late-onset Alzheimer's disease (LOAD) research has principally focused on neurons over the years due to their known role in the production of amyloid beta plaques and neurofibrillary tangles. In contrast, recent genomic studies of LOAD have implicated microglia as culprits of the prolonged inflammation exacerbating the neurodegeneration observed in patient brains. Indeed, recent LOAD genome-wide association studies (GWAS) have reported multiple loci near genes related to microglial function, including TREM2, ABI3, and CR1. However, GWAS alone cannot pinpoint underlying causal variants or effector genes at such loci, as most signals reside in non-coding regions of the genome and could presumably confer their influence frequently via long-range regulatory interactions. We elected to carry out a combination of ATAC-seq and high-resolution promoter-focused Capture-C in two human microglial cell models (iPSC-derived microglia and HMC3) in order to physically map interactions between LOAD GWAS-implicated candidate causal variants and their corresponding putative effector genes. Notably, we observed consistent evidence that rs6024870 at the GWAS CASS4 locus contacted the promoter of nearby gene, RTFDC1. We subsequently observed a directionallly consistent decrease in RTFDC1 expression with the the protective minor A allele of rs6024870 via both luciferase assays in HMC3 cells and expression studies in primary human microglia. Through CRISPR-Cas9-mediated deletion of the putative regulatory region harboring rs6024870 in HMC3 cells, we observed increased pro-inflammatory cytokine secretion and decreased DNA double strand break repair related, at least in part, to RTFDC1 expression levels. Our variant-to-function approach therefore reveals that the rs6024870-harboring regulatory element at the LOAD 'CASS4' GWAS locus influences both microglial inflammatory capacity and DNA damage resolution, along with cumulative evidence implicating RTFDC1 as a novel candidate effector gene.

2.
Diabetologia ; 2024 Sep 06.
Artigo em Inglês | MEDLINE | ID: mdl-39240351

RESUMO

AIMS/HYPOTHESIS: Genome-wide association studies (GWAS) have identified hundreds of type 2 diabetes loci, with the vast majority of signals located in non-coding regions; as a consequence, it remains largely unclear which 'effector' genes these variants influence. Determining these effector genes has been hampered by the relatively challenging cellular settings in which they are hypothesised to confer their effects. METHODS: To implicate such effector genes, we elected to generate and integrate high-resolution promoter-focused Capture-C, assay for transposase-accessible chromatin with sequencing (ATAC-seq) and RNA-seq datasets to characterise chromatin and expression profiles in multiple cell lines relevant to type 2 diabetes for subsequent functional follow-up analyses: EndoC-BH1 (pancreatic beta cell), HepG2 (hepatocyte) and Simpson-Golabi-Behmel syndrome (SGBS; adipocyte). RESULTS: The subsequent variant-to-gene analysis implicated 810 candidate effector genes at 370 type 2 diabetes risk loci. Using partitioned linkage disequilibrium score regression, we observed enrichment for type 2 diabetes and fasting glucose GWAS loci in promoter-connected putative cis-regulatory elements in EndoC-BH1 cells as well as fasting insulin GWAS loci in SGBS cells. Moreover, as a proof of principle, when we knocked down expression of the SMCO4 gene in EndoC-BH1 cells, we observed a statistically significant increase in insulin secretion. CONCLUSIONS/INTERPRETATION: These results provide a resource for comparing tissue-specific data in tractable cellular models as opposed to relatively challenging primary cell settings. DATA AVAILABILITY: Raw and processed next-generation sequencing data for EndoC-BH1, HepG2, SGBS_undiff and SGBS_diff cells are deposited in GEO under the Superseries accession GSE262484. Promoter-focused Capture-C data are deposited under accession GSE262496. Hi-C data are deposited under accession GSE262481. Bulk ATAC-seq data are deposited under accession GSE262479. Bulk RNA-seq data are deposited under accession GSE262480.

3.
Diabetes ; 73(10): 1697-1704, 2024 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-39083653

RESUMO

Persistent enterovirus B infection has been proposed as an important contributor to the etiology of type 1 diabetes. We leveraged extensive bulk RNA-sequencing (RNA-seq) data from α-, ß-, and exocrine cells, as well as islet single-cell RNA-seq data from the Human Pancreas Analysis Program (HPAP), to evaluate the presence of enterovirus B sequences in the pancreas of patients with type 1 diabetes and prediabetes (no diabetes but positive for autoantibodies). We examined all available HPAP data for either assay type, including donors without diabetes and with type 1 and type 2 diabetes. To assess the presence of viral reads, we analyzed all reads not mapping to the human genome with the taxonomic classification system Kraken2 and its full viral database augmented to encompass representatives for all 28 enterovirus B serotypes for which a complete genome is available. As a secondary approach, we input the same sequence reads into the STAR aligner using these 28 enterovirus B genomes as the reference. No enterovirus B sequences were detected by either approach in any of the 243 bulk RNA libraries or in any of the 79 single-cell RNA libraries. While we cannot rule out the possibility of a very-low-grade persistent enterovirus B infection in the donors analyzed, our data do not support the notion of chronic viral infection by these viruses as a major driver of type 1 diabetes.


Assuntos
Diabetes Mellitus Tipo 1 , Enterovirus Humano B , Infecções por Enterovirus , Ilhotas Pancreáticas , Estado Pré-Diabético , Análise de Sequência de RNA , Diabetes Mellitus Tipo 1/virologia , Diabetes Mellitus Tipo 1/genética , Humanos , Ilhotas Pancreáticas/virologia , Infecções por Enterovirus/virologia , Infecções por Enterovirus/genética , Estado Pré-Diabético/virologia , Estado Pré-Diabético/genética , Enterovirus Humano B/genética , Análise de Sequência de RNA/métodos , Masculino , Feminino , Adulto
4.
Cancer Res ; 83(20): 3462-3477, 2023 10 13.
Artigo em Inglês | MEDLINE | ID: mdl-37584517

RESUMO

Long noncoding RNAs (lncRNA) play an important role in gene regulation and contribute to tumorigenesis. While pan-cancer studies of lncRNA expression have been performed for adult malignancies, the lncRNA landscape across pediatric cancers remains largely uncharted. Here, we curated RNA sequencing data for 1,044 pediatric leukemia and extracranial solid tumors and integrated paired tumor whole genome sequencing and epigenetic data in relevant cell line models to explore lncRNA expression, regulation, and association with cancer. A total of 2,657 lncRNAs were robustly expressed across six pediatric cancers, including 1,142 exhibiting histotype-elevated expression. DNA copy number alterations contributed to lncRNA dysregulation at a proportion comparable to protein coding genes. Application of a multidimensional framework to identify and prioritize lncRNAs impacting gene networks revealed that lncRNAs dysregulated in pediatric cancer are associated with proliferation, metabolism, and DNA damage hallmarks. Analysis of upstream regulation via cell type-specific transcription factors further implicated distinct histotype-elevated and developmental lncRNAs. Integration of these analyses prioritized lncRNAs for experimental validation, and silencing of TBX2-AS1, the top-prioritized neuroblastoma-specific lncRNA, resulted in significant growth inhibition of neuroblastoma cells, confirming the computational predictions. Taken together, these data provide a comprehensive characterization of lncRNA regulation and function in pediatric cancers and pave the way for future mechanistic studies. SIGNIFICANCE: Comprehensive characterization of lncRNAs in pediatric cancer leads to the identification of highly expressed lncRNAs across childhood cancers, annotation of lncRNAs showing histotype-specific elevated expression, and prediction of lncRNA gene regulatory networks.


Assuntos
Leucemia , Neuroblastoma , RNA Longo não Codificante , Adulto , Humanos , Criança , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , Perfilação da Expressão Gênica , Neuroblastoma/genética , Leucemia/genética , Genômica , Redes Reguladoras de Genes , Regulação Neoplásica da Expressão Gênica
6.
Transl Psychiatry ; 13(1): 78, 2023 03 03.
Artigo em Inglês | MEDLINE | ID: mdl-36869037

RESUMO

Disrupted sleep is a symptom of many psychiatric disorders, including substance use disorders. Most drugs of abuse, including opioids, disrupt sleep. However, the extent and consequence of opioid-induced sleep disturbance, especially during chronic drug exposure, is understudied. We have previously shown that sleep disturbance alters voluntary morphine intake. Here, we examine the effects of acute and chronic morphine exposure on sleep. Using an oral self-administration paradigm, we show that morphine disrupts sleep, most significantly during the dark cycle in chronic morphine, with a concomitant sustained increase in neural activity in the Paraventricular Nucleus of the Thalamus (PVT). Morphine binds primarily to Mu Opioid Receptors (MORs), which are highly expressed in the PVT. Translating Ribosome Affinity Purification (TRAP)-Sequencing of PVT neurons that express MORs showed significant enrichment of the circadian entrainment pathway. To determine whether MOR + cells in the PVT mediate morphine-induced sleep/wake properties, we inhibited these neurons during the dark cycle while mice were self-administering morphine. This inhibition decreased morphine-induced wakefulness but not general wakefulness, indicating that MORs in the PVT contribute to opioid-specific wake alterations. Overall, our results suggest an important role for PVT neurons that express MORs in mediating morphine-induced sleep disturbance.


Assuntos
Morfina , Transtornos do Sono-Vigília , Animais , Camundongos , Analgésicos Opioides , Receptores Opioides mu , Neurônios , Tálamo
7.
Cell Mol Gastroenterol Hepatol ; 15(4): 821-839, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36503150

RESUMO

BACKGROUND & AIMS: Although trimethylation of histone H3 lysine 27 (H3K27me3) by polycomb repressive complex 2 is required for intestinal function, the role of the antagonistic process-H3K27me3 demethylation-in the intestine remains unknown. The aim of this study was to determine the contribution of H3K27me3 demethylases to intestinal homeostasis. METHODS: An inducible mouse model was used to simultaneously ablate the 2 known H3K27me3 demethylases, lysine (K)-specific demethylase 6A (Kdm6a) and lysine (K)-specific demethylase 6B (Kdm6b), from the intestinal epithelium. Mice were analyzed at acute and prolonged time points after Kdm6a/b ablation. Cellular proliferation and differentiation were measured using immunohistochemistry, while RNA sequencing and chromatin immunoprecipitation followed by sequencing for H3K27me3 were used to identify gene expression and chromatin changes after Kdm6a/b loss. Intestinal epithelial renewal was evaluated using a radiation-induced injury model, while Paneth cell homeostasis was measured via immunohistochemistry, immunoblot, and transmission electron microscopy. RESULTS: We did not detect any effect of Kdm6a/b ablation on intestinal cell proliferation or differentiation toward the secretory cell lineages. Acute and prolonged Kdm6a/b loss perturbed expression of gene signatures belonging to multiple cell lineages (adjusted P value < .05), and a set of 72 genes was identified as being down-regulated with an associated increase in H3K27me3 levels after Kdm6a/b ablation (false discovery rate, <0.05). After prolonged Kdm6a/b loss, dysregulation of the Paneth cell gene signature was associated with perturbed matrix metallopeptidase 7 localization (P < .0001) and expression. CONCLUSIONS: Although KDM6A/B does not regulate intestinal cell differentiation, both enzymes are required to support the full transcriptomic and epigenomic landscape of the intestinal epithelium and the expression of key Paneth cell genes.


Assuntos
Epigenômica , Histonas , Animais , Camundongos , Histonas/metabolismo , Lisina/metabolismo , Histona Desmetilases/genética , Histona Desmetilases/metabolismo , Mucosa Intestinal/metabolismo
8.
J Mol Med (Berl) ; 100(9): 1341-1353, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-35986225

RESUMO

Idiopathic pulmonary fibrosis (IPF) is a chronic, progressive, fibrosing interstitial pneumonia of unknown etiology. The role of genetic risk factors has been the focus of numerous studies probing for associations of genetic variants with IPF. We aimed to determine whether single-nucleotide polymorphisms (SNPs) of four candidate genes are associated with IPF susceptibility and survival in a Portuguese population. A retrospective case-control study was performed with 64 IPF patients and 74 healthy controls. Ten single-nucleotide variants residing in the MUC5B, TOLLIP, SERPINB1, and PLAU genes were analyzed. Single- and multi-locus analyses were performed to investigate the predictive potential of specific variants in IPF susceptibility and survival. Multifactor dimensionality reduction (MDR) was employed to uncover predictive multi-locus interactions underlying IPF susceptibility. The MUC5B rs35705950 SNP was significantly associated with IPF: T allele carriers were significantly more frequent among IPF patients (75.0% vs 20.3%, P < 1.0 × 10-6). Genotypic and allelic distributions of TOLLIP, PLAU, and SERPINB1 SNPs did not differ significantly between groups. However, the MUC5B-TOLLIP T-C-T-C haplotype, defined by the rs35705950-rs111521887-rs5743894-rs5743854 block, emerged as an independent protective factor in IPF survival (HR = 0.37, 95% CI 0.17-0.78, P = 0.009, after adjustment for FVC). No significant multi-locus interactions correlating with disease susceptibility were detected. MUC5B rs35705950 was linked to an increased risk for IPF, as reported for other populations, but not to disease survival. A haplotype incorporating SNPs of the MUC5B-TOLLIP locus at 11p15.5 seems to predict better survival and could prove useful for prognostic purposes and IPF patient stratification. KEY MESSAGES : The MUC5B rs35705950 minor allele is associated with IPF risk in the Portuguese. No predictive multi-locus interactions of IPF susceptibility were identified by MDR. A haplotype defined by MUC5B and TOLLIP SNPs is a protective factor in IPF survival. The haplotype may be used as a prognostic tool for IPF patient stratification.


Assuntos
Fibrose Pulmonar Idiopática , Serpinas , Humanos , Estudos de Casos e Controles , Predisposição Genética para Doença , Fibrose Pulmonar Idiopática/genética , Polimorfismo de Nucleotídeo Único , Estudos Retrospectivos , Serpinas/genética
9.
BioData Min ; 15(1): 15, 2022 Jul 26.
Artigo em Inglês | MEDLINE | ID: mdl-35883154

RESUMO

OBJECTIVES: Ascertain and compare the performances of Automated Machine Learning (AutoML) tools on large, highly imbalanced healthcare datasets. MATERIALS AND METHODS: We generated a large dataset using historical de-identified administrative claims including demographic information and flags for disease codes in four different time windows prior to 2019. We then trained three AutoML tools on this dataset to predict six different disease outcomes in 2019 and evaluated model performances on several metrics. RESULTS: The AutoML tools showed improvement from the baseline random forest model but did not differ significantly from each other. All models recorded low area under the precision-recall curve and failed to predict true positives while keeping the true negative rate high. Model performance was not directly related to prevalence. We provide a specific use-case to illustrate how to select a threshold that gives the best balance between true and false positive rates, as this is an important consideration in medical applications. DISCUSSION: Healthcare datasets present several challenges for AutoML tools, including large sample size, high imbalance, and limitations in the available features. Improvements in scalability, combinations of imbalance-learning resampling and ensemble approaches, and curated feature selection are possible next steps to achieve better performance. CONCLUSION: Among the three explored, no AutoML tool consistently outperforms the rest in terms of predictive performance. The performances of the models in this study suggest that there may be room for improvement in handling medical claims data. Finally, selection of the optimal prediction threshold should be guided by the specific practical application.

10.
J Clin Invest ; 132(11)2022 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-35642629

RESUMO

BACKGROUNDMultiple islet autoantibodies (AAbs) predict the development of type 1 diabetes (T1D) and hyperglycemia within 10 years. By contrast, T1D develops in only approximately 15% of individuals who are positive for single AAbs (generally against glutamic acid decarboxylase [GADA]); hence, the single GADA+ state may represent an early stage of T1D.METHODSHere, we functionally, histologically, and molecularly phenotyped human islets from nondiabetic GADA+ and T1D donors.RESULTSSimilar to the few remaining ß cells in the T1D islets, GADA+ donor islets demonstrated a preserved insulin secretory response. By contrast, α cell glucagon secretion was dysregulated in both GADA+ and T1D islets, with impaired glucose suppression of glucagon secretion. Single-cell RNA-Seq of GADA+ α cells revealed distinct abnormalities in glycolysis and oxidative phosphorylation pathways and a marked downregulation of cAMP-dependent protein kinase inhibitor ß (PKIB), providing a molecular basis for the loss of glucose suppression and the increased effect of 3-isobutyl-1-methylxanthine (IBMX) observed in GADA+ donor islets.CONCLUSIONWe found that α cell dysfunction was present during the early stages of islet autoimmunity at a time when ß cell mass was still normal, raising important questions about the role of early α cell dysfunction in the progression of T1D.FUNDINGThis work was supported by grants from the NIH (3UC4DK112217-01S1, U01DK123594-02, UC4DK112217, UC4DK112232, U01DK123716, and P30 DK019525) and the Vanderbilt Diabetes Research and Training Center (DK20593).


Assuntos
Diabetes Mellitus Tipo 1 , Glutamato Descarboxilase , Autoanticorpos , Glucagon , Glucose , Humanos
11.
Sleep ; 45(8)2022 08 11.
Artigo em Inglês | MEDLINE | ID: mdl-35537191

RESUMO

We investigated the potential role of sleep-trait associated genetic loci in conferring a degree of their effect via pancreatic α- and ß-cells, given that both sleep disturbances and metabolic disorders, including type 2 diabetes and obesity, involve polygenic contributions and complex interactions. We determined genetic commonalities between sleep and metabolic disorders, conducting linkage disequilibrium genetic correlation analyses with publicly available GWAS summary statistics. Then we investigated possible enrichment of sleep-trait associated SNPs in promoter-interacting open chromatin regions within α- and ß-cells, intersecting public GWAS reports with our own ATAC-seq and high-resolution promoter-focused Capture C data generated from both sorted human α-cells and an established human beta-cell line (EndoC-ßH1). Finally, we identified putative effector genes physically interacting with sleep-trait associated variants in α- and EndoC-ßH1cells running variant-to-gene mapping and establish pathways in which these genes are significantly involved. We observed that insomnia, short and long sleep-but not morningness-were significantly correlated with type 2 diabetes, obesity and other metabolic traits. Both the EndoC-ßH1 and α-cells were enriched for insomnia loci (p = .01; p = .0076), short sleep loci (p = .017; p = .022) and morningness loci (p = 2.2 × 10-7; p = .0016), while the α-cells were also enriched for long sleep loci (p = .034). Utilizing our promoter contact data, we identified 63 putative effector genes in EndoC-ßH1 and 76 putative effector genes in α-cells, with these genes showing significant enrichment for organonitrogen and organophosphate biosynthesis, phosphatidylinositol and phosphorylation, intracellular transport and signaling, stress responses and cell differentiation. Our data suggest that a subset of sleep-related loci confer their effects via cells in pancreatic islets.


Assuntos
Diabetes Mellitus Tipo 2 , Ilhotas Pancreáticas , Distúrbios do Início e da Manutenção do Sono , Mapeamento Cromossômico , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla , Humanos , Ilhotas Pancreáticas/metabolismo , Obesidade/metabolismo , Sono , Distúrbios do Início e da Manutenção do Sono/metabolismo
12.
Hum Genet ; 141(9): 1529-1544, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34713318

RESUMO

The genetic analysis of complex traits has been dominated by parametric statistical methods due to their theoretical properties, ease of use, computational efficiency, and intuitive interpretation. However, there are likely to be patterns arising from complex genetic architectures which are more easily detected and modeled using machine learning methods. Unfortunately, selecting the right machine learning algorithm and tuning its hyperparameters can be daunting for experts and non-experts alike. The goal of automated machine learning (AutoML) is to let a computer algorithm identify the right algorithms and hyperparameters thus taking the guesswork out of the optimization process. We review the promises and challenges of AutoML for the genetic analysis of complex traits and give an overview of several approaches and some example applications to omics data. It is our hope that this review will motivate studies to develop and evaluate novel AutoML methods and software in the genetics and genomics space. The promise of AutoML is to enable anyone, regardless of training or expertise, to apply machine learning as part of their genetic analysis strategy.


Assuntos
Aprendizado de Máquina , Herança Multifatorial , Algoritmos , Genômica/métodos , Humanos , Software
13.
IEEE/ACM Trans Comput Biol Bioinform ; 19(3): 1379-1386, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34310318

RESUMO

Machine Learning (ML) approaches are increasingly being used in biomedical applications. Important challenges of ML include choosing the right algorithm and tuning the parameters for optimal performance. Automated ML (AutoML) methods, such as Tree-based Pipeline Optimization Tool (TPOT), have been developed to take some of the guesswork out of ML thus making this technology available to users from more diverse backgrounds. The goals of this study were to assess applicability of TPOT to genomics and to identify combinations of single nucleotide polymorphisms (SNPs) associated with coronary artery disease (CAD), with a focus on genes with high likelihood of being good CAD drug targets. We leveraged public functional genomic resources to group SNPs into biologically meaningful sets to be selected by TPOT. We applied this strategy to data from the U.K. Biobank, detecting a strikingly recurrent signal stemming from a group of 28 SNPs. Importance analysis of these SNPs uncovered functional relevance of the top SNPs to genes whose association with CAD is supported in the literature and other resources. Furthermore, we employed game-theory based metrics to study SNP contributions to individual-level TPOT predictions and discover distinct clusters of well-predicted CAD cases. The latter indicates a promising approach towards precision medicine.


Assuntos
Doença da Artéria Coronariana , Aprendizado de Máquina , Algoritmos , Doença da Artéria Coronariana/genética , Humanos , Polimorfismo de Nucleotídeo Único
15.
Front Cell Dev Biol ; 9: 648791, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34017831

RESUMO

Newly differentiated pancreatic ß cells lack proper insulin secretion profiles of mature functional ß cells. The global gene expression differences between paired immature and mature ß cells have been studied, but the dynamics of transcriptional events, correlating with temporal development of glucose-stimulated insulin secretion (GSIS), remain to be fully defined. This aspect is important to identify which genes and pathways are necessary for ß-cell development or for maturation, as defective insulin secretion is linked with diseases such as diabetes. In this study, we assayed through RNA sequencing the global gene expression across six ß-cell developmental stages in mice, spanning from ß-cell progenitor to mature ß cells. A computational pipeline then selected genes differentially expressed with respect to progenitors and clustered them into groups with distinct temporal patterns associated with biological functions and pathways. These patterns were finally correlated with experimental GSIS, calcium influx, and insulin granule formation data. Gene expression temporal profiling revealed the timing of important biological processes across ß-cell maturation, such as the deregulation of ß-cell developmental pathways and the activation of molecular machineries for vesicle biosynthesis and transport, signal transduction of transmembrane receptors, and glucose-induced Ca2+ influx, which were established over a week before ß-cell maturation completes. In particular, ß cells developed robust insulin secretion at high glucose several days after birth, coincident with the establishment of glucose-induced calcium influx. Yet the neonatal ß cells displayed high basal insulin secretion, which decreased to the low levels found in mature ß cells only a week later. Different genes associated with calcium-mediated processes, whose alterations are linked with insulin resistance and deregulation of glucose homeostasis, showed increased expression across ß-cell stages, in accordance with the temporal acquisition of proper GSIS. Our temporal gene expression pattern analysis provided a comprehensive database of the underlying molecular components and biological mechanisms driving ß-cell maturation at different temporal stages, which are fundamental for better control of the in vitro production of functional ß cells from human embryonic stem/induced pluripotent cell for transplantation-based type 1 diabetes therapy.

16.
Development ; 148(6)2021 03 21.
Artigo em Inglês | MEDLINE | ID: mdl-33653874

RESUMO

To gain a deeper understanding of pancreatic ß-cell development, we used iterative weighted gene correlation network analysis to calculate a gene co-expression network (GCN) from 11 temporally and genetically defined murine cell populations. The GCN, which contained 91 distinct modules, was then used to gain three new biological insights. First, we found that the clustered protocadherin genes are differentially expressed during pancreas development. Pcdhγ genes are preferentially expressed in pancreatic endoderm, Pcdhß genes in nascent islets, and Pcdhα genes in mature ß-cells. Second, after extracting sub-networks of transcriptional regulators for each developmental stage, we identified 81 zinc finger protein (ZFP) genes that are preferentially expressed during endocrine specification and ß-cell maturation. Third, we used the GCN to select three ZFPs for further analysis by CRISPR mutagenesis of mice. Zfp800 null mice exhibited early postnatal lethality, and at E18.5 their pancreata exhibited a reduced number of pancreatic endocrine cells, alterations in exocrine cell morphology, and marked changes in expression of genes involved in protein translation, hormone secretion and developmental pathways in the pancreas. Together, our results suggest that developmentally oriented GCNs have utility for gaining new insights into gene regulation during organogenesis.


Assuntos
Diferenciação Celular/genética , Proteínas de Homeodomínio/genética , Organogênese/genética , Pâncreas/crescimento & desenvolvimento , Animais , Caderinas/genética , Linhagem da Célula/genética , Regulação da Expressão Gênica no Desenvolvimento/genética , Insulina/metabolismo , Ilhotas Pancreáticas/citologia , Ilhotas Pancreáticas/metabolismo , Camundongos , Pâncreas/metabolismo
17.
BMC Bioinformatics ; 21(1): 430, 2020 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-32998684

RESUMO

BACKGROUND: A typical task in bioinformatics consists of identifying which features are associated with a target outcome of interest and building a predictive model. Automated machine learning (AutoML) systems such as the Tree-based Pipeline Optimization Tool (TPOT) constitute an appealing approach to this end. However, in biomedical data, there are often baseline characteristics of the subjects in a study or batch effects that need to be adjusted for in order to better isolate the effects of the features of interest on the target. Thus, the ability to perform covariate adjustments becomes particularly important for applications of AutoML to biomedical big data analysis. RESULTS: We developed an approach to adjust for covariates affecting features and/or target in TPOT. Our approach is based on regressing out the covariates in a manner that avoids 'leakage' during the cross-validation training procedure. We describe applications of this approach to toxicogenomics and schizophrenia gene expression data sets. The TPOT extensions discussed in this work are available at https://github.com/EpistasisLab/tpot/tree/v0.11.1-resAdj . CONCLUSIONS: In this work, we address an important need in the context of AutoML, which is particularly crucial for applications to bioinformatics and medical informatics, namely covariate adjustments. To this end we present a substantial extension of TPOT, a genetic programming based AutoML approach. We show the utility of this extension by applications to large toxicogenomics and differential gene expression data. The method is generally applicable in many other scenarios from the biomedical field.


Assuntos
Big Data , Análise de Dados , Aprendizado de Máquina , Algoritmos , Automação , Humanos
18.
Nat Commun ; 11(1): 3294, 2020 07 03.
Artigo em Inglês | MEDLINE | ID: mdl-32620744

RESUMO

Systemic lupus erythematosus (SLE) is mediated by autoreactive antibodies that damage multiple tissues. Genome-wide association studies (GWAS) link >60 loci with SLE risk, but the causal variants and effector genes are largely unknown. We generated high-resolution spatial maps of SLE variant accessibility and gene connectivity in human follicular helper T cells (TFH), a cell type required for anti-nuclear antibodies characteristic of SLE. Of the ~400 potential regulatory variants identified, 90% exhibit spatial proximity to genes distant in the 1D genome sequence, including variants that loop to regulate the canonical TFH genes BCL6 and CXCR5 as confirmed by genome editing. SLE 'variant-to-gene' maps also implicate genes with no known role in TFH/SLE disease biology, including the kinases HIPK1 and MINK1. Targeting these kinases in TFH inhibits production of IL-21, a cytokine crucial for class-switched B cell antibodies. These studies offer mechanistic insight into the SLE-associated regulatory architecture of the human genome.


Assuntos
Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla/métodos , Lúpus Eritematoso Sistêmico/genética , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas/genética , Linfócitos T Auxiliares-Indutores/metabolismo , Autoanticorpos/imunologia , Autoanticorpos/metabolismo , Células Cultivadas , Mapeamento Cromossômico/métodos , Perfilação da Expressão Gênica/métodos , Humanos , Células Jurkat , Lúpus Eritematoso Sistêmico/imunologia , Proteínas Serina-Treonina Quinases/genética , Proteínas Proto-Oncogênicas c-bcl-6/genética , Interferência de RNA , Receptores CXCR5/genética , Linfócitos T Auxiliares-Indutores/imunologia
19.
Genes Dev ; 34(15-16): 1039-1050, 2020 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-32561546

RESUMO

The FoxA transcription factors are critical for liver development through their pioneering activity, which initiates a highly complex regulatory network thought to become progressively resistant to the loss of any individual hepatic transcription factor via mutual redundancy. To investigate the dispensability of FoxA factors for maintaining this regulatory network, we ablated all FoxA genes in the adult mouse liver. Remarkably, loss of FoxA caused rapid and massive reduction in the expression of critical liver genes. Activity of these genes was reduced back to the low levels of the fetal prehepatic endoderm stage, leading to necrosis and lethality within days. Mechanistically, we found FoxA proteins to be required for maintaining enhancer activity, chromatin accessibility, nucleosome positioning, and binding of HNF4α. Thus, the FoxA factors act continuously, guarding hepatic enhancer activity throughout adult life.


Assuntos
Fatores de Transcrição Forkhead/fisiologia , Redes Reguladoras de Genes , Fígado/metabolismo , Animais , Sítios de Ligação , Cromatina/metabolismo , Elementos Facilitadores Genéticos , Fatores de Transcrição Forkhead/genética , Fatores de Transcrição Forkhead/metabolismo , Regulação da Expressão Gênica , Técnicas de Silenciamento de Genes , Fator 3-alfa Nuclear de Hepatócito/genética , Fator 3-beta Nuclear de Hepatócito/genética , Fator 3-gama Nuclear de Hepatócito/genética , Fator 4 Nuclear de Hepatócito/metabolismo , Fígado/patologia , Falência Hepática/etiologia , Falência Hepática/patologia , Masculino , Camundongos , Nucleossomos
20.
Artif Life ; 26(1): 23-37, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32027528

RESUMO

Susceptibility to common human diseases such as cancer is influenced by many genetic and environmental factors that work together in a complex manner. The state of the art is to perform a genome-wide association study (GWAS) that measures millions of single-nucleotide polymorphisms (SNPs) throughout the genome followed by a one-SNP-at-a-time statistical analysis to detect univariate associations. This approach has identified thousands of genetic risk factors for hundreds of diseases. However, the genetic risk factors detected have very small effect sizes and collectively explain very little of the overall heritability of the disease. Nonetheless, it is assumed that the genetic component of risk is due to many independent risk factors that contribute additively. The fact that many genetic risk factors with small effects can be detected is taken as evidence to support this notion. It is our working hypothesis that the genetic architecture of common diseases is partly driven by non-additive interactions. To test this hypothesis, we developed a heuristic simulation-based method for conducting experiments about the complexity of genetic architecture. We show that a genetic architecture driven by complex interactions is highly consistent with the magnitude and distribution of univariate effects seen in real data. We compare our results with measures of univariate and interaction effects from two large-scale GWASs of sporadic breast cancer and find evidence to support our hypothesis that is consistent with the results of our computational experiment.


Assuntos
Biologia Computacional , Doença/genética , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Simulação por Computador , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA