RESUMO
Despite calls to improve reproducibility in research, achieving this goal remains elusive even within computational fields. Currently, >50% of R packages are distributed exclusively through GitHub. While the trend towards sharing open-source software has been revolutionary, GitHub does not have any default built-in checks for minimal coding standards or software usability. This makes it difficult to assess the current quality R packages, or to consistently use them over time and across platforms. While GitHub-native solutions are technically possible, they require considerable time and expertise for each developer to write, implement, and maintain. To address this, we develop rworkflows; a suite of tools to make robust continuous integration and deployment ( https://github.com/neurogenomics/rworkflows ). rworkflows can be implemented by developers of all skill levels using a one-time R function call which has both sensible defaults and extensive options for customisation. Once implemented, any updates to the GitHub repository automatically trigger parallel workflows that install all software dependencies, run code checks, generate a dedicated documentation website, and deploy a publicly accessible containerised environment. By making the rworkflows suite free, automated, and simple to use, we aim to promote widespread adoption of reproducible practices across a continually growing R community.
RESUMO
Mathys et al. conducted the first single-nucleus RNA-seq (snRNA-seq) study of Alzheimer's disease (AD) (Mathys et al., 2019). With bulk RNA-seq, changes in gene expression across cell types can be lost, potentially masking the differentially expressed genes (DEGs) across different cell types. Through the use of single-cell techniques, the authors benefitted from increased resolution with the potential to uncover cell type-specific DEGs in AD for the first time. However, there were limitations in both their data processing and quality control and their differential expression analysis. Here, we correct these issues and use best-practice approaches to snRNA-seq differential expression, resulting in 549 times fewer DEGs at a false discovery rate of 0.05. Thus, this study highlights the impact of quality control and differential analysis methods on the discovery of disease-associated genes and aims to refocus the AD research field away from spuriously identified genes.
Assuntos
Doença de Alzheimer , Humanos , Doença de Alzheimer/genética , Análise da Expressão Gênica de Célula Única , Controle de Qualidade , RNA Nuclear Pequeno , RNA-SeqRESUMO
Genetics and omics studies of Alzheimer's disease and other dementia subtypes enhance our understanding of underlying mechanisms and pathways that can be targeted. We identified key remaining challenges: First, can we enhance genetic studies to address missing heritability? Can we identify reproducible omics signatures that differentiate between dementia subtypes? Can high-dimensional omics data identify improved biomarkers? How can genetics inform our understanding of causal status of dementia risk factors? And which biological processes are altered by dementia-related genetic variation? Artificial intelligence (AI) and machine learning approaches give us powerful new tools in helping us to tackle these challenges, and we review possible solutions and examples of best practice. However, their limitations also need to be considered, as well as the need for coordinated multidisciplinary research and diverse deeply phenotyped cohorts. Ultimately AI approaches improve our ability to interrogate genetics and omics data for precision dementia medicine. HIGHLIGHTS: We have identified five key challenges in dementia genetics and omics studies. AI can enable detection of undiscovered patterns in dementia genetics and omics data. Enhanced and more diverse genetics and omics datasets are still needed. Multidisciplinary collaborative efforts using AI can boost dementia research.
Assuntos
Doença de Alzheimer , Inteligência Artificial , Humanos , Aprendizado de Máquina , Doença de Alzheimer/genética , Fenótipo , Medicina de PrecisãoRESUMO
Summary: EpiCompare combines a variety of downstream analysis tools to compare, quality control and benchmark different epigenomic datasets. The package requires minimal input from users, can be run with just one line of code and provides all results of the analysis in a single interactive HTML report. EpiCompare thus enables downstream analysis of multiple epigenomic datasets in a simple, effective and user-friendly manner. Availability and implementation: EpiCompare is available on Bioconductor (≥ v3.15): https://bioconductor.org/packages/release/bioc/html/EpiCompare.html; all source code is publicly available via GitHub: https://github.com/neurogenomics/EpiCompare; documentation website https://neurogenomics.github.io/EpiCompare; and EpiCompare DockerHub repository: https://hub.docker.com/repository/docker/neurogenomicslab/epicompare.
RESUMO
Severe psychological trauma triggers genetic, biochemical and morphological changes in amygdala neurons, which underpin the development of stress-induced behavioural abnormalities, such as high levels of anxiety. miRNAs are small, non-coding RNA fragments that orchestrate complex neuronal responses by simultaneous transcriptional/translational repression of multiple target genes. Here we show that miR-483-5p in the amygdala of male mice counterbalances the structural, functional and behavioural consequences of stress to promote a reduction in anxiety-like behaviour. Upon stress, miR-483-5p is upregulated in the synaptic compartment of amygdala neurons and directly represses three stress-associated genes: Pgap2, Gpx3 and Macf1. Upregulation of miR-483-5p leads to selective contraction of distal parts of the dendritic arbour and conversion of immature filopodia into mature, mushroom-like dendritic spines. Consistent with its role in reducing the stress response, upregulation of miR-483-5p in the basolateral amygdala produces a reduction in anxiety-like behaviour. Stress-induced neuromorphological and behavioural effects of miR-483-5p can be recapitulated by shRNA mediated suppression of Pgap2 and prevented by simultaneous overexpression of miR-483-5p-resistant Pgap2. Our results demonstrate that miR-483-5p is sufficient to confer a reduction in anxiety-like behaviour and point to miR-483-5p-mediated repression of Pgap2 as a critical cellular event offsetting the functional and behavioural consequences of psychological stress.
Assuntos
Complexo Nuclear Basolateral da Amígdala , MicroRNAs , Animais , Masculino , Camundongos , Tonsila do Cerebelo/metabolismo , Complexo Nuclear Basolateral da Amígdala/metabolismo , MicroRNAs/genética , MicroRNAs/metabolismo , Neurônios/metabolismo , Sinapses/metabolismoRESUMO
Progress in dementia research has been limited, with substantial gaps in our knowledge of targets for prevention, mechanisms for disease progression, and disease-modifying treatments. The growing availability of multimodal data sets opens possibilities for the application of machine learning and artificial intelligence (AI) to help answer key questions in the field. We provide an overview of the state of the science, highlighting current challenges and opportunities for utilisation of AI approaches to move the field forward in the areas of genetics, experimental medicine, drug discovery and trials optimisation, imaging, and prevention. Machine learning methods can enhance results of genetic studies, help determine biological effects and facilitate the identification of drug targets based on genetic and transcriptomic information. The use of unsupervised learning for understanding disease mechanisms for drug discovery is promising, while analysis of multimodal data sets to characterise and quantify disease severity and subtype are also beginning to contribute to optimisation of clinical trial recruitment. Data-driven experimental medicine is needed to analyse data across modalities and develop novel algorithms to translate insights from animal models to human disease biology. AI methods in neuroimaging outperform traditional approaches for diagnostic classification, and although challenges around validation and translation remain, there is optimism for their meaningful integration to clinical practice in the near future. AI-based models can also clarify our understanding of the causality and commonality of dementia risk factors, informing and improving risk prediction models along with the development of preventative interventions. The complexity and heterogeneity of dementia requires an alternative approach beyond traditional design and analytical approaches. Although not yet widely used in dementia research, machine learning and AI have the potential to unlock current challenges and advance precision dementia medicine.
RESUMO
Leigh syndrome is a rare, inherited, complex neurometabolic disorder with genetic and clinical heterogeneity. Features present in affected patients range from classical stepwise developmental regression to ataxia, seizures, tremor, and occasionally psychiatric manifestations. Currently, more than 100 monogenic causes of Leigh syndrome have been identified, yet the pathophysiology remains unknown. Here, we sought to determine the cellular specificity within the brain of all genes currently associated with Leigh syndrome. Further, we aimed to investigate potential genetic commonalities between Leigh syndrome and other disorders with overlapping clinical features. Enrichment of our target genes within the brain was evaluated with co-expression (CoExp) network analyses constructed using existing UK Brain Expression Consortium data. To determine the cellular specificity of the Leigh associated genes, we employed expression weighted cell type enrichment (EWCE) analysis of single-cell RNA-Seq data. Finally, CoExp network modules demonstrating enrichment of Leigh syndrome associated genes were then utilised for synaptic gene ontology analysis and heritability analysis. CoExp network analyses revealed that Leigh syndrome associated genes exhibit the highest levels of expression in brain regions most affected on MRI in affected patients. EWCE revealed significant enrichment of target genes in hippocampal and somatosensory pyramidal neurons and interneurons of the brain. Analysis of CoExp modules enriched with our target genes revealed preferential association with pre-synaptic structures. Heritability studies suggested some common enrichment between Leigh syndrome and Parkinson disease and epilepsy. Our findings suggest a primary mitochondrial dysfunction as the underlying basis of Leigh syndrome, with associated genes primarily expressed in neuronal cells.
Assuntos
Doença de Leigh , Humanos , Doença de Leigh/genética , Transcriptoma , Mutação , Encéfalo/metabolismo , Imageamento por Ressonância MagnéticaRESUMO
The amount of any given protein in the brain is determined by the rates of its synthesis and destruction, which are regulated by different cellular mechanisms. Here, we combine metabolic labeling in live mice with global proteomic profiling to simultaneously quantify both the flux and amount of proteins in mouse models of neurodegeneration. In multiple models, protein turnover increases were associated with increasing pathology. This method distinguishes changes in protein expression mediated by synthesis from those mediated by degradation. In the AppNL-F knockin mouse model of Alzheimer's disease, increased turnover resulted from imbalances in both synthesis and degradation, converging on proteins associated with synaptic vesicle recycling (Dnm1, Cltc, Rims1) and mitochondria (Fis1, Ndufv1). In contrast to disease models, aging in wild-type mice caused a widespread decrease in protein recycling associated with a decrease in autophagic flux. Overall, this simple multidimensional approach enables a comprehensive mapping of proteome dynamics and identifies affected proteins in mouse models of disease and other live animal test settings.
Assuntos
Doença de Alzheimer , Proteoma , Envelhecimento , Doença de Alzheimer/metabolismo , Animais , Encéfalo/metabolismo , Modelos Animais de Doenças , Mamíferos/metabolismo , Camundongos , Camundongos Transgênicos , Proteoma/metabolismo , Proteômica/métodosRESUMO
MOTIVATION: Genome-wide association studies (GWAS) summary statistics have popularized and accelerated genetic research. However, a lack of standardization of the file formats used has proven problematic when running secondary analysis tools or performing meta-analysis studies. RESULTS: To address this issue, we have developed MungeSumstats, a Bioconductor R package for the standardization and quality control of GWAS summary statistics. MungeSumstats can handle the most common summary statistic formats, including variant call format (VCF) producing a reformatted, standardized, tabular summary statistic file, VCF or R native data object. AVAILABILITY AND IMPLEMENTATION: MungeSumstats is available on Bioconductor (v 3.13) and can also be found on Github at: https://neurogenomics.github.io/MungeSumstats. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Estudo de Associação Genômica Ampla , Software , Controle de Qualidade , Padrões de ReferênciaRESUMO
Apart from well-defined factors in neuronal cells1, only a few reports consider that the variability of sporadic amyotrophic lateral sclerosis (ALS) progression can depend on less-defined contributions from glia2,3 and blood vessels4. In this study we use an expression-weighted cell-type enrichment method to infer cell activity in spinal cord samples from patients with sporadic ALS and mouse models of this disease. Here we report that patients with sporadic ALS present cell activity patterns consistent with two mouse models in which enrichments of vascular cell genes preceded microglial response. Notably, during the presymptomatic stage, perivascular fibroblast cells showed the strongest gene enrichments, and their marker proteins SPP1 and COL6A1 accumulated in enlarged perivascular spaces in patients with sporadic ALS. Moreover, in plasma of 574 patients with ALS from four independent cohorts, increased levels of SPP1 at disease diagnosis repeatedly predicted shorter survival with stronger effect than the established risk factors of bulbar onset or neurofilament levels in cerebrospinal fluid. We propose that the activity of the recently discovered perivascular fibroblast can predict survival of patients with ALS and provide a new conceptual framework to re-evaluate definitions of ALS etiology.
Assuntos
Esclerose Lateral Amiotrófica/patologia , Vasos Sanguíneos/patologia , Fibroblastos/patologia , Esclerose Lateral Amiotrófica/sangue , Esclerose Lateral Amiotrófica/genética , Esclerose Lateral Amiotrófica/fisiopatologia , Animais , Biomarcadores/metabolismo , Colágeno Tipo VI/genética , Colágeno Tipo VI/metabolismo , Proteínas de Ligação a DNA/metabolismo , Progressão da Doença , Marcadores Genéticos , Humanos , Camundongos Transgênicos , Osteopontina/sangue , Prognóstico , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Medula Espinal/patologia , Medula Espinal/ultraestrutura , Superóxido Dismutase/genética , Transcrição Gênica , Remodelação VascularRESUMO
Substantial genetic liability is shared across psychiatric disorders but less is known about risk variants that are specific to a given disorder. We used multi-trait conditional and joint analysis (mtCOJO) to adjust GWAS summary statistics of one disorder for the effects of genetically correlated traits to identify putative disorder-specific SNP associations. We applied mtCOJO to summary statistics for five psychiatric disorders from the Psychiatric Genomics Consortium-schizophrenia (SCZ), bipolar disorder (BIP), major depression (MD), attention-deficit hyperactivity disorder (ADHD) and autism (AUT). Most genome-wide significant variants for these disorders had evidence of pleiotropy (i.e., impact on multiple psychiatric disorders) and hence have reduced mtCOJO conditional effect sizes. However, subsets of genome-wide significant variants had larger conditional effect sizes consistent with disorder-specific effects: 15 of 130 genome-wide significant variants for schizophrenia, 5 of 40 for major depression, 3 of 11 for ADHD and 1 of 2 for autism. We show that decreased expression of VPS29 in the brain may increase risk to SCZ only and increased expression of CSE1L is associated with SCZ and MD, but not with BIP. Likewise, decreased expression of PCDHA7 in the brain is linked to increased risk of MD but decreased risk of SCZ and BIP.
Assuntos
Transtorno do Deficit de Atenção com Hiperatividade , Transtorno Bipolar , Esquizofrenia , Transtorno do Deficit de Atenção com Hiperatividade/genética , Transtorno Bipolar/genética , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla , Humanos , Polimorfismo de Nucleotídeo Único/genética , Esquizofrenia/genéticaRESUMO
Single-nucleus RNA sequencing (snRNA-seq) is used as an alternative to single-cell RNA-seq, as it allows transcriptomic profiling of frozen tissue. However, it is unclear whether snRNA-seq is able to detect cellular state in human tissue. Indeed, snRNA-seq analyses of human brain samples have failed to detect a consistent microglial activation signature in Alzheimer's disease. Our comparison of microglia from single cells and single nuclei of four human subjects reveals that, although most genes show similar relative abundances in cells and nuclei, a small population of genes (â¼1%) is depleted in nuclei compared to whole cells. This population is enriched for genes previously implicated in microglial activation, including APOE, CST3, SPP1, and CD74, comprising 18% of previously identified microglial-disease-associated genes. Given the low sensitivity of snRNA-seq to detect many activation genes, we conclude that snRNA-seq is not suited for detecting cellular activation in microglia in human disease.
Assuntos
Perfilação da Expressão Gênica/métodos , Microglia/fisiologia , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , HumanosRESUMO
Genome-wide association studies have discovered hundreds of loci associated with complex brain disorders, but it remains unclear in which cell types these loci are active. Here we integrate genome-wide association study results with single-cell transcriptomic data from the entire mouse nervous system to systematically identify cell types underlying brain complex traits. We show that psychiatric disorders are predominantly associated with projecting excitatory and inhibitory neurons. Neurological diseases were associated with different cell types, which is consistent with other lines of evidence. Notably, Parkinson's disease was genetically associated not only with cholinergic and monoaminergic neurons (which include dopaminergic neurons) but also with enteric neurons and oligodendrocytes. Using post-mortem brain transcriptomic data, we confirmed alterations in these cells, even at the earliest stages of disease progression. Our study provides an important framework for understanding the cellular basis of complex brain maladies, and reveals an unexpected role of oligodendrocytes in Parkinson's disease.
Assuntos
Encéfalo/patologia , Doença de Parkinson/etiologia , Doença de Parkinson/genética , Animais , Estudo de Associação Genômica Ampla/métodos , Humanos , Camundongos , Neurônios/patologia , Doença de Parkinson/patologia , Transcriptoma/genéticaRESUMO
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
RESUMO
Understanding the function of a tissue requires knowing the spatial organization of its constituent cell types. In the cerebral cortex, single-cell RNA sequencing (scRNA-seq) has revealed the genome-wide expression patterns that define its many, closely related neuronal types, but cannot reveal their spatial arrangement. Here we introduce probabilistic cell typing by in situ sequencing (pciSeq), an approach that leverages previous scRNA-seq classification to identify cell types using multiplexed in situ RNA detection. We applied this method by mapping the inhibitory neurons of mouse hippocampal area CA1, for which ground truth is available from extensive previous work identifying their laminar organization. Our method identified these neuronal classes in a spatial arrangement matching ground truth, and further identified multiple classes of isocortical pyramidal cell in a pattern matching their known organization. This method will allow identifying the spatial organization of closely related cell types across the brain and other tissues.
Assuntos
Região CA1 Hipocampal/citologia , Perfilação da Expressão Gênica/métodos , Neocórtex/citologia , Neurônios/citologia , Células Piramidais/citologia , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Algoritmos , Animais , Região CA1 Hipocampal/metabolismo , Masculino , Camundongos , Modelos Estatísticos , Neocórtex/metabolismo , Neurônios/metabolismo , Células Piramidais/metabolismoRESUMO
Socioeconomic position (SEP) is a multi-dimensional construct reflecting (and influencing) multiple socio-cultural, physical, and environmental factors. In a sample of 286,301 participants from UK Biobank, we identify 30 (29 previously unreported) independent-loci associated with income. Using a method to meta-analyze data from genetically-correlated traits, we identify an additional 120 income-associated loci. These loci show clear evidence of functionality, with transcriptional differences identified across multiple cortical tissues, and links to GABAergic and serotonergic neurotransmission. By combining our genome wide association study on income with data from eQTL studies and chromatin interactions, 24 genes are prioritized for follow up, 18 of which were previously associated with intelligence. We identify intelligence as one of the likely causal, partly-heritable phenotypes that might bridge the gap between molecular genetic inheritance and phenotypic consequence in terms of income differences. These results indicate that, in modern era Great Britain, genetic effects contribute towards some of the observed socioeconomic inequalities.
Assuntos
Estudo de Associação Genômica Ampla , Renda/estatística & dados numéricos , Inteligência/genética , Locos de Características Quantitativas , Classe Social , Adulto , Idoso , Feminino , Genótipo , Humanos , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , Reino UnidoRESUMO
Insomnia is the second most prevalent mental disorder, with no sufficient treatment available. Despite substantial heritability, insight into the associated genes and neurobiological pathways remains limited. Here, we use a large genetic association sample (n = 1,331,010) to detect novel loci and gain insight into the pathways, tissue and cell types involved in insomnia complaints. We identify 202 loci implicating 956 genes through positional, expression quantitative trait loci, and chromatin mapping. The meta-analysis explained 2.6% of the variance. We show gene set enrichments for the axonal part of neurons, cortical and subcortical tissues, and specific cell types, including striatal, hypothalamic, and claustrum neurons. We found considerable genetic correlations with psychiatric traits and sleep duration, and modest correlations with other sleep-related traits. Mendelian randomization identified the causal effects of insomnia on depression, diabetes, and cardiovascular disease, and the protective effects of educational attainment and intracranial volume. Our findings highlight key brain areas and cell types implicated in insomnia, and provide new treatment targets.
Assuntos
Predisposição Genética para Doença/genética , Locos de Características Quantitativas/genética , Distúrbios do Início e da Manutenção do Sono/genética , Cromatina/genética , Feminino , Estudo de Associação Genômica Ampla/métodos , Humanos , Masculino , Pessoa de Meia-Idade , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Sono/genéticaRESUMO
Alzheimer's disease (AD) is highly heritable and recent studies have identified over 20 disease-associated genomic loci. Yet these only explain a small proportion of the genetic variance, indicating that undiscovered loci remain. Here, we performed a large genome-wide association study of clinically diagnosed AD and AD-by-proxy (71,880 cases, 383,378 controls). AD-by-proxy, based on parental diagnoses, showed strong genetic correlation with AD (rg = 0.81). Meta-analysis identified 29 risk loci, implicating 215 potential causative genes. Associated genes are strongly expressed in immune-related tissues and cell types (spleen, liver, and microglia). Gene-set analyses indicate biological mechanisms involved in lipid-related processes and degradation of amyloid precursor proteins. We show strong genetic correlations with multiple health-related outcomes, and Mendelian randomization results suggest a protective effect of cognitive ability on AD risk. These results are a step forward in identifying the genetic factors that contribute to AD risk and add novel insights into the neurobiology of AD.