Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Bioinformatics ; 39(11)2023 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-37947320

RESUMEN

SUMMARY: Preparing functional genomic (FG) data with diverse assay types and file formats for integration into analysis workflows that interpret genome-wide association and other studies is a significant and time-consuming challenge. Here we introduce hipFG (Harmonization and Integration Pipeline for Functional Genomics), an automatically customized pipeline for efficient and scalable normalization of heterogenous FG data collections into standardized, indexed, rapidly searchable analysis-ready datasets while accounting for FG datatypes (e.g. chromatin interactions, genomic intervals, quantitative trait loci). AVAILABILITY AND IMPLEMENTATION: hipFG is freely available at https://bitbucket.org/wanglab-upenn/hipFG. A Docker container is available at https://hub.docker.com/r/wanglab/hipfg.


Asunto(s)
Estudio de Asociación del Genoma Completo , Programas Informáticos , Genómica , Cromatina , Sitios de Carácter Cuantitativo
2.
Alzheimers Dement ; 20(2): 1123-1136, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-37881831

RESUMEN

INTRODUCTION: The National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site Alzheimer's Genomics Database (GenomicsDB) is a public knowledge base of Alzheimer's disease (AD) genetic datasets and genomic annotations. METHODS: GenomicsDB uses a custom systems architecture to adopt and enforce rigorous standards that facilitate harmonization of AD-relevant genome-wide association study summary statistics datasets with functional annotations, including over 230 million annotated variants from the AD Sequencing Project. RESULTS: GenomicsDB generates interactive reports compiled from the harmonized datasets and annotations. These reports contextualize AD-risk associations in a broader functional genomic setting and summarize them in the context of functionally annotated genes and variants. DISCUSSION: Created to make AD-genetics knowledge more accessible to AD researchers, the GenomicsDB is designed to guide users unfamiliar with genetic data in not only exploring but also interpreting this ever-growing volume of data. Scalable and interoperable with other genomics resources using data technology standards, the GenomicsDB can serve as a central hub for research and data analysis on AD and related dementias. HIGHLIGHTS: The National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site (NIAGADS) offers to the public a unique, disease-centric collection of AD-relevant GWAS summary statistics datasets. Interpreting these data is challenging and requires significant bioinformatics expertise to standardize datasets and harmonize them with functional annotations on genome-wide scales. The NIAGADS Alzheimer's GenomicsDB helps overcome these challenges by providing a user-friendly public knowledge base for AD-relevant genetics that shares harmonized, annotated summary statistics datasets from the NIAGADS repository in an interpretable, easily searchable format.


Asunto(s)
Enfermedad de Alzheimer , Estados Unidos , Humanos , Enfermedad de Alzheimer/genética , Estudio de Asociación del Genoma Completo , National Institute on Aging (U.S.) , Genómica , Bases de Datos Factuales , Predisposición Genética a la Enfermedad/genética
3.
Bioinformatics ; 36(12): 3879-3881, 2020 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-32330239

RESUMEN

SUMMARY: We report Spark-based INFERence of the molecular mechanisms of NOn-coding genetic variants (SparkINFERNO), a scalable bioinformatics pipeline characterizing non-coding genome-wide association study (GWAS) association findings. SparkINFERNO prioritizes causal variants underlying GWAS association signals and reports relevant regulatory elements, tissue contexts and plausible target genes they affect. To achieve this, the SparkINFERNO algorithm integrates GWAS summary statistics with large-scale collection of functional genomics datasets spanning enhancer activity, transcription factor binding, expression quantitative trait loci and other functional datasets across more than 400 tissues and cell types. Scalability is achieved by an underlying API implemented using Apache Spark and Giggle-based genomic indexing. We evaluated SparkINFERNO on large GWASs and show that SparkINFERNO is more than 60 times efficient and scales with data size and amount of computational resources. AVAILABILITY AND IMPLEMENTATION: SparkINFERNO runs on clusters or a single server with Apache Spark environment, and is available at https://bitbucket.org/wanglab-upenn/SparkINFERNO or https://hub.docker.com/r/wanglab/spark-inferno. CONTACT: lswang@pennmedicine.upenn.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo , Algoritmos , Genómica , Programas Informáticos
4.
Bioinformatics ; 35(6): 1033-1039, 2019 03 15.
Artículo en Inglés | MEDLINE | ID: mdl-30668832

RESUMEN

MOTIVATION: Small non-coding RNAs (sncRNAs, <100 nts) are highly abundant RNAs that regulate diverse and often tissue-specific cellular processes by associating with transcription factor complexes or binding to mRNAs. While thousands of sncRNA genes exist in the human genome, no single resource provides searchable, unified annotation, expression and processing information for full sncRNA transcripts and mature RNA products derived from these larger RNAs. RESULTS: Our goal is to establish a complete catalog of annotation, expression, processing, conservation, tissue-specificity and other biological features for all human sncRNA genes and mature products derived from all major RNA classes. DASHR (Database of small human non-coding RNAs) v2.0 database is the first that integrates human sncRNA gene and mature products profiles obtained from multiple RNA-seq protocols. Altogether, 185 tissues/cell types and sncRNA annotations and >800 curated experiments from ENCODE and GEO/SRA across multiple RNA-seq protocols for both GRCh38/hg38 and GRCh37/hg19 assemblies are integrated in DASHR. Moreover, DASHR is the first to contain both known and novel, previously un-annotated sncRNA loci identified by unsupervised segmentation (13 times more loci with 1 678 800 total). Additionally, DASHR v2.0 adds >3 200 000 annotations for non-small RNA genes and other genomic features (long-noncoding RNAs, mRNAs, promoters, repeats). Furthermore, DASHR v2.0 introduces an enhanced user interface, interactive experiment-by-locus table view, sncRNA locus sorting and filtering by biological features. All annotation and expression information directly downloadable and accessible as UCSC genome browser tracks. AVAILABILITY AND IMPLEMENTATION: DASHR v2.0 is freely available at https://lisanwanglab.org/DASHRv2. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
ARN Pequeño no Traducido/provisión & distribución , Bases de Datos de Ácidos Nucleicos , Genómica , Humanos , ARN Largo no Codificante , Análisis de Secuencia de ARN , Programas Informáticos
5.
Bioinformatics ; 35(10): 1768-1770, 2019 05 15.
Artículo en Inglés | MEDLINE | ID: mdl-30351394

RESUMEN

SUMMARY: We report VCPA, our SNP/Indel Variant Calling Pipeline and data management tool used for the analysis of whole genome and exome sequencing (WGS/WES) for the Alzheimer's Disease Sequencing Project. VCPA consists of two independent but linkable components: pipeline and tracking database. The pipeline, implemented using the Workflow Description Language and fully optimized for the Amazon elastic compute cloud environment, includes steps from aligning raw sequence reads to variant calling using GATK. The tracking database allows users to view job running status in real time and visualize >100 quality metrics per genome. VCPA is functionally equivalent to the CCDG/TOPMed pipeline. Users can use the pipeline and the dockerized database to process large WGS/WES datasets on Amazon cloud with minimal configuration. AVAILABILITY AND IMPLEMENTATION: VCPA is released under the MIT license and is available for academic and nonprofit use for free. The pipeline source code and step-by-step instructions are available from the National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site (http://www.niagads.org/VCPA). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Enfermedad de Alzheimer , Manejo de Datos , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Programas Informáticos
6.
Nucleic Acids Res ; 46(W1): W36-W42, 2018 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-29733404

RESUMEN

The introduction of new high-throughput small RNA sequencing protocols that generate large-scale genomics datasets along with increasing evidence of the significant regulatory roles of small non-coding RNAs (sncRNAs) have highlighted the urgent need for tools to analyze and interpret large amounts of small RNA sequencing data. However, it remains challenging to systematically and comprehensively discover and characterize sncRNA genes and specifically-processed sncRNA products from these datasets. To fill this gap, we present Small RNA-seq Portal for Analysis of sequencing expeRiments (SPAR), a user-friendly web server for interactive processing, analysis, annotation and visualization of small RNA sequencing data. SPAR supports sequencing data generated from various experimental protocols, including smRNA-seq, short total RNA sequencing, microRNA-seq, and single-cell small RNA-seq. Additionally, SPAR includes publicly available reference sncRNA datasets from our DASHR database and from ENCODE across 185 human tissues and cell types to produce highly informative small RNA annotations across all major small RNA types and other features such as co-localization with various genomic features, precursor transcript cleavage patterns, and conservation. SPAR allows the user to compare the input experiment against reference ENCODE/DASHR datasets. SPAR currently supports analyses of human (hg19, hg38) and mouse (mm10) sequencing data. SPAR is freely available at https://www.lisanwanglab.org/SPAR.


Asunto(s)
Biología Computacional/tendencias , ARN Pequeño no Traducido/genética , ARN/genética , Programas Informáticos , Animales , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento/instrumentación , Humanos , Internet , Ratones , Anotación de Secuencia Molecular , Análisis de Secuencia de ARN/instrumentación , Transcriptoma/genética
7.
Nucleic Acids Res ; 46(17): 8740-8753, 2018 09 28.
Artículo en Inglés | MEDLINE | ID: mdl-30113658

RESUMEN

The majority of variants identified by genome-wide association studies (GWAS) reside in the noncoding genome, affecting regulatory elements including transcriptional enhancers. However, characterizing their effects requires the integration of GWAS results with context-specific regulatory activity and linkage disequilibrium annotations to identify causal variants underlying noncoding association signals and the regulatory elements, tissue contexts, and target genes they affect. We propose INFERNO, a novel method which integrates hundreds of functional genomics datasets spanning enhancer activity, transcription factor binding sites, and expression quantitative trait loci with GWAS summary statistics. INFERNO includes novel statistical methods to quantify empirical enrichments of tissue-specific enhancer overlap and to identify co-regulatory networks of dysregulated long noncoding RNAs (lncRNAs). We applied INFERNO to two large GWAS studies. For schizophrenia (36,989 cases, 113,075 controls), INFERNO identified putatively causal variants affecting brain enhancers for known schizophrenia-related genes. For inflammatory bowel disease (IBD) (12,882 cases, 21,770 controls), INFERNO found enrichments of immune and digestive enhancers and lncRNAs involved in regulation of the adaptive immune response. In summary, INFERNO comprehensively infers the molecular mechanisms of causal noncoding variants, providing a sensitive hypothesis generation method for post-GWAS analysis. The software is available as an open source pipeline and a web server.


Asunto(s)
Elementos de Facilitación Genéticos , Genoma Humano , Enfermedades Inflamatorias del Intestino/genética , ARN Largo no Codificante/genética , Esquizofrenia/genética , Programas Informáticos , Inmunidad Adaptativa , Estudios de Casos y Controles , Femenino , Marcadores Genéticos , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Enfermedades Inflamatorias del Intestino/inmunología , Enfermedades Inflamatorias del Intestino/fisiopatología , Internet , Desequilibrio de Ligamiento , Masculino , Fenotipo , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , ARN Largo no Codificante/inmunología , Esquizofrenia/inmunología , Esquizofrenia/fisiopatología
8.
Nature ; 485(7397): 242-5, 2012 Apr 04.
Artículo en Inglés | MEDLINE | ID: mdl-22495311

RESUMEN

Autism spectrum disorders (ASD) are believed to have genetic and environmental origins, yet in only a modest fraction of individuals can specific causes be identified. To identify further genetic risk factors, here we assess the role of de novo mutations in ASD by sequencing the exomes of ASD cases and their parents (n = 175 trios). Fewer than half of the cases (46.3%) carry a missense or nonsense de novo variant, and the overall rate of mutation is only modestly higher than the expected rate. In contrast, the proteins encoded by genes that harboured de novo missense or nonsense mutations showed a higher degree of connectivity among themselves and to previous ASD genes as indexed by protein-protein interaction screens. The small increase in the rate of de novo events, when taken together with the protein interaction results, are consistent with an important but limited role for de novo point mutations in ASD, similar to that documented for de novo copy number variants. Genetic models incorporating these data indicate that most of the observed de novo events are unconnected to ASD; those that do confer risk are distributed across many genes and are incompletely penetrant (that is, not necessarily sufficient for disease). Our results support polygenic models in which spontaneous coding mutations in any of a large number of genes increases risk by 5- to 20-fold. Despite the challenge posed by such models, results from de novo events and a large parallel case-control study provide strong evidence in favour of CHD8 and KATNAL2 as genuine autism risk factors.


Asunto(s)
Trastorno Autístico/genética , Proteínas de Unión al ADN/genética , Exones/genética , Predisposición Genética a la Enfermedad/genética , Mutación/genética , Factores de Transcripción/genética , Estudios de Casos y Controles , Exoma/genética , Salud de la Familia , Humanos , Modelos Genéticos , Herencia Multifactorial/genética , Fenotipo , Distribución de Poisson , Mapas de Interacción de Proteínas
9.
Nucleic Acids Res ; 44(D1): D216-22, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26553799

RESUMEN

Small non-coding RNAs (sncRNAs) are highly abundant RNAs, typically <100 nucleotides long, that act as key regulators of diverse cellular processes. Although thousands of sncRNA genes are known to exist in the human genome, no single database provides searchable, unified annotation, and expression information for full sncRNA transcripts and mature RNA products derived from these larger RNAs. Here, we present the Database of small human noncoding RNAs (DASHR). DASHR contains the most comprehensive information to date on human sncRNA genes and mature sncRNA products. DASHR provides a simple user interface for researchers to view sequence and secondary structure, compare expression levels, and evidence of specific processing across all sncRNA genes and mature sncRNA products in various human tissues. DASHR annotation and expression data covers all major classes of sncRNAs including microRNAs (miRNAs), Piwi-interacting (piRNAs), small nuclear, nucleolar, cytoplasmic (sn-, sno-, scRNAs, respectively), transfer (tRNAs), and ribosomal RNAs (rRNAs). Currently, DASHR (v1.0) integrates 187 smRNA high-throughput sequencing (smRNA-seq) datasets with over 2.5 billion reads and annotation data from multiple public sources. DASHR contains annotations for ∼ 48,000 human sncRNA genes and mature sncRNA products, 82% of which are expressed in one or more of the curated tissues. DASHR is available at http://lisanwanglab.org/DASHR.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , ARN Pequeño no Traducido/metabolismo , Humanos , Anotación de Secuencia Molecular , Procesamiento Postranscripcional del ARN , ARN Pequeño no Traducido/química , ARN Pequeño no Traducido/genética
10.
Bioinformatics ; 31(8): 1290-2, 2015 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-25480377

RESUMEN

UNLABELLED: We implemented a high-throughput identification pipeline for promoter interacting enhancer element to streamline the workflow from mapping raw Hi-C reads, identifying DNA-DNA interacting fragments with high confidence and quality control, detecting histone modifications and DNase hypersensitive enrichments in putative enhancer elements, to ultimately extracting possible intra- and inter-chromosomal enhancer-target gene relationships. AVAILABILITY AND IMPLEMENTATION: This software package is designed to run on high-performance computing clusters with Oracle Grid Engine. The source code is freely available under the MIT license for academic and nonprofit use. The source code and instructions are available at the Wang lab website (http://wanglab.pcbi.upenn.edu/hippie/). It is also provided as an Amazon Machine Image to be used directly on Amazon Cloud with minimal installation. CONTACT: lswang@mail.med.upenn.edu or bdgregor@sas.upenn.edu SUPPLEMENTARY INFORMATION: Supplementary Material is available at Bioinformatics online.


Asunto(s)
ADN/genética , ADN/metabolismo , Elementos de Facilitación Genéticos/genética , Regiones Promotoras Genéticas/genética , Análisis de Secuencia de ADN/métodos , Humanos , Lenguajes de Programación
11.
Alzheimers Dement ; 12(3): 233-43, 2016 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-26092349

RESUMEN

INTRODUCTION: African-American (AA) individuals have a higher risk for late-onset Alzheimer's disease (LOAD) than Americans of primarily European ancestry (EA). Recently, the largest genome-wide association study in AAs to date confirmed that six of the Alzheimer's disease (AD)-related genetic variants originally discovered in EA cohorts are also risk variants in AA; however, the risk attributable to many of the loci (e.g., APOE, ABCA7) differed substantially from previous studies in EA. There likely are risk variants of higher frequency in AAs that have not been discovered. METHODS: We performed a comprehensive analysis of genetically determined local and global ancestry in AAs with regard to LOAD status. RESULTS: Compared to controls, LOAD cases showed higher levels of African ancestry, both globally and at several LOAD relevant loci, which explained risk for AD beyond global differences. DISCUSSION: Exploratory post hoc analyses highlight regions with greatest differences in ancestry as potential candidate regions for future genetic analyses.


Asunto(s)
Enfermedad de Alzheimer/etnología , Enfermedad de Alzheimer/genética , Predisposición Genética a la Enfermedad/genética , Transportadoras de Casetes de Unión a ATP/genética , Negro o Afroamericano/genética , Anciano , Anciano de 80 o más Años , Enfermedad de Alzheimer/epidemiología , Apolipoproteínas E/genética , Distribución de Chi-Cuadrado , Aberraciones Cromosómicas , Estudios de Cohortes , Femenino , Estudios de Asociación Genética , Genotipo , Humanos , Masculino , Polimorfismo de Nucleótido Simple/genética , Lectina 3 Similar a Ig de Unión al Ácido Siálico/genética
12.
RNA ; 19(12): 1684-92, 2013 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-24149843

RESUMEN

RNA is often altered post-transcriptionally by the covalent modification of particular nucleotides; these modifications are known to modulate the structure and activity of their host RNAs. The recent discovery that an RNA methyl-6 adenosine demethylase (FTO) is a risk gene in obesity has brought to light the significance of RNA modifications to human biology. These noncanonical nucleotides, when converted to cDNA in the course of RNA sequencing, can produce sequence patterns that are distinguishable from simple base-calling errors. To determine whether these modifications can be detected in RNA sequencing data, we developed a method that can not only locate these modifications transcriptome-wide with single nucleotide resolution, but can also differentiate between different classes of modifications. Using small RNA-seq data we were able to detect 92% of all known human tRNA modification sites that are predicted to affect RT activity. We also found that different modifications produce distinct patterns of cDNA sequence, allowing us to differentiate between two classes of adenosine and two classes of guanine modifications with 98% and 79% accuracy, respectively. To show the robustness of this method to sample preparation and sequencing methods, as well as to organismal diversity, we applied it to a publicly available yeast data set and achieved similar levels of accuracy. We also experimentally validated two novel and one known 3-methylcytosine (3mC) sites predicted by HAMR in human tRNAs. Researchers can now use our method to identify and characterize RNA modifications using only RNA-seq data, both retrospectively and when asking questions specifically about modified RNA.


Asunto(s)
Anotación de Secuencia Molecular/métodos , Procesamiento Postranscripcional del ARN , ARN de Transferencia/genética , Programas Informáticos , Femenino , Células HEK293 , Humanos , Masculino , ARN/genética , ARN/metabolismo , ARN de Transferencia/metabolismo , Saccharomyces cerevisiae/genética , Alineación de Secuencia , Análisis de Secuencia de ARN
14.
Alzheimers Dement ; 11(12): 1407-1416, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-25936935

RESUMEN

A rare variant in TREM2 (p.R47H, rs75932628) was recently reported to increase the risk of Alzheimer's disease (AD) and, subsequently, other neurodegenerative diseases, i.e. frontotemporal lobar degeneration (FTLD), amyotrophic lateral sclerosis (ALS), and Parkinson's disease (PD). Here we comprehensively assessed TREM2 rs75932628 for association with these diseases in a total of 19,940 previously untyped subjects of European descent. These data were combined with those from 28 published data sets by meta-analysis. Furthermore, we tested whether rs75932628 shows association with amyloid beta (Aß42) and total-tau protein levels in the cerebrospinal fluid (CSF) of 828 individuals with AD or mild cognitive impairment. Our data show that rs75932628 is highly significantly associated with the risk of AD across 24,086 AD cases and 148,993 controls of European descent (odds ratio or OR = 2.71, P = 4.67 × 10(-25)). No consistent evidence for association was found between this marker and the risk of FTLD (OR = 2.24, P = .0113 across 2673 cases/9283 controls), PD (OR = 1.36, P = .0767 across 8311 cases/79,938 controls) and ALS (OR = 1.41, P = .198 across 5544 cases/7072 controls). Furthermore, carriers of the rs75932628 risk allele showed significantly increased levels of CSF-total-tau (P = .0110) but not Aß42 suggesting that TREM2's role in AD may involve tau dysfunction.


Asunto(s)
Enfermedad de Alzheimer/genética , Predisposición Genética a la Enfermedad , Glicoproteínas de Membrana/genética , Enfermedades Neurodegenerativas/genética , Receptores Inmunológicos/genética , Anciano , Alelos , Esclerosis Amiotrófica Lateral/genética , Estudios de Casos y Controles , Femenino , Degeneración Lobar Frontotemporal/genética , Genotipo , Humanos , Masculino , Persona de Mediana Edad , Enfermedad de Parkinson/genética , Sitios de Carácter Cuantitativo , Factores de Riesgo , Población Blanca , Proteínas tau/líquido cefalorraquídeo
15.
Bioinformatics ; 29(19): 2498-500, 2013 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-23943636

RESUMEN

SUMMARY: We report our new DRAW+SneakPeek software for DNA-seq analysis. DNA resequencing analysis workflow (DRAW) automates the workflow of processing raw sequence reads including quality control, read alignment and variant calling on high-performance computing facilities such as Amazon elastic compute cloud. SneakPeek provides an effective interface for reviewing dozens of quality metrics reported by DRAW, so users can assess the quality of data and diagnose problems in their sequencing procedures. Both DRAW and SneakPeek are freely available under the MIT license, and are available as Amazon machine images to be used directly on Amazon cloud with minimal installation. AVAILABILITY: DRAW+SneakPeek is released under the MIT license and is available for academic and nonprofit use for free. The information about source code, Amazon machine images and instructions on how to install and run DRAW+SneakPeek locally and on Amazon elastic compute cloud is available at the National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site (http://www.niagads.org/) and Wang lab Web site (http://wanglab.pcbi.upenn.edu/).


Asunto(s)
Biometría/métodos , ADN/análisis , Análisis de Secuencia de ADN/métodos , Diseño de Software , Internet , Lenguajes de Programación
16.
Acta Neuropathol ; 127(6): 825-43, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24770881

RESUMEN

Hippocampal sclerosis of aging (HS-Aging) is a high-morbidity brain disease in the elderly but risk factors are largely unknown. We report the first genome-wide association study (GWAS) with HS-Aging pathology as an endophenotype. In collaboration with the Alzheimer's Disease Genetics Consortium, data were analyzed from large autopsy cohorts: (#1) National Alzheimer's Coordinating Center (NACC); (#2) Rush University Religious Orders Study and Memory and Aging Project; (#3) Group Health Research Institute Adult Changes in Thought study; (#4) University of California at Irvine 90+ Study; and (#5) University of Kentucky Alzheimer's Disease Center. Altogether, 363 HS-Aging cases and 2,303 controls, all pathologically confirmed, provided statistical power to test for risk alleles with large effect size. A two-tier study design included GWAS from cohorts #1-3 (Stage I) to identify promising SNP candidates, followed by focused evaluation of particular SNPs in cohorts #4-5 (Stage II). Polymorphism in the ATP-binding cassette, sub-family C member 9 (ABCC9) gene, also known as sulfonylurea receptor 2, was associated with HS-Aging pathology. In the meta-analyzed Stage I GWAS, ABCC9 polymorphisms yielded the lowest p values, and factoring in the Stage II results, the meta-analyzed risk SNP (rs704178:G) attained genome-wide statistical significance (p = 1.4 × 10(-9)), with odds ratio (OR) of 2.13 (recessive mode of inheritance). For SNPs previously linked to hippocampal sclerosis, meta-analyses of Stage I results show OR = 1.16 for rs5848 (GRN) and OR = 1.22 rs1990622 (TMEM106B), with the risk alleles as previously described. Sulfonylureas, a widely prescribed drug class used to treat diabetes, also modify human ABCC9 protein function. A subsample of patients from the NACC database (n = 624) were identified who were older than age 85 at death with known drug history. Controlling for important confounders such as diabetes itself, exposure to a sulfonylurea drug was associated with risk for HS-Aging pathology (p = 0.03). Thus, we describe a novel and targetable dementia risk factor.


Asunto(s)
Envejecimiento/genética , Envejecimiento/patología , Hipocampo/patología , Polimorfismo de Nucleótido Simple , Receptores de Sulfonilureas/genética , Anciano de 80 o más Años , Envejecimiento/efectos de los fármacos , Estudios de Cohortes , Bases de Datos como Asunto , Endofenotipos , Estudio de Asociación del Genoma Completo , Hipocampo/efectos de los fármacos , Humanos , Esclerosis/genética , Esclerosis/patología , Compuestos de Sulfonilurea/efectos adversos , Compuestos de Sulfonilurea/uso terapéutico
17.
Nucleic Acids Res ; 40(Web Server issue): W59-64, 2012 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-22492627

RESUMEN

RNA secondary structure is required for the proper regulation of the cellular transcriptome. This is because the functionality, processing, localization and stability of RNAs are all dependent on the folding of these molecules into intricate structures through specific base pairing interactions encoded in their primary nucleotide sequences. Thus, as the number of RNA sequencing (RNA-seq) data sets and the variety of protocols for this technology grow rapidly, it is becoming increasingly pertinent to develop tools that can analyze and visualize this sequence data in the context of RNA secondary structure. Here, we present Sequencing Annotation and Visualization of RNA structures (SAVoR), a web server, which seamlessly links RNA structure predictions with sequencing data and genomic annotations to produce highly informative and annotated models of RNA secondary structure. SAVoR accepts read alignment data from RNA-seq experiments and computes a series of per-base values such as read abundance and sequence variant frequency. These values can then be visualized on a customizable secondary structure model. SAVoR is freely available at http://tesla.pcbi.upenn.edu/savor.


Asunto(s)
ARN/química , Programas Informáticos , Internet , Modelos Moleculares , Anotación de Secuencia Molecular , Conformación de Ácido Nucleico , Análisis de Secuencia de ARN
18.
Nat Commun ; 15(1): 684, 2024 Jan 23.
Artículo en Inglés | MEDLINE | ID: mdl-38263370

RESUMEN

The heterogeneity of the whole-exome sequencing (WES) data generation methods present a challenge to a joint analysis. Here we present a bioinformatics strategy for joint-calling 20,504 WES samples collected across nine studies and sequenced using ten capture kits in fourteen sequencing centers in the Alzheimer's Disease Sequencing Project. The joint-genotype called variant-called format (VCF) file contains only positions within the union of capture kits. The VCF was then processed specifically to account for the batch effects arising from the use of different capture kits from different studies. We identified 8.2 million autosomal variants. 96.82% of the variants are high-quality, and are located in 28,579 Ensembl transcripts. 41% of the variants are intronic and 1.8% of the variants are with CADD > 30, indicating they are of high predicted pathogenicity. Here we show our new strategy can generate high-quality data from processing these diversely generated WES samples. The improved ability to combine data sequenced in different batches benefits the whole genomics research community.


Asunto(s)
Enfermedad de Alzheimer , Humanos , Exoma , Biología Computacional , Exactitud de los Datos , Genotipo
19.
Life Sci Alliance ; 7(5)2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38418088

RESUMEN

Detecting structural variants (SVs) in whole-genome sequencing poses significant challenges. We present a protocol for variant calling, merging, genotyping, sensitivity analysis, and laboratory validation for generating a high-quality SV call set in whole-genome sequencing from the Alzheimer's Disease Sequencing Project comprising 578 individuals from 111 families. Employing two complementary pipelines, Scalpel and Parliament, for SV/indel calling, we assessed sensitivity through sample replicates (N = 9) with in silico variant spike-ins. We developed a novel metric, D-score, to evaluate caller specificity for deletions. The accuracy of deletions was evaluated by Sanger sequencing. We generated a high-quality call set of 152,301 deletions of diverse sizes. Sanger sequencing validated 114 of 146 detected deletions (78.1%). Scalpel excelled in accuracy for deletions ≤100 bp, whereas Parliament was optimal for deletions >900 bp. Overall, 83.0% and 72.5% of calls by Scalpel and Parliament were validated, respectively, including all 11 deletions called by both Parliament and Scalpel between 101 and 900 bp. Our flexible protocol successfully generated a high-quality deletion call set and a truth set of Sanger sequencing-validated deletions with precise breakpoints spanning 1-17,000 bp.


Asunto(s)
Enfermedad de Alzheimer , Humanos , Enfermedad de Alzheimer/genética , Secuenciación Completa del Genoma/métodos
20.
Nat Commun ; 15(1): 7880, 2024 Sep 09.
Artículo en Inglés | MEDLINE | ID: mdl-39251599

RESUMEN

Progressive supranuclear palsy (PSP), a rare Parkinsonian disorder, is characterized by problems with movement, balance, and cognition. PSP differs from Alzheimer's disease (AD) and other diseases, displaying abnormal microtubule-associated protein tau by both neuronal and glial cell pathologies. Genetic contributors may mediate these differences; however, the genetics of PSP remain underexplored. Here we conduct the largest genome-wide association study (GWAS) of PSP which includes 2779 cases (2595 neuropathologically-confirmed) and 5584 controls and identify six independent PSP susceptibility loci with genome-wide significant (P < 5 × 10-8) associations, including five known (MAPT, MOBP, STX6, RUNX2, SLCO1A2) and one novel locus (C4A). Integration with cell type-specific epigenomic annotations reveal an oligodendrocytic signature that might distinguish PSP from AD and Parkinson's disease in subsequent studies. Candidate PSP risk gene prioritization using expression quantitative trait loci (eQTLs) identifies oligodendrocyte-specific effects on gene expression in half of the genome-wide significant loci, and an association with C4A expression in brain tissue, which may be driven by increased C4A copy number. Finally, histological studies demonstrate tau aggregates in oligodendrocytes that colocalize with C4 (complement) deposition. Integrating GWAS with functional studies, epigenomic and eQTL analyses, we identify potential causal roles for variation in MOBP, STX6, RUNX2, SLCO1A2, and C4A in PSP pathogenesis.


Asunto(s)
Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo , Parálisis Supranuclear Progresiva , Proteínas tau , Humanos , Parálisis Supranuclear Progresiva/genética , Parálisis Supranuclear Progresiva/patología , Parálisis Supranuclear Progresiva/metabolismo , Anciano , Masculino , Femenino , Proteínas tau/genética , Proteínas tau/metabolismo , Transcriptoma , Polimorfismo de Nucleótido Simple , Neuroglía/metabolismo , Neuroglía/patología , Anciano de 80 o más Años , Oligodendroglía/metabolismo , Oligodendroglía/patología , Persona de Mediana Edad , Enfermedad de Alzheimer/genética , Enfermedad de Alzheimer/patología , Enfermedad de Alzheimer/metabolismo , Estudios de Casos y Controles , Proteínas de la Mielina
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA