Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
medRxiv ; 2024 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-38496498

RESUMO

Less than half of individuals with a suspected Mendelian condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control datasets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project ONT Sequencing Consortium aims to generate LRS data from at least 800 of the 1000 Genomes Project samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37x and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.

2.
Cell ; 185(18): 3426-3440.e19, 2022 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-36055201

RESUMO

The 1000 Genomes Project (1kGP) is the largest fully open resource of whole-genome sequencing (WGS) data consented for public distribution without access or use restrictions. The final, phase 3 release of the 1kGP included 2,504 unrelated samples from 26 populations and was based primarily on low-coverage WGS. Here, we present a high-coverage 3,202-sample WGS 1kGP resource, which now includes 602 complete trios, sequenced to a depth of 30X using Illumina. We performed single-nucleotide variant (SNV) and short insertion and deletion (INDEL) discovery and generated a comprehensive set of structural variants (SVs) by integrating multiple analytic methods through a machine learning model. We show gains in sensitivity and precision of variant calls compared to phase 3, especially among rare SNVs as well as INDELs and SVs spanning frequency spectrum. We also generated an improved reference imputation panel, making variants discovered here accessible for association studies.


Assuntos
Genoma Humano , Sequenciamento Completo do Genoma , Feminino , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Mutação INDEL , Masculino , Polimorfismo de Nucleotídeo Único
3.
Proc Natl Acad Sci U S A ; 119(26): e2118755119, 2022 06 28.
Artigo em Inglês | MEDLINE | ID: mdl-35749364

RESUMO

Retromer is a heteropentameric complex that plays a specialized role in endosomal protein sorting and trafficking. Here, we report a reduction in the retromer proteins-vacuolar protein sorting 35 (VPS35), VPS26A, and VPS29-in patients with amyotrophic lateral sclerosis (ALS) and in the ALS model provided by transgenic (Tg) mice expressing the mutant superoxide dismutase-1 G93A. These changes are accompanied by a reduction of levels of the α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid receptor subunit GluA1, a proxy of retromer function, in spinal cords from Tg SOD1G93A mice. Correction of the retromer deficit by a viral vector expressing VPS35 exacerbates the paralytic phenotype in Tg SOD1G93A mice. Conversely, lowering Vps35 levels in Tg SOD1G93A mice ameliorates the disease phenotype. In light of these findings, we propose that mild alterations in retromer inversely modulate neurodegeneration propensity in ALS.


Assuntos
Esclerose Lateral Amiotrófica , Proteínas de Transporte Vesicular , Esclerose Lateral Amiotrófica/metabolismo , Animais , Modelos Animais de Doenças , Humanos , Camundongos , Camundongos Transgênicos , Medula Espinal/metabolismo , Superóxido Dismutase-1/genética , Superóxido Dismutase-1/metabolismo , Proteínas de Transporte Vesicular/genética , Proteínas de Transporte Vesicular/metabolismo
4.
Science ; 372(6537)2021 04 02.
Artigo em Inglês | MEDLINE | ID: mdl-33632895

RESUMO

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.


Assuntos
Variação Genética , Genoma Humano , Haplótipos , Feminino , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação INDEL , Sequências Repetitivas Dispersas , Masculino , Grupos Populacionais/genética , Locos de Características Quantitativas , Retroelementos , Análise de Sequência de DNA , Inversão de Sequência , Sequenciamento Completo do Genoma
5.
J Clin Endocrinol Metab ; 105(6)2020 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-31917831

RESUMO

CONTEXT: As many as 75% of patients with polycystic ovary syndrome (PCOS) are estimated to be unidentified in clinical practice. OBJECTIVE: Utilizing polygenic risk prediction, we aim to identify the phenome-wide comorbidity patterns characteristic of PCOS to improve accurate diagnosis and preventive treatment. DESIGN, PATIENTS, AND METHODS: Leveraging the electronic health records (EHRs) of 124 852 individuals, we developed a PCOS risk prediction algorithm by combining polygenic risk scores (PRS) with PCOS component phenotypes into a polygenic and phenotypic risk score (PPRS). We evaluated its predictive capability across different ancestries and perform a PRS-based phenome-wide association study (PheWAS) to assess the phenomic expression of the heightened risk of PCOS. RESULTS: The integrated polygenic prediction improved the average performance (pseudo-R2) for PCOS detection by 0.228 (61.5-fold), 0.224 (58.8-fold), 0.211 (57.0-fold) over the null model across European, African, and multi-ancestry participants respectively. The subsequent PRS-powered PheWAS identified a high level of shared biology between PCOS and a range of metabolic and endocrine outcomes, especially with obesity and diabetes: "morbid obesity", "type 2 diabetes", "hypercholesterolemia", "disorders of lipid metabolism", "hypertension", and "sleep apnea" reaching phenome-wide significance. CONCLUSIONS: Our study has expanded the methodological utility of PRS in patient stratification and risk prediction, especially in a multifactorial condition like PCOS, across different genetic origins. By utilizing the individual genome-phenome data available from the EHR, our approach also demonstrates that polygenic prediction by PRS can provide valuable opportunities to discover the pleiotropic phenomic network associated with PCOS pathogenesis.


Assuntos
Algoritmos , Estudo de Associação Genômica Ampla , Herança Multifatorial/genética , Fenômica/métodos , Fenótipo , Síndrome do Ovário Policístico/diagnóstico , Adolescente , Idoso , Estudos de Casos e Controles , Criança , Registros Eletrônicos de Saúde , Feminino , Seguimentos , Predisposição Genética para Doença , Humanos , Pessoa de Meia-Idade , Síndrome do Ovário Policístico/epidemiologia , Síndrome do Ovário Policístico/genética , Prognóstico , Fatores de Risco
6.
Trends Pharmacol Sci ; 40(9): 624-635, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31383376

RESUMO

Interventional pharmacology is one of medicine's most potent weapons against disease. These drugs, however, can result in damaging side effects and must be closely monitored. Pharmacovigilance is the field of science that monitors, detects, and prevents adverse drug reactions (ADRs). Safety efforts begin during the development process, using in vivo and in vitro studies, continue through clinical trials, and extend to postmarketing surveillance of ADRs in real-world populations. Future toxicity and safety challenges, including increased polypharmacy and patient diversity, stress the limits of these traditional tools. Massive amounts of newly available data present an opportunity for using artificial intelligence (AI) and machine learning to improve drug safety science. Here, we explore recent advances as applied to preclinical drug safety and postmarketing surveillance with a specific focus on machine and deep learning (DL) approaches.


Assuntos
Sistemas de Notificação de Reações Adversas a Medicamentos , Inteligência Artificial , Animais , Avaliação Pré-Clínica de Medicamentos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/prevenção & controle , Humanos , Aprendizado de Máquina , Farmacovigilância , Vigilância de Produtos Comercializados , Relação Quantitativa Estrutura-Atividade , Testes de Toxicidade
7.
BMC Bioinformatics ; 20(1): 46, 2019 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-30669967

RESUMO

BACKGROUND: The development of sequencing techniques and statistical methods provides great opportunities for identifying the impact of rare genetic variation on complex traits. However, there is a lack of knowledge on the impact of sample size, case numbers, the balance of cases vs controls for both burden and dispersion based rare variant association methods. For example, Phenome-Wide Association Studies may have a wide range of case and control sample sizes across hundreds of diagnoses and traits, and with the application of statistical methods to rare variants, it is important to understand the strengths and limitations of the analyses. RESULTS: We conducted a large-scale simulation of randomly selected low-frequency protein-coding regions using twelve different balanced samples with an equal number of cases and controls as well as twenty-one unbalanced sample scenarios. We further explored statistical performance of different minor allele frequency thresholds and a range of genetic effect sizes. Our simulation results demonstrate that using an unbalanced study design has an overall higher type I error rate for both burden and dispersion tests compared with a balanced study design. Regression has an overall higher type I error with balanced cases and controls, while SKAT has higher type I error for unbalanced case-control scenarios. We also found that both type I error and power were driven by the number of cases in addition to the case to control ratio under large control group scenarios. Based on our power simulations, we observed that a SKAT analysis with case numbers larger than 200 for unbalanced case-control models yielded over 90% power with relatively well controlled type I error. To achieve similar power in regression, over 500 cases are needed. Moreover, SKAT showed higher power to detect associations in unbalanced case-control scenarios than regression. CONCLUSIONS: Our results provide important insights into rare variant association study designs by providing a landscape of type I error and statistical power for a wide range of sample sizes. These results can serve as a benchmark for making decisions about study design for rare variant analyses.


Assuntos
Simulação por Computador/normas , Estudos de Associação Genética/métodos , Tamanho da Amostra , Humanos , Modelos Genéticos , Projetos de Pesquisa
8.
Bioinformatics ; 34(3): 527-529, 2018 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-28968757

RESUMO

Motivation: BioBin is an automated bioinformatics tool for the multi-level biological binning of sequence variants. Herein, we present a significant update to BioBin which expands the software to facilitate a comprehensive rare variant analysis and incorporates novel features and analysis enhancements. Results: In BioBin 2.3, we extend our software tool by implementing statistical association testing, updating the binning algorithm, as well as incorporating novel analysis features providing for a robust, highly customizable, and unified rare variant analysis tool. Availability and implementation: The BioBin software package is open source and freely available to users at http://www.ritchielab.com/software/biobin-download. Contact: mdritchie@geisinger.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Estudos de Associação Genética/métodos , Variação Genética , Software , Algoritmos , Genômica/métodos
9.
Nat Commun ; 8(1): 1167, 2017 10 27.
Artigo em Inglês | MEDLINE | ID: mdl-29079728

RESUMO

Genome-wide, imputed, sequence, and structural data are now available for exceedingly large sample sizes. The needs for data management, handling population structure and related samples, and performing associations have largely been met. However, the infrastructure to support analyses involving complexity beyond genome-wide association studies is not standardized or centralized. We provide the PLatform for the Analysis, Translation, and Organization of large-scale data (PLATO), a software tool equipped to handle multi-omic data for hundreds of thousands of samples to explore complexity using genetic interactions, environment-wide association studies and gene-environment interactions, phenome-wide association studies, as well as copy number and rare variant analyses. Using the data from the Marshfield Personalized Medicine Research Project, a site in the electronic Medical Records and Genomics Network, we apply each feature of PLATO to type 2 diabetes and demonstrate how PLATO can be used to uncover the complex etiology of common traits.


Assuntos
Biologia Computacional , Genoma Humano , Estudo de Associação Genômica Ampla , Consumo de Bebidas Alcoólicas , Alelos , Bases de Dados Genéticas , Diabetes Mellitus Tipo 2/genética , Dieta , Epistasia Genética , Deleção de Genes , Dosagem de Genes , Interação Gene-Ambiente , Genômica , Genótipo , Glutamato Descarboxilase/genética , Humanos , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único , Linguagens de Programação , Recidiva , Análise de Sequência de DNA , Software , Inquéritos e Questionários
10.
BMC Med Inform Decis Mak ; 17(Suppl 1): 61, 2017 May 18.
Artigo em Inglês | MEDLINE | ID: mdl-28539126

RESUMO

BACKGROUND: Rapid advancement of next generation sequencing technologies such as whole genome sequencing (WGS) has facilitated the search for genetic factors that influence disease risk in the field of human genetics. To identify rare variants associated with human diseases or traits, an efficient genome-wide binning approach is needed. In this study we developed a novel biological knowledge-based binning approach for rare-variant association analysis and then applied the approach to structural neuroimaging endophenotypes related to late-onset Alzheimer's disease (LOAD). METHODS: For rare-variant analysis, we used the knowledge-driven binning approach implemented in Bin-KAT, an automated tool, that provides 1) binning/collapsing methods for multi-level variant aggregation with a flexible, biologically informed binning strategy and 2) an option of performing unified collapsing and statistical rare variant analyses in one tool. A total of 750 non-Hispanic Caucasian participants from the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort who had both WGS data and magnetic resonance imaging (MRI) scans were used in this study. Mean bilateral cortical thickness of the entorhinal cortex extracted from MRI scans was used as an AD-related neuroimaging endophenotype. SKAT was used for a genome-wide gene- and region-based association analysis of rare variants (MAF (minor allele frequency) < 0.05) and potential confounding factors (age, gender, years of education, intracranial volume (ICV) and MRI field strength) for entorhinal cortex thickness were used as covariates. Significant associations were determined using FDR adjustment for multiple comparisons. RESULTS: Our knowledge-driven binning approach identified 16 functional exonic rare variants in FANCC significantly associated with entorhinal cortex thickness (FDR-corrected p-value < 0.05). In addition, the approach identified 7 evolutionary conserved regions, which were mapped to FAF1, RFX7, LYPLAL1 and GOLGA3, significantly associated with entorhinal cortex thickness (FDR-corrected p-value < 0.05). In further analysis, the functional exonic rare variants in FANCC were also significantly associated with hippocampal volume and cerebrospinal fluid (CSF) Aß1-42 (p-value < 0.05). CONCLUSIONS: Our novel binning approach identified rare variants in FANCC as well as 7 evolutionary conserved regions significantly associated with a LOAD-related neuroimaging endophenotype. FANCC (fanconi anemia complementation group C) has been shown to modulate TLR and p38 MAPK-dependent expression of IL-1ß in macrophages. Our results warrant further investigation in a larger independent cohort and demonstrate that the biological knowledge-driven binning approach is a powerful strategy to identify rare variants associated with AD and other complex disease.


Assuntos
Doença de Alzheimer/diagnóstico por imagem , Doença de Alzheimer/genética , Mineração de Dados/métodos , Idoso , Idoso de 80 Anos ou mais , Biomarcadores , Éxons , Feminino , Estudo de Associação Genômica Ampla , Genômica , Humanos , Masculino , Pessoa de Meia-Idade , Neuroimagem , Fenótipo
11.
Pac Symp Biocomput ; 22: 177-183, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-27896973

RESUMO

Given the exponential growth of biomedical data, researchers are faced with numerous challenges in extracting and interpreting information from these large, high-dimensional, incomplete, and often noisy data. To facilitate addressing this growing concern, the "Patterns in Biomedical Data-How do we find them?" session of the 2017 Pacific Symposium on Biocomputing (PSB) is devoted to exploring pattern recognition using data-driven approaches for biomedical and precision medicine applications. The papers selected for this session focus on novel machine learning techniques as well as applications of established methods to heterogeneous data. We also feature manuscripts aimed at addressing the current challenges associated with the analysis of biomedical data.

12.
PLoS One ; 11(8): e0160573, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27508393

RESUMO

We performed a Phenome-Wide Association Study (PheWAS) to identify interrelationships between the immune system genetic architecture and a wide array of phenotypes from two de-identified electronic health record (EHR) biorepositories. We selected variants within genes encoding critical factors in the immune system and variants with known associations with autoimmunity. To define case/control status for EHR diagnoses, we used International Classification of Diseases, Ninth Revision (ICD-9) diagnosis codes from 3,024 Geisinger Clinic MyCode® subjects (470 diagnoses) and 2,899 Vanderbilt University Medical Center BioVU biorepository subjects (380 diagnoses). A pooled-analysis was also carried out for the replicating results of the two data sets. We identified new associations with potential biological relevance including SNPs in tumor necrosis factor (TNF) and ankyrin-related genes associated with acute and chronic sinusitis and acute respiratory tract infection. The two most significant associations identified were for the C6orf10 SNP rs6910071 and "rheumatoid arthritis" (ICD-9 code category 714) (pMETAL = 2.58 x 10-9) and the ATN1 SNP rs2239167 and "diabetes mellitus, type 2" (ICD-9 code category 250) (pMETAL = 6.39 x 10-9). This study highlights the utility of using PheWAS in conjunction with EHRs to discover new genotypic-phenotypic associations for immune-system related genetic loci.


Assuntos
Estudos de Associação Genética , Sistema Imunitário/metabolismo , Anquirinas/genética , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/patologia , Registros Eletrônicos de Saúde , Loci Gênicos , Genótipo , Humanos , Desequilíbrio de Ligação , Proteínas do Tecido Nervoso/genética , Fenótipo , Polimorfismo de Nucleotídeo Único , Infecções Respiratórias/genética , Infecções Respiratórias/patologia , Sinusite/genética , Sinusite/patologia , Fator de Necrose Tumoral alfa/genética
13.
Pac Symp Biocomput ; 21: 249-60, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26776191

RESUMO

Next-generation sequencing technology has presented an opportunity for rare variant discovery and association of these variants with disease. To address the challenges of rare variant analysis, multiple statistical methods have been developed for combining rare variants to increase statistical power for detecting associations. BioBin is an automated tool that expands on collapsing/binning methods by performing multi-level variant aggregation with a flexible, biologically informed binning strategy using an internal biorepository, the Library of Knowledge (LOKI). The databases within LOKI provide variant details, regional annotations and pathway interactions which can be used to generate bins of biologically-related variants, thereby increasing the power of any subsequent statistical test. In this study, we expand the framework of BioBin to incorporate statistical tests, including a dispersion-based test, SKAT, thereby providing the option of performing a unified collapsing and statistical rare variant analysis in one tool. Extensive simulation studies performed on gene-coding regions showed a Bin-KAT analysis to have greater power than BioBin-regression in all simulated conditions, including variants influencing the phenotype in the same direction, a scenario where burden tests often retain greater power. The use of Madsen- Browning variant weighting increased power in the burden analysis to that equitable with Bin-KAT; but overall Bin-KAT retained equivalent or higher power under all conditions. Bin-KAT was applied to a study of 82 pharmacogenes sequenced in the Marshfield Personalized Medicine Research Project (PMRP). We looked for association of these genes with 9 different phenotypes extracted from the electronic health record. This study demonstrates that Bin-KAT is a powerful tool for the identification of genes harboring low frequency variants for complex phenotypes.


Assuntos
Estudo de Associação Genômica Ampla/estatística & dados numéricos , Fenótipo , Software , Biologia Computacional/métodos , Biologia Computacional/estatística & dados numéricos , Simulação por Computador , Bases de Dados Genéticas/estatística & dados numéricos , Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Bases de Conhecimento , Modelos Genéticos , Modelos Estatísticos , Farmacogenética/estatística & dados numéricos , Medicina de Precisão/estatística & dados numéricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA