Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Artículo en Inglés | MEDLINE | ID: mdl-35224460

RESUMEN

Inspired by well-established material and pedagogy provided by The Carpentries (Wilson, 2016), we developed a two-day workshop curriculum that teaches introductory R programming for managing, analyzing, plotting and reporting data using packages from the tidyverse (Wickham et al., 2019), the Unix shell, version control with git, and GitHub. While the official Software Carpentry curriculum is comprehensive, we found that it contains too much content for a two-day workshop. We also felt that the independent nature of the lessons left learners confused about how to integrate the newly acquired programming skills in their own work. Thus, we developed a new curriculum that aims to teach novices how to implement reproducible research principles in their own data analysis. The curriculum integrates live coding lessons with individual-level and group-based practice exercises, and also serves as a succinct resource that learners can reference both during and after the workshop. Moreover, it lowers the entry barrier for new instructors as they do not have to develop their own teaching materials or sift through extensive content. We developed this curriculum during a two-day sprint, successfully used it to host a two-day virtual workshop with almost 40 participants, and updated the material based on instructor and learner feedback. We hope that our new curriculum will prove useful to future instructors interested in teaching workshops with similar learning objectives.

2.
Bioinformatics ; 37(18): 3017-3018, 2021 09 29.
Artículo en Inglés | MEDLINE | ID: mdl-33734315

RESUMEN

SUMMARY: LocusZoom.js is a JavaScript library for creating interactive web-based visualizations of genetic association study results. It can display one or more traits in the context of relevant biological data (such as gene models and other genomic annotation), and allows interactive refinement of analysis models (by selecting linkage disequilibrium reference panels, identifying sets of likely causal variants, or comparisons to the GWAS catalog). It can be embedded in web pages to enable data sharing and exploration. Views can be customized and extended to display other data types such as phenome-wide association study (PheWAS) results, chromatin co-accessibility, or eQTL measurements. A new web upload service harmonizes datasets, adds annotations, and makes it easy to explore user-provided result sets. AVAILABILITY AND IMPLEMENTATION: LocusZoom.js is open-source software under a permissive MIT license. Code and documentation are available at: https://github.com/statgen/locuszoom/. Installable packages for all versions are also distributed via NPM. Additional features are provided as standalone libraries to promote reuse. Use with your own GWAS results at https://my.locuszoom.org/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genómica , Programas Informáticos , Genoma , Estudios de Asociación Genética , Documentación
3.
Genome Res ; 30(2): 185-194, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-31980570

RESUMEN

Detecting and estimating DNA sample contamination are important steps to ensure high-quality genotype calls and reliable downstream analysis. Existing methods rely on population allele frequency information for accurate estimation of contamination rates. Correctly specifying population allele frequencies for each individual in early stage of sequence analysis is impractical or even impossible for large-scale sequencing centers that simultaneously process samples from multiple studies across diverse populations. On the other hand, incorrectly specified allele frequencies may result in substantial bias in estimated contamination rates. For example, we observed that existing methods often fail to identify 10% contaminated samples at a typical 3% contamination exclusion threshold when genetic ancestry is misspecified. Such an incomplete screening of contaminated samples substantially inflates the estimated rate of genotyping errors even in deeply sequenced genomes and exomes. We propose a robust statistical method that accurately estimates DNA contamination and is agnostic to genetic ancestry of the intended or contaminating sample. Our method integrates the estimation of genetic ancestry and DNA contamination in a unified likelihood framework by leveraging individual-specific allele frequencies projected from reference genotypes onto principal component coordinates. Our method can also be used for estimating genetic ancestries, similar to LASER or TRACE, but simultaneously accounting for potential contamination. We demonstrate that our method robustly estimates contamination rates and genetic ancestries across populations and contamination scenarios. We further demonstrate that, in the presence of contamination, genetic ancestry inference can be substantially biased with existing methods that ignore contamination, while our method corrects for such biases.


Asunto(s)
Contaminación de ADN , ADN/genética , Genotipo , Técnicas de Genotipaje/normas , Alelos , Exoma/genética , Frecuencia de los Genes/genética , Genética de Población , Humanos , Polimorfismo de Nucleótido Simple/genética , Análisis de Secuencia de ADN
4.
Nat Commun ; 9(1): 3753, 2018 09 14.
Artículo en Inglés | MEDLINE | ID: mdl-30218074

RESUMEN

A detailed understanding of the genome-wide variability of single-nucleotide germline mutation rates is essential to studying human genome evolution. Here, we use ~36 million singleton variants from 3560 whole-genome sequences to infer fine-scale patterns of mutation rate heterogeneity. Mutability is jointly affected by adjacent nucleotide context and diverse genomic features of the surrounding region, including histone modifications, replication timing, and recombination rate, sometimes suggesting specific mutagenic mechanisms. Remarkably, GC content, DNase hypersensitivity, CpG islands, and H3K36 trimethylation are associated with both increased and decreased mutation rates depending on nucleotide context. We validate these estimated effects in an independent dataset of ~46,000 de novo mutations, and confirm our estimates are more accurate than previously published results based on ancestrally older variants without considering genomic features. Our results thus provide the most refined portrait to date of the factors contributing to genome-wide variability of the human germline mutation rate.


Asunto(s)
Evolución Molecular , Variación Genética , Mutación de Línea Germinal/genética , Tasa de Mutación , Composición de Base , Islas de CpG , Citosina , Metilación de ADN , Desoxirribonucleasas , Genoma Humano , Guanina , Código de Histonas , Humanos , Polimorfismo de Nucleótido Simple
5.
JAMA Psychiatry ; 73(6): 590-7, 2016 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-27120077

RESUMEN

IMPORTANCE: Complex disorders, such as bipolar disorder (BD), likely result from the influence of both common and rare susceptibility alleles. While common variation has been widely studied, rare variant discovery has only recently become feasible with next-generation sequencing. OBJECTIVE: To utilize a combined family-based and case-control approach to exome sequencing in BD using multiplex families as an initial discovery strategy, followed by association testing in a large case-control meta-analysis. DESIGN, SETTING, AND PARTICIPANTS: We performed exome sequencing of 36 affected members with BD from 8 multiplex families and tested rare, segregating variants in 3 independent case-control samples consisting of 3541 BD cases and 4774 controls. MAIN OUTCOMES AND MEASURES: We used penalized logistic regression and 1-sided gene-burden analyses to test for association of rare, segregating damaging variants with BD. Permutation-based analyses were performed to test for overall enrichment with previously identified gene sets. RESULTS: We found 84 rare (frequency <1%), segregating variants that were bioinformatically predicted to be damaging. These variants were found in 82 genes that were enriched for gene sets previously identified in de novo studies of autism (19 observed vs. 10.9 expected, P = .0066) and schizophrenia (11 observed vs. 5.1 expected, P = .0062) and for targets of the fragile X mental retardation protein (FMRP) pathway (10 observed vs. 4.4 expected, P = .0076). The case-control meta-analyses yielded 19 genes that were nominally associated with BD based either on individual variants or a gene-burden approach. Although no gene was individually significant after correction for multiple testing, this group of genes continued to show evidence for significant enrichment of de novo autism genes (6 observed vs 2.6 expected, P = .028). CONCLUSIONS AND RELEVANCE: Our results are consistent with the presence of prominent locus and allelic heterogeneity in BD and suggest that very large samples will be required to definitively identify individual rare variants or genes conferring risk for this disorder. However, we also identify significant associations with gene sets composed of previously discovered de novo variants in autism and schizophrenia, as well as targets of the FRMP pathway, providing preliminary support for the overlap of potential autism and schizophrenia risk genes with rare, segregating variants in families with BD.


Asunto(s)
Trastorno Bipolar/genética , Exoma/genética , Análisis de Secuencia de ADN , Alelos , Trastorno Autístico/genética , Trastorno Autístico/psicología , Trastorno Bipolar/diagnóstico , Trastorno Bipolar/psicología , Estudios de Casos y Controles , Proteína de la Discapacidad Intelectual del Síndrome del Cromosoma X Frágil/genética , Heterogeneidad Genética , Predisposición Genética a la Enfermedad/genética , Variación Genética/genética , Estudio de Asociación del Genoma Completo , Humanos , Esquizofrenia/genética , Psicología del Esquizofrénico
6.
Am J Hum Genet ; 97(2): 284-90, 2015 Aug 06.
Artículo en Inglés | MEDLINE | ID: mdl-26235984

RESUMEN

DNA sample contamination is a frequent problem in DNA sequencing studies and can result in genotyping errors and reduced power for association testing. We recently described methods to identify within-species DNA sample contamination based on sequencing read data, showed that our methods can reliably detect and estimate contamination levels as low as 1%, and suggested strategies to identify and remove contaminated samples from sequencing studies. Here we propose methods to model contamination during genotype calling as an alternative to removal of contaminated samples from further analyses. We compare our contamination-adjusted calls to calls that ignore contamination and to calls based on uncontaminated data. We demonstrate that, for moderate contamination levels (5%-20%), contamination-adjusted calls eliminate 48%-77% of the genotyping errors. For lower levels of contamination, our contamination correction methods produce genotypes nearly as accurate as those based on uncontaminated data. Our contamination correction methods are useful generally, but are particularly helpful for sample contamination levels from 2% to 20%.


Asunto(s)
Contaminación de ADN , Técnicas de Genotipaje/métodos , Técnicas de Genotipaje/normas , Modelos Genéticos , Análisis de Secuencia de ADN/métodos , Análisis de Secuencia de ADN/normas
7.
Psychophysiology ; 51(12): 1309-20, 2014 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-25387710

RESUMEN

Whole genome sequencing was completed on 1,325 individuals from 602 families, identifying 27 million autosomal variants. Genetic association tests were conducted for those individuals who had been assessed for one or more of 17 endophenotypes (N range = 802-1,185). No significant associations were found. These 27 million variants were then imputed into the full sample of individuals with psychophysiological data (N range = 3,088-4,469) and again tested for associations with the 17 endophenotypes. No association was significant. Using a gene-based variable threshold burden test of nonsynonymous variants, we obtained five significant associations. These findings are preliminary and call for additional analysis of this rich sample. We argue that larger samples, alternative study designs, and additional bioinformatics approaches will be necessary to discover associations between these endophenotypes and genomic variation.


Asunto(s)
Endofenotipos , Genotipo , Polimorfismo de Nucleótido Simple , Gemelos/genética , Encéfalo/fisiología , Electroencefalografía , Potenciales Relacionados con Evento P300/genética , Respuesta Galvánica de la Piel/genética , Estudios de Asociación Genética , Humanos , Reflejo de Sobresalto/genética , Movimientos Sacádicos/genética , Filtrado Sensorial/genética
8.
Am J Hum Genet ; 91(5): 839-48, 2012 Nov 02.
Artículo en Inglés | MEDLINE | ID: mdl-23103226

RESUMEN

DNA sample contamination is a serious problem in DNA sequencing studies and may result in systematic genotype misclassification and false positive associations. Although methods exist to detect and filter out cross-species contamination, few methods to detect within-species sample contamination are available. In this paper, we describe methods to identify within-species DNA sample contamination based on (1) a combination of sequencing reads and array-based genotype data, (2) sequence reads alone, and (3) array-based genotype data alone. Analysis of sequencing reads allows contamination detection after sequence data is generated but prior to variant calling; analysis of array-based genotype data allows contamination detection prior to generation of costly sequence data. Through a combination of analysis of in silico and experimentally contaminated samples, we show that our methods can reliably detect and estimate levels of contamination as low as 1%. We evaluate the impact of DNA contamination on genotype accuracy and propose effective strategies to screen for and prevent DNA contamination in sequencing studies.


Asunto(s)
Contaminación de ADN , Genotipo , Análisis de Secuencia de ADN , Diabetes Mellitus Tipo 2/diagnóstico , Diabetes Mellitus Tipo 2/genética , Humanos
9.
Proc Natl Acad Sci U S A ; 106(18): 7501-6, 2009 May 05.
Artículo en Inglés | MEDLINE | ID: mdl-19416921

RESUMEN

Bipolar disorder (BP) is a disabling and often life-threatening disorder that affects approximately 1% of the population worldwide. To identify genetic variants that increase the risk of BP, we genotyped on the Illumina HumanHap550 Beadchip 2,076 bipolar cases and 1,676 controls of European ancestry from the National Institute of Mental Health Human Genetics Initiative Repository, and the Prechter Repository and samples collected in London, Toronto, and Dundee. We imputed SNP genotypes and tested for SNP-BP association in each sample and then performed meta-analysis across samples. The strongest association P value for this 2-study meta-analysis was 2.4 x 10(-6). We next imputed SNP genotypes and tested for SNP-BP association based on the publicly available Affymetrix 500K genotype data from the Wellcome Trust Case Control Consortium for 1,868 BP cases and a reference set of 12,831 individuals. A 3-study meta-analysis of 3,683 nonoverlapping cases and 14,507 extended controls on >2.3 M genotyped and imputed SNPs resulted in 3 chromosomal regions with association P approximately 10(-7): 1p31.1 (no known genes), 3p21 (>25 known genes), and 5q15 (MCTP1). The most strongly associated nonsynonymous SNP rs1042779 (OR = 1.19, P = 1.8 x 10(-7)) is in the ITIH1 gene on chromosome 3, with other strongly associated nonsynonymous SNPs in GNL3, NEK4, and ITIH3. Thus, these chromosomal regions harbor genes implicated in cell cycle, neurogenesis, neuroplasticity, and neurosignaling. In addition, we replicated the reported ANK3 association results for SNP rs10994336 in the nonoverlapping GSK sample (OR = 1.37, P = 0.042). Although these results are promising, analysis of additional samples will be required to confirm that variant(s) in these regions influence BP risk.


Asunto(s)
Trastorno Bipolar/genética , Cromosomas Humanos Par 1/genética , Cromosomas Humanos Par 3/genética , Cromosomas Humanos Par 5/genética , Genoma Humano , Europa (Continente) , Estudio de Asociación del Genoma Completo , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...