RESUMO
The intensities from genotyping array data can be used to detect copy number variants (CNVs) but a high level of noise in the data and overlap between different copy-number intensity distributions produces unreliable calls, particularly when only a few probes are covered by the CNV. We present a novel pipeline (CamCNV) with a series of steps to reduce noise and detect more reliably CNVs covering as few as three probes. The pipeline aims to detect rare CNVs (below 1% frequency) for association tests in large cohorts. The method uses the information from all samples to convert intensities to z-scores, thus adjusting for variance between probes. We tested the sensitivity of our pipeline by looking for known CNVs from the 1000 Genomes Project in our genotyping of 1000 Genomes samples. We also compared the CNV calls for 1661 pairs of genotyped replicate samples. At the chosen mean z-score cut-off, sensitivity to detect the 1000 Genomes CNVs was approximately 85% for deletions and 65% for duplications. From the replicates, we estimate the false discovery rate is controlled at â¼10% for deletions (falling to below 3% with more than five probes) and â¼28% for duplications. The pipeline demonstrates improved sensitivity when compared to calling with PennCNV, particularly for short deletions covering only a few probes. For each called CNV, the mean z-score is a useful metric for controlling the false discovery rate.
Assuntos
Variações do Número de Cópias de DNA , Sequenciamento de Nucleotídeos em Larga Escala , Genótipo , Humanos , Reprodutibilidade dos TestesRESUMO
Genome-wide association studies (GWAS) in crops requires genotyping platforms that are capable of producing accurate high density genotyping data on hundreds of plants in a cost-effective manner. Currently there are multiple commercial platforms available that are being effectively used across crops. These platforms include genotyping arrays such as the Illumina Infinium arrays and the Applied Biosystems Axiom Arrays along with a variety of resequencing methods. These methods are being used to genotype tens of thousands of markers up to millions of markers on GWAS panels. They are being used on crops with simple genomes to crops with very complex, large, polyploid genomes. Depending on the crop and the goal of the GWAS, there are several options and practical considerations to take into account when selecting a genotyping technology to ensure that the right coverage, accuracy, and cost for the study is achieved.
Assuntos
Produtos Agrícolas , Estudo de Associação Genômica Ampla , Produtos Agrícolas/genética , Genoma , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala/métodosRESUMO
The role of chromosome Y in chronic kidney disease (CKD) remains unknown, as chromosome Y is typically excluded from genetic analysis in CKD. The complex, sex-specific presentation of CKD could be influenced by chromosome Y genetic variation, but there is limited published research available to confirm or reject this hypothesis. Although traditionally thought to be associated with male-specific disease, evidence linking chromosome Y genetic variation to common complex disorders highlights a potential gap in CKD research. Chromosome Y variation has been associated with cardiovascular disease, a condition closely linked to CKD and one with a very similar sexual dimorphism. Relatively few sources of genetic variation in chromosome Y have been examined in CKD. The association between chromosome Y aneuploidy and CKD has never been explored comprehensively, while analyses of microdeletions, copy number variation, and single-nucleotide polymorphisms in CKD have been largely limited to the autosomes or chromosome X. In many studies, it is unclear whether the analyses excluded chromosome Y or simply did not report negative results. Lack of imputation, poor cross-study comparability, and requirement for separate or additional analyses in comparison with autosomal chromosomes means that chromosome Y is under-investigated in the context of CKD. Limitations in genotyping arrays could be overcome through use of whole-chromosome sequencing of chromosome Y that may allow analysis of many different types of genetic variation across the chromosome to determine if chromosome Y genetic variation is associated with CKD.
RESUMO
AIM: We have evaluated the pharmacogenetic content of commercial human genome-wide genotyping arrays, as it is a critical determinant to enabling pharmacogenomic discoveries. METHODS: Using bioinformatics approaches, we assessed 27,811 genetic variants in 3146 genes for their presence in 18 Illumina and 15 Affymetrix genome-wide arrays. RESULTS: The pharmacogenetic content of the arrays varied greatly. The combination of the Affymetrix precision medicine array and PharmacoScan arrays (Affymetrix) had the highest coverage for a set of clinically actionable absorption, distribution, metabolism and excretion (ADME) variants, single nucleotide ADME variants and ADME insertions/deletions, with a physical coverage of 125/130 (96.2%), 9924/24,138 (41.1%) and 2252/3994 (56.4%), respectively. CONCLUSION: The combination of the Affymetrix precision medicine array and PharmacoScan arrays provided both genome-wide and pharmacogene coverage, which is crucial in the discovering of new variants responsible for drug adverse effects. These results will help in the design of pharmacogenomic studies and will enable a critical review of results from past studies.
Assuntos
Genoma Humano/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Farmacogenética/métodos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/genética , Estudo de Associação Genômica Ampla/métodos , Genótipo , Humanos , Polimorfismo de Nucleotídeo Único/genética , Medicina de Precisão/métodosRESUMO
OBJECTIVES: Despite recent advancements in diagnostic tools, the genomic landscape of hereditary hearing loss remains largely uncharacterized. One strategy to understand genome-wide aberrations includes the analysis of copy number variation that can be mapped using SNP-microarray technology. A growing collection of literature has begun to uncover the importance of copy number variation in hereditary hearing loss. This pilot study underpins a larger effort that involves the stage-wise analysis of hearing loss patients, many of whom have advanced to high-throughput sequencing analysis. DATA DESCRIPTION: Our data originate from the Infinium HumanOmni1-Quad v1.0 SNP-microarrays (Illumina) that provide useful markers for genome-wide association studies and copy number variation analysis. This dataset comprises a cohort of 108 individuals (99 with hearing loss, 9 normal hearing family members) for the purpose of understanding the genetic contribution of copy number variations to hereditary hearing loss. These anonymized SNP-microarray data have been uploaded to the NCBI Gene Expression Omnibus and are intended to benefit other investigators interested in aggregating platform-matched array patient datasets or as part of a supporting reference tool for other laboratories to better understand recurring copy number variations in other genetic disorders.
Assuntos
Estudo de Associação Genômica Ampla/métodos , Perda Auditiva/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise em Microsséries/métodos , Variações do Número de Cópias de DNA , Conjuntos de Dados como Assunto , Humanos , Projetos Piloto , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Allele-specific (AS) assessment of chromatin has the potential to elucidate specific cis-regulatory mechanisms, which are predicted to underlie the majority of the known genetic associations to complex disease. However, development of chromatin landscapes at allelic resolution has been challenging since sites of variable signal strength require substantial read depths not commonly applied in sequencing based approaches. In this study, we addressed this by performing parallel analyses of input DNA and chromatin immunoprecipitates (ChIP) on high-density Illumina genotyping arrays. Allele-specificity for the histone modifications H3K4me1, H3K4me3, H3K27ac, H3K27me3, and H3K36me3 was assessed using ChIP samples generated from 14 lymphoblast and 6 fibroblast cell lines. AS-ChIP SNPs were combined into domains and validated using high-confidence ChIP-seq sites. We observed characteristic patterns of allelic-imbalance for each histone-modification around allele-specifically expressed transcripts. Notably, we found H3K4me1 to be significantly anti-correlated with allelic expression (AE) at transcription start sites, indicating H3K4me1 allelic imbalance as a marker of AE. We also found that allelic chromatin domains exhibit population and cell-type specificity as well as heritability within trios. Finally, we observed that a subset of allelic chromatin domains is regulated by DNase I-sensitive quantitative trait loci and that these domains are significantly enriched for genome-wide association studies hits, with autoimmune disease associated SNPs specifically enriched in lymphoblasts. This study provides the first genome-wide maps of allelic-imbalance for five histone marks. Our results provide new insights into the role of chromatin in cis-regulation and highlight the need for high-depth sequencing in ChIP-seq studies along with the need to improve allele-specificity of ChIP-enrichment.