SNPQC--an R pipeline for quality control of Illumina SNP genotyping array data.
Anim Genet
; 45(5): 758-61, 2014 Oct.
Article
em En
| MEDLINE
| ID: mdl-25040453
ABSTRACT
In genome-wide association studies, quality control (QC) of genotypes is important to avoid spurious results. It is also important to maintain long-term data integrity, particularly in settings with ongoing genotyping (e.g. estimation of genomic breeding values). Here we discuss SNPQc, a fully automated pipeline to perform QC analyses of Illumina SNP array data. It applies a wide range of common quality metrics with user-defined filtering thresholds to generate a comprehensive QC report and a filtered dataset, including a genomic relationship matrix, ready for further downstream analyses which make it amenable for integration in high-throughput environments. SNPQC also builds a database to store genotypic, phenotypic and quality metrics to ensure data integrity and the option of integrating more samples from subsequent runs. The program is generic across species and array designs, providing a convenient interface between the genotyping laboratory and downstream genome-wide association study or genomic prediction.
Palavras-chave
Texto completo:
1
Coleções:
01-internacional
Base de dados:
MEDLINE
Assunto principal:
Software
/
Processamento Eletrônico de Dados
/
Análise de Sequência com Séries de Oligonucleotídeos
Idioma:
En
Ano de publicação:
2014
Tipo de documento:
Article