Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Más filtros













Base de datos
Intervalo de año de publicación
1.
PLoS One ; 12(9): e0182438, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28926565

RESUMEN

In the current precision medicine era, more and more samples get genotyped and sequenced. Both researchers and commercial companies expend significant time and resources to reduce the error rate. However, it has been reported that there is a sample mix-up rate of between 0.1% and 1%, not to mention the possibly higher mix-up rate during the down-stream genetic reporting processes. Even on the low end of this estimate, this translates to a significant number of mislabeled samples, especially over the projected one billion people that will be sequenced within the next decade. Here, we first describe a method to identify a small set of Single nucleotide polymorphisms (SNPs) that can uniquely identify a personal genome, which utilizes allele frequencies of five major continental populations reported in the 1000 genomes project and the ExAC Consortium. To make this panel more informative, we added four SNPs that are commonly used to predict ABO blood type, and another two SNPs that are capable of predicting sex. We then implement a web interface (http://qrcme.tech), nicknamed QRC (for QR code based Concordance check), which is capable of extracting the relevant ID SNPs from a raw genetic data, coding its genotype as a quick response (QR) code, and comparing QR codes to report the concordance of underlying genetic datasets. The resulting 80 fingerprinting SNPs represent a significant decrease in complexity and the number of markers used for genetic data labelling and tracking. Our method and web tool is easily accessible to both researchers and the general public who consider the accuracy of complex genetic data as a prerequisite towards precision medicine.


Asunto(s)
Polimorfismo de Nucleótido Simple , Interfaz Usuario-Computador , Frecuencia de los Genes , Genoma Humano , Genotipo , Humanos , Internet
2.
Genetics ; 197(1): 91-106, 2014 May.
Artículo en Inglés | MEDLINE | ID: mdl-24578350

RESUMEN

Since the publication of the first comprehensive linkage map for the laboratory mouse, the architecture of recombination as a basic biological process has become amenable to investigation in mammalian model organisms. Here we take advantage of high-density genotyping and the unique pedigree structure of the incipient Collaborative Cross to investigate the roles of sex and genetic background in mammalian recombination. Our results confirm the observation that map length is longer when measured through female meiosis than through male meiosis, but we find that this difference is modified by genotype at loci on both the X chromosome and the autosomes. In addition, we report a striking concentration of crossovers in the distal ends of autosomes in male meiosis that is absent in female meiosis. The presence of this pattern in both single- and double-recombinant chromosomes, combined with the absence of a corresponding asymmetry in the distribution of double-strand breaks, indicates a regulated sequence of events specific to male meiosis that is anchored by chromosome ends. This pattern is consistent with the timing of chromosome pairing and evolutionary constraints on male recombination. Finally, we identify large regions of reduced crossover frequency that together encompass 5% of the genome. Many of these "cold regions" are enriched for segmental duplications, suggesting an inverse local correlation between recombination rate and mutation rate for large copy number variants.


Asunto(s)
Mapeo Cromosómico , Intercambio Genético/genética , Caracteres Sexuales , Espermatozoides/metabolismo , Animales , Femenino , Genómica , Técnicas de Genotipaje , Masculino , Ratones , Linaje , Hermanos , Especificidad de la Especie
3.
Bioinformatics ; 29(21): 2744-9, 2013 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-23956302

RESUMEN

SUMMARY: Although the 1000 Genomes haplotypes are the most commonly used reference panel for imputation, medical sequencing projects are generating large alternate sets of sequenced samples. Imputation in African Americans using 3384 haplotypes from the Exome Sequencing Project, compared with 2184 haplotypes from 1000 Genomes Project, increased effective sample size by 8.3-11.4% for coding variants with minor allele frequency <1%. No loss of imputation quality was observed using a panel built from phenotypic extremes. We recommend using haplotypes from Exome Sequencing Project alone or concatenation of the two panels over quality score-based post-imputation selection or IMPUTE2's two-panel combination. CONTACT: yunli@med.unc.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Negro o Afroamericano/genética , Exoma , Variación Genética , Análisis de Secuencia de ADN/métodos , Frecuencia de los Genes , Genoma Humano , Estudio de Asociación del Genoma Completo , Haplotipos , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple
4.
Bioinformatics ; 29(4): 528-31, 2013 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-23292738

RESUMEN

MOTIVATION: Genotype imputation has become an indispensible step in genome-wide association studies (GWAS). Imputation accuracy, directly influencing downstream analysis, has shown to be improved using re-sequencing-based reference panels; however, this comes at the cost of high computational burden due to the huge number of potentially imputable markers (tens of millions) discovered through sequencing a large number of individuals. Therefore, there is an increasing need for access to imputation quality information without actually conducting imputation. To facilitate this process, we have established a publicly available SNP and indel imputability database, aiming to provide direct access to imputation accuracy information for markers identified by the 1000 Genomes Project across four major populations and covering multiple GWAS genotyping platforms. RESULTS: SNP and indel imputability information can be retrieved through a user-friendly interface by providing the ID(s) of the desired variant(s) or by specifying the desired genomic region. The query results can be refined by selecting relevant GWAS genotyping platform(s). This is the first database providing variant imputability information specific to each continental group and to each genotyping platform. In Filipino individuals from the Cebu Longitudinal Health and Nutrition Survey, our database can achieve an area under the receiver-operating characteristic curve of 0.97, 0.91, 0.88 and 0.79 for markers with minor allele frequency >5%, 3-5%, 1-3% and 0.5-1%, respectively. Specifically, by filtering out 48.6% of markers (corresponding to a reduction of up to 48.6% in computational costs for actual imputation) based on the imputability information in our database, we can remove 77%, 58%, 51% and 42% of the poorly imputed markers at the cost of only 0.3%, 0.8%, 1.5% and 4.6% of the well-imputed markers with minor allele frequency >5%, 3-5%, 1-3% and 0.5-1%, respectively. AVAILABILITY: http://www.unc.edu/∼yunmli/imputability.html


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Estudio de Asociación del Genoma Completo , Mutación INDEL , Polimorfismo de Nucleótido Simple , Pueblo Asiatico/genética , Frecuencia de los Genes , Genoma , Genotipo , Humanos , Filipinas , Programas Informáticos
5.
Stat Biosci ; 5(1): 3-25, 2013 May.
Artículo en Inglés | MEDLINE | ID: mdl-24489615

RESUMEN

Massively parallel sequencing (MPS), since its debut in 2005, has transformed the field of genomic studies. These new sequencing technologies have resulted in the successful identification of causal variants for several rare Mendelian disorders. They have also begun to deliver on their promise to explain some of the missing heritability from genome-wide association studies (GWAS) of complex traits. We anticipate a rapidly growing number of MPS-based studies for a diverse range of applications in the near future. One crucial and nearly inevitable step is to detect SNPs and call genotypes at the detected polymorphic sites from the sequencing data. Here, we review statistical methods that have been proposed in the past five years for this purpose. In addition, we discuss emerging issues and future directions related to SNP detection and genotype calling from MPS data.

6.
Genet Epidemiol ; 37(1): 25-37, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23074066

RESUMEN

Imputation in admixed populations is an important problem but challenging due to the complex linkage disequilibrium (LD) pattern. The emergence of large reference panels such as that from the 1,000 Genomes Project enables more accurate imputation in general, and in particular for admixed populations and for uncommon variants. To efficiently benefit from these large reference panels, one key issue to consider in modern genotype imputation framework is the selection of effective reference panels. In this work, we consider a number of methods for effective reference panel construction inside a hidden Markov model and specific to each target individual. These methods fall into two categories: identity-by-state (IBS) based and ancestry-weighted approach. We evaluated the performance on individuals from recently admixed populations. Our target samples include 8,421 African Americans and 3,587 Hispanic Americans from the Women' Health Initiative, which allow assessment of imputation quality for uncommon variants. Our experiments include both large and small reference panels; large, medium, and small target samples; and in genome regions of varying levels of LD. We also include BEAGLE and IMPUTE2 for comparison. Experiment results with large reference panel suggest that our novel piecewise IBS method yields consistently higher imputation quality than other methods/software. The advantage is particularly noteworthy among uncommon variants where we observe up to 5.1% information gain with the difference being highly significant (Wilcoxon signed rank test P-value < 0.0001). Our work is the first that considers various sensible approaches for imputation in admixed populations and presents a comprehensive comparison.


Asunto(s)
Genética de Población , Genotipo , Modelos Genéticos , Programas Informáticos , Negro o Afroamericano/genética , Femenino , Genoma Humano , Proyecto Mapa de Haplotipos , Haplotipos , Hispánicos o Latinos/genética , Humanos , Desequilibrio de Ligamiento , Cadenas de Markov , Linaje
7.
Genet Epidemiol ; 36(2): 107-17, 2012 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-22851474

RESUMEN

Genetic imputation has become standard practice in modern genetic studies. However, several important issues have not been adequately addressed including the utility of study-specific reference, performance in admixed populations, and quality for less common (minor allele frequency [MAF] 0.005-0.05) and rare (MAF < 0.005) variants. These issues only recently became addressable with genome-wide association studies (GWAS) follow-up studies using dense genotyping or sequencing in large samples of non-European individuals. In this work, we constructed a study-specific reference panel of 3,924 haplotypes using African Americans in the Women's Health Initiative (WHI) genotyped on both the Metabochip and the Affymetrix 6.0 GWAS platform. We used this reference panel to impute into 6,459 WHI SNP Health Association Resource (SHARe) study subjects with only GWAS genotypes. Our analysis confirmed the imputation quality metric Rsq (estimated r(2) , specific to each SNP) as an effective post-imputation filter. We recommend different Rsq thresholds for different MAF categories such that the average (across SNPs) Rsq is above the desired dosage r(2) (squared Pearson correlation between imputed and experimental genotypes). With a desired dosage r(2) of 80%, 99.9% (97.5%, 83.6%, 52.0%, 20.5%) of SNPs with MAF > 0.05 (0.03-0.05, 0.01-0.03, 0.005-0.01, and 0.001-0.005) passed the post-imputation filter. The average dosage r(2) for these SNPs is 94.7%, 92.1%, 89.0%, 83.1%, and 79.7%, respectively. These results suggest that for African Americans imputation of Metabochip SNPs from GWAS data, including low frequency SNPs with MAF 0.005-0.05, is feasible and worthwhile for power increase in downstream association analysis provided a sizable reference panel is available.


Asunto(s)
Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Polimorfismo de Nucleótido Simple , Negro o Afroamericano , Alelos , Femenino , Frecuencia de los Genes , Genoma Humano , Estudio de Asociación del Genoma Completo , Genotipo , Haplotipos , Humanos , Modelos Genéticos , Fenotipo , Reproducibilidad de los Resultados , Programas Informáticos , Estados Unidos , Salud de la Mujer
8.
Genome Res ; 21(8): 1213-22, 2011 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-21406540

RESUMEN

The Collaborative Cross (CC) is a mouse recombinant inbred strain panel that is being developed as a resource for mammalian systems genetics. Here we describe an experiment that uses partially inbred CC lines to evaluate the genetic properties and utility of this emerging resource. Genome-wide analysis of the incipient strains reveals high genetic diversity, balanced allele frequencies, and dense, evenly distributed recombination sites-all ideal qualities for a systems genetics resource. We map discrete, complex, and biomolecular traits and contrast two quantitative trait locus (QTL) mapping approaches. Analysis based on inferred haplotypes improves power, reduces false discovery, and provides information to identify and prioritize candidate genes that is unique to multifounder crosses like the CC. The number of expression QTLs discovered here exceeds all previous efforts at eQTL mapping in mice, and we map local eQTL at 1-Mb resolution. We demonstrate that the genetic diversity of the CC, which derives from random mixing of eight founder strains, results in high phenotypic diversity and enhances our ability to map causative loci underlying complex disease-related traits.


Asunto(s)
Genoma , Sitios de Carácter Cuantitativo , Animales , Cruzamientos Genéticos , Femenino , Expresión Génica , Estudios de Asociación Genética , Haplotipos , Masculino , Ratones , Fenotipo
9.
Bioinformatics ; 26(12): i199-207, 2010 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-20529906

RESUMEN

MOTIVATION: High-density SNP data of model animal resources provides opportunities for fine-resolution genetic variation studies. These genetic resources are generated through a variety of breeding schemes that involve multiple generations of matings derived from a set of founder animals. In this article, we investigate the problem of inferring the most probable ancestry of resulting genotypes, given a set of founder genotypes. Due to computational difficulty, existing methods either handle only small pedigree data or disregard the pedigree structure. However, large pedigrees of model animal resources often contain repetitive substructures that can be utilized in accelerating computation. RESULTS: We present an accurate and efficient method that can accept complex pedigrees with inbreeding in inferring genome ancestry. Inbreeding is a commonly used process in generating genetically diverse and reproducible animals. It is often carried out for many generations and can account for most of the computational complexity in real-world model animal pedigrees. Our method builds a hidden Markov model that derives the ancestry probabilities through inbreeding process without explicit modeling in every generation. The ancestry inference is accurate and fast, independent of the number of generations, for model animal resources such as the Collaborative Cross (CC). Experiments on both simulated and real CC data demonstrate that our method offers comparable accuracy to those methods that build an explicit model of the entire pedigree, but much better scalability with respect to the pedigree size.


Asunto(s)
Genoma , Genómica/métodos , Endogamia , Linaje , Animales , Variación Genética , Genotipo , Polimorfismo de Nucleótido Simple
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA