Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 51
Filtrar
Más filtros

Bases de datos
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 168(5): 830-842.e7, 2017 02 23.
Artículo en Inglés | MEDLINE | ID: mdl-28235197

RESUMEN

De novo copy number variants (dnCNVs) arising at multiple loci in a personal genome have usually been considered to reflect cancer somatic genomic instabilities. We describe a multiple dnCNV (MdnCNV) phenomenon in which individuals with genomic disorders carry five to ten constitutional dnCNVs. These CNVs originate from independent formation incidences, are predominantly tandem duplications or complex gains, exhibit breakpoint junction features reminiscent of replicative repair, and show increased de novo point mutations flanking the rearrangement junctions. The active CNV mutation shower appears to be restricted to a transient perizygotic period. We propose that a defect in the CNV formation process is responsible for the "CNV-mutator state," and this state is dampened after early embryogenesis. The constitutional MdnCNV phenomenon resembles chromosomal instability in various cancers. Investigations of this phenomenon may provide unique access to understanding genomic disorders, structural variant mutagenesis, human evolution, and cancer biology.


Asunto(s)
Aberraciones Cromosómicas , Variaciones en el Número de Copia de ADN , Enfermedades Genéticas Congénitas/embriología , Enfermedades Genéticas Congénitas/genética , Inestabilidad Genómica , Mutación , Puntos de Rotura del Cromosoma , Duplicación Cromosómica , Replicación del ADN , Desarrollo Embrionario , Femenino , Gametogénesis , Humanos , Masculino
2.
Cell ; 167(5): 1415-1429.e19, 2016 11 17.
Artículo en Inglés | MEDLINE | ID: mdl-27863252

RESUMEN

Many common variants have been associated with hematological traits, but identification of causal genes and pathways has proven challenging. We performed a genome-wide association analysis in the UK Biobank and INTERVAL studies, testing 29.5 million genetic variants for association with 36 red cell, white cell, and platelet properties in 173,480 European-ancestry participants. This effort yielded hundreds of low frequency (<5%) and rare (<1%) variants with a strong impact on blood cell phenotypes. Our data highlight general properties of the allelic architecture of complex traits, including the proportion of the heritable component of each blood trait explained by the polygenic signal across different genome regulatory domains. Finally, through Mendelian randomization, we provide evidence of shared genetic pathways linking blood cell indices with complex pathologies, including autoimmune diseases, schizophrenia, and coronary heart disease and evidence suggesting previously reported population associations between blood cell indices and cardiovascular disease may be non-causal.


Asunto(s)
Variación Genética , Estudio de Asociación del Genoma Completo , Células Madre Hematopoyéticas/metabolismo , Enfermedades del Sistema Inmune/genética , Alelos , Diferenciación Celular , Predisposición Genética a la Enfermedad , Células Madre Hematopoyéticas/patología , Humanos , Enfermedades del Sistema Inmune/patología , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Población Blanca/genética
3.
Hum Mol Genet ; 32(5): 790-797, 2023 02 19.
Artículo en Inglés | MEDLINE | ID: mdl-36136759

RESUMEN

Few genome-wide association studies (GWAS) analyzing genetic regulation of morphological traits of white blood cells have been reported. We carried out a GWAS of 12 morphological traits in 869 individuals from the general population of Sardinia, Italy. These traits, included measures of cell volume, conductivity and light scatter in four white-cell populations (eosinophils, lymphocytes, monocytes, neutrophils). This analysis yielded seven statistically significant signals, four of which were novel (four novel, PRG2, P2RX3, two of CDK6). Five signals were replicated in the independent INTERVAL cohort of 11 822 individuals. The most interesting signal with large effect size on eosinophil scatter (P-value = 8.33 x 10-32, beta = -1.651, se = 0.1351) falls within the innate immunity cluster on chromosome 11, and is located in the PRG2 gene. Computational analyses revealed that a rare, Sardinian-specific PRG2:p.Ser148Pro mutation modifies PRG2 amino acid contacts and protein dynamics in a manner that could possibly explain the changes observed in eosinophil morphology. Our discoveries shed light on genetics of morphological traits. For the first time, we describe such large effect size on eosinophils morphology that is relatively frequent in Sardinian population.


Asunto(s)
Eosinófilos , Estudio de Asociación del Genoma Completo , Humanos , Cromosomas Humanos Par 11 , Polimorfismo de Nucleótido Simple , Inmunidad Innata
4.
Am J Hum Genet ; 109(6): 1038-1054, 2022 06 02.
Artículo en Inglés | MEDLINE | ID: mdl-35568032

RESUMEN

Metabolite levels measured in the human population are endophenotypes for biological processes. We combined sequencing data for 3,924 (whole-exome sequencing, WES, discovery) and 2,805 (whole-genome sequencing, WGS, replication) donors from a prospective cohort of blood donors in England. We used multiple approaches to select and aggregate rare genetic variants (minor allele frequency [MAF] < 0.1%) in protein-coding regions and tested their associations with 995 metabolites measured in plasma by using ultra-high-performance liquid chromatography-tandem mass spectrometry. We identified 40 novel associations implicating rare coding variants (27 genes and 38 metabolites), of which 28 (15 genes and 28 metabolites) were replicated. We developed algorithms to prioritize putative driver variants at each locus and used mediation and Mendelian randomization analyses to test directionality at associations of metabolite and protein levels at the ACY1 locus. Overall, 66% of reported associations implicate gene targets of approved drugs or bioactive drug-like compounds, contributing to drug targets' validating efforts.


Asunto(s)
Exoma , Exoma/genética , Frecuencia de los Genes/genética , Humanos , Estudios Prospectivos , Secuenciación del Exoma/métodos , Secuenciación Completa del Genoma
5.
PLoS Genet ; 18(11): e1010367, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-36327219

RESUMEN

Host genetics is a key determinant of COVID-19 outcomes. Previously, the COVID-19 Host Genetics Initiative genome-wide association study used common variants to identify multiple loci associated with COVID-19 outcomes. However, variants with the largest impact on COVID-19 outcomes are expected to be rare in the population. Hence, studying rare variants may provide additional insights into disease susceptibility and pathogenesis, thereby informing therapeutics development. Here, we combined whole-exome and whole-genome sequencing from 21 cohorts across 12 countries and performed rare variant exome-wide burden analyses for COVID-19 outcomes. In an analysis of 5,085 severe disease cases and 571,737 controls, we observed that carrying a rare deleterious variant in the SARS-CoV-2 sensor toll-like receptor TLR7 (on chromosome X) was associated with a 5.3-fold increase in severe disease (95% CI: 2.75-10.05, p = 5.41x10-7). This association was consistent across sexes. These results further support TLR7 as a genetic determinant of severe disease and suggest that larger studies on rare variants influencing COVID-19 outcomes could provide additional insights.


Asunto(s)
COVID-19 , Exoma , Humanos , Exoma/genética , Estudio de Asociación del Genoma Completo , COVID-19/genética , Predisposición Genética a la Enfermedad , Receptor Toll-Like 7/genética , SARS-CoV-2/genética
6.
Am J Hum Genet ; 108(10): 1836-1851, 2021 10 07.
Artículo en Inglés | MEDLINE | ID: mdl-34582791

RESUMEN

Many common and rare variants associated with hematologic traits have been discovered through imputation on large-scale reference panels. However, the majority of genome-wide association studies (GWASs) have been conducted in Europeans, and determining causal variants has proved challenging. We performed a GWAS of total leukocyte, neutrophil, lymphocyte, monocyte, eosinophil, and basophil counts generated from 109,563,748 variants in the autosomes and the X chromosome in the Trans-Omics for Precision Medicine (TOPMed) program, which included data from 61,802 individuals of diverse ancestry. We discovered and replicated 7 leukocyte trait associations, including (1) the association between a chromosome X, pseudo-autosomal region (PAR), noncoding variant located between cytokine receptor genes (CSF2RA and CLRF2) and lower eosinophil count; and (2) associations between single variants found predominantly among African Americans at the S1PR3 (9q22.1) and HBB (11p15.4) loci and monocyte and lymphocyte counts, respectively. We further provide evidence indicating that the newly discovered eosinophil-lowering chromosome X PAR variant might be associated with reduced susceptibility to common allergic diseases such as atopic dermatitis and asthma. Additionally, we found a burden of very rare FLT3 (13q12.2) variants associated with monocyte counts. Together, these results emphasize the utility of whole-genome sequencing in diverse samples in identifying associations missed by European-ancestry-driven GWASs.


Asunto(s)
Asma/epidemiología , Biomarcadores/metabolismo , Dermatitis Atópica/epidemiología , Leucocitos/patología , Polimorfismo de Nucleótido Simple , Enfermedad Pulmonar Obstructiva Crónica/epidemiología , Sitios de Carácter Cuantitativo , Asma/genética , Asma/metabolismo , Asma/patología , Dermatitis Atópica/genética , Dermatitis Atópica/metabolismo , Dermatitis Atópica/patología , Predisposición Genética a la Enfermedad , Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , National Heart, Lung, and Blood Institute (U.S.) , Fenotipo , Pronóstico , Proteoma/análisis , Proteoma/metabolismo , Enfermedad Pulmonar Obstructiva Crónica/genética , Enfermedad Pulmonar Obstructiva Crónica/metabolismo , Enfermedad Pulmonar Obstructiva Crónica/patología , Reino Unido/epidemiología , Estados Unidos/epidemiología , Secuenciación Completa del Genoma
7.
PLoS Genet ; 16(3): e1008605, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-32150548

RESUMEN

Circulating metabolite levels are biomarkers for cardiovascular disease (CVD). Here we studied, association of rare variants and 226 serum lipoproteins, lipids and amino acids in 7,142 (discovery plus follow-up) healthy participants. We leveraged the information from multiple metabolite measurements on the same participants to improve discovery in rare variant association analyses for gene-based and gene-set tests by incorporating correlated metabolites as covariates in the validation stage. Gene-based analysis corrected for the effective number of tests performed, confirmed established associations at APOB, APOC3, PAH, HAL and PCSK (p<1.32x10-7) and identified novel gene-trait associations at a lower stringency threshold with ACSL1, MYCN, FBXO36 and B4GALNT3 (p<2.5x10-6). Regulation of the pyruvate dehydrogenase (PDH) complex was associated for the first time, in gene-set analyses also corrected for effective number of tests, with IDL and LDL parameters, as well as circulating cholesterol (pMETASKAT<2.41x10-6). In conclusion, using an approach that leverages metabolite measurements obtained in the same participants, we identified novel loci and pathways involved in the regulation of these important metabolic biomarkers. As large-scale biobanks continue to amass sequencing and phenotypic information, analytical approaches such as ours will be useful to fully exploit the copious amounts of biological data generated in these efforts.


Asunto(s)
Biomarcadores/sangre , Enfermedades Cardiovasculares/sangre , Enfermedades Cardiovasculares/genética , Variación Genética/genética , Colesterol/sangre , LDL-Colesterol/sangre , Femenino , Estudio de Asociación del Genoma Completo/métodos , Humanos , Lipoproteínas/sangre , Masculino , Fenotipo , Triglicéridos/sangre
8.
Genet Epidemiol ; 44(1): 79-89, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-31520489

RESUMEN

Copy number variants (CNVs) play an important role in a number of human diseases, but the accurate calling of CNVs remains challenging. Most current approaches to CNV detection use raw read alignments, which are computationally intensive to process. We use a regression tree-based approach to call germline CNVs from whole-genome sequencing (WGS, >18x) variant call sets in 6,898 samples across four European cohorts, and describe a rich large variation landscape comprising 1,320 CNVs. Eighty-one percent of detected events have been previously reported in the Database of Genomic Variants. Twenty-three percent of high-quality deletions affect entire genes, and we recapitulate known events such as the GSTM1 and RHD gene deletions. We test for association between the detected deletions and 275 protein levels in 1,457 individuals to assess the potential clinical impact of the detected CNVs. We describe complex CNV patterns underlying an association with levels of the CCL3 protein (MAF = 0.15, p = 3.6x10-12 ) at the CCL3L3 locus, and a novel cis-association between a low-frequency NOMO1 deletion and NOMO1 protein levels (MAF = 0.02, p = 2.2x10-7 ). This study demonstrates that existing population-wide WGS call sets can be mined for germline CNVs with minimal computational overhead, delivering insight into a less well-studied, yet potentially impactful class of genetic variant.


Asunto(s)
Variaciones en el Número de Copia de ADN/genética , Genética de Población/métodos , Genoma Humano/genética , Quimiocina CCL3/genética , Eliminación de Gen , Estudio de Asociación del Genoma Completo , Genómica , Glutatión Transferasa/genética , Humanos , Proteína Nodal/genética , Proteínas Recombinantes de Fusión/genética , Secuenciación Completa del Genoma
9.
Int J Obes (Lond) ; 45(10): 2221-2229, 2021 10.
Artículo en Inglés | MEDLINE | ID: mdl-34226637

RESUMEN

BACKGROUND: Variation in adiposity is associated with cardiometabolic disease outcomes, but mechanisms leading from this exposure to disease are unclear. This study aimed to estimate effects of body mass index (BMI) on an extensive set of circulating proteins. METHODS: We used SomaLogic proteomic data from up to 2737 healthy participants from the INTERVAL study. Associations between self-reported BMI and 3622 unique plasma proteins were explored using linear regression. These were complemented by Mendelian randomisation (MR) analyses using a genetic risk score (GRS) comprised of 654 BMI-associated polymorphisms from a recent genome-wide association study (GWAS) of adult BMI. A disease enrichment analysis was performed using DAVID Bioinformatics 6.8 for proteins which were altered by BMI. RESULTS: Observationally, BMI was associated with 1576 proteins (P < 1.4 × 10-5), with particularly strong evidence for a positive association with leptin and fatty acid-binding protein-4 (FABP4), and a negative association with sex hormone-binding globulin (SHBG). Observational estimates were likely confounded, but the GRS for BMI did not associate with measured confounders. MR analyses provided evidence for a causal relationship between BMI and eight proteins including leptin (0.63 standard deviation (SD) per SD BMI, 95% CI 0.48-0.79, P = 1.6 × 10-15), FABP4 (0.64 SD per SD BMI, 95% CI 0.46-0.83, P = 6.7 × 10-12) and SHBG (-0.45 SD per SD BMI, 95% CI -0.65 to -0.25, P = 1.4 × 10-5). There was agreement in the magnitude of observational and MR estimates (R2 = 0.33) and evidence that proteins most strongly altered by BMI were enriched for genes involved in cardiovascular disease. CONCLUSIONS: This study provides evidence for a broad impact of adiposity on the human proteome. Proteins strongly altered by BMI include those involved in regulating appetite, sex hormones and inflammation; such proteins are also enriched for cardiovascular disease-related genes. Altogether, results help focus attention onto new proteomic signatures of obesity-related disease.


Asunto(s)
Adiposidad/fisiología , Proteoma/análisis , Adulto , Índice de Masa Corporal , Estudios de Cohortes , Femenino , Humanos , Masculino , Análisis de la Aleatorización Mendeliana , Persona de Mediana Edad , Estudios Prospectivos , Proteoma/metabolismo , Encuestas y Cuestionarios
10.
Nature ; 526(7571): 82-90, 2015 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-26367797

RESUMEN

The contribution of rare and low-frequency variants to human traits is largely unexplored. Here we describe insights from sequencing whole genomes (low read depth, 7×) or exomes (high read depth, 80×) of nearly 10,000 individuals from population-based and disease collections. In extensively phenotyped cohorts we characterize over 24 million novel sequence variants, generate a highly accurate imputation reference panel and identify novel alleles associated with levels of triglycerides (APOB), adiponectin (ADIPOQ) and low-density lipoprotein cholesterol (LDLR and RGAG1) from single-marker and rare variant aggregation tests. We describe population structure and functional annotation of rare and low-frequency variants, use the data to estimate the benefits of sequencing for association studies, and summarize lessons from disease-specific collections. Finally, we make available an extensive resource, including individual-level genetic and phenotypic data and web-based tools to facilitate the exploration of association results.


Asunto(s)
Enfermedad/genética , Variación Genética/genética , Genoma Humano/genética , Salud , Adiponectina/sangre , Alelos , Estudios de Cohortes , Exoma/genética , Femenino , Predisposición Genética a la Enfermedad/genética , Genética Médica , Genética de Población , Estudio de Asociación del Genoma Completo , Genómica , Humanos , Metabolismo de los Lípidos/genética , Masculino , Anotación de Secuencia Molecular , Receptores de LDL/genética , Estándares de Referencia , Análisis de Secuencia de ADN , Triglicéridos/sangre , Reino Unido
11.
Nature ; 526(7571): 112-7, 2015 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-26367794

RESUMEN

The extent to which low-frequency (minor allele frequency (MAF) between 1-5%) and rare (MAF ≤ 1%) variants contribute to complex traits and disease in the general population is mainly unknown. Bone mineral density (BMD) is highly heritable, a major predictor of osteoporotic fractures, and has been previously associated with common genetic variants, as well as rare, population-specific, coding variants. Here we identify novel non-coding genetic variants with large effects on BMD (ntotal = 53,236) and fracture (ntotal = 508,253) in individuals of European ancestry from the general population. Associations for BMD were derived from whole-genome sequencing (n = 2,882 from UK10K (ref. 10); a population-based genome sequencing consortium), whole-exome sequencing (n = 3,549), deep imputation of genotyped samples using a combined UK10K/1000 Genomes reference panel (n = 26,534), and de novo replication genotyping (n = 20,271). We identified a low-frequency non-coding variant near a novel locus, EN1, with an effect size fourfold larger than the mean of previously reported common variants for lumbar spine BMD (rs11692564(T), MAF = 1.6%, replication effect size = +0.20 s.d., Pmeta = 2 × 10(-14)), which was also associated with a decreased risk of fracture (odds ratio = 0.85; P = 2 × 10(-11); ncases = 98,742 and ncontrols = 409,511). Using an En1(cre/flox) mouse model, we observed that conditional loss of En1 results in low bone mass, probably as a consequence of high bone turnover. We also identified a novel low-frequency non-coding variant with large effects on BMD near WNT16 (rs148771817(T), MAF = 1.2%, replication effect size = +0.41 s.d., Pmeta = 1 × 10(-11)). In general, there was an excess of association signals arising from deleterious coding and conserved non-coding variants. These findings provide evidence that low-frequency non-coding variants have large effects on BMD and fracture, thereby providing rationale for whole-genome sequencing and improved imputation reference panels to study the genetic architecture of complex traits and disease in the general population.


Asunto(s)
Densidad Ósea/genética , Fracturas Óseas/genética , Genoma Humano/genética , Proteínas de Homeodominio/genética , Animales , Huesos/metabolismo , Modelos Animales de Enfermedad , Europa (Continente)/etnología , Exoma/genética , Femenino , Frecuencia de los Genes/genética , Predisposición Genética a la Enfermedad/genética , Variación Genética/genética , Genómica , Genotipo , Humanos , Ratones , Análisis de Secuencia de ADN , Población Blanca/genética , Proteínas Wnt/genética
12.
Nature ; 526(7571): 75-81, 2015 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-26432246

RESUMEN

Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.


Asunto(s)
Variación Genética/genética , Genoma Humano/genética , Mapeo Físico de Cromosoma , Secuencia de Aminoácidos , Predisposición Genética a la Enfermedad , Genética Médica , Genética de Población , Estudio de Asociación del Genoma Completo , Genómica , Genotipo , Haplotipos/genética , Homocigoto , Humanos , Datos de Secuencia Molecular , Tasa de Mutación , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genética , Análisis de Secuencia de ADN , Eliminación de Secuencia/genética
13.
Am J Hum Genet ; 100(6): 865-884, 2017 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-28552196

RESUMEN

Deep sequence-based imputation can enhance the discovery power of genome-wide association studies by assessing previously unexplored variation across the common- and low-frequency spectra. We applied a hybrid whole-genome sequencing (WGS) and deep imputation approach to examine the broader allelic architecture of 12 anthropometric traits associated with height, body mass, and fat distribution in up to 267,616 individuals. We report 106 genome-wide significant signals that have not been previously identified, including 9 low-frequency variants pointing to functional candidates. Of the 106 signals, 6 are in genomic regions that have not been implicated with related traits before, 28 are independent signals at previously reported regions, and 72 represent previously reported signals for a different anthropometric trait. 71% of signals reside within genes and fine mapping resolves 23 signals to one or two likely causal variants. We confirm genetic overlap between human monogenic and polygenic anthropometric traits and find signal enrichment in cis expression QTLs in relevant tissues. Our results highlight the potential of WGS strategies to enhance biologically relevant discoveries across the frequency spectrum.


Asunto(s)
Antropometría , Genoma Humano , Estudio de Asociación del Genoma Completo , Sitios de Carácter Cuantitativo/genética , Análisis de Secuencia de ADN/métodos , Estatura/genética , Estudios de Cohortes , Metilación de ADN/genética , Bases de Datos Genéticas , Femenino , Variación Genética , Humanos , Lipodistrofia/genética , Masculino , Metaanálisis como Asunto , Obesidad/genética , Mapeo Físico de Cromosoma , Caracteres Sexuales , Síndrome , Reino Unido
14.
Am J Hum Genet ; 99(2): 481-8, 2016 08 04.
Artículo en Inglés | MEDLINE | ID: mdl-27486782

RESUMEN

Circulating blood cell counts and indices are important indicators of hematopoietic function and a number of clinical parameters, such as blood oxygen-carrying capacity, inflammation, and hemostasis. By performing whole-exome sequence association analyses of hematologic quantitative traits in 15,459 community-dwelling individuals, followed by in silico replication in up to 52,024 independent samples, we identified two previously undescribed coding variants associated with lower platelet count: a common missense variant in CPS1 (rs1047891, MAF = 0.33, discovery + replication p = 6.38 × 10(-10)) and a rare synonymous variant in GFI1B (rs150813342, MAF = 0.009, discovery + replication p = 1.79 × 10(-27)). By performing CRISPR/Cas9 genome editing in hematopoietic cell lines and follow-up targeted knockdown experiments in primary human hematopoietic stem and progenitor cells, we demonstrate an alternative splicing mechanism by which the GFI1B rs150813342 variant suppresses formation of a GFI1B isoform that preferentially promotes megakaryocyte differentiation and platelet production. These results demonstrate how unbiased studies of natural variation in blood cell traits can provide insight into the regulation of human hematopoiesis.


Asunto(s)
Empalme Alternativo/genética , Análisis Mutacional de ADN , Exoma/genética , Sitios Genéticos/genética , Hematopoyesis/genética , Proteínas Proto-Oncogénicas/genética , Proteínas Represoras/genética , Plaquetas/citología , Sistemas CRISPR-Cas , Edición Génica , Células Madre Hematopoyéticas/citología , Humanos , Megacariocitos/citología , Recuento de Plaquetas
15.
Nature ; 470(7332): 59-65, 2011 Feb 03.
Artículo en Inglés | MEDLINE | ID: mdl-21293372

RESUMEN

Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.


Asunto(s)
Variaciones en el Número de Copia de ADN/genética , Genética de Población , Genoma Humano/genética , Genómica , Duplicación de Gen/genética , Predisposición Genética a la Enfermedad/genética , Genotipo , Humanos , Mutagénesis Insercional/genética , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN , Eliminación de Secuencia/genética
16.
Bioinformatics ; 31(24): 4029-31, 2015 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-26315906

RESUMEN

UNLABELLED: High-throughput sequencing technologies survey genetic variation at genome scale and are increasingly used to study the contribution of rare and low-frequency genetic variants to human traits. As part of the Cohorts arm of the UK10K project, genetic variants called from low-read depth (average 7×) whole genome sequencing of 3621 cohort individuals were analysed for statistical associations with 64 different phenotypic traits of biomedical importance. Here, we describe a novel genome browser based on the Biodalliance platform developed to provide interactive access to the association results of the project. AVAILABILITY AND IMPLEMENTATION: The browser is available at http://www.uk10k.org/dalliance.html. Source code for the Biodalliance platform is available under a BSD license from http://github.com/dasmoth/dalliance, and for the LD-display plugin and backend from http://github.com/dasmoth/ldserv.


Asunto(s)
Estudios de Asociación Genética , Variación Genética , Genoma Humano , Programas Informáticos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Desequilibrio de Ligamiento
17.
Nature ; 464(7289): 704-12, 2010 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-19812545

RESUMEN

Structural variations of DNA greater than 1 kilobase in size account for most bases that vary among human genomes, but are still relatively under-ascertained. Here we use tiling oligonucleotide microarrays, comprising 42 million probes, to generate a comprehensive map of 11,700 copy number variations (CNVs) greater than 443 base pairs, of which most (8,599) have been validated independently. For 4,978 of these CNVs, we generated reference genotypes from 450 individuals of European, African or East Asian ancestry. The predominant mutational mechanisms differ among CNV size classes. Retrotransposition has duplicated and inserted some coding and non-coding DNA segments randomly around the genome. Furthermore, by correlation with known trait-associated single nucleotide polymorphisms (SNPs), we identified 30 loci with CNVs that are candidates for influencing disease susceptibility. Despite this, having assessed the completeness of our map and the patterns of linkage disequilibrium between CNVs and SNPs, we conclude that, for complex traits, the heritability void left by genome-wide association studies will not be accounted for by common CNVs.


Asunto(s)
Variaciones en el Número de Copia de ADN/genética , Predisposición Genética a la Enfermedad/genética , Genoma Humano/genética , Mutagénesis/genética , Duplicación de Gen , Estudio de Asociación del Genoma Completo , Genotipo , Haplotipos/genética , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos , Polimorfismo de Nucleótido Simple/genética , Grupos Raciales/genética , Reproducibilidad de los Resultados
18.
Genet Epidemiol ; 38(4): 281-90, 2014 May.
Artículo en Inglés | MEDLINE | ID: mdl-24676807

RESUMEN

Although a standard genome-wide significance level has been accepted for the testing of association between common genetic variants and disease, the era of whole-genome sequencing (WGS) requires a new threshold. The allele frequency spectrum of sequence-identified variants is very different from common variants, and the identified rare genetic variation is usually jointly analyzed in a series of genomic windows or regions. In nearby or overlapping windows, these test statistics will be correlated, and the degree of correlation is likely to depend on the choice of window size, overlap, and the test statistic. Furthermore, multiple analyses may be performed using different windows or test statistics. Here we propose an empirical approach for estimating genome-wide significance thresholds for data arising from WGS studies, and we demonstrate that the empirical threshold can be efficiently estimated by extrapolating from calculations performed on a small genomic region. Because analysis of WGS may need to be repeated with different choices of test statistics or windows, this prediction approach makes it computationally feasible to estimate genome-wide significance thresholds for different analysis choices. Based on UK10K whole-genome sequence data, we derive genome-wide significance thresholds ranging between 2.5 × 10(-8) and 8 × 10(-8) for our analytic choices in window-based testing, and thresholds of 0.6 × 10(-8) -1.5 × 10(-8) for a combined analytic strategy of testing common variants using single-SNP tests together with rare variants analyzed with our sliding-window test strategy.


Asunto(s)
Genoma Humano/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN , Cromosomas Humanos Par 3/genética , Interpretación Estadística de Datos , Humanos , Polimorfismo de Nucleótido Simple/genética
20.
Nature ; 456(7218): 53-9, 2008 Nov 06.
Artículo en Inglés | MEDLINE | ID: mdl-18987734

RESUMEN

DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation. Here we report an approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost. Single molecules of DNA are attached to a flat surface, amplified in situ and used as templates for synthetic sequencing with fluorescent reversible terminator deoxyribonucleotides. Images of the surface are analysed to generate high-quality sequence. We demonstrate application of this approach to human genome sequencing on flow-sorted X chromosomes and then scale the approach to determine the genome sequence of a male Yoruba from Ibadan, Nigeria. We build an accurate consensus sequence from >30x average depth of paired 35-base reads. We characterize four million single-nucleotide polymorphisms and four hundred thousand structural variants, many of which were previously unknown. Our approach is effective for accurate, rapid and economical whole-genome re-sequencing and many other biomedical applications.


Asunto(s)
Genoma Humano/genética , Genómica/métodos , Análisis de Secuencia de ADN/métodos , Cromosomas Humanos X/genética , Secuencia de Consenso/genética , Genómica/economía , Genotipo , Humanos , Masculino , Nigeria , Polimorfismo de Nucleótido Simple/genética , Sensibilidad y Especificidad , Análisis de Secuencia de ADN/economía
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA