Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 47
Filter
Add more filters

Publication year range
1.
Cell ; 155(1): 242-56, 2013 Sep 26.
Article in English | MEDLINE | ID: mdl-24074872

ABSTRACT

The complex network of specialized cells and molecules in the immune system has evolved to defend against pathogens, but inadvertent immune system attacks on "self" result in autoimmune disease. Both genetic regulation of immune cell levels and their relationships with autoimmunity are largely undetermined. Here, we report genetic contributions to quantitative levels of 95 cell types encompassing 272 immune traits, in a cohort of 1,629 individuals from four clustered Sardinian villages. We first estimated trait heritability, showing that it can be substantial, accounting for up to 87% of the variance (mean 41%). Next, by assessing ∼8.2 million variants that we identified and confirmed in an extended set of 2,870 individuals, 23 independent variants at 13 loci associated with at least one trait. Notably, variants at three loci (HLA, IL2RA, and SH2B3/ATXN2) overlap with known autoimmune disease associations. These results connect specific cellular phenotypes to specific genetic variants, helping to explicate their involvement in disease.


Subject(s)
Flow Cytometry/methods , Genetic Predisposition to Disease , Genome-Wide Association Study , Immune System Diseases/genetics , Polymorphism, Single Nucleotide , Humans , Phenotype
2.
Hum Mol Genet ; 32(5): 790-797, 2023 02 19.
Article in English | MEDLINE | ID: mdl-36136759

ABSTRACT

Few genome-wide association studies (GWAS) analyzing genetic regulation of morphological traits of white blood cells have been reported. We carried out a GWAS of 12 morphological traits in 869 individuals from the general population of Sardinia, Italy. These traits, included measures of cell volume, conductivity and light scatter in four white-cell populations (eosinophils, lymphocytes, monocytes, neutrophils). This analysis yielded seven statistically significant signals, four of which were novel (four novel, PRG2, P2RX3, two of CDK6). Five signals were replicated in the independent INTERVAL cohort of 11 822 individuals. The most interesting signal with large effect size on eosinophil scatter (P-value = 8.33 x 10-32, beta = -1.651, se = 0.1351) falls within the innate immunity cluster on chromosome 11, and is located in the PRG2 gene. Computational analyses revealed that a rare, Sardinian-specific PRG2:p.Ser148Pro mutation modifies PRG2 amino acid contacts and protein dynamics in a manner that could possibly explain the changes observed in eosinophil morphology. Our discoveries shed light on genetics of morphological traits. For the first time, we describe such large effect size on eosinophils morphology that is relatively frequent in Sardinian population.


Subject(s)
Eosinophils , Genome-Wide Association Study , Humans , Chromosomes, Human, Pair 11 , Polymorphism, Single Nucleotide , Immunity, Innate
3.
Genet Epidemiol ; 47(3): 231-248, 2023 04.
Article in English | MEDLINE | ID: mdl-36739617

ABSTRACT

Linkage analysis, a class of methods for detecting co-segregation of genomic segments and traits in families, was used to map disease-causing genes for decades before genotyping arrays and dense SNP genotyping enabled genome-wide association studies in population samples. Population samples often contain related individuals, but the segregation of alleles within families is rarely used because traditional linkage methods are computationally inefficient for larger datasets. Here, we describe Population Linkage, a novel application of Haseman-Elston regression as a method of moments estimator of variance components and their standard errors. We achieve additional computational efficiency by using modern methods for detection of IBD segments and variance component estimation, efficient preprocessing of input data, and minimizing redundant numerical calculations. We also refined variance component models to account for the biases in population-scale methods for IBD segment detection. We ran Population Linkage on four blood lipid traits in over 70,000 individuals from the HUNT and SardiNIA studies, successfully detecting 25 known genetic signals. One notable linkage signal that appeared in both was for low-density lipoprotein (LDL) cholesterol levels in the region near the gene APOE (LOD = 29.3, variance explained = 4.1%). This is the region where the missense variants rs7412 and rs429358, which together make up the ε2, ε3, and ε4 alleles each account for 2.4% and 0.8% of variation in circulating LDL cholesterol. Our results show the potential for linkage analysis and other large-scale applications of method of moments variance components estimation.


Subject(s)
Genome-Wide Association Study , Models, Genetic , Humans , Phenotype , Cholesterol, LDL/genetics , Genetic Linkage , Apolipoproteins E/genetics
4.
Am J Hum Genet ; 107(1): 60-71, 2020 07 02.
Article in English | MEDLINE | ID: mdl-32533944

ABSTRACT

Adult height is one of the earliest putative examples of polygenic adaptation in humans. However, this conclusion was recently challenged because residual uncorrected stratification from large-scale consortium studies was considered responsible for the previously noted genetic difference. It thus remains an open question whether height loci exhibit signals of polygenic adaptation in any human population. We re-examined this question, focusing on one of the shortest European populations, the Sardinians, in addition to mainland European populations. We utilized height-associated loci from the Biobank Japan (BBJ) dataset to further alleviate concerns of biased ascertainment of GWAS loci and showed that the Sardinians remain significantly shorter than expected under neutrality (∼0.22 standard deviation shorter than Utah residents with ancestry from northern and western Europe [CEU] on the basis of polygenic height scores, p = 3.89 × 10-4). We also found the trajectory of polygenic height scores between the Sardinian and the British populations diverged over at least the last 10,000 years (p = 0.0082), consistent with a signature of polygenic adaptation driven primarily by the Sardinian population. Although the polygenic score-based analysis showed a much subtler signature in mainland European populations, we found a clear and robust adaptive signature in the UK population by using a haplotype-based statistic, the trait singleton density score (tSDS), driven by the height-increasing alleles (p = 9.1 × 10-4). In summary, by ascertaining height loci in a distant East Asian population, we further supported the evidence of polygenic adaptation at height-associated loci among the Sardinians. In mainland Europeans, the adaptive signature was detected in haplotype-based analysis but not in polygenic score-based analysis.


Subject(s)
Adaptation, Physiological/genetics , Body Height/genetics , Multifactorial Inheritance/genetics , Alleles , Asian People/genetics , Biological Specimen Banks , Genetics, Population/methods , Genome, Human/genetics , Genome-Wide Association Study/methods , Haplotypes/genetics , Humans , Italy , Japan , Phenotype , Polymorphism, Single Nucleotide/genetics , Selection, Genetic/genetics , White People/genetics
5.
Diabetologia ; 64(6): 1342-1347, 2021 06.
Article in English | MEDLINE | ID: mdl-33830302

ABSTRACT

AIMS/HYPOTHESIS: Given the potential shared aetiology between type 1 and type 2 diabetes, we aimed to identify any genetic regions associated with both diseases. For associations where there is a shared signal and the allele that increases risk to one disease also increases risk to the other, inference about shared aetiology could be made, with the potential to develop therapeutic strategies to treat or prevent both diseases simultaneously. Alternatively, if a genetic signal co-localises with divergent effect directions, it could provide valuable biological insight into how the association affects the two diseases differently. METHODS: Using publicly available type 2 diabetes summary statistics from a genome-wide association study (GWAS) meta-analysis of European ancestry individuals (74,124 cases and 824,006 controls) and type 1 diabetes GWAS summary statistics from a meta-analysis of studies on individuals from the UK and Sardinia (7467 cases and 10,218 controls), we identified all regions of 0.5 Mb that contained variants associated with both diseases (false discovery rate <0.01). In each region, we performed forward stepwise logistic regression to identify independent association signals, then examined co-localisation of each type 1 diabetes signal with each type 2 diabetes signal using coloc. Any association with a co-localisation posterior probability of ≥0.9 was considered a genuine shared association with both diseases. RESULTS: Of the 81 association signals from 42 genetic regions that showed association with both type 1 and type 2 diabetes, four association signals co-localised between both diseases (posterior probability ≥0.9): (1) chromosome 16q23.1, near CTRB1/BCAR1, which has been previously identified; (2) chromosome 11p15.5, near the INS gene; (3) chromosome 4p16.3, near TMEM129 and (4) chromosome 1p31.3, near PGM1. In each of these regions, the effect of genetic variants on type 1 diabetes was in the opposite direction to the effect on type 2 diabetes. Use of additional datasets also supported the previously identified co-localisation on chromosome 9p24.2, near the GLIS3 gene, in this case with a concordant direction of effect. CONCLUSIONS/INTERPRETATION: Four of five association signals that co-localise between type 1 diabetes and type 2 diabetes are in opposite directions, suggesting a complex genetic relationship between the two diseases.


Subject(s)
Diabetes Mellitus, Type 1/genetics , Diabetes Mellitus, Type 2/genetics , Genetic Predisposition to Disease , Polymorphism, Single Nucleotide , Alleles , Female , Genetic Association Studies , Genotype , Humans , Italy , Male , United Kingdom
6.
Genet Epidemiol ; 44(6): 537-549, 2020 09.
Article in English | MEDLINE | ID: mdl-32519380

ABSTRACT

A key aim for current genome-wide association studies (GWAS) is to interrogate the full spectrum of genetic variation underlying human traits, including rare variants, across populations. Deep whole-genome sequencing is the gold standard to fully capture genetic variation, but remains prohibitively expensive for large sample sizes. Array genotyping interrogates a sparser set of variants, which can be used as a scaffold for genotype imputation to capture a wider set of variants. However, imputation quality depends crucially on reference panel size and genetic distance from the target population. Here, we consider sequencing a subset of GWAS participants and imputing the rest using a reference panel that includes both sequenced GWAS participants and an external reference panel. We investigate how imputation quality and GWAS power are affected by the number of participants sequenced for admixed populations (African and Latino Americans) and European population isolates (Sardinians and Finns), and identify powerful, cost-effective GWAS designs given current sequencing and array costs. For populations that are well-represented in existing reference panels, we find that array genotyping alone is cost-effective and well-powered to detect common- and rare-variant associations. For poorly represented populations, sequencing a subset of participants is often most cost-effective, and can substantially increase imputation quality and GWAS power.


Subject(s)
Genome, Human , Genome-Wide Association Study , Whole Genome Sequencing , Cost-Benefit Analysis , Gene Frequency/genetics , Genome-Wide Association Study/economics , Genotype , Humans , Phenotype , Polymorphism, Single Nucleotide/genetics , Whole Genome Sequencing/economics
7.
Am J Hum Genet ; 103(5): 691-706, 2018 11 01.
Article in English | MEDLINE | ID: mdl-30388399

ABSTRACT

C-reactive protein (CRP) is a sensitive biomarker of chronic low-grade inflammation and is associated with multiple complex diseases. The genetic determinants of chronic inflammation remain largely unknown, and the causal role of CRP in several clinical outcomes is debated. We performed two genome-wide association studies (GWASs), on HapMap and 1000 Genomes imputed data, of circulating amounts of CRP by using data from 88 studies comprising 204,402 European individuals. Additionally, we performed in silico functional analyses and Mendelian randomization analyses with several clinical outcomes. The GWAS meta-analyses of CRP revealed 58 distinct genetic loci (p < 5 × 10-8). After adjustment for body mass index in the regression analysis, the associations at all except three loci remained. The lead variants at the distinct loci explained up to 7.0% of the variance in circulating amounts of CRP. We identified 66 gene sets that were organized in two substantially correlated clusters, one mainly composed of immune pathways and the other characterized by metabolic pathways in the liver. Mendelian randomization analyses revealed a causal protective effect of CRP on schizophrenia and a risk-increasing effect on bipolar disorder. Our findings provide further insights into the biology of inflammation and could lead to interventions for treating inflammation and its clinical consequences.


Subject(s)
Genetic Loci/genetics , Inflammation/genetics , Metabolic Networks and Pathways/genetics , Adolescent , Adult , Aged , Aged, 80 and over , Biomarkers/metabolism , Bipolar Disorder/genetics , Bipolar Disorder/metabolism , Body Mass Index , C-Reactive Protein/genetics , Child , Female , Genome-Wide Association Study/methods , Humans , Inflammation/metabolism , Liver/metabolism , Liver/pathology , Male , Mendelian Randomization Analysis/methods , Middle Aged , Schizophrenia/genetics , Schizophrenia/metabolism , Young Adult
8.
Mult Scler ; 27(9): 1332-1340, 2021 08.
Article in English | MEDLINE | ID: mdl-33566725

ABSTRACT

BACKGROUND: Defective alleles within the PRF1 gene, encoding the pore-forming protein perforin, in combination with environmental factors, cause familial type 2 hemophagocytic lymphohistiocytosis (FHL2), a rare, severe autosomal recessive childhood disorder characterized by massive release of cytokines-cytokine storm. OBJECTIVE: The aim of this study was to determine the function of hypomorph PRF1:p.A91V g.72360387 G > A on multiple sclerosis (MS) and type 1 diabetes (T1D). METHODS: We cross-compare the association data for PRF1:p.A91V mutation derived from GWAS on adult MS and pediatric T1D in Sardinians. The novel association with T1D was replicated in metanalysis in 12,584 cases and 17,692 controls from Sardinia, the United Kingdom, and Scotland. To dissect this mutation function, we searched through the coincident association immunophenotypes in additional set of general population Sardinians. RESULTS: We report that PRF1:p.A91V, is associated with increase of lymphocyte levels, especially within the cytotoxic memory T-cells, at general population level with reduced interleukin 7 receptor expression on these cells. The minor allele increased risk of MS, in 2903 cases and 2880 controls from Sardinia p = 2.06 × 10-4, odds ratio OR = 1.29, replicating a previous finding, whereas it protects from T1D p = 1.04 × 10-5, OR = 0.82. CONCLUSION: Our results indicate opposing contributions of the cytotoxic T-cell compartment to MS and T1D pathogenesis.


Subject(s)
Autoimmunity , Immune System , Autoimmunity/genetics , Child , Humans , Inflammation , LIM-Homeodomain Proteins , Muscle Proteins , Mutation , Perforin/genetics , Transcription Factors
9.
Genet Epidemiol ; 43(1): 112-117, 2019 02.
Article in English | MEDLINE | ID: mdl-30565766

ABSTRACT

It is unclear whether insertions and deletions (indels) are more likely to influence complex traits than abundant single-nucleotide polymorphisms (SNPs). We sought to understand which category of variation is more likely to impact health. Using the SardiNIA study as an exemplar, we characterized 478,876 common indels and 8,246,244 common SNPs in up to 5,949 well-phenotyped individuals from an isolated valley in Sardinia. We assessed association between 120 traits, resulting in 89 nonoverlapping-associated loci.We evaluated whether indels were enriched among credible sets of potential causal variants. These credible sets included 1,319 SNPs and 88 indels. We did not find indels to be significantly enriched. Indels were the most likely causal variant in seven loci, including one locus associated with monocyte count where an indel with causality and mechanism previously demonstrated (rs200748895:TGCTG/T) had a 0.999 posterior probability. Overall, our results show a very modest and nonsignificant enrichment for common indels in associated loci.


Subject(s)
INDEL Mutation/genetics , Polymorphism, Single Nucleotide/genetics , Genetic Loci , Humans , Italy , Molecular Sequence Annotation
10.
Genet Epidemiol ; 43(7): 800-814, 2019 10.
Article in English | MEDLINE | ID: mdl-31433078

ABSTRACT

The power of genetic association analyses can be increased by jointly meta-analyzing multiple correlated phenotypes. Here, we develop a meta-analysis framework, Meta-MultiSKAT, that uses summary statistics to test for association between multiple continuous phenotypes and variants in a region of interest. Our approach models the heterogeneity of effects between studies through a kernel matrix and performs a variance component test for association. Using a genotype kernel, our approach can test for rare-variants and the combined effects of both common and rare-variants. To achieve robust power, within Meta-MultiSKAT, we developed fast and accurate omnibus tests combining different models of genetic effects, functional genomic annotations, multiple correlated phenotypes, and heterogeneity across studies. In addition, Meta-MultiSKAT accommodates situations where studies do not share exactly the same set of phenotypes or have differing correlation patterns among the phenotypes. Simulation studies confirm that Meta-MultiSKAT can maintain the type-I error rate at the exome-wide level of 2.5 × 10-6 . Further simulations under different models of association show that Meta-MultiSKAT can improve the power of detection from 23% to 38% on average over single phenotype-based meta-analysis approaches. We demonstrate the utility and improved power of Meta-MultiSKAT in the meta-analyses of four white blood cell subtype traits from the Michigan Genomics Initiative (MGI) and SardiNIA studies.


Subject(s)
Genetic Association Studies , Meta-Analysis as Topic , Gene Frequency/genetics , Genotype , Humans , Italy , Leukocytes/metabolism , Models, Genetic , Mutation/genetics , Phenotype
11.
Am J Hum Genet ; 100(6): 865-884, 2017 Jun 01.
Article in English | MEDLINE | ID: mdl-28552196

ABSTRACT

Deep sequence-based imputation can enhance the discovery power of genome-wide association studies by assessing previously unexplored variation across the common- and low-frequency spectra. We applied a hybrid whole-genome sequencing (WGS) and deep imputation approach to examine the broader allelic architecture of 12 anthropometric traits associated with height, body mass, and fat distribution in up to 267,616 individuals. We report 106 genome-wide significant signals that have not been previously identified, including 9 low-frequency variants pointing to functional candidates. Of the 106 signals, 6 are in genomic regions that have not been implicated with related traits before, 28 are independent signals at previously reported regions, and 72 represent previously reported signals for a different anthropometric trait. 71% of signals reside within genes and fine mapping resolves 23 signals to one or two likely causal variants. We confirm genetic overlap between human monogenic and polygenic anthropometric traits and find signal enrichment in cis expression QTLs in relevant tissues. Our results highlight the potential of WGS strategies to enhance biologically relevant discoveries across the frequency spectrum.


Subject(s)
Anthropometry , Genome, Human , Genome-Wide Association Study , Quantitative Trait Loci/genetics , Sequence Analysis, DNA/methods , Body Height/genetics , Cohort Studies , DNA Methylation/genetics , Databases, Genetic , Female , Genetic Variation , Humans , Lipodystrophy/genetics , Male , Meta-Analysis as Topic , Obesity/genetics , Physical Chromosome Mapping , Sex Characteristics , Syndrome , United Kingdom
12.
N Engl J Med ; 376(17): 1615-1626, 2017 04 27.
Article in English | MEDLINE | ID: mdl-28445677

ABSTRACT

BACKGROUND: Genomewide association studies of autoimmune diseases have mapped hundreds of susceptibility regions in the genome. However, only for a few association signals has the causal gene been identified, and for even fewer have the causal variant and underlying mechanism been defined. Coincident associations of DNA variants affecting both the risk of autoimmune disease and quantitative immune variables provide an informative route to explore disease mechanisms and drug-targetable pathways. METHODS: Using case-control samples from Sardinia, Italy, we performed a genomewide association study in multiple sclerosis followed by TNFSF13B locus-specific association testing in systemic lupus erythematosus (SLE). Extensive phenotyping of quantitative immune variables, sequence-based fine mapping, cross-population and cross-phenotype analyses, and gene-expression studies were used to identify the causal variant and elucidate its mechanism of action. Signatures of positive selection were also investigated. RESULTS: A variant in TNFSF13B, encoding the cytokine and drug target B-cell activating factor (BAFF), was associated with multiple sclerosis as well as SLE. The disease-risk allele was also associated with up-regulated humoral immunity through increased levels of soluble BAFF, B lymphocytes, and immunoglobulins. The causal variant was identified: an insertion-deletion variant, GCTGT→A (in which A is the risk allele), yielded a shorter transcript that escaped microRNA inhibition and increased production of soluble BAFF, which in turn up-regulated humoral immunity. Population genetic signatures indicated that this autoimmunity variant has been evolutionarily advantageous, most likely by augmenting resistance to malaria. CONCLUSIONS: A TNFSF13B variant was associated with multiple sclerosis and SLE, and its effects were clarified at the population, cellular, and molecular levels. (Funded by the Italian Foundation for Multiple Sclerosis and others.).


Subject(s)
B-Cell Activating Factor/genetics , INDEL Mutation , Lupus Erythematosus, Systemic/genetics , Multiple Sclerosis/genetics , Autoimmunity , B-Cell Activating Factor/metabolism , Case-Control Studies , Gene Expression , Genome-Wide Association Study , Humans , Italy , Lupus Erythematosus, Systemic/immunology , MicroRNAs , Multiple Sclerosis/immunology , Phenotype , Polymorphism, Single Nucleotide , Risk , Sequence Analysis, RNA , Transcription, Genetic
13.
Mol Biol Evol ; 34(5): 1230-1239, 2017 05 01.
Article in English | MEDLINE | ID: mdl-28177087

ABSTRACT

Sardinians are "outliers" in the European genetic landscape and, according to paleogenomic nuclear data, the closest to early European Neolithic farmers. To learn more about their genetic ancestry, we analyzed 3,491 modern and 21 ancient mitogenomes from Sardinia. We observed that 78.4% of modern mitogenomes cluster into 89 haplogroups that most likely arose in situ. For each Sardinian-specific haplogroup (SSH), we also identified the upstream node in the phylogeny, from which non-Sardinian mitogenomes radiate. This provided minimum and maximum time estimates for the presence of each SSH on the island. In agreement with demographic evidence, almost all SSHs coalesce in the post-Nuragic, Nuragic and Neolithic-Copper Age periods. For some rare SSHs, however, we could not dismiss the possibility that they might have been on the island prior to the Neolithic, a scenario that would be in agreement with archeological evidence of a Mesolithic occupation of Sardinia.


Subject(s)
DNA, Mitochondrial/genetics , Genome, Mitochondrial/genetics , DNA, Ancient/analysis , Demography , Ethnicity/genetics , Evolution, Molecular , Genetic Variation/genetics , Genetics, Population/methods , Haplotypes/genetics , Humans , Islands , Italy/ethnology , Phylogeny , Sequence Analysis, DNA/methods , White People/genetics
14.
Bioinformatics ; 33(9): 1399-1401, 2017 05 01.
Article in English | MEDLINE | ID: mdl-28453676

ABSTRACT

Availability and Implementation: fastMitoCalc is available at https://lgsun.irp.nia.nih.gov/hsgu/software/mitoAnalyzer/index.html. Contact: jun.ding@nih.gov. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Gene Dosage , Genome, Mitochondrial , Genomics/methods , Sequence Analysis, DNA/methods , Software , Genome, Human , Humans
15.
Nature ; 490(7419): 267-72, 2012 Oct 11.
Article in English | MEDLINE | ID: mdl-22982992

ABSTRACT

There is evidence across several species for genetic control of phenotypic variation of complex traits, such that the variance among phenotypes is genotype dependent. Understanding genetic control of variability is important in evolutionary biology, agricultural selection programmes and human medicine, yet for complex traits, no individual genetic variants associated with variance, as opposed to the mean, have been identified. Here we perform a meta-analysis of genome-wide association studies of phenotypic variation using ∼170,000 samples on height and body mass index (BMI) in human populations. We report evidence that the single nucleotide polymorphism (SNP) rs7202116 at the FTO gene locus, which is known to be associated with obesity (as measured by mean BMI for each rs7202116 genotype), is also associated with phenotypic variability. We show that the results are not due to scale effects or other artefacts, and find no other experiment-wise significant evidence for effects on variability, either at loci other than FTO for BMI or at any locus for height. The difference in variance for BMI among individuals with opposite homozygous genotypes at the FTO locus is approximately 7%, corresponding to a difference of ∼0.5 kilograms in the standard deviation of weight. Our results indicate that genetic variants can be discovered that are associated with variability, and that between-person variability in obesity can partly be explained by the genotype at the FTO locus. The results are consistent with reported FTO by environment interactions for BMI, possibly mediated by DNA methylation. Our BMI results for other SNPs and our height results for all SNPs suggest that most genetic variants, including those that influence mean height or mean BMI, are not associated with phenotypic variance, or that their effects on variability are too small to detect even with samples sizes greater than 100,000.


Subject(s)
Body Mass Index , Genetic Variation , Phenotype , Proteins/genetics , Alpha-Ketoglutarate-Dependent Dioxygenase FTO , Body Height/genetics , Co-Repressor Proteins , Female , Genome-Wide Association Study , Humans , Male , Nerve Tissue Proteins/genetics , Polymorphism, Single Nucleotide , Repressor Proteins/genetics
16.
PLoS Genet ; 11(7): e1005306, 2015 Jul.
Article in English | MEDLINE | ID: mdl-26172475

ABSTRACT

DNA sequencing identifies common and rare genetic variants for association studies, but studies typically focus on variants in nuclear DNA and ignore the mitochondrial genome. In fact, analyzing variants in mitochondrial DNA (mtDNA) sequences presents special problems, which we resolve here with a general solution for the analysis of mtDNA in next-generation sequencing studies. The new program package comprises 1) an algorithm designed to identify mtDNA variants (i.e., homoplasmies and heteroplasmies), incorporating sequencing error rates at each base in a likelihood calculation and allowing allele fractions at a variant site to differ across individuals; and 2) an estimation of mtDNA copy number in a cell directly from whole-genome sequencing data. We also apply the methods to DNA sequence from lymphocytes of ~2,000 SardiNIA Project participants. As expected, mothers and offspring share all homoplasmies but a lesser proportion of heteroplasmies. Both homoplasmies and heteroplasmies show 5-fold higher transition/transversion ratios than variants in nuclear DNA. Also, heteroplasmy increases with age, though on average only ~1 heteroplasmy reaches the 4% level between ages 20 and 90. In addition, we find that mtDNA copy number averages ~110 copies/lymphocyte and is ~54% heritable, implying substantial genetic regulation of the level of mtDNA. Copy numbers also decrease modestly but significantly with age, and females on average have significantly more copies than males. The mtDNA copy numbers are significantly associated with waist circumference (p-value = 0.0031) and waist-hip ratio (p-value = 2.4×10-5), but not with body mass index, indicating an association with central fat distribution. To our knowledge, this is the largest population analysis to date of mtDNA dynamics, revealing the age-imposed increase in heteroplasmy, the relatively high heritability of copy number, and the association of copy number with metabolic traits.


Subject(s)
DNA Copy Number Variations/genetics , DNA, Mitochondrial/genetics , Gene Dosage/genetics , Lymphocytes/cytology , Obesity/genetics , Aging , Algorithms , Base Sequence , Body Fat Distribution , Body Mass Index , Female , High-Throughput Nucleotide Sequencing , Humans , Male , Mitochondria/genetics , Mitochondria/metabolism , Sequence Analysis, DNA , Sex Factors , Waist Circumference/genetics , Waist-Hip Ratio
17.
BMC Genomics ; 18(1): 747, 2017 Sep 21.
Article in English | MEDLINE | ID: mdl-28934930

ABSTRACT

BACKGROUND: We developed a novel software package, XCAVATOR, for the identification of genomic regions involved in copy number variants/alterations (CNVs/CNAs) from short and long reads whole-genome sequencing experiments. RESULTS: By using simulated and real datasets we showed that our tool, based on read count approach, is capable to predict the boundaries and the absolute number of DNA copies CNVs/CNAs with high resolutions. To demonstrate the power of our software we applied it to the analysis Illumina and Pacific Bioscencies data and we compared its performance to other ten state of the art tools. CONCLUSION: All the analyses we performed demonstrate that XCAVATOR is capable to detect germline and somatic CNVs/CNAs outperforming all the other tools we compared. XCAVATOR is freely available at http://sourceforge.net/projects/xcavator/ .


Subject(s)
DNA Copy Number Variations/genetics , Genotyping Techniques/methods , Whole Genome Sequencing , Polymorphism, Single Nucleotide , Software
18.
PLoS Genet ; 10(5): e1004353, 2014 May.
Article in English | MEDLINE | ID: mdl-24809476

ABSTRACT

Genome sequencing of the 5,300-year-old mummy of the Tyrolean Iceman, found in 1991 on a glacier near the border of Italy and Austria, has yielded new insights into his origin and relationship to modern European populations. A key finding of that study was an apparent recent common ancestry with individuals from Sardinia, based largely on the Y chromosome haplogroup and common autosomal SNP variation. Here, we compiled and analyzed genomic datasets from both modern and ancient Europeans, including genome sequence data from over 400 Sardinians and two ancient Thracians from Bulgaria, to investigate this result in greater detail and determine its implications for the genetic structure of Neolithic Europe. Using whole-genome sequencing data, we confirm that the Iceman is, indeed, most closely related to Sardinians. Furthermore, we show that this relationship extends to other individuals from cultural contexts associated with the spread of agriculture during the Neolithic transition, in contrast to individuals from a hunter-gatherer context. We hypothesize that this genetic affinity of ancient samples from different parts of Europe with Sardinians represents a common genetic component that was geographically widespread across Europe during the Neolithic, likely related to migrations and population expansions associated with the spread of agriculture.


Subject(s)
Fossils , Genetics, Population , Genome, Human , Europe , Female , Humans , Polymorphism, Single Nucleotide
19.
Genome Res ; 23(1): 142-51, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23064751

ABSTRACT

Emerging sequencing technologies allow common and rare variants to be systematically assayed across the human genome in many individuals. In order to improve variant detection and genotype calling, raw sequence data are typically examined across many individuals. Here, we describe a method for genotype calling in settings where sequence data are available for unrelated individuals and parent-offspring trios and show that modeling trio information can greatly increase the accuracy of inferred genotypes and haplotypes, especially on low to modest depth sequencing data. Our method considers both linkage disequilibrium (LD) patterns and the constraints imposed by family structure when assigning individual genotypes and haplotypes. Using simulations, we show that trios provide higher genotype calling accuracy across the frequency spectrum, both overall and at hard-to-call heterozygous sites. In addition, trios provide greatly improved phasing accuracy--improving the accuracy of downstream analyses (such as genotype imputation) that rely on phased haplotypes. To further evaluate our approach, we analyzed data on the first 508 individuals sequenced by the SardiNIA sequencing project. Our results show that our method reduces the genotyping error rate by 50% compared with analysis using existing methods that ignore family structure. We anticipate our method will facilitate genotype calling and haplotype inference for many ongoing sequencing projects.


Subject(s)
Genotyping Techniques , Haplotypes , Models, Genetic , Genome, Human , Humans , Linkage Disequilibrium , Pedigree , Polymorphism, Single Nucleotide , Sequence Analysis, DNA/methods
20.
Hum Mol Genet ; 22(17): 3597-607, 2013 Sep 01.
Article in English | MEDLINE | ID: mdl-23669352

ABSTRACT

Genetic loci for body mass index (BMI) in adolescence and young adulthood, a period of high risk for weight gain, are understudied, yet may yield important insight into the etiology of obesity and early intervention. To identify novel genetic loci and examine the influence of known loci on BMI during this critical time period in late adolescence and early adulthood, we performed a two-stage meta-analysis using 14 genome-wide association studies in populations of European ancestry with data on BMI between ages 16 and 25 in up to 29 880 individuals. We identified seven independent loci (P < 5.0 × 10⁻8) near FTO (P = 3.72 × 10⁻²³), TMEM18 (P = 3.24 × 10⁻¹7), MC4R (P = 4.41 × 10⁻¹7), TNNI3K (P = 4.32 × 10⁻¹¹), SEC16B (P = 6.24 × 10⁻9), GNPDA2 (P = 1.11 × 10⁻8) and POMC (P = 4.94 × 10⁻8) as well as a potential secondary signal at the POMC locus (rs2118404, P = 2.4 × 10⁻5 after conditioning on the established single-nucleotide polymorphism at this locus) in adolescents and young adults. To evaluate the impact of the established genetic loci on BMI at these young ages, we examined differences between the effect sizes of 32 published BMI loci in European adult populations (aged 18-90) and those observed in our adolescent and young adult meta-analysis. Four loci (near PRKD1, TNNI3K, SEC16B and CADM2) had larger effects and one locus (near SH2B1) had a smaller effect on BMI during adolescence and young adulthood compared with older adults (P < 0.05). These results suggest that genetic loci for BMI can vary in their effects across the life course, underlying the importance of evaluating BMI at different ages.


Subject(s)
Body Mass Index , Genetic Loci , Weight Gain/genetics , Adolescent , Adult , Age Factors , Aged , Aged, 80 and over , Cohort Studies , Genome-Wide Association Study , Humans , Middle Aged , Polymorphism, Single Nucleotide , White People/genetics , Young Adult
SELECTION OF CITATIONS
SEARCH DETAIL