Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 142
Filter
Add more filters

Publication year range
1.
Am J Hum Genet ; 108(8): 1488-1501, 2021 08 05.
Article in English | MEDLINE | ID: mdl-34214457

ABSTRACT

Across species, offspring of related individuals often exhibit significant reduction in fitness-related traits, known as inbreeding depression (ID), yet the genetic and molecular basis for ID remains elusive. Here, we develop a method to quantify enrichment of ID within specific genomic annotations and apply it to human data. We analyzed the phenomes and genomes of ∼350,000 unrelated participants of the UK Biobank and found, on average of over 11 traits, significant enrichment of ID within genomic regions with high recombination rates (>21-fold; p < 10-5), with conserved function across species (>19-fold; p < 10-4), and within regulatory elements such as DNase I hypersensitive sites (∼5-fold; p = 8.9 × 10-7). We also quantified enrichment of ID within trait-associated regions and found suggestive evidence that genomic regions contributing to additive genetic variance in the population are enriched for ID signal. We find strong correlations between functional enrichment of SNP-based heritability and that of ID (r = 0.8, standard error: 0.1). These findings provide empirical evidence that ID is most likely due to many partially recessive deleterious alleles in low linkage disequilibrium regions of the genome. Our study suggests that functional characterization of ID may further elucidate the genetic architectures and biological mechanisms underlying complex traits and diseases.


Subject(s)
Genome-Wide Association Study , Genomics/methods , Inbreeding Depression/genetics , Linkage Disequilibrium , Multifactorial Inheritance/genetics , Phenotype , Polymorphism, Single Nucleotide , Female , Humans , Male
2.
Am J Hum Genet ; 108(5): 786-798, 2021 05 06.
Article in English | MEDLINE | ID: mdl-33811805

ABSTRACT

Non-additive genetic variance for complex traits is traditionally estimated from data on relatives. It is notoriously difficult to estimate without bias in non-laboratory species, including humans, because of possible confounding with environmental covariance among relatives. In principle, non-additive variance attributable to common DNA variants can be estimated from a random sample of unrelated individuals with genome-wide SNP data. Here, we jointly estimate the proportion of variance explained by additive (hSNP2), dominance (δSNP2) and additive-by-additive (ηSNP2) genetic variance in a single analysis model. We first show by simulations that our model leads to unbiased estimates and provide a new theory to predict standard errors estimated using either least-squares or maximum likelihood. We then apply the model to 70 complex traits using 254,679 unrelated individuals from the UK Biobank and 1.1 M genotyped and imputed SNPs. We found strong evidence for additive variance (average across traits h¯SNP2=0.208). In contrast, the average estimate of δ¯SNP2 across traits was 0.001, implying negligible dominance variance at causal variants tagged by common SNPs. The average epistatic variance η¯SNP2 across the traits was 0.055, not significantly different from zero because of the large sampling variance. Our results provide new evidence that genetic variance for complex traits is predominantly additive and that sample sizes of many millions of unrelated individuals are needed to estimate epistatic variance with sufficient precision.


Subject(s)
Datasets as Topic , Multifactorial Inheritance/genetics , Polymorphism, Single Nucleotide/genetics , Biological Specimen Banks , Epistasis, Genetic , Female , Genotype , Humans , Male , Models, Genetic , Phenotype , Reproducibility of Results , United Kingdom
3.
Genet Sel Evol ; 56(1): 50, 2024 Jun 27.
Article in English | MEDLINE | ID: mdl-38937662

ABSTRACT

BACKGROUND: Genome sequence variants affecting complex traits (quantitative trait loci, QTL) are enriched in functional regions of the genome, such as those marked by certain histone modifications. These variants are believed to influence gene expression. However, due to the linkage disequilibrium among nearby variants, pinpointing the precise location of QTL is challenging. We aimed to identify allele-specific binding (ASB) QTL (asbQTL) that cause variation in the level of histone modification, as measured by the height of peaks assayed by ChIP-seq (chromatin immunoprecipitation sequencing). We identified DNA sequences that predict the difference between alleles in ChIP-seq peak height in H3K4me3 and H3K27ac histone modifications in the mammary glands of cows. RESULTS: We used a gapped k-mer support vector machine, a novel best linear unbiased prediction model, and a multiple linear regression model that combines the other two approaches to predict variant impacts on peak height. For each method, a subset of 1000 sites with the highest magnitude of predicted ASB was considered as candidate asbQTL. The accuracy of this prediction was measured by the proportion where the predicted direction matched the observed direction. Prediction accuracy ranged between 0.59 and 0.74, suggesting that these 1000 sites are enriched for asbQTL. Using independent data, we investigated functional enrichment in the candidate asbQTL set and three control groups, including non-causal ASB sites, non-ASB variants under a peak, and SNPs (single nucleotide polymorphisms) not under a peak. For H3K4me3, a higher proportion of the candidate asbQTL were confirmed as ASB when compared to the non-causal ASB sites (P < 0.01). However, these candidate asbQTL did not enrich for the other annotations, including expression QTL (eQTL), allele-specific expression QTL (aseQTL) and sites conserved across mammals (P > 0.05). CONCLUSIONS: We identified putatively causal sites for asbQTL using the DNA sequence surrounding these sites. Our results suggest that many sites influencing histone modifications may not directly affect gene expression. However, it is important to acknowledge that distinguishing between putative causal ASB sites and other non-causal ASB sites in high linkage disequilibrium with the causal sites regarding their impact on gene expression may be challenging due to limitations in statistical power.


Subject(s)
Alleles , Chromatin Immunoprecipitation Sequencing , Histones , Quantitative Trait Loci , Animals , Cattle/genetics , Histones/genetics , Histones/metabolism , Chromatin Immunoprecipitation Sequencing/methods , Polymorphism, Single Nucleotide , Histone Code , Linkage Disequilibrium , Molecular Sequence Annotation , Female
4.
Genet Sel Evol ; 56(1): 22, 2024 Mar 28.
Article in English | MEDLINE | ID: mdl-38549172

ABSTRACT

BACKGROUND: Bovine lactoferrin (Lf) is an iron absorbing whey protein with antibacterial, antiviral, and antifungal activity. Lactoferrin is economically valuable and has an extremely variable concentration in milk, partly driven by environmental influences such as milking frequency, involution, or mastitis. A significant genetic influence has also been previously observed to regulate lactoferrin content in milk. Here, we conducted genetic mapping of lactoferrin protein concentration in conjunction with RNA-seq, ChIP-seq, and ATAC-seq data to pinpoint candidate causative variants that regulate lactoferrin concentrations in milk. RESULTS: We identified a highly-significant lactoferrin protein quantitative trait locus (pQTL), as well as a cis lactotransferrin (LTF) expression QTL (cis-eQTL) mapping to the LTF locus. Using ChIP-seq and ATAC-seq datasets representing lactating mammary tissue samples, we also report a number of regions where the openness of chromatin is under genetic influence. Several of these also show highly significant QTL with genetic signatures similar to those highlighted through pQTL and eQTL analysis. By performing correlation analysis between these QTL, we revealed an ATAC-seq peak in the putative promotor region of LTF, that highlights a set of 115 high-frequency variants that are potentially responsible for these effects. One of the 115 variants (rs110000337), which maps within the ATAC-seq peak, was predicted to alter binding sites of transcription factors known to be involved in lactation-related pathways. CONCLUSIONS: Here, we report a regulatory haplotype of 115 variants with conspicuously large impacts on milk lactoferrin concentration. These findings could enable the selection of animals for high-producing specialist herds.


Subject(s)
Lactation , Lactoferrin , Milk , Animals , Female , Haplotypes , Lactation/genetics , Lactoferrin/genetics , Lactoferrin/analysis , Lactoferrin/metabolism , Milk/chemistry , Milk/metabolism , Cattle
6.
Genet Sel Evol ; 55(1): 9, 2023 Jan 31.
Article in English | MEDLINE | ID: mdl-36721111

ABSTRACT

Studies have demonstrated that structural variants (SV) play a substantial role in the evolution of species and have an impact on Mendelian traits in the genome. However, unlike small variants (< 50 bp), it has been challenging to accurately identify and genotype SV at the population scale using short-read sequencing. Long-read sequencing technologies are becoming competitively priced and can address several of the disadvantages of short-read sequencing for the discovery and genotyping of SV. In livestock species, analysis of SV at the population scale still faces challenges due to the lack of resources, high costs, technological barriers, and computational limitations. In this review, we summarize recent progress in the characterization of SV in the major livestock species, the obstacles that still need to be overcome, as well as the future directions in this growing field. It seems timely that research communities pool resources to build global population-scale long-read sequencing consortiums for the major livestock species for which the application of genomic tools has become cost-effective.


Subject(s)
Genomics , Livestock , Animals , Livestock/genetics , Genotype , Phenotype
7.
Eur Heart J ; 43(19): 1864-1877, 2022 05 14.
Article in English | MEDLINE | ID: mdl-35567557

ABSTRACT

AIMS: Inflammation is a key factor in atherosclerosis. The transcription factor interferon regulatory factor-5 (IRF5) drives macrophages towards a pro-inflammatory state. We investigated the role of IRF5 in human atherosclerosis and plaque stability. METHODS AND RESULTS: Bulk RNA sequencing from the Carotid Plaque Imaging Project biobank were used to mine associations between major macrophage associated genes and transcription factors and human symptomatic carotid disease. Immunohistochemistry, proximity extension assays, and Helios cytometry by time of flight (CyTOF) were used for validation. The effect of IRF5 deficiency on carotid plaque phenotype and rupture in ApoE-/- mice was studied in an inducible model of plaque rupture. Interferon regulatory factor-5 and ITGAX/CD11c were identified as the macrophage associated genes with the strongest associations with symptomatic carotid disease. Expression of IRF5 and ITGAX/CD11c correlated with the vulnerability index, pro-inflammatory plaque cytokine levels, necrotic core area, and with each other. Macrophages were the predominant CD11c-expressing immune cells in the plaque by CyTOF and immunohistochemistry. Interferon regulatory factor-5 immunopositive areas were predominantly found within CD11c+ areas with a predilection for the shoulder region, the area of the human plaque most prone to rupture. Accordingly, an inducible plaque rupture model of ApoE-/-Irf5-/- mice had significantly lower frequencies of carotid plaque ruptures, smaller necrotic cores, and less CD11c+ macrophages than their IRF5-competent counterparts. CONCLUSION: Using complementary evidence from data from human carotid endarterectomies and a murine model of inducible rupture of carotid artery plaque in IRF5-deficient mice, we demonstrate a mechanistic link between the pro-inflammatory transcription factor IRF5, macrophage phenotype, plaque inflammation, and its vulnerability to rupture.


Subject(s)
Atherosclerosis , Interferon Regulatory Factors , Macrophages , Plaque, Atherosclerotic , Animals , Apolipoproteins E/genetics , Atherosclerosis/metabolism , Atherosclerosis/pathology , Humans , Inflammation/metabolism , Interferon Regulatory Factors/metabolism , Macrophages/immunology , Mice , Necrosis , Plaque, Atherosclerotic/metabolism , Plaque, Atherosclerotic/pathology
8.
BMC Genomics ; 23(1): 815, 2022 Dec 08.
Article in English | MEDLINE | ID: mdl-36482302

ABSTRACT

BACKGROUND: Causal variants for complex traits, such as eQTL are often found in non-coding regions of the genome, where they are hypothesised to influence phenotypes by regulating gene expression. Many regulatory regions are marked by histone modifications, which can be assayed by chromatin immunoprecipitation followed by sequencing (ChIP-seq). Sequence reads from ChIP-seq form peaks at putative regulatory regions, which may reflect the amount of regulatory activity at this region. Therefore, eQTL which are also associated with differences in histone modifications are excellent candidate causal variants. RESULTS: We assayed the histone modifications H3K4Me3, H3K4Me1 and H3K27ac and mRNA in the mammary gland of up to 400 animals. We identified QTL for peak height (histone QTL), exon expression (eeQTL), allele specific expression (aseQTL) and allele specific binding (asbQTL). By intersecting these results, we identify variants which may influence gene expression by altering regulatory regions of the genome, and may be causal variants for other traits. Lastly, we find that these variants are found in putative transcription factor binding sites, identifying a mechanism for the effect of many eQTL. CONCLUSIONS: We find that allele specific and traditional QTL analysis often identify the same genetic variants and provide evidence that many eQTL are regulatory variants which alter activity at regulatory regions of the bovine genome. Our work provides methodological and biological updates on how regulatory mechanisms interplay at multi-omics levels.


Subject(s)
Histone Code , Multiomics , Cattle/genetics , Animals , Genetic Variation , Gene Expression
9.
Mol Psychiatry ; 26(6): 2070-2081, 2021 06.
Article in English | MEDLINE | ID: mdl-32398722

ABSTRACT

Substantial genetic liability is shared across psychiatric disorders but less is known about risk variants that are specific to a given disorder. We used multi-trait conditional and joint analysis (mtCOJO) to adjust GWAS summary statistics of one disorder for the effects of genetically correlated traits to identify putative disorder-specific SNP associations. We applied mtCOJO to summary statistics for five psychiatric disorders from the Psychiatric Genomics Consortium-schizophrenia (SCZ), bipolar disorder (BIP), major depression (MD), attention-deficit hyperactivity disorder (ADHD) and autism (AUT). Most genome-wide significant variants for these disorders had evidence of pleiotropy (i.e., impact on multiple psychiatric disorders) and hence have reduced mtCOJO conditional effect sizes. However, subsets of genome-wide significant variants had larger conditional effect sizes consistent with disorder-specific effects: 15 of 130 genome-wide significant variants for schizophrenia, 5 of 40 for major depression, 3 of 11 for ADHD and 1 of 2 for autism. We show that decreased expression of VPS29 in the brain may increase risk to SCZ only and increased expression of CSE1L is associated with SCZ and MD, but not with BIP. Likewise, decreased expression of PCDHA7 in the brain is linked to increased risk of MD but decreased risk of SCZ and BIP.


Subject(s)
Attention Deficit Disorder with Hyperactivity , Bipolar Disorder , Schizophrenia , Attention Deficit Disorder with Hyperactivity/genetics , Bipolar Disorder/genetics , Genetic Predisposition to Disease/genetics , Genome-Wide Association Study , Humans , Polymorphism, Single Nucleotide/genetics , Schizophrenia/genetics
10.
Genet Sel Evol ; 54(1): 60, 2022 Sep 06.
Article in English | MEDLINE | ID: mdl-36068488

ABSTRACT

BACKGROUND: Sharing individual phenotype and genotype data between countries is complex and fraught with potential errors, while sharing summary statistics of genome-wide association studies (GWAS) is relatively straightforward, and thus would be especially useful for traits that are expensive or difficult-to-measure, such as feed efficiency. Here we examined: (1) the sharing of individual cow data from international partners; and (2) the use of sequence variants selected from GWAS of international cow data to evaluate the accuracy of genomic estimated breeding values (GEBV) for residual feed intake (RFI) in Australian cows. RESULTS: GEBV for RFI were estimated using genomic best linear unbiased prediction (GBLUP) with 50k or high-density single nucleotide polymorphisms (SNPs), from a training population of 3797 individuals in univariate to trivariate analyses where the three traits were RFI phenotypes calculated using 584 Australian lactating cows (AUSc), 824 growing heifers (AUSh), and 2526 international lactating cows (OVE). Accuracies of GEBV in AUSc were evaluated by either cohort-by-birth-year or fourfold random cross-validations. GEBV of AUSc were also predicted using only the AUS training population with a weighted genomic relationship matrix constructed with SNPs from the 50k array and sequence variants selected from a meta-GWAS that included only international datasets. The genomic heritabilities estimated using the AUSc, OVE and AUSh datasets were moderate, ranging from 0.20 to 0.36. The genetic correlations (rg) of traits between heifers and cows ranged from 0.30 to 0.95 but were associated with large standard errors. The mean accuracies of GEBV in Australian cows were up to 0.32 and almost doubled when either overseas cows, or both overseas cows and AUS heifers were included in the training population. They also increased when selected sequence variants were combined with 50k SNPs, but with a smaller relative increase. CONCLUSIONS: The accuracy of RFI GEBV increased when international data were used or when selected sequence variants were combined with 50k SNP array data. This suggests that if direct sharing of data is not feasible, a meta-analysis of summary GWAS statistics could provide selected SNPs for custom panels to use in genomic selection programs. However, since this finding is based on a small cross-validation study, confirmation through a larger study is recommended.


Subject(s)
Cattle , Lactation , Animals , Australia , Cattle/genetics , Female , Genome-Wide Association Study , Genomics , Genotype , Phenotype , Polymorphism, Single Nucleotide
12.
Proc Natl Acad Sci U S A ; 116(39): 19398-19408, 2019 09 24.
Article in English | MEDLINE | ID: mdl-31501319

ABSTRACT

Many genome variants shaping mammalian phenotype are hypothesized to regulate gene transcription and/or to be under selection. However, most of the evidence to support this hypothesis comes from human studies. Systematic evidence for regulatory and evolutionary signals contributing to complex traits in a different mammalian model is needed. Sequence variants associated with gene expression (expression quantitative trait loci [eQTLs]) and concentration of metabolites (metabolic quantitative trait loci [mQTLs]) and under histone-modification marks in several tissues were discovered from multiomics data of over 400 cattle. Variants under selection and evolutionary constraint were identified using genome databases of multiple species. These analyses defined 30 sets of variants, and for each set, we estimated the genetic variance the set explained across 34 complex traits in 11,923 bulls and 32,347 cows with 17,669,372 imputed variants. The per-variant trait heritability of these sets across traits was highly consistent (r > 0.94) between bulls and cows. Based on the per-variant heritability, conserved sites across 100 vertebrate species and mQTLs ranked the highest, followed by eQTLs, young variants, those under histone-modification marks, and selection signatures. From these results, we defined a Functional-And-Evolutionary Trait Heritability (FAETH) score indicating the functionality and predicted heritability of each variant. In additional 7,551 cattle, the high FAETH-ranking variants had significantly increased genetic variances and genomic prediction accuracies in 3 production traits compared to the low FAETH-ranking variants. The FAETH framework combines the information of gene regulation, evolution, and trait heritability to rank variants, and the publicly available FAETH data provide a set of biological priors for cattle genomic selection worldwide.


Subject(s)
Biological Evolution , Cattle/genetics , Gene Expression Regulation/genetics , Multifactorial Inheritance/genetics , Animals , Breeding , Databases, Genetic , Female , Genetic Variation , Genome/genetics , Genome-Wide Association Study , Male , Phenotype , Quantitative Trait Loci/genetics , Selection, Genetic
13.
Annu Rev Genet ; 47: 75-95, 2013.
Article in English | MEDLINE | ID: mdl-23988118

ABSTRACT

Understanding genetic variation of complex traits in human populations has moved from the quantification of the resemblance between close relatives to the dissection of genetic variation into the contributions of individual genomic loci. However, major questions remain unanswered: How much phenotypic variation is genetic; how much of the genetic variation is additive and can be explained by fitting all genetic variants simultaneously in one model, and what is the joint distribution of effect size and allele frequency at causal variants? We review and compare three whole-genome analysis methods that use mixed linear models (MLMs) to estimate genetic variation. In all methods, genetic variation is estimated from the relationship between close or distant relatives on the basis of pedigree information and/or single nucleotide polymorphisms (SNPs). We discuss theory, estimation procedures, bias, and precision of each method and review recent advances in the dissection of genetic variation of complex traits in human populations. By using genome-wide data, it is now established that SNPs in total account for far more of the genetic variation than the statistically highly significant SNPs that have been detected in genome-wide association studies. All SNPs together, however, do not account for all of the genetic variance estimated by pedigree-based methods. We explain possible reasons for this remaining "missing heritability."


Subject(s)
Genetic Variation , Genome, Human , Genome-Wide Association Study , Models, Genetic , Multifactorial Inheritance , Body Height/genetics , Family , Genome-Wide Association Study/methods , Genome-Wide Association Study/statistics & numerical data , Humans , Linear Models , Pedigree , Phenotype , Polymorphism, Single Nucleotide , Quantitative Trait, Heritable , Research Design , Sample Size , Sequence Analysis, DNA , Siblings , Twins, Dizygotic/genetics , Twins, Monozygotic/genetics
15.
Nat Rev Genet ; 14(7): 507-15, 2013 07.
Article in English | MEDLINE | ID: mdl-23774735

ABSTRACT

The success of genome-wide association studies (GWASs) has led to increasing interest in making predictions of complex trait phenotypes, including disease, from genotype data. Rigorous assessment of the value of predictors is crucial before implementation. Here we discuss some of the limitations and pitfalls of prediction analysis and show how naive implementations can lead to severe bias and misinterpretation of results.


Subject(s)
Genome-Wide Association Study , Phenotype , Polymorphism, Single Nucleotide , Genetic Markers/genetics , Genetic Variation , Genomics , Genotype , Humans , Models, Genetic , Models, Statistical , Reproducibility of Results , Risk
16.
Circulation ; 136(12): 1140-1154, 2017 Sep 19.
Article in English | MEDLINE | ID: mdl-28698173

ABSTRACT

BACKGROUND: Myeloid cells are central to atherosclerotic lesion development and vulnerable plaque formation. Impaired ability of arterial phagocytes to uptake apoptotic cells (efferocytosis) promotes lesion growth and establishment of a necrotic core. The transcription factor interferon regulatory factor (IRF)-5 is an important modulator of myeloid function and programming. We sought to investigate whether IRF5 affects the formation and phenotype of atherosclerotic lesions. METHODS: We investigated the role of IRF5 in atherosclerosis in 2 complementary models. First, atherosclerotic lesion development in hyperlipidemic apolipoprotein E-deficient (ApoE-/-) mice and ApoE-/- mice with a genetic deletion of IRF5 (ApoE-/-Irf5-/-) was compared and then lesion development was assessed in a model of shear stress-modulated vulnerable plaque formation. RESULTS: Both lesion and necrotic core size were significantly reduced in ApoE-/-Irf5-/- mice compared with IRF5-competent ApoE-/- mice. Necrotic core size was also reduced in the model of shear stress-modulated vulnerable plaque formation. A significant loss of CD11c+ macrophages was evident in ApoE-/-Irf5-/- mice in the aorta, draining lymph nodes, and bone marrow cell cultures, indicating that IRF5 maintains CD11c+ macrophages in atherosclerosis. Moreover, we revealed that the CD11c gene is a direct target of IRF5 in macrophages. In the absence of IRF5, CD11c- macrophages displayed a significant increase in expression of the efferocytosis-regulating integrin-ß3 and its ligand milk fat globule-epidermal growth factor 8 protein and enhanced efferocytosis in vitro and in situ. CONCLUSIONS: IRF5 is detrimental in atherosclerosis by promoting the maintenance of proinflammatory CD11c+ macrophages within lesions and controlling the expansion of the necrotic core by impairing efferocytosis.


Subject(s)
Atherosclerosis/pathology , Interferon Regulatory Factors/metabolism , Animals , Aorta/metabolism , Aorta/pathology , Apolipoproteins E/deficiency , Apolipoproteins E/genetics , Atherosclerosis/metabolism , Bone Marrow Cells/cytology , Bone Marrow Cells/metabolism , CD11c Antigen/genetics , CD11c Antigen/metabolism , Cells, Cultured , Immunohistochemistry , Integrin beta3/metabolism , Interferon Regulatory Factors/deficiency , Interferon Regulatory Factors/genetics , Lymph Nodes/cytology , Macrophages/cytology , Macrophages/metabolism , Mice , Mice, Inbred C57BL , Mice, Knockout , Necrosis , Phagocytosis , Shear Strength
17.
BMC Genomics ; 19(1): 793, 2018 Nov 03.
Article in English | MEDLINE | ID: mdl-30390624

ABSTRACT

BACKGROUND: The mutations changing the expression level of a gene, or expression quantitative trait loci (eQTL), can be identified by testing the association between genetic variants and gene expression in multiple individuals (eQTL mapping), or by comparing the expression of the alleles in a heterozygous individual (allele specific expression or ASE analysis). The aims of the study were to find and compare ASE and local eQTL in 4 bovine RNA-sequencing (RNA-Seq) datasets, validate them in an independent ASE study and investigate if they are associated with complex trait variation. RESULTS: We present a novel method for distinguishing between ASE driven by polymorphisms in cis and parent of origin effects. We found that single nucleotide polymorphisms (SNPs) driving ASE are also often local eQTL and therefore presumably cis eQTL. These SNPs often, but not always, affect gene expression in multiple tissues and, when they do, the allele increasing expression is usually the same. However, there were systematic differences between ASE and local eQTL and between tissues and breeds. We also found that SNPs significantly associated with gene expression (p < 0.001) were likely to influence some complex traits (p < 0.001), which means that some mutations influence variation in complex traits by changing the expression level of genes. CONCLUSION: We conclude that ASE detects phenomenon that overlap with local eQTL, but there are also systematic differences between the SNPs discovered by the two methods. Some mutations influencing complex traits are actually eQTL and can be discovered using RNA-Seq including eQTL in the genes CAST, CAPN1, LCORL and LEPROTL1.


Subject(s)
Alleles , Gene Expression , Genetic Variation , Multifactorial Inheritance , Quantitative Trait Loci , Quantitative Trait, Heritable , Animals , Cattle , Chromosome Mapping , Genome-Wide Association Study , Polymorphism, Single Nucleotide , Reproducibility of Results , Sequence Analysis, RNA
18.
BMC Genomics ; 19(1): 521, 2018 Jul 04.
Article in English | MEDLINE | ID: mdl-29973141

ABSTRACT

BACKGROUND: Mammalian phenotypes are shaped by numerous genome variants, many of which may regulate gene transcription or RNA splicing. To identify variants with regulatory functions in cattle, an important economic and model species, we used sequence variants to map a type of expression quantitative trait loci (expression QTLs) that are associated with variations in the RNA splicing, i.e., sQTLs. To further the understanding of regulatory variants, sQTLs were compare with other two types of expression QTLs, 1) variants associated with variations in gene expression, i.e., geQTLs and 2) variants associated with variations in exon expression, i.e., eeQTLs, in different tissues. RESULTS: Using whole genome and RNA sequence data from four tissues of over 200 cattle, sQTLs identified using exon inclusion ratios were verified by matching their effects on adjacent intron excision ratios. sQTLs contained the highest percentage of variants that are within the intronic region of genes and contained the lowest percentage of variants that are within intergenic regions, compared to eeQTLs and geQTLs. Many geQTLs and sQTLs are also detected as eeQTLs. Many expression QTLs, including sQTLs, were significant in all four tissues and had a similar effect in each tissue. To verify such expression QTL sharing between tissues, variants surrounding (±1 Mb) the exon or gene were used to build local genomic relationship matrices (LGRM) and estimated genetic correlations between tissues. For many exons, the splicing and expression level was determined by the same cis additive genetic variance in different tissues. Thus, an effective but simple-to-implement meta-analysis combining information from three tissues is introduced to increase power to detect and validate sQTLs. sQTLs and eeQTLs together were more enriched for variants associated with cattle complex traits, compared to geQTLs. Several putative causal mutations were identified, including an sQTL at Chr6:87392580 within the 5th exon of kappa casein (CSN3) associated with milk production traits. CONCLUSIONS: Using novel analytical approaches, we report the first identification of numerous bovine sQTLs which are extensively shared between multiple tissue types. The significant overlaps between bovine sQTLs and complex traits QTL highlight the contribution of regulatory mutations to phenotypic variations.


Subject(s)
Genetic Variation , RNA Splicing , Animals , Blood Cells/metabolism , Caseins/genetics , Cattle , Exons , Female , Liver/metabolism , Mammary Glands, Animal/metabolism , Muscles/metabolism , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Transcriptome
19.
Am J Hum Genet ; 97(5): 677-90, 2015 Nov 05.
Article in English | MEDLINE | ID: mdl-26544803

ABSTRACT

Genetic prediction based on either identity by state (IBS) sharing or pedigree information has been investigated extensively with best linear unbiased prediction (BLUP) methods. Such methods were pioneered in plant and animal-breeding literature and have since been applied to predict human traits, with the aim of eventual clinical utility. However, methods to combine IBS sharing and pedigree information for genetic prediction in humans have not been explored. We introduce a two-variance-component model for genetic prediction: one component for IBS sharing and one for approximate pedigree structure, both estimated with genetic markers. In simulations using real genotypes from the Candidate-gene Association Resource (CARe) and Framingham Heart Study (FHS) family cohorts, we demonstrate that the two-variance-component model achieves gains in prediction r(2) over standard BLUP at current sample sizes, and we project, based on simulations, that these gains will continue to hold at larger sample sizes. Accordingly, in analyses of four quantitative phenotypes from CARe and two quantitative phenotypes from FHS, the two-variance-component model significantly improves prediction r(2) in each case, with up to a 20% relative improvement. We also find that standard mixed-model association tests can produce inflated test statistics in datasets with related individuals, whereas the two-variance-component model corrects for inflation.


Subject(s)
Cardiovascular Diseases/diagnosis , Genetic Markers , Genome-Wide Association Study , Models, Genetic , Models, Statistical , Quantitative Trait Loci , Cardiovascular Diseases/genetics , Computer Simulation , Datasets as Topic , Family , Genetic Association Studies , Genomics/methods , Humans , Phenotype , Polymorphism, Single Nucleotide/genetics , Principal Component Analysis , Selection, Genetic/genetics
20.
Am J Hum Genet ; 96(5): 720-30, 2015 May 07.
Article in English | MEDLINE | ID: mdl-25892111

ABSTRACT

We introduce a liability-threshold mixed linear model (LTMLM) association statistic for case-control studies and show that it has a well-controlled false-positive rate and more power than existing mixed-model methods for diseases with low prevalence. Existing mixed-model methods suffer a loss in power under case-control ascertainment, but no solution has been proposed. Here, we solve this problem by using a χ(2) score statistic computed from posterior mean liabilities (PMLs) under the liability-threshold model. Each individual's PML is conditional not only on that individual's case-control status but also on every individual's case-control status and the genetic relationship matrix (GRM) obtained from the data. The PMLs are estimated with a multivariate Gibbs sampler; the liability-scale phenotypic covariance matrix is based on the GRM, and a heritability parameter is estimated via Haseman-Elston regression on case-control phenotypes and then transformed to the liability scale. In simulations of unrelated individuals, the LTMLM statistic was correctly calibrated and achieved higher power than existing mixed-model methods for diseases with low prevalence, and the magnitude of the improvement depended on sample size and severity of case-control ascertainment. In a Wellcome Trust Case Control Consortium 2 multiple sclerosis dataset with >10,000 samples, LTMLM was correctly calibrated and attained a 4.3% improvement (p = 0.005) in χ(2) statistics over existing mixed-model methods at 75 known associated SNPs, consistent with simulations. Larger increases in power are expected at larger sample sizes. In conclusion, case-control studies of diseases with low prevalence can achieve power higher than that in existing mixed-model methods.


Subject(s)
Genetic Association Studies , Models, Genetic , Models, Theoretical , Case-Control Studies , Chromosome Mapping , Computer Simulation , Humans , Multiple Sclerosis/genetics , Multiple Sclerosis/pathology , Phenotype , Polymorphism, Single Nucleotide , Sample Size
SELECTION OF CITATIONS
SEARCH DETAIL