Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 23
Filter
1.
Cell ; 167(3): 643-656.e17, 2016 Oct 20.
Article in English | MEDLINE | ID: mdl-27768888

ABSTRACT

Humans differ in the outcome that follows exposure to life-threatening pathogens, yet the extent of population differences in immune responses and their genetic and evolutionary determinants remain undefined. Here, we characterized, using RNA sequencing, the transcriptional response of primary monocytes from Africans and Europeans to bacterial and viral stimuli-ligands activating Toll-like receptor pathways (TLR1/2, TLR4, and TLR7/8) and influenza virus-and mapped expression quantitative trait loci (eQTLs). We identify numerous cis-eQTLs that contribute to the marked differences in immune responses detected within and between populations and a strong trans-eQTL hotspot at TLR1 that decreases expression of pro-inflammatory genes in Europeans only. We find that immune-responsive regulatory variants are enriched in population-specific signals of natural selection and show that admixture with Neandertals introduced regulatory variants into European genomes, affecting preferentially responses to viral challenges. Together, our study uncovers evolutionarily important determinants of differences in host immune responsiveness between human populations.


Subject(s)
Adaptation, Physiological/genetics , Adaptation, Physiological/immunology , Adaptive Immunity , Neanderthals/genetics , Neanderthals/immunology , Adaptive Immunity/genetics , Alleles , Animals , Bacterial Infections/genetics , Bacterial Infections/immunology , Base Sequence , Biological Evolution , Black People/genetics , Gene Expression Regulation , Genetic Variation , Humans , Immune System , Quantitative Trait Loci , RNA/genetics , Selection, Genetic , Sequence Analysis, RNA , Toll-Like Receptors/genetics , Transcription, Genetic , Virus Diseases/genetics , Virus Diseases/immunology , White People/genetics
2.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38546325

ABSTRACT

Expression quantitative trait loci (eQTLs) are used to inform the mechanisms of transcriptional regulation in eukaryotic cells. However, the specificity of genome-wide eQTL identification is limited by stringent control for false discoveries. Here, we described a method based on the non-homogeneous Poisson process to identify 125 489 regions with highly frequent, multiple eQTL associations, or 'eQTL-hotspots', from the public database of 59 human tissues or cell types. We stratified the eQTL-hotspots into two classes with their distinct sequence and epigenomic characteristics. Based on these classifications, we developed a machine-learning model, E-SpotFinder, for augmented discovery of tissue- or cell-type-specific eQTL-hotspots. We applied this model to 36 tissues or cell types. Using augmented eQTL-hotspots, we recovered 655 402 eSNPs and reconstructed a comprehensive regulatory network of 2 725 380 cis-interactions among eQTL-hotspots. We further identified 52 012 modules representing transcriptional programs with unique functional backgrounds. In summary, our study provided a framework of epigenome-augmented eQTL analysis and thereby constructed comprehensive genome-wide networks of cis-regulations across diverse human tissues or cell types.


Subject(s)
Epigenome , Epigenomics , Humans , Databases, Factual , Eukaryotic Cells , Machine Learning
3.
Am J Hum Genet ; 109(5): 783-801, 2022 05 05.
Article in English | MEDLINE | ID: mdl-35334221

ABSTRACT

Integrative analysis of genome-wide association studies (GWASs) and gene expression studies in the form of a transcriptome-wide association study (TWAS) has the potential to better elucidate the molecular mechanisms underlying disease etiology. Here we present a method, METRO, that can leverage gene expression data collected from multiple genetic ancestries to enhance TWASs. METRO incorporates expression prediction models constructed in different genetic ancestries through a likelihood-based inference framework, producing calibrated p values with substantially improved TWAS power. We illustrate the benefits of METRO in both simulations and applications to seven complex traits and diseases obtained from four GWASs. These GWASs include two of primarily European ancestry (n = 188,577 and 339,226) and two of primarily African ancestry (n = 42,752 and 23,827). In the real data applications, we leverage gene expression data measured on 1,032 African Americans and 801 European Americans from the Genetic Epidemiology Network of Arteriopathy (GENOA) study to identify a substantially larger number of gene-trait associations as compared to existing TWAS approaches. The benefits of METRO are most prominent in applications to GWASs of African ancestry where the sample size is much smaller than GWASs of European ancestry and where a more powerful TWAS method is crucial. Among the identified associations are high-density lipoprotein-associated genes including PLTP and PPARG that are critical for maintaining lipid homeostasis and the type II diabetes-associated gene MAPT that supports microtubule-associated protein tau as a key component underlying impaired insulin secretion.


Subject(s)
Diabetes Mellitus, Type 2 , Genome-Wide Association Study , Diabetes Mellitus, Type 2/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study/methods , Humans , Likelihood Functions , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/genetics , Transcriptome/genetics
4.
Genet Epidemiol ; 44(8): 798-810, 2020 11.
Article in English | MEDLINE | ID: mdl-32700329

ABSTRACT

Many expression quantitative trait loci (eQTL) studies have been conducted to investigate the biological effects of variants in gene regulation. However, these eQTL studies may suffer from low or moderate statistical power and overly conservative false-discovery rate. In practice, most algorithms for eQTL identification do not model the joint effects of multiple genetic variants with weak or moderate influence. Here we present a novel machine-learning algorithm, lasso least-squares kernel machine (LSKM-LASSO) that model the association between multiple genetic variants and phenotypic traits simultaneously with the existence of nongenetic and genetic confounding. With a more general and flexible framework for the estimation of genetic confounding, LSKM-LASSO is able to provide a more accurate evaluation of the joint effects of multiple genetic variants. Our simulations demonstrate that our approach outperforms three state-of-the-art alternatives in terms of eQTL identification and phenotype prediction. We then apply our method to genotype and gene expression data of 11 tissues obtained from the Genotype-Tissue Expression project. Our algorithm was able to identify more genes with eQTL than other algorithms. By incorporating a regularization term and combining it with least-squares kernel machine, LSKM-LASSO provides a powerful tool for eQTL mapping and phenotype prediction.


Subject(s)
Machine Learning , Quantitative Trait Loci/genetics , Algorithms , Confounding Factors, Epidemiologic , Gene Expression Profiling , Gene Expression Regulation , Genotype , Humans , Models, Genetic , Phenotype , Polymorphism, Single Nucleotide/genetics
5.
BMC Genomics ; 19(1): 793, 2018 Nov 03.
Article in English | MEDLINE | ID: mdl-30390624

ABSTRACT

BACKGROUND: The mutations changing the expression level of a gene, or expression quantitative trait loci (eQTL), can be identified by testing the association between genetic variants and gene expression in multiple individuals (eQTL mapping), or by comparing the expression of the alleles in a heterozygous individual (allele specific expression or ASE analysis). The aims of the study were to find and compare ASE and local eQTL in 4 bovine RNA-sequencing (RNA-Seq) datasets, validate them in an independent ASE study and investigate if they are associated with complex trait variation. RESULTS: We present a novel method for distinguishing between ASE driven by polymorphisms in cis and parent of origin effects. We found that single nucleotide polymorphisms (SNPs) driving ASE are also often local eQTL and therefore presumably cis eQTL. These SNPs often, but not always, affect gene expression in multiple tissues and, when they do, the allele increasing expression is usually the same. However, there were systematic differences between ASE and local eQTL and between tissues and breeds. We also found that SNPs significantly associated with gene expression (p < 0.001) were likely to influence some complex traits (p < 0.001), which means that some mutations influence variation in complex traits by changing the expression level of genes. CONCLUSION: We conclude that ASE detects phenomenon that overlap with local eQTL, but there are also systematic differences between the SNPs discovered by the two methods. Some mutations influencing complex traits are actually eQTL and can be discovered using RNA-Seq including eQTL in the genes CAST, CAPN1, LCORL and LEPROTL1.


Subject(s)
Alleles , Gene Expression , Genetic Variation , Multifactorial Inheritance , Quantitative Trait Loci , Quantitative Trait, Heritable , Animals , Cattle , Chromosome Mapping , Genome-Wide Association Study , Polymorphism, Single Nucleotide , Reproducibility of Results , Sequence Analysis, RNA
6.
BMC Bioinformatics ; 17: 136, 2016 Mar 22.
Article in English | MEDLINE | ID: mdl-27000043

ABSTRACT

BACKGROUND: As a promising tool for dissecting the genetic basis of common diseases, expression quantitative trait loci (eQTL) study has attracted increasing research interest. Traditional eQTL methods focus on testing the associations between individual single-nucleotide polymorphisms (SNPs) and gene expression traits. A major drawback of this approach is that it cannot model the joint effect of a set of SNPs on a set of genes, which may correspond to biological pathways. RESULTS: To alleviate this limitation, in this paper, we propose geQTL, a sparse regression method that can detect both group-wise and individual associations between SNPs and expression traits. geQTL can also correct the effects of potential confounders. Our method employs computationally efficient technique, thus it is able to fulfill large scale studies. Moreover, our method can automatically infer the proper number of group-wise associations. We perform extensive experiments on both simulated datasets and yeast datasets to demonstrate the effectiveness and efficiency of the proposed method. The results show that geQTL can effectively detect both individual and group-wise signals and outperforms the state-of-the-arts by a large margin. CONCLUSIONS: This paper well illustrates that decoupling individual and group-wise associations for association mapping is able to improve eQTL mapping accuracy, and inferring individual and group-wise associations.


Subject(s)
Quantitative Trait Loci , Saccharomyces cerevisiae/genetics , Algorithms , Chromosome Mapping , Phenotype , Polymorphism, Single Nucleotide , Regression Analysis
7.
BMC Bioinformatics ; 17(1): 531, 2016 Dec 13.
Article in English | MEDLINE | ID: mdl-27964730

ABSTRACT

BACKGROUND: Expression quantitative trait loci (eQTL) mapping is often used to identify genetic loci and candidate genes correlated with traits. Although usually a group of genes affect complex traits, genes in most eQTL mapping methods are considered as independent. Recently, some eQTL mapping methods have accounted for correlated genes, used biological prior knowledge and applied these in model species such as yeast or mouse. However, biological prior knowledge might be very limited for most species. RESULTS: We proposed a data-driven prior knowledge guided eQTL mapping for identifying candidate genes. At first, quantitative trait loci (QTL) analysis was used to identify single nucleotide polymorphisms (SNP) markers that are associated with traits. Then co-expressed gene modules were generated and gene modules significantly associated with traits were selected. Prior knowledge from QTL mapping was used for eQTL mapping on the selected modules. We tested and compared prior knowledge guided eQTL mapping to the eQTL mapping with no prior knowledge in a simulation study and two barley stem rust resistance case studies. The results in simulation study and real barley case studies show that models using prior knowledge outperform models without prior knowledge. In the first case study, three gene modules were selected and one of the gene modules was enriched with defense response Gene Ontology (GO) terms. Also, one probe in the gene module is mapped to Rpg1, previously identified as resistance gene to stem rust. In the second case study, four gene modules are identified, one gene module is significantly enriched with defense response to fungus and bacterium. CONCLUSIONS: Prior knowledge guided eQTL mapping is an effective method for identifying candidate genes. The case studies in stem rust show that this approach is robust, and outperforms methods with no prior knowledge in identifying candidate genes.


Subject(s)
Chromosome Mapping/methods , Hordeum/genetics , Mice/genetics , Quantitative Trait Loci , Saccharomyces cerevisiae/genetics , Animals , Gene Regulatory Networks , Phenotype , Polymorphism, Single Nucleotide
8.
BMC Bioinformatics ; 17: 257, 2016 Jun 24.
Article in English | MEDLINE | ID: mdl-27341818

ABSTRACT

BACKGROUND: In order to better understand complex diseases, it is important to understand how genetic variation in the regulatory regions affects gene expression. Genetic variants found in these regulatory regions have been shown to activate transcription in a tissue-specific manner. Therefore, it is important to map the aforementioned expression quantitative trait loci (eQTL) using a statistically disciplined approach that jointly models all the tissues and makes use of all the information available to maximize the power of eQTL mapping. In this context, we are proposing a score test-based approach where we model tissue-specificity as a random effect and investigate an overall shift in the gene expression combined with tissue-specific effects due to genetic variants. RESULTS: Our approach has 1) a distinct computational edge, and 2) comparable performance in terms of statistical power over other currently existing joint modeling approaches such as MetaTissue eQTL and eQTL-BMA. Using simulations, we show that our method increases the power to detect eQTLs when compared to a tissue-by-tissue approach and can exceed the performance, in terms of computational speed, of MetaTissue eQTL and eQTL-BMA. We apply our method to two publicly available expression datasets from normal human brains, one comprised of four brain regions from 150 neuropathologically normal samples and another comprised of ten brain regions from 134 neuropathologically normal samples, and show that by using our method and jointly analyzing multiple brain regions, we identify eQTLs within more genes when compared to three often used existing methods. CONCLUSIONS: Since we employ a score test-based approach, there is no need for parameter estimation under the alternative hypothesis. As a result, model parameters only have to be estimated once per genome, significantly decreasing computation time. Our method also accommodates the analysis of next- generation sequencing data. As an example, by modeling gene transcripts in an analogous fashion to tissues in our current formulation one would be able to test for both a variant overall effect across all isoforms of a gene as well as transcript-specific effects. We implement our approach within the R package JAGUAR, which is now available at the Comprehensive R Archive Network repository.


Subject(s)
Brain/physiology , Gene Expression Profiling , Quantitative Trait Loci , Software , Gene Expression Regulation , Genetic Variation , Genome-Wide Association Study , Humans , Organ Specificity , Regression Analysis , Regulatory Sequences, Nucleic Acid
9.
G3 (Bethesda) ; 14(2)2024 Feb 07.
Article in English | MEDLINE | ID: mdl-38015660

ABSTRACT

Regulation of gene expression plays a crucial role in developmental processes and adaptation to changing environments. expression quantitative trait locus (eQTL) mapping is a technique used to study the genetic regulation of gene expression using the transcriptomes of recombinant inbred lines (RILs). Typically, the age of the inbred lines at the time of RNA sampling is carefully controlled. This is necessary because the developmental process causes changes in gene expression, complicating the interpretation of eQTL mapping experiments. However, due to genetics and variation in ambient micro-environments, organisms can differ in their "developmental age," even if they are of the same chronological age. As a result, eQTL patterns are affected by developmental variation in gene expression. The model organism Caenorhabditis elegans is particularly suited for studying the effect of developmental variation on eQTL mapping patterns. In a span of days, C. elegans transitions from embryo through 4 larval stages to adult while undergoing massive changes to its transcriptome. Here, we use C. elegans to investigate the effect of developmental age variation on eQTL patterns and present a normalization procedure. We used dynamical eQTL mapping, which includes the developmental age as a cofactor, to separate the variation in development from genotypic variation and explain variation in gene expression levels. We compare classical single marker eQTL mapping and dynamical eQTL mapping using RNA-seq data of ∼200 multi-parental RILs of C. elegans. The results show that (1) many eQTLs are caused by developmental variation, (2) most trans-bands are developmental QTLs, and (3) dynamical eQTL mapping detects additional eQTLs not found with classical eQTL mapping. We recommend that correction for variation in developmental age should be strongly considered in eQTL mapping studies given the large impact of processes like development on the transcriptome.


Subject(s)
Caenorhabditis elegans , Quantitative Trait Loci , Animals , Caenorhabditis elegans/genetics , Chromosome Mapping/methods , Gene Expression Regulation , Genotype
10.
G3 (Bethesda) ; 14(3)2024 03 06.
Article in English | MEDLINE | ID: mdl-38262701

ABSTRACT

Copper is one of a handful of biologically necessary heavy metals that is also a common environmental pollutant. Under normal conditions, copper ions are required for many key physiological processes. However, in excess, copper results in cell and tissue damage ranging in severity from temporary injury to permanent neurological damage. Because of its biological relevance, and because many conserved copper-responsive genes respond to nonessential heavy metal pollutants, copper resistance in Drosophila melanogaster is a useful model system with which to investigate the genetic control of the heavy metal stress response. Because heavy metal toxicity has the potential to differently impact specific tissues, we genetically characterized the control of the gene expression response to copper stress in a tissue-specific manner in this study. We assessed the copper stress response in head and gut tissue of 96 inbred strains from the Drosophila Synthetic Population Resource using a combination of differential expression analysis and expression quantitative trait locus mapping. Differential expression analysis revealed clear patterns of tissue-specific expression. Tissue and treatment specific responses to copper stress were also detected using expression quantitative trait locus mapping. Expression quantitative trait locus associated with MtnA, Mdr49, Mdr50, and Sod3 exhibited both genotype-by-tissue and genotype-by-treatment effects on gene expression under copper stress, illuminating tissue- and treatment-specific patterns of gene expression control. Together, our data build a nuanced description of the roles and interactions between allelic and expression variation in copper-responsive genes, provide valuable insight into the genomic architecture of susceptibility to metal toxicity, and highlight candidate genes for future functional characterization.


Subject(s)
Drosophila melanogaster , Metals, Heavy , Animals , Drosophila melanogaster/genetics , Drosophila melanogaster/metabolism , Copper/toxicity , Metals, Heavy/metabolism , Metals, Heavy/toxicity , Gene Expression Regulation , Drosophila/genetics , Gene Expression
11.
J Anim Sci Biotechnol ; 14(1): 78, 2023 May 11.
Article in English | MEDLINE | ID: mdl-37165455

ABSTRACT

BACKGROUND: A detailed understanding of genetic variants that affect beef merit helps maximize the efficiency of breeding for improved production merit in beef cattle. To prioritize the putative variants and genes, we ran a comprehensive genome-wide association studies (GWAS) analysis for 21 agronomic traits using imputed whole-genome variants in Simmental beef cattle. Then, we applied expression quantitative trait loci (eQTL) mapping between the genotype variants and transcriptome of three tissues (longissimus dorsi muscle, backfat, and liver) in 120 cattle. RESULTS: We identified 1,580 association signals for 21 beef agronomic traits using GWAS. We then illuminated 854,498 cis-eQTLs for 6,017 genes and 46,970 trans-eQTLs for 1,903 genes in three tissues and built a synergistic network by integrating transcriptomics with agronomic traits. These cis-eQTLs were preferentially close to the transcription start site and enriched in functional regulatory regions. We observed an average of 43.5% improvement in cis-eQTL discovery using multi-tissue eQTL mapping. Fine-mapping analysis revealed that 111, 192, and 194 variants were most likely to be causative to regulate gene expression in backfat, liver, and muscle, respectively. The transcriptome-wide association studies identified 722 genes significantly associated with 11 agronomic traits. Via the colocalization and Mendelian randomization analyses, we found that eQTLs of several genes were associated with the GWAS signals of agronomic traits in three tissues, which included genes, such as NADSYN1, NDUFS3, LTF and KIFC2 in liver, GRAMD1C, TMTC2 and ZNF613 in backfat, as well as TIGAR, NDUFS3 and L3HYPDH in muscle that could serve as the candidate genes for economic traits. CONCLUSIONS: The extensive atlas of GWAS, eQTL, fine-mapping, and transcriptome-wide association studies aid in the suggestion of potentially functional variants and genes in cattle agronomic traits and will be an invaluable source for genomics and breeding in beef cattle.

12.
bioRxiv ; 2023 Jul 18.
Article in English | MEDLINE | ID: mdl-37503205

ABSTRACT

Copper is one of a handful of biologically necessary heavy metals that is also a common environmental pollutant. Under normal conditions, copper ions are required for many key physiological processes. However, in excess, copper quickly results in cell and tissue damage that can range in severity from temporary injury to permanent neurological damage. Because of its biological relevance, and because many conserved copper-responsive genes also respond to other non-essential heavy metal pollutants, copper resistance in Drosophila melanogaster is a useful model system with which to investigate the genetic control of the response to heavy metal stress. Because heavy metal toxicity has the potential to differently impact specific tissues, we genetically characterized the control of the gene expression response to copper stress in a tissue-specific manner in this study. We assessed the copper stress response in head and gut tissue of 96 inbred strains from the Drosophila Synthetic Population Resource (DSPR) using a combination of differential expression analysis and expression quantitative trait locus (eQTL) mapping. Differential expression analysis revealed clear patterns of tissue-specific expression, primarily driven by a more pronounced gene expression response in gut tissue. eQTL mapping of gene expression under control and copper conditions as well as for the change in gene expression following copper exposure (copper response eQTL) revealed hundreds of genes with tissue-specific local cis-eQTL and many distant trans-eQTL. eQTL associated with MtnA, Mdr49, Mdr50, and Sod3 exhibited genotype by environment effects on gene expression under copper stress, illuminating several tissue- and treatment-specific patterns of gene expression control. Together, our data build a nuanced description of the roles and interactions between allelic and expression variation in copper-responsive genes, provide valuable insight into the genomic architecture of susceptibility to metal toxicity, and highlight many candidate genes for future functional characterization.

13.
bioRxiv ; 2023 Sep 16.
Article in English | MEDLINE | ID: mdl-37745421

ABSTRACT

Genetic factors play a significant role in the risk for development of alcohol use disorder (AUD). Using 3-bottle choice intermittent access ethanol (IEA), we have employed the Diversity Outbred (DO) mouse panel as a model of alcohol use disorder in a genetically diverse population. Through use of gene expression network analysis techniques, in combination with expression quantitative trait loci (eQTL) mapping, we have completed an extensive analysis of the influence of genetic background on gene expression changes in the prefrontal cortex (PFC). This approach revealed that, in DO mice, genes whose expression was significantly disrupted by intermittent ethanol in the PFC also tended to be those whose expression correlated to intake. This finding is in contrast to previous studies of both mice and nonhuman primates. Importantly, these analyses identified genes involved in myelination in the PFC as significantly disrupted by IEA, correlated to ethanol intake, and having significant eQTLs. Genes that code for canonical components of the myelin sheath, such as Mbp, also emerged as key drivers of the gene expression response to intermittent ethanol drinking. Several regulators of myelination were also key drivers of gene expression, and had significant QTLs, indicating that genetic background may play an important role in regulation of brain myelination. These findings underscore the importance of disruption of normal myelination in the PFC in response to prolonged ethanol exposure, that genetic variation plays an important role in this response, and that this interaction between genetics and myelin disruption in the presence of ethanol may underlie previously observed behavioral changes under intermittent access ethanol drinking such as escalation of consumption.

14.
Genome Biol ; 24(1): 33, 2023 02 23.
Article in English | MEDLINE | ID: mdl-36823676

ABSTRACT

Using latent variables in gene expression data can help correct unobserved confounders and increase statistical power for expression quantitative trait Loci (eQTL) detection. The probabilistic estimation of expression residuals (PEER) and principal component analysis (PCA) are widely used methods that can remove unwanted variation and improve eQTL discovery power in bulk RNA-seq analysis. However, their performance has not been evaluated extensively in single-cell eQTL analysis, especially for different cell types. Potential challenges arise due to the structure of single-cell RNA-seq data, including sparsity, skewness, and mean-variance relationship. Here, we show by a series of analyses that PEER and PCA require additional quality control and data transformation steps on the pseudo-bulk matrix to obtain valid latent variables; otherwise, it can result in highly correlated factors (Pearson's correlation r = 0.63 ~ 0.99). Incorporating valid PFs/PCs in the eQTL association model would identify 1.7 ~ 13.3% more eGenes. Sensitivity analysis showed that the pattern of change between the number of eGenes detected and fitted PFs/PCs varied significantly in different cell types. In addition, using highly variable genes to generate latent variables could achieve similar eGenes discovery power as using all genes but save considerable computational resources (~ 6.2-fold faster).


Subject(s)
Genome-Wide Association Study , Quantitative Trait Loci , Genome-Wide Association Study/methods , RNA-Seq , Polymorphism, Single Nucleotide
15.
HGG Adv ; 3(3): 100103, 2022 Jul 14.
Article in English | MEDLINE | ID: mdl-35519825

ABSTRACT

Mapping genetic variants that regulate gene expression (eQTL mapping) in large-scale RNA sequencing (RNA-seq) studies is often employed to understand functional consequences of regulatory variants. However, the high cost of RNA-seq limits sample size, sequencing depth, and, therefore, discovery power in eQTL studies. In this work, we demonstrate that, given a fixed budget, eQTL discovery power can be increased by lowering the sequencing depth per sample and increasing the number of individuals sequenced in the assay. We perform RNA-seq of whole-blood tissue across 1,490 individuals at low coverage (5.9 million reads/sample) and show that the effective power is higher than that of an RNA-seq study of 570 individuals at moderate coverage (13.9 million reads/sample). Next, we leverage synthetic datasets derived from real RNA-seq data (50 million reads/sample) to explore the interplay of coverage and number individuals in eQTL studies, and show that a 10-fold reduction in coverage leads to only a 2.5-fold reduction in statistical power to identify eQTLs. Our work suggests that lowering coverage while increasing the number of individuals in RNA-seq is an effective approach to increase discovery power in eQTL studies.

16.
Front Genet ; 12: 690926, 2021.
Article in English | MEDLINE | ID: mdl-34868194

ABSTRACT

Characterization of genetic variations that are associated with gene expression levels is essential to understand cellular mechanisms that underline human complex traits. Expression quantitative trait loci (eQTL) mapping attempts to identify genetic variants, such as single nucleotide polymorphisms (SNPs), that affect the expression of one or more genes. With the availability of a large volume of gene expression data, it is necessary and important to develop fast and efficient statistical and computational methods to perform eQTL mapping for such large scale data. In this paper, we proposed a new method, the low rank penalized regression method (LORSEN), for eQTL mapping. We evaluated and compared the performance of LORSEN with two existing methods for eQTL mapping using extensive simulations as well as real data from the HapMap3 project. Simulation studies showed that our method outperformed two commonly used methods for eQTL mapping, LORS and FastLORS, in many scenarios in terms of area under the curve (AUC). We illustrated the usefulness of our method by applying it to SNP variants data and gene expression levels on four chromosomes from the HapMap3 Project.

17.
Genes (Basel) ; 12(7)2021 07 05.
Article in English | MEDLINE | ID: mdl-34356056

ABSTRACT

Many marine ectotherms, especially those inhabiting highly variable intertidal zones, develop high phenotypic plasticity in response to rapid climate change by modulating gene expression levels. Herein, we examined the regulatory architecture of heat-responsive gene expression plasticity in oysters using expression quantitative trait loci (eQTL) analysis. Using a backcross family of Crassostrea gigas and its sister species Crassostrea angulata under acute stress, 56 distant regulatory regions accounting for 6-26.6% of the gene expression variation were identified for 19 heat-responsive genes. In total, 831 genes and 164 single nucleotide polymorphisms (SNPs) that could potentially regulate expression of the target genes were screened in the eQTL region. The association between three SNPs and the corresponding target genes was verified in an independent family. Specifically, Marker13973 was identified for heat shock protein (HSP) family A member 9 (HspA9). Ribosomal protein L10a (RPL10A) was detected approximately 2 kb downstream of the distant regulatory SNP. Further, Marker14346-48 and Marker14346-85 were in complete linkage disequilibrium and identified for autophagy-related gene 7 (ATG7). Nuclear respiratory factor 1 (NRF1) was detected approximately 3 kb upstream of the two SNPs. These results suggested regulatory relationships between RPL10A and HSPA9 and between NRF1 and ATG7. Our findings indicate that distant regulatory mutations play an important role in the regulation of gene expression plasticity by altering upstream regulatory factors in response to heat stress. The identified eQTLs provide candidate biomarkers for predicting the persistence of oysters under future climate change scenarios.


Subject(s)
Ostreidae/genetics , Quantitative Trait Loci/genetics , Regulatory Sequences, Nucleic Acid/genetics , Adaptation, Physiological , Animals , Crassostrea/genetics , Female , Gene Expression , Heat-Shock Proteins/genetics , Heat-Shock Response/genetics , Linkage Disequilibrium , Male , Polymorphism, Single Nucleotide , Stress, Physiological/genetics
18.
Methods Mol Biol ; 2082: 105-121, 2020.
Article in English | MEDLINE | ID: mdl-31849011

ABSTRACT

As a promising tool for dissecting the genetic basis of common diseases, expression quantitative trait loci (eQTL) study has attracted increasing research interest. Traditional eQTL methods focus on testing the associations between individual single-nucleotide polymorphisms (SNPs) and gene expression traits. A major drawback of this approach is that it cannot model the joint effect of a set of SNPs on a set of genes, which may correspond to biological pathways. To alleviate this limitation, in this chapter, we propose geQTL, a sparse regression method that can detect both group-wise and individual associations between SNPs and expression traits. geQTL can also correct the effects of potential confounders. Our method employs computationally efficient technique, thus it is able to fulfill large scale studies. Moreover, our method can automatically infer the proper number of group-wise associations. We perform extensive experiments on both simulated datasets and yeast datasets to demonstrate the effectiveness and efficiency of the proposed method. The results show that geQTL can effectively detect both individual and group-wise signals and outperform the state-of-the-arts by a large margin. This book chapter well illustrates that decoupling individual and group-wise associations for association mapping is able to improve eQTL mapping accuracy, and inferring individual and group-wise associations.


Subject(s)
Chromosome Mapping , Gene Expression , Genome-Wide Association Study/methods , Quantitative Trait Loci , Regression Analysis , Algorithms , Computational Biology/methods , Humans , ROC Curve , Reproducibility of Results , Yeasts/genetics
19.
Methods Mol Biol ; 2082: 123-146, 2020.
Article in English | MEDLINE | ID: mdl-31849012

ABSTRACT

The discovery of genomic polymorphisms influencing gene expression (also known as expression quantitative trait loci or eQTLs) can be formulated as a sparse Bayesian multivariate/multiple regression problem. An important aspect in the development of such models is the implementation of bespoke inference methodologies, a process which can become quite laborious, when multiple candidate models are being considered. We describe automatic, black-box inference in such models using Stan, a popular probabilistic programming language. The utilization of systems like Stan can facilitate model prototyping and testing, thus accelerating the data modeling process. The code described in this chapter can be found at https://github.com/dvav/eQTLBookChapter .


Subject(s)
Bayes Theorem , Chromosome Mapping , Computational Biology/methods , Gene Expression , Quantitative Trait Loci , Software , Algorithms , Gene Expression Profiling/methods , Polymorphism, Single Nucleotide , Programming Languages
20.
Methods Mol Biol ; 2082: 157-171, 2020.
Article in English | MEDLINE | ID: mdl-31849014

ABSTRACT

Expression quantitative trait loci (eQTL) mapping studies identify genetic loci that regulate gene expression. eQTL mapping studies can capture gene regulatory interactions and provide insight into the genetic mechanism of biological systems. Recently, the integration of multi-omics data, such as single-nucleotide polymorphisms (SNPs), copy number variations (CNVs), DNA methylation, and gene expression, plays an important role in elucidating complex biological systems, since biological systems involve a sequence of complex interactions between various biological processes. This chapter introduces multi-omics data that have been used in many eQTL studies and integrative methodologies that incorporate multi-omics data for eQTL studies. Furthermore, we describe a statistical approach that can detect nonlinear causal relationships between eQTLs, called eQTL epistasis, and its importance.


Subject(s)
Chromosome Mapping , Epistasis, Genetic , Gene Expression , Genome-Wide Association Study , Genomics , Quantitative Trait Loci , Algorithms , Computational Biology/methods , Genome-Wide Association Study/methods , Genomics/methods , Humans , Polymorphism, Genetic , Polymorphism, Single Nucleotide
SELECTION OF CITATIONS
SEARCH DETAIL