Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
1.
Mol Syst Biol ; 18(8): e10663, 2022 08.
Article in English | MEDLINE | ID: mdl-35972065

ABSTRACT

Single-cell RNA sequencing (scRNA-seq) enables characterizing the cellular heterogeneity in human tissues. Recent technological advances have enabled the first population-scale scRNA-seq studies in hundreds of individuals, allowing to assay genetic effects with single-cell resolution. However, existing strategies to analyze these data remain based on principles established for the genetic analysis of bulk RNA-seq. In particular, current methods depend on a priori definitions of discrete cell types, and hence cannot assess allelic effects across subtle cell types and cell states. To address this, we propose the Cell Regulatory Map (CellRegMap), a statistical framework to test for and quantify genetic effects on gene expression in individual cells. CellRegMap provides a principled approach to identify and characterize genotype-context interactions of known eQTL variants using scRNA-seq data. This model-based approach resolves allelic effects across cellular contexts of different granularity, including genetic effects specific to cell subtypes and continuous cell transitions. We validate CellRegMap using simulated data and apply it to previously identified eQTL from two recent studies of differentiating iPSCs, where we uncover hundreds of eQTL displaying heterogeneity of genetic effects across cellular contexts. Finally, we identify fine-grained genetic regulation in neuronal subtypes for eQTL that are colocalized with human disease variants.


Subject(s)
Gene Expression Regulation , Single-Cell Analysis , Gene Expression Profiling/methods , Humans , RNA-Seq , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods
2.
PLoS Genet ; 13(4): e1006693, 2017 04.
Article in English | MEDLINE | ID: mdl-28426829

ABSTRACT

Joint genetic models for multiple traits have helped to enhance association analyses. Most existing multi-trait models have been designed to increase power for detecting associations, whereas the analysis of interactions has received considerably less attention. Here, we propose iSet, a method based on linear mixed models to test for interactions between sets of variants and environmental states or other contexts. Our model generalizes previous interaction tests and in particular provides a test for local differences in the genetic architecture between contexts. We first use simulations to validate iSet before applying the model to the analysis of genotype-environment interactions in an eQTL study. Our model retrieves a larger number of interactions than alternative methods and reveals that up to 20% of cases show context-specific configurations of causal variants. Finally, we apply iSet to test for sub-group specific genetic effects in human lipid levels in a large human cohort, where we identify a gene-sex interaction for C-reactive protein that is missed by alternative methods.


Subject(s)
Epistasis, Genetic , Gene-Environment Interaction , Genome-Wide Association Study , Quantitative Trait Loci/genetics , C-Reactive Protein/genetics , Genotype , Humans , Models, Genetic , Multifactorial Inheritance/genetics , Polymorphism, Single Nucleotide
3.
Bioinformatics ; 30(22): 3206-14, 2014 Nov 15.
Article in English | MEDLINE | ID: mdl-25075117

ABSTRACT

MOTIVATION: Set-based variance component tests have been identified as a way to increase power in association studies by aggregating weak individual effects. However, the choice of test statistic has been largely ignored even though it may play an important role in obtaining optimal power. We compared a standard statistical test-a score test-with a recently developed likelihood ratio (LR) test. Further, when correction for hidden structure is needed, or gene-gene interactions are sought, state-of-the art algorithms for both the score and LR tests can be computationally impractical. Thus we develop new computationally efficient methods. RESULTS: After reviewing theoretical differences in performance between the score and LR tests, we find empirically on real data that the LR test generally has more power. In particular, on 15 of 17 real datasets, the LR test yielded at least as many associations as the score test-up to 23 more associations-whereas the score test yielded at most one more association than the LR test in the two remaining datasets. On synthetic data, we find that the LR test yielded up to 12% more associations, consistent with our results on real data, but also observe a regime of extremely small signal where the score test yielded up to 25% more associations than the LR test, consistent with theory. Finally, our computational speedups now enable (i) efficient LR testing when the background kernel is full rank, and (ii) efficient score testing when the background kernel changes with each test, as for gene-gene interaction tests. The latter yielded a factor of 2000 speedup on a cohort of size 13 500. AVAILABILITY: Software available at http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/Fastlmm/. CONTACT: heckerma@microsoft.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Genetic Association Studies/methods , Genetic Variation , Algorithms , Data Interpretation, Statistical , Humans , Likelihood Functions , Phenotype , Polymorphism, Single Nucleotide
4.
Nat Genet ; 53(3): 313-321, 2021 03.
Article in English | MEDLINE | ID: mdl-33664507

ABSTRACT

Induced pluripotent stem cells (iPSCs) are an established cellular system to study the impact of genetic variants in derived cell types and developmental contexts. However, in their pluripotent state, the disease impact of genetic variants is less well known. Here, we integrate data from 1,367 human iPSC lines to comprehensively map common and rare regulatory variants in human pluripotent cells. Using this population-scale resource, we report hundreds of new colocalization events for human traits specific to iPSCs, and find increased power to identify rare regulatory variants compared with somatic tissues. Finally, we demonstrate how iPSCs enable the identification of causal genes for rare diseases.


Subject(s)
Genetic Variation , Induced Pluripotent Stem Cells/physiology , Quantitative Trait Loci , Bardet-Biedl Syndrome/genetics , Calcium Channels/genetics , Cell Line , Cerebellar Ataxia/genetics , DNA Methylation , Gene Expression , Humans , Induced Pluripotent Stem Cells/cytology , Polymorphism, Single Nucleotide , Proteins/genetics , Rare Diseases/genetics , Regulatory Sequences, Nucleic Acid , Sequence Analysis, RNA , Whole Genome Sequencing
5.
Nat Genet ; 51(1): 180-186, 2019 01.
Article in English | MEDLINE | ID: mdl-30478441

ABSTRACT

Different exposures, including diet, physical activity, or external conditions can contribute to genotype-environment interactions (G×E). Although high-dimensional environmental data are increasingly available and multiple exposures have been implicated with G×E at the same loci, multi-environment tests for G×E are not established. Here, we propose the structured linear mixed model (StructLMM), a computationally efficient method to identify and characterize loci that interact with one or more environments. After validating our model using simulations, we applied StructLMM to body mass index in the UK Biobank, where our model yields previously known and novel G×E signals. Finally, in an application to a large blood eQTL dataset, we demonstrate that StructLMM can be used to study interactions with hundreds of environmental variables.


Subject(s)
Gene-Environment Interaction , Algorithms , Computer Simulation , Environment , Genotype , Humans , Linear Models , Models, Genetic , Quantitative Trait Loci/genetics
6.
Article in English | MEDLINE | ID: mdl-26356865

ABSTRACT

The comparison of ordinary partitions of a set of objects is well established in the clustering literature, which comprehends several studies on the analysis of the properties of similarity measures for comparing partitions. However, similarity measures for clusterings are not readily applicable to biclusterings, since each bicluster is a tuple of two sets (of rows and columns), whereas a cluster is only a single set (of rows). Some biclustering similarity measures have been defined as minor contributions in papers which primarily report on proposals and evaluation of biclustering algorithms or comparative analyses of biclustering algorithms. The consequence is that some desirable properties of such measures have been overlooked in the literature. We review 14 biclustering similarity measures. We define eight desirable properties of a biclustering measure, discuss their importance, and prove which properties each of the reviewed measures has. We show examples drawn and inspired from important studies in which several biclustering measures convey misleading evaluations due to the absence of one or more of the discussed properties. We also advocate the use of a more general comparison approach that is based on the idea of transforming the original problem of comparing biclusterings into an equivalent problem of comparing clustering partitions with overlapping clusters.


Subject(s)
Cluster Analysis , Computational Biology/methods , Gene Expression Profiling/methods , Algorithms , Models, Genetic , Reproducibility of Results
SELECTION OF CITATIONS
SEARCH DETAIL