Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 12.713
Filter
Add more filters

Publication year range
1.
Cell ; 185(22): 4216-4232.e16, 2022 10 27.
Article in English | MEDLINE | ID: mdl-36240780

ABSTRACT

Genotype-phenotype associations for common diseases are often compounded by pleiotropy and metabolic state. Here, we devised a pooled human organoid-panel of steatohepatitis to investigate the impact of metabolic status on genotype-phenotype association. En masse population-based phenotypic analysis under insulin insensitive conditions predicted key non-alcoholic steatohepatitis (NASH)-genetic factors including the glucokinase regulatory protein (GCKR)-rs1260326:C>T. Analysis of NASH clinical cohorts revealed that GCKR-rs1260326-T allele elevates disease severity only under diabetic state but protects from fibrosis under non-diabetic states. Transcriptomic, metabolomic, and pharmacological analyses indicate significant mitochondrial dysfunction incurred by GCKR-rs1260326, which was not reversed with metformin. Uncoupling oxidative mechanisms mitigated mitochondrial dysfunction and permitted adaptation to increased fatty acid supply while protecting against oxidant stress, forming a basis for future therapeutic approaches for diabetic NASH. Thus, "in-a-dish" genotype-phenotype association strategies disentangle the opposing roles of metabolic-associated gene variant functions and offer a rich mechanistic, diagnostic, and therapeutic inference toolbox toward precision hepatology. VIDEO ABSTRACT.


Subject(s)
Genetic Predisposition to Disease , Non-alcoholic Fatty Liver Disease , Humans , Non-alcoholic Fatty Liver Disease/genetics , Organoids , Genetic Association Studies , Alleles , Liver
2.
Cell ; 185(14): 2559-2575.e28, 2022 07 07.
Article in English | MEDLINE | ID: mdl-35688146

ABSTRACT

A central goal of genetics is to define the relationships between genotypes and phenotypes. High-content phenotypic screens such as Perturb-seq (CRISPR-based screens with single-cell RNA-sequencing readouts) enable massively parallel functional genomic mapping but, to date, have been used at limited scales. Here, we perform genome-scale Perturb-seq targeting all expressed genes with CRISPR interference (CRISPRi) across >2.5 million human cells. We use transcriptional phenotypes to predict the function of poorly characterized genes, uncovering new regulators of ribosome biogenesis (including CCDC86, ZNF236, and SPATA5L1), transcription (C7orf26), and mitochondrial respiration (TMEM242). In addition to assigning gene function, single-cell transcriptional phenotypes allow for in-depth dissection of complex cellular phenomena-from RNA processing to differentiation. We leverage this ability to systematically identify genetic drivers and consequences of aneuploidy and to discover an unanticipated layer of stress-specific regulation of the mitochondrial genome. Our information-rich genotype-phenotype map reveals a multidimensional portrait of gene and cellular function.


Subject(s)
Genomics , Single-Cell Analysis , CRISPR-Cas Systems/genetics , Chromosome Mapping , Genotype , Phenotype , Single-Cell Analysis/methods
3.
Cell ; 176(3): 549-563.e23, 2019 01 24.
Article in English | MEDLINE | ID: mdl-30661752

ABSTRACT

Despite a wealth of molecular knowledge, quantitative laws for accurate prediction of biological phenomena remain rare. Alternative pre-mRNA splicing is an important regulated step in gene expression frequently perturbed in human disease. To understand the combined effects of mutations during evolution, we quantified the effects of all possible combinations of exonic mutations accumulated during the emergence of an alternatively spliced human exon. This revealed that mutation effects scale non-monotonically with the inclusion level of an exon, with each mutation having maximum effect at a predictable intermediate inclusion level. This scaling is observed genome-wide for cis and trans perturbations of splicing, including for natural and disease-associated variants. Mathematical modeling suggests that competition between alternative splice sites is sufficient to cause this non-linearity in the genotype-phenotype map. Combining the global scaling law with specific pairwise interactions between neighboring mutations allows accurate prediction of the effects of complex genotype changes involving >10 mutations.


Subject(s)
Alternative Splicing/genetics , RNA Splicing/genetics , fas Receptor/genetics , Animals , Exons/genetics , Genetic Techniques , Genetics , Genotype , Humans , Introns/genetics , Mice , Models, Theoretical , Mutation/genetics , Phenotype , RNA Precursors/metabolism , RNA Splice Sites/genetics , RNA, Messenger/metabolism
4.
Annu Rev Cell Dev Biol ; 32: 103-126, 2016 10 06.
Article in English | MEDLINE | ID: mdl-27501448

ABSTRACT

One of the central goals in biology is to understand how and how much of the phenotype of an organism is encoded in its genome. Although many genes that are crucial for organismal processes have been identified, much less is known about the genetic bases underlying quantitative phenotypic differences in natural populations. We discuss the fundamental gap between the large body of knowledge generated over the past decades by experimental genetics in the laboratory and what is needed to understand the genotype-to-phenotype problem on a broader scale. We argue that systems genetics, a combination of systems biology and the study of natural variation using quantitative genetics, will help to address this problem. We present major advances in these two mostly disconnected areas that have increased our understanding of the developmental processes of flowering time control and root growth. We conclude by illustrating and discussing the efforts that have been made toward systems genetics specifically in plants.


Subject(s)
Gene Regulatory Networks , Plants/genetics , Genetic Variation , Genotype , Phenotype , Systems Biology
5.
Annu Rev Genet ; 55: 71-91, 2021 11 23.
Article in English | MEDLINE | ID: mdl-34314597

ABSTRACT

Genetic manipulations with a robust and predictable outcome are critical to investigate gene function, as well as for therapeutic genome engineering. For many years, knockdown approaches and reagents including RNA interference and antisense oligonucleotides dominated functional studies; however, with the advent of precise genome editing technologies, CRISPR-based knockout systems have become the state-of-the-art tools for such studies. These technologies have helped decipher the role of thousands of genes in development and disease. Their use has also revealed how limited our understanding of genotype-phenotype relationships is. The recent discovery that certain mutations can trigger the transcriptional modulation of other genes, a phenomenon called transcriptional adaptation, has provided an additional explanation for the contradicting phenotypes observed in knockdown versus knockout models and increased awareness about the use of each of these approaches. In this review, we first cover the strengths and limitations of different gene perturbation strategies. Then we highlight the diverse ways in which the genotype-phenotype relationship can be discordant between these different strategies. Finally, we review the genetic robustness mechanisms that can lead to such discrepancies, paying special attention to the recently discovered phenomenon of transcriptional adaptation.


Subject(s)
CRISPR-Cas Systems , Gene Editing , CRISPR-Cas Systems/genetics , Genome , Genotype , Phenotype
6.
Annu Rev Genet ; 54: 439-464, 2020 11 23.
Article in English | MEDLINE | ID: mdl-32897739

ABSTRACT

The complexity of heredity has been appreciated for decades: Many traits are controlled not by a single genetic locus but instead by polymorphisms throughout the genome. The importance of complex traits in biology and medicine has motivated diverse approaches to understanding their detailed genetic bases. Here, we focus on recent systematic studies, many in budding yeast, which have revealed that large numbers of all kinds of molecular variation, from noncoding to synonymous variants, can make significant contributions to phenotype. Variants can affect different traits in opposing directions, and their contributions can be modified by both the environment and the epigenetic state of the cell. The integration of prospective (synthesizing and analyzing variants) and retrospective (examining standing variation) approaches promises to reveal how natural selection shapes quantitative traits. Only by comprehensively understanding nature's genetic tool kit can we predict how phenotypes arise from the complex ensembles of genetic variants in living organisms.


Subject(s)
Quantitative Trait Loci/genetics , Selection, Genetic/genetics , Genetic Variation/genetics , Genotype , Humans , Phenotype , Prospective Studies , Retrospective Studies , Saccharomycetales/genetics
7.
Am J Hum Genet ; 111(2): 280-294, 2024 02 01.
Article in English | MEDLINE | ID: mdl-38183988

ABSTRACT

Eosinophilic esophagitis (EoE) is a rare atopic disorder associated with esophageal dysfunction, including difficulty swallowing, food impaction, and inflammation, that develops in a small subset of people with food allergies. Genome-wide association studies (GWASs) have identified 9 independent EoE risk loci reaching genome-wide significance (p < 5 × 10-8) and 27 additional loci of suggestive significance (5 × 10-8 < p < 1 × 10-5). In the current study, we perform linkage disequilibrium (LD) expansion of these loci to nominate a set of 531 variants that are potentially causal. To systematically interrogate the gene regulatory activity of these variants, we designed a massively parallel reporter assay (MPRA) containing the alleles of each variant within their genomic sequence context cloned into a GFP reporter library. Analysis of reporter gene expression in TE-7, HaCaT, and Jurkat cells revealed cell-type-specific gene regulation. We identify 32 allelic enhancer variants, representing 6 genome-wide significant EoE loci and 7 suggestive EoE loci, that regulate reporter gene expression in a genotype-dependent manner in at least one cellular context. By annotating these variants with expression quantitative trait loci (eQTL) and chromatin looping data in related tissues and cell types, we identify putative target genes affected by genetic variation in individuals with EoE. Transcription factor enrichment analyses reveal possible roles for cell-type-specific regulators, including GATA3. Our approach reduces the large set of EoE-associated variants to a set of 32 with allelic regulatory activity, providing functional insights into the effects of genetic variation in this disease.


Subject(s)
Enteritis , Eosinophilia , Eosinophilic Esophagitis , Gastritis , Humans , Eosinophilic Esophagitis/genetics , Eosinophilic Esophagitis/complications , Genome-Wide Association Study , Genotype , Quantitative Trait Loci/genetics
8.
Am J Hum Genet ; 111(5): 990-995, 2024 05 02.
Article in English | MEDLINE | ID: mdl-38636510

ABSTRACT

Since genotype imputation was introduced, researchers have been relying on the estimated imputation quality from imputation software to perform post-imputation quality control (QC). However, this quality estimate (denoted as Rsq) performs less well for lower-frequency variants. We recently published MagicalRsq, a machine-learning-based imputation quality calibration, which leverages additional typed markers from the same cohort and outperforms Rsq as a QC metric. In this work, we extended the original MagicalRsq to allow cross-cohort model training and named the new model MagicalRsq-X. We removed the cohort-specific estimated minor allele frequency and included linkage disequilibrium scores and recombination rates as additional features. Leveraging whole-genome sequencing data from TOPMed, specifically participants in the BioMe, JHS, WHI, and MESA studies, we performed comprehensive cross-cohort evaluations for predominantly European and African ancestral individuals based on their inferred global ancestry with the 1000 Genomes and Human Genome Diversity Project data as reference. Our results suggest MagicalRsq-X outperforms Rsq in almost every setting, with 7.3%-14.4% improvement in squared Pearson correlation with true R2, corresponding to 85-218 K variant gains. We further developed a metric to quantify the genetic distances of a target cohort relative to a reference cohort and showed that such metric largely explained the performance of MagicalRsq-X models. Finally, we found MagicalRsq-X saved up to 53 known genome-wide significant variants in one of the largest blood cell trait GWASs that would be missed using the original Rsq for QC. In conclusion, MagicalRsq-X shows superiority for post-imputation QC and benefits genetic studies by distinguishing well and poorly imputed lower-frequency variants.


Subject(s)
Gene Frequency , Genotype , Polymorphism, Single Nucleotide , Software , Humans , Cohort Studies , Linkage Disequilibrium , Genome-Wide Association Study/methods , Genome, Human , Quality Control , Machine Learning , Whole Genome Sequencing/standards , Whole Genome Sequencing/methods
9.
Trends Genet ; 39(8): 602-608, 2023 08.
Article in English | MEDLINE | ID: mdl-36878820

ABSTRACT

Behaviors are components of fitness and contribute to adaptive evolution. Behaviors represent the interactions of an organism with its environment, yet innate behaviors display robustness in the face of environmental change, which we refer to as 'behavioral canalization'. We hypothesize that positive selection of hub genes of genetic networks stabilizes the genetic architecture for innate behaviors by reducing variation in the expression of interconnected network genes. Robustness of these stabilized networks would be protected from deleterious mutations by purifying selection or suppressing epistasis. We propose that, together with newly emerging favorable mutations, epistatically suppressed mutations can generate a reservoir of cryptic genetic variation that could give rise to decanalization when genetic backgrounds or environmental conditions change to allow behavioral adaptation.


Subject(s)
Adaptation, Physiological , Gene Regulatory Networks , Phenotype , Mutation/genetics , Gene Regulatory Networks/genetics , Adaptation, Physiological/genetics , Epistasis, Genetic , Selection, Genetic , Models, Genetic , Genetic Fitness , Genetic Variation/genetics
10.
Annu Rev Genomics Hum Genet ; 24: 151-176, 2023 08 25.
Article in English | MEDLINE | ID: mdl-37285546

ABSTRACT

DECIPHER (Database of Genomic Variation and Phenotype in Humans Using Ensembl Resources) shares candidate diagnostic variants and phenotypic data from patients with genetic disorders to facilitate research and improve the diagnosis, management, and therapy of rare diseases. The platform sits at the boundary between genomic research and the clinical community. DECIPHER aims to ensure that the most up-to-date data are made rapidly available within its interpretation interfaces to improve clinical care. Newly integrated cardiac case-control data that provide evidence of gene-disease associations and inform variant interpretation exemplify this mission. New research resources are presented in a format optimized for use by a broad range of professionals supporting the delivery of genomic medicine. The interfaces within DECIPHER integrate and contextualize variant and phenotypic data, helping to determine a robust clinico-molecular diagnosis for rare-disease patients, which combines both variant classification and clinical fit. DECIPHER supports discovery research, connecting individuals within the rare-disease community to pursue hypothesis-driven research.


Subject(s)
Genomics , Genomics/methods , Humans , Rare Diseases/genetics , Alleles , Practice Guidelines as Topic , DNA Copy Number Variations , Databases, Genetic
11.
Am J Hum Genet ; 110(1): 161-165, 2023 01 05.
Article in English | MEDLINE | ID: mdl-36450278

ABSTRACT

The first release of UK Biobank whole-genome sequence data contains 150,119 genomes. We present an open-source pipeline for filtering, phasing, and indexing these genomes on the cloud-based UK Biobank Research Analysis Platform. This pipeline makes it possible to apply haplotype-based methods to UK Biobank whole-genome sequence data. The pipeline uses BCFtools for marker filtering, Beagle for genotype phasing, and Tabix for VCF indexing. We used the pipeline to phase 406 million single-nucleotide variants on chromosomes 1-22 and X at a cost of £2,309. The maximum time required to process a chromosome was 2.6 days. In order to assess phase accuracy, we modified the pipeline to exclude trio parents. We observed a switch error rate of 0.0016 on chromosome 20 in the White British trio offspring. If we exclude markers with nonmajor allele frequency < 0.1% after phasing, this switch error rate decreases by 80% to 0.00032.


Subject(s)
Biological Specimen Banks , Genome , Humans , Dogs , Animals , Genotype , Haplotypes/genetics , Polymorphism, Single Nucleotide/genetics , United Kingdom , Algorithms , Sequence Analysis, DNA/methods
12.
Am J Hum Genet ; 110(8): 1319-1329, 2023 08 03.
Article in English | MEDLINE | ID: mdl-37490908

ABSTRACT

Polygenic scores (PGSs) have emerged as a standard approach to predict phenotypes from genotype data in a wide array of applications from socio-genomics to personalized medicine. Traditional PGSs assume genotype data to be error-free, ignoring possible errors and uncertainties introduced from genotyping, sequencing, and/or imputation. In this work, we investigate the effects of genotyping error due to low coverage sequencing on PGS estimation. We leverage SNP array and low-coverage whole-genome sequencing data (lcWGS, median coverage 0.04×) of 802 individuals from the Dana-Farber PROFILE cohort to show that PGS error correlates with sequencing depth (p = 1.2 × 10-7). We develop a probabilistic approach that incorporates genotype error in PGS estimation to produce well-calibrated PGS credible intervals and show that the probabilistic approach increases classification accuracy by up to 6% as compared to traditional PGSs that ignore genotyping error. Finally, we use simulations to explore the combined effect of genotyping and effect size errors and their implication on PGS-based risk-stratification. Our results illustrate the importance of considering genotyping error as a source of PGS error especially for cohorts with varying genotyping technologies and/or low-coverage sequencing.


Subject(s)
Genomics , Polymorphism, Single Nucleotide , Uncertainty , Genotype , Genomics/methods , Whole Genome Sequencing , Polymorphism, Single Nucleotide/genetics
13.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38770718

ABSTRACT

Polygenetic Risk Scores are used to evaluate an individual's vulnerability to developing specific diseases or conditions based on their genetic composition, by taking into account numerous genetic variations. This article provides an overview of the concept of Polygenic Risk Scores (PRS). We elucidate the historical advancements of PRS, their advantages and shortcomings in comparison with other predictive methods, and discuss their conceptual limitations in light of the complexity of biological systems. Furthermore, we provide a survey of published tools for computing PRS and associated resources. The various tools and software packages are categorized based on their technical utility for users or prospective developers. Understanding the array of available tools and their limitations is crucial for accurately assessing and predicting disease risks, facilitating early interventions, and guiding personalized healthcare decisions. Additionally, we also identify potential new avenues for future bioinformatic analyzes and advancements related to PRS.


Subject(s)
Genetic Predisposition to Disease , Multifactorial Inheritance , Software , Humans , Computational Biology/methods , Genome-Wide Association Study/methods , Risk Factors , Risk Assessment/methods , Genetic Risk Score
14.
Brief Bioinform ; 25(2)2024 Jan 22.
Article in English | MEDLINE | ID: mdl-38348747

ABSTRACT

Integrating and analyzing multiple omics data sets, including genomics, proteomics and radiomics, can significantly advance researchers' comprehensive understanding of Alzheimer's disease (AD). However, current methodologies primarily focus on the main effects of genetic variation and protein, overlooking non-additive effects such as genotype-protein interaction (GPI) and correlation patterns in brain imaging genetics studies. Importantly, these non-additive effects could contribute to intermediate imaging phenotypes, finally leading to disease occurrence. In general, the interaction between genetic variations and proteins, and their correlations are two distinct biological effects, and thus disentangling the two effects for heritable imaging phenotypes is of great interest and need. Unfortunately, this issue has been largely unexploited. In this paper, to fill this gap, we propose $\textbf{M}$ulti-$\textbf{T}$ask $\textbf{G}$enotype-$\textbf{P}$rotein $\textbf{I}$nteraction and $\textbf{C}$orrelation disentangling method ($\textbf{MT-GPIC}$) to identify GPI and extract correlation patterns between them. To ensure stability and interpretability, we use novel and off-the-shelf penalties to identify meaningful genetic risk factors, as well as exploit the interconnectedness of different brain regions. Additionally, since computing GPI poses a high computational burden, we develop a fast optimization strategy for solving MT-GPIC, which is guaranteed to converge. Experimental results on the Alzheimer's Disease Neuroimaging Initiative data set show that MT-GPIC achieves higher correlation coefficients and classification accuracy than state-of-the-art methods. Moreover, our approach could effectively identify interpretable phenotype-related GPI and correlation patterns in high-dimensional omics data sets. These findings not only enhance the diagnostic accuracy but also contribute valuable insights into the underlying pathogenic mechanisms of AD.


Subject(s)
Alzheimer Disease , Humans , Alzheimer Disease/diagnostic imaging , Alzheimer Disease/genetics , Alzheimer Disease/pathology , Multiomics , Genotype , Neuroimaging/methods , Phenotype , Brain/diagnostic imaging , Brain/pathology
15.
Brief Bioinform ; 25(3)2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38701420

ABSTRACT

The relationship between genotype and fitness is fundamental to evolution, but quantitatively mapping genotypes to fitness has remained challenging. We propose the Phenotypic-Embedding theorem (P-E theorem) that bridges genotype-phenotype through an encoder-decoder deep learning framework. Inspired by this, we proposed a more general first principle for correlating genotype-phenotype, and the P-E theorem provides a computable basis for the application of first principle. As an application example of the P-E theorem, we developed the Co-attention based Transformer model to bridge Genotype and Fitness model, a Transformer-based pre-train foundation model with downstream supervised fine-tuning that can accurately simulate the neutral evolution of viruses and predict immune escape mutations. Accordingly, following the calculation path of the P-E theorem, we accurately obtained the basic reproduction number (${R}_0$) of SARS-CoV-2 from first principles, quantitatively linked immune escape to viral fitness and plotted the genotype-fitness landscape. The theoretical system we established provides a general and interpretable method to construct genotype-phenotype landscapes, providing a new paradigm for studying theoretical and computational biology.


Subject(s)
COVID-19 , Deep Learning , Genotype , Phenotype , SARS-CoV-2 , SARS-CoV-2/genetics , SARS-CoV-2/immunology , Humans , COVID-19/virology , COVID-19/genetics , COVID-19/immunology , Computational Biology/methods , Algorithms , Genetic Fitness
16.
Proc Natl Acad Sci U S A ; 120(1): e2207544120, 2023 01 03.
Article in English | MEDLINE | ID: mdl-36574663

ABSTRACT

A growing body of work has addressed human adaptations to diverse environments using genomic data, but few studies have connected putatively selected alleles to phenotypes, much less among underrepresented populations such as Amerindians. Studies of natural selection and genotype-phenotype relationships in underrepresented populations hold potential to uncover previously undescribed loci underlying evolutionarily and biomedically relevant traits. Here, we worked with the Tsimane and the Moseten, two Amerindian populations inhabiting the Bolivian lowlands. We focused most intensively on the Tsimane, because long-term anthropological work with this group has shown that they have a high burden of both macro and microparasites, as well as minimal cardiometabolic disease or dementia. We therefore generated genome-wide genotype data for Tsimane individuals to study natural selection, and paired this with blood mRNA-seq as well as cardiometabolic and immune biomarker data generated from a larger sample that included both populations. In the Tsimane, we identified 21 regions that are candidates for selective sweeps, as well as 5 immune traits that show evidence for polygenic selection (e.g., C-reactive protein levels and the response to coronaviruses). Genes overlapping candidate regions were strongly enriched for known involvement in immune-related traits, such as abundance of lymphocytes and eosinophils. Importantly, we were also able to draw on extensive phenotype information for the Tsimane and Moseten and link five regions (containing PSD4, MUC21 and MUC22, TOX2, ANXA6, and ABCA1) with biomarkers of immune and metabolic function. Together, our work highlights the utility of pairing evolutionary analyses with anthropological and biomedical data to gain insight into the genetic basis of health-related traits.


Subject(s)
Genetics, Population , Health Status , Humans , Biomarkers , Bolivia , Genomics , Genotype , Phenotype , Polymorphism, Single Nucleotide , Selection, Genetic , Genome, Human
17.
Proc Natl Acad Sci U S A ; 120(14): e2205771120, 2023 04 04.
Article in English | MEDLINE | ID: mdl-36972430

ABSTRACT

This perspective describes the opportunities and challenges of data-driven approaches for crop diversity management (genebanks and breeding) in the context of agricultural research for sustainable development in the Global South. Data-driven approaches build on larger volumes of data and flexible analyses that link different datasets across domains and disciplines. This can lead to more information-rich management of crop diversity, which can address the complex interactions between crop diversity, production environments, and socioeconomic heterogeneity and help to deliver more suitable portfolios of crop diversity to users with highly diverse demands. We describe recent efforts that illustrate the potential of data-driven approaches for crop diversity management. A continued investment in this area should fill remaining gaps and seize opportunities, including i) supporting genebanks to play a more active role in linking with farmers using data-driven approaches; ii) designing low-cost, appropriate technologies for phenotyping; iii) generating more and better gender and socioeconomic data; iv) designing information products to facilitate decision-making; and v) building more capacity in data science. Broad, well-coordinated policies and investments are needed to avoid fragmentation of such capacities and achieve coherence between domains and disciplines so that crop diversity management systems can become more effective in delivering benefits to farmers, consumers, and other users of crop diversity.


Subject(s)
Crops, Agricultural , Plant Breeding , Crops, Agricultural/genetics , Agriculture
18.
Genet Epidemiol ; 48(2): 85-100, 2024 03.
Article in English | MEDLINE | ID: mdl-38303123

ABSTRACT

The use of polygenic risk score (PRS) models has transformed the field of genetics by enabling the prediction of complex traits and diseases based on an individual's genetic profile. However, the impact of genotype-environment interaction (GxE) on the performance and applicability of PRS models remains a crucial aspect to be explored. Currently, existing genotype-environment interaction polygenic risk score (GxE PRS) models are often inappropriately used, which can result in inflated type 1 error rates and compromised results. In this study, we propose novel GxE PRS models that jointly incorporate additive and interaction genetic effects although also including an additional quadratic term for nongenetic covariates, enhancing their robustness against model misspecification. Through extensive simulations, we demonstrate that our proposed models outperform existing models in terms of controlling type 1 error rates and enhancing statistical power. Furthermore, we apply the proposed models to real data, and report significant GxE effects. Specifically, we highlight the impact of our models on both quantitative and binary traits. For quantitative traits, we uncover the GxE modulation of genetic effects on body mass index by alcohol intake frequency. In the case of binary traits, we identify the GxE modulation of genetic effects on hypertension by waist-to-hip ratio. These findings underscore the importance of employing a robust model that effectively controls type 1 error rates, thus preventing the occurrence of spurious GxE signals. To facilitate the implementation of our approach, we have developed an innovative R software package called GxEprs, specifically designed to detect and estimate GxE effects. Overall, our study highlights the importance of accurate GxE modeling and its implications for genetic risk prediction, although providing a practical tool to support further research in this area.


Subject(s)
Gene-Environment Interaction , Genetic Risk Score , Humans , Models, Genetic , Phenotype , Risk Factors
19.
Mol Biol Evol ; 41(6)2024 Jun 01.
Article in English | MEDLINE | ID: mdl-38693911

ABSTRACT

Modeling the rate at which adaptive phenotypes appear in a population is a key to predicting evolutionary processes. Given random mutations, should this rate be modeled by a simple Poisson process, or is a more complex dynamics needed? Here we use analytic calculations and simulations of evolving populations on explicit genotype-phenotype maps to show that the introduction of novel phenotypes can be "bursty" or overdispersed. In other words, a novel phenotype either appears multiple times in quick succession or not at all for many generations. These bursts are fundamentally caused by statistical fluctuations and other structure in the map from genotypes to phenotypes. Their strength depends on population parameters, being highest for "monomorphic" populations with low mutation rates. They can also be enhanced by additional inhomogeneities in the mapping from genotypes to phenotypes. We mainly investigate the effect of bursts using the well-studied genotype-phenotype map for RNA secondary structure, but find similar behavior in a lattice protein model and in Richard Dawkins's biomorphs model of morphological development. Bursts can profoundly affect adaptive dynamics. Most notably, they imply that fitness differences play a smaller role in determining which phenotype fixes than would be the case for a Poisson process without bursts.


Subject(s)
Models, Genetic , Phenotype , Genotype , Computer Simulation , Adaptation, Physiological/genetics , Evolution, Molecular , Mutation , Biological Evolution , Poisson Distribution , RNA/genetics , Adaptation, Biological/genetics
20.
Mol Biol Evol ; 41(5)2024 May 03.
Article in English | MEDLINE | ID: mdl-38709811

ABSTRACT

The evolution of antimicrobial resistance (AMR) in bacteria is a major public health concern, and antibiotic restriction is often implemented to reduce the spread of resistance. These measures rely on the existence of deleterious fitness effects (i.e. costs) imposed by AMR mutations during growth in the absence of antibiotics. According to this assumption, resistant strains will be outcompeted by susceptible strains that do not pay the cost during the period of restriction. The fitness effects of AMR mutations are generally studied in laboratory reference strains grown in standard growth environments; however, the genetic and environmental context can influence the magnitude and direction of a mutation's fitness effects. In this study, we measure how three sources of variation impact the fitness effects of Escherichia coli AMR mutations: the type of resistance mutation, the genetic background of the host, and the growth environment. We demonstrate that while AMR mutations are generally costly in antibiotic-free environments, their fitness effects vary widely and depend on complex interactions between the mutation, genetic background, and environment. We test the ability of the Rough Mount Fuji fitness landscape model to reproduce the empirical data in simulation. We identify model parameters that reasonably capture the variation in fitness effects due to genetic variation. However, the model fails to accommodate the observed variation when considering multiple growth environments. Overall, this study reveals a wealth of variation in the fitness effects of resistance mutations owing to genetic background and environmental conditions, which will ultimately impact their persistence in natural populations.


Subject(s)
Drug Resistance, Bacterial , Escherichia coli , Genetic Fitness , Mutation , Escherichia coli/genetics , Escherichia coli/drug effects , Drug Resistance, Bacterial/genetics , Anti-Bacterial Agents/pharmacology , Models, Genetic , Environment
SELECTION OF CITATIONS
SEARCH DETAIL