RESUMO
Plant noncoding RNA transcripts have gained increasing attention in recent years due to growing evidence that they can regulate developmental plasticity. In this review article, we comprehensively analyze the relationship between noncoding RNA transcripts in plants and their response to environmental cues. We first provide an overview of the various noncoding transcript types, including long and small RNAs, and how the environment modulates their performance. We then highlight the importance of noncoding RNA secondary structure for their molecular and biological functions. Finally, we discuss recent studies that have unveiled the functional significance of specific long noncoding transcripts and their molecular partners within ribonucleoprotein complexes during development and in response to biotic and abiotic stress. Overall, this review sheds light on the fascinating and complex relationship between dynamic noncoding transcription and plant environmental responses, and highlights the need for further research to uncover the underlying molecular mechanisms and exploit the potential of noncoding transcripts for crop resilience in the context of global warming.
Assuntos
RNA Longo não Codificante , Transcriptoma , RNA Longo não Codificante/genética , Regulação da Expressão Gênica de Plantas , RNA não Traduzido/genética , Estresse Fisiológico/genética , RNA de Plantas/genéticaRESUMO
Variants in cis-regulatory elements link the noncoding genome to human pathology; however, detailed analytic tools for understanding the association between cell-level brain pathology and noncoding variants are lacking. CWAS-Plus, adapted from a Python package for category-wide association testing (CWAS), enhances noncoding variant analysis by integrating both whole-genome sequencing (WGS) and user-provided functional data. With simplified parameter settings and an efficient multiple testing correction method, CWAS-Plus conducts the CWAS workflow 50 times faster than CWAS, making it more accessible and user-friendly for researchers. Here, we used a single-nuclei assay for transposase-accessible chromatin with sequencing to facilitate CWAS-guided noncoding variant analysis at cell-type-specific enhancers and promoters. Examining autism spectrum disorder WGS data (n = 7280), CWAS-Plus identified noncoding de novo variant associations in transcription factor binding sites within conserved loci. Independently, in Alzheimer's disease WGS data (n = 1087), CWAS-Plus detected rare noncoding variant associations in microglia-specific regulatory elements. These findings highlight CWAS-Plus's utility in genomic disorders and scalability for processing large-scale WGS data and in multiple-testing corrections. CWAS-Plus and its user manual are available at https://github.com/joonan-lab/cwas/ and https://cwas-plus.readthedocs.io/en/latest/, respectively.
Assuntos
Sequenciamento Completo do Genoma , Humanos , Sequenciamento Completo do Genoma/métodos , Doença de Alzheimer/genética , Estudo de Associação Genômica Ampla/métodos , Transtorno do Espectro Autista/genética , Variação Genética , Software , Cromatina/genética , Cromatina/metabolismo , Genoma HumanoRESUMO
Gene regulation by transcriptional enhancers is the dominant mechanism driving cell type- and signal-specific transcriptional diversity in metazoans. However, over four decades since the original discovery, how enhancers operate in the nuclear space remains largely enigmatic. Recent multidisciplinary efforts combining real-time imaging, genome sequencing, and biophysical strategies provide insightful but conflicting models of enhancer-mediated gene control. Here, we review the discovery and progress in enhancer biology, emphasizing the recent findings that acutely activated enhancers assemble regulatory machinery as mesoscale architectural structures with distinct physical properties. These findings help formulate novel models that explain several mysterious features of the assembly of transcriptional enhancers and the mechanisms of spatial control of gene expression.
Assuntos
DNA Viral , Elementos Facilitadores Genéticos , Sequência de Bases , Núcleo Celular/genética , Regulação da Expressão Gênica/genéticaRESUMO
The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas nuclease system is a powerful tool for genome editing, and its simple programmability has enabled high-throughput genetic and epigenetic studies. These high-throughput approaches offer investigators a toolkit for functional interrogation of not only protein-coding genes but also noncoding DNA. Historically, noncoding DNA has lacked the detailed characterization that has been applied to protein-coding genes in large part because there has not been a robust set of methodologies for perturbing these regions. Although the majority of high-throughput CRISPR screens have focused on the coding genome to date, an increasing number of CRISPR screens targeting noncoding genomic regions continue to emerge. Here, we review high-throughput CRISPR-based approaches to uncover and understand functional elements within the noncoding genome and discuss practical aspects of noncoding library design and screen analysis.
Assuntos
Sistemas CRISPR-Cas , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , DNA Intergênico/genética , Endonucleases/genética , Edição de Genes/métodos , Genoma , Animais , DNA Intergênico/metabolismo , Endonucleases/metabolismo , Células Eucarióticas/citologia , Células Eucarióticas/metabolismo , Engenharia Genética , Biblioteca Genômica , Ensaios de Triagem em Larga Escala , Humanos , RNA Guia de Cinetoplastídeos/genética , RNA Guia de Cinetoplastídeos/metabolismoRESUMO
Since their discovery a decade ago, it has become evident that innate lymphoid cells (ILCs) play critical roles in protective immune responses against intracellular and extracellular pathogens but are also central regulators of epithelial barrier integrity and tissue homeostasis. ILCs populate almost every tissue in mammalian organisms; therefore, not surprisingly, dysregulation of their functions contributes to the development and progression of multiple inflammatory and metabolic diseases. Our knowledge of the transcriptional programs governing the development, differentiation, and functions of the different groups of ILCs has increased dramatically in the last ten years. However, with the advent of new technologies, an unprecedented level of heterogeneity, plasticity, and developmental complexity has started to be revealed. In this review, we highlight recent advances in our understanding of ILC development and their biological functions. In particular, we aim to emphasize how our increasing knowledge of the chromatin landscape and the noncoding genome of these innate lymphocytes is allowing us to better understand their development and functions in different contexts during homeostasis and inflammation. Moreover, we propose that the design of more refined genetic tools to study tissue-specific ILCs and their functions can be accomplished by leveraging our understanding of how specific noncoding elements of the genome regulate gene expression in ILCs.
Assuntos
Imunidade Inata , Linfócitos , Animais , Diferenciação Celular , Homeostase , Imunidade Inata/genética , InflamaçãoRESUMO
Functionally annotating genetic variations is an essential yet challenging topic in human genetics research. As large consortia including ENCODE and Roadmap Epigenomics Project continue to generate high-throughput transcriptomic and epigenomic data, many computational frameworks have been developed to integrate these experimental data to predict functionality of genetic variations in both protein-coding and noncoding regions. Here, we compare a number of recently developed annotation frameworks for noncoding regions through enrichment analysis on genome-wide association studies (GWASs). We also compare several different strategies to quantify enrichment using GWAS summary statistics. Our analyses highlight the importance of jointly modeling context-specific annotations with genome-wide data in providing statistically powerful and biologically interpretable enrichment for complex disease associations. Our findings provide insights into when and how computational genome annotations may benefit future complex disease studies on the genome-wide scale.
Assuntos
Estudo de Associação Genômica Ampla , Anotação de Sequência Molecular , HumanosRESUMO
Pseudogenes, the debilitated parts of ancient genes, were previously scrapped off as junk or discarded genes with no functional significance. Pseudogenes have come under scrutiny for their functionality, since recent studies have unveiled their importance in the regulation of their corresponding parent genes and various biological mechanisms. Despite the enormous occurrence of pseudogenes in plants, the lack of experimental validation has contributed toward their unresolved roles in gene regulation. Contrarily, most of the studies associated with gene regulation have been mainly reported for humans, mice, and other mammalian genomes. Consequently, in order to present a cumulative report on plant-based pseudogenes research, an attempt has been made to assemble multiple studies presenting the pseudogene classification, the prediction and the determination of comparative accuracies of various computational pipelines, and recent trends in analyzing their biological functions, and regulatory mechanisms. This review represents the classical, as well as the recent advances on pseudogene identification and their potential roles in transcriptional regulation, which could possibly invigorate the quality of genome annotation, evolutionary analysis, and complexity surrounding the regulatory pathways in plants. Thus, when the ambiguous boundary girdling the pseudogenes eventually recedes on account of their explicit orchestration role, research in flora would no longer saunter compared to that on fauna.
Assuntos
Genoma , Pseudogenes , Animais , Evolução Biológica , Regulação da Expressão Gênica , Camundongos , Pseudogenes/genéticaRESUMO
Salmonella enterica serovar Typhimurium ST313 is a relatively newly emerged sequence type that is causing a devastating epidemic of bloodstream infections across sub-Saharan Africa. Analysis of hundreds of Salmonella genomes has revealed that ST313 is closely related to the ST19 group of S Typhimurium that cause gastroenteritis across the world. The core genomes of ST313 and ST19 vary by only â¼1,000 SNPs. We hypothesized that the phenotypic differences that distinguish African Salmonella from ST19 are caused by certain SNPs that directly modulate the transcription of virulence genes. Here we identified 3,597 transcriptional start sites of the ST313 strain D23580, and searched for a gene-expression signature linked to pathogenesis of Salmonella We identified a SNP in the promoter of the pgtE gene that caused high expression of the PgtE virulence factor in African S. Typhimurium, increased the degradation of the factor B component of human complement, contributed to serum resistance, and modulated virulence in the chicken infection model. We propose that high levels of PgtE expression by African S Typhimurium ST313 promote bacterial survival and dissemination during human infection. Our finding of a functional role for an extragenic SNP shows that approaches used to deduce the evolution of virulence in bacterial pathogens should include a focus on noncoding regions of the genome.
Assuntos
Evolução Molecular , Genoma Bacteriano/genética , Infecções por Salmonella/microbiologia , Salmonella typhimurium/genética , Salmonella typhimurium/patogenicidade , DNA Bacteriano/genética , Epidemias , Humanos , Filogenia , Polimorfismo de Nucleotídeo Único/genética , Virulência/genética , Fatores de Virulência/genéticaRESUMO
The relationship between DNA sequence, biochemical function, and molecular evolution is relatively well-described for protein-coding regions of genomes, but far less clear in noncoding regions, particularly, in eukaryote genomes. In part, this is because we lack a complete description of the essential noncoding elements in a eukaryote genome. To contribute to this challenge, we used saturating transposon mutagenesis to interrogate the Schizosaccharomyces pombe genome. We generated 31 million transposon insertions, a theoretical coverage of 2.4 insertions per genomic site. We applied a five-state hidden Markov model (HMM) to distinguish insertion-depleted regions from insertion biases. Both raw insertion-density and HMM-defined fitness estimates showed significant quantitative relationships to gene knockout fitness, genetic diversity, divergence, and expected functional regions based on transcription and gene annotations. Through several analyses, we conclude that transposon insertions produced fitness effects in 66-90% of the genome, including substantial portions of the noncoding regions. Based on the HMM, we estimate that 10% of the insertion depleted sites in the genome showed no signal of conservation between species and were weakly transcribed, demonstrating limitations of comparative genomics and transcriptomics to detect functional units. In this species, 3'- and 5'-untranslated regions were the most prominent insertion-depleted regions that were not represented in measures of constraint from comparative genomics. We conclude that the combination of transposon mutagenesis, evolutionary, and biochemical data can provide new insights into the relationship between genome function and molecular evolution.
Assuntos
Aptidão Genética , Genoma Fúngico , Schizosaccharomyces/genética , Modelos Genéticos , Mutagênese InsercionalRESUMO
PURPOSE OF REVIEW: Common genetic variants that associate with type 2 diabetes risk are markedly enriched in pancreatic islet transcriptional enhancers. This review discusses current advances in the annotation of islet enhancer variants and their target genes. RECENT FINDINGS: Recent methodological advances now allow genetic and functional mapping of diabetes causal variants at unprecedented resolution. Mapping of enhancer-promoter interactions in human islets has provided a unique appreciation of the complexity of islet gene regulatory processes and enabled direct association of noncoding diabetes risk variants to their target genes. The recently improved human islet enhancer annotations constitute a framework for the interpretation of diabetes genetic signals in the context of pancreatic islet gene regulation. In the future, integration of existing and yet to come regulatory maps with genetic fine-mapping efforts and in-depth functional characterization will foster the discovery of novel diabetes molecular risk mechanisms.
Assuntos
Diabetes Mellitus Tipo 2/genética , Ilhotas Pancreáticas/fisiopatologia , Regulação da Expressão Gênica , Predisposição Genética para Doença , Técnicas Genéticas , Estudo de Associação Genômica Ampla , Humanos , Ilhotas Pancreáticas/metabolismo , Regiões Promotoras GenéticasRESUMO
We report on the sequencing of 10,545 human genomes at 30×-40× coverage with an emphasis on quality metrics and novel variant and sequence discovery. We find that 84% of an individual human genome can be sequenced confidently. This high-confidence region includes 91.5% of exon sequence and 95.2% of known pathogenic variant positions. We present the distribution of over 150 million single-nucleotide variants in the coding and noncoding genome. Each newly sequenced genome contributes an average of 8,579 novel variants. In addition, each genome carries on average 0.7 Mb of sequence that is not found in the main build of the hg38 reference genome. The density of this catalog of variation allowed us to construct high-resolution profiles that define genomic sites that are highly intolerant of genetic variation. These results indicate that the data generated by deep genome sequencing is of the quality necessary for clinical use.
Assuntos
Genoma Humano , Genômica , Sequenciamento Completo do Genoma , Mapeamento Cromossômico , Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Predisposição Genética para Doença , Variação Genética , Genômica/métodos , Humanos , Fases de Leitura Aberta , Polimorfismo de Nucleotídeo Único , Reprodutibilidade dos Testes , Regiões não TraduzidasRESUMO
The loci arginine vasopressin receptor 1a (avpr1a) and oxytocin receptor (oxtr) have evolutionarily conserved roles in vertebrate social and sexual behaviour. Allelic variation at a microsatellite locus in the 5' regulatory region of these genes is associated with fitness in the bank vole Myodes glareolus Given the low frequency of long and short alleles at these microsatellite loci in wild bank voles, we used breeding trials to determine whether selection acts against long and short alleles. Female bank voles with intermediate length avpr1a alleles had the highest probability of breeding, while male voles whose avpr1a alleles were very different in length had reduced probability of breeding. Moreover, there was a significant interaction between male and female oxtr genotypes, where potential breeding pairs with dissimilar length alleles had reduced probability of breeding. These data show how genetic variation at microsatellite loci associated with avpr1a and oxtr is associated with fitness, and highlight complex patterns of selection at these loci. More widely, these data show how stabilizing selection might act on allele length frequency distributions at gene-associated microsatellite loci.
Assuntos
Arvicolinae/genética , Frequência do Gene , Repetições de Microssatélites/genética , Receptores de Ocitocina/genética , Receptores de Vasopressinas/genética , Seleção Genética , Alelos , Animais , Arvicolinae/metabolismo , Feminino , Variação Genética , Masculino , Receptores de Ocitocina/metabolismo , Receptores de Vasopressinas/metabolismoRESUMO
The genetics of Plasmodium as an intracellular, mostly haploid, sexually reproducing, eukaryotic organism with a complex life cycle, presents unprecedented challenges in studying drug resistance. This article summarizes current knowledge on the genetic basis of artemisinin resistance (AR) - a main component of current drug therapies for falciparum malaria. Although centered on nonsynonymous single-nucleotide polymorphisms (nsSNPs), we describe multifaceted resistance mechanisms as part of a complex, cumulative genetic trait that involves regulation of expression by a wide array of polymorphisms in noncoding regions. These genetic variations alter transcriptome profiles linked to Plasmodium's development and population dynamics, ultimately influencing the emergence and spread of the resistance.
RESUMO
Variants in cis-regulatory elements link the noncoding genome to human brain pathology; however, detailed analytic tools for understanding the association between cell-level brain pathology and noncoding variants are lacking. CWAS-Plus, adapted from a Python package for category-wide association testing (CWAS) employs both whole-genome sequencing and user-provided functional data to enhance noncoding variant analysis, with a faster and more efficient execution of the CWAS workflow. Here, we used single-nuclei assay for transposase-accessible chromatin with sequencing to facilitate CWAS-guided noncoding variant analysis at cell-type specific enhancers and promoters. Examining autism spectrum disorder whole-genome sequencing data (n = 7,280), CWAS-Plus identified noncoding de novo variant associations in transcription factor binding sites within conserved loci. Independently, in Alzheimer's disease whole-genome sequencing data (n = 1,087), CWAS-Plus detected rare noncoding variant associations in microglia-specific regulatory elements. These findings highlight CWAS-Plus's utility in genomic disorders and scalability for processing large-scale whole-genome sequencing data and in multiple-testing corrections. CWAS-Plus and its user manual are available at https://github.com/joonan-lab/cwas/ and https://cwas-plus.readthedocs.io/en/latest/, respectively.
RESUMO
BACKGROUND: Pervasive translation is a widespread phenomenon that plays a critical role in the emergence of novel microproteins, but the diversity of translation patterns contributing to their generation remains unclear. Based on 54 ribosome profiling (Ribo-Seq) datasets, we investigated the yeast Ribo-Seq landscape using a representation framework that allows the comprehensive inventory and classification of the entire diversity of Ribo-Seq signals, including non-canonical ones. RESULTS: We show that if coding regions occupy specific areas of the Ribo-Seq landscape, noncoding regions encompass a wide diversity of Ribo-Seq signals and, conversely, populate the entire landscape. Our results show that pervasive translation can, nevertheless, be associated with high specificity, with 1055 noncoding ORFs exhibiting canonical Ribo-Seq signals. Using mass spectrometry under standard conditions or proteasome inhibition with an in-house analysis protocol, we report 239 microproteins originating from noncoding ORFs that display canonical but also non-canonical Ribo-Seq signals. Each condition yields dozens of additional microprotein candidates with comparable translation properties, suggesting a larger population of volatile microproteins that are challenging to detect. Our findings suggest that non-canonical translation signals may harbor valuable information and underscore the significance of considering them in proteogenomic studies. Finally, we show that the translation outcome of a noncoding ORF is primarily determined by the initiating codon and the codon distribution in its two alternative frames, rather than features indicative of functionality. CONCLUSION: Our results enable us to propose a topology of a species' Ribo-Seq landscape, opening the way to comparative analyses of this translation landscape under different conditions.
Assuntos
Fases de Leitura Aberta , Biossíntese de Proteínas , Ribossomos , Saccharomyces cerevisiae , Ribossomos/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Perfil de RibossomosRESUMO
While it is well known that 98-99% of the human genome does not encode proteins, but are nevertheless transcriptionally active and give rise to a broad spectrum of noncoding RNAs [ncRNAs] with complex regulatory and structural functions, specific functions have so far been assigned to only a tiny fraction of all known transcripts. On the other hand, the striking observation of an overwhelmingly growing fraction of ncRNAs, in contrast to an only modest increase in the number of protein-coding genes, during evolution from simple organisms to humans, strongly suggests critical but so far essentially unexplored roles of the noncoding genome for human health and disease pathogenesis. Research into the vast realm of the noncoding genome during the past decades thus lead to a profoundly enhanced appreciation of the multi-level complexity of the human genome. Here, we address a few of the many huge remaining knowledge gaps and consider some newly emerging questions and concepts of research. We attempt to provide an up-to-date assessment of recent insights obtained by molecular and cell biological methods, and by the application of systems biology approaches. Specifically, we discuss current data regarding two topics of high current interest: (1) By which mechanisms could evolutionary recent ncRNAs with critical regulatory functions in a broad spectrum of cell types (neural, immune, cardiovascular) constitute novel therapeutic targets in human diseases? (2) Since noncoding genome evolution is causally linked to brain evolution, and given the profound interactions between brain and immune system, could human-specific brain-expressed ncRNAs play a direct or indirect (immune-mediated) role in human diseases? Synergistic with remarkable recent progress regarding delivery, efficacy, and safety of nucleic acid-based therapies, the ongoing large-scale exploration of the noncoding genome for human-specific therapeutic targets is encouraging to proceed with the development and clinical evaluation of novel therapeutic pathways suggested by these research fields.
Assuntos
Genoma , RNA não Traduzido , Humanos , RNA não Traduzido/genética , EncéfaloRESUMO
During the past few years, unexpected developments have driven studies in the field of clinical immunology. One driver of immense impact was the outbreak of a pandemic caused by the novel virus SARS-CoV-2. Excellent recent reviews address diverse aspects of immunological re-search into cardiovascular diseases. Here, we specifically focus on selected studies taking advantage of advanced state-of-the-art molecular genetic methods ranging from genome-wide epi/transcriptome mapping and variant scanning to optogenetics and chemogenetics. First, we discuss the emerging clinical relevance of advanced diagnostics for cardiovascular diseases, including those associated with COVID-19-with a focus on the role of inflammation in cardiomyopathies and arrhythmias. Second, we consider newly identified immunological interactions at organ and system levels which affect cardiovascular pathogenesis. Thus, studies into immune influences arising from the intestinal system are moving towards therapeutic exploitation. Further, powerful new research tools have enabled novel insight into brain-immune system interactions at unprecedented resolution. This latter line of investigation emphasizes the strength of influence of emotional stress-acting through defined brain regions-upon viral and cardiovascular disorders. Several challenges need to be overcome before the full impact of these far-reaching new findings will hit the clinical arena.
RESUMO
The spatiotemporal control of tissue-specific gene expression is coordinated by cis-regulatory elements (CREs) and associated trans-acting factors. Despite major advances in genome-wide annotation of candidate CREs, the in situ regulatory composition of the vast majority of CREs remain unknown. To address this challenge, we developed the CRISPR affinity purification in situ of regulatory elements (CAPTURE) toolbox that employs an in vivo biotinylated nuclease-deficient Cas9 (dCas9) protein and programmable single-guide RNAs (sgRNAs) to identify CRE-associated macromolecular complexes and chromatin looping. In this chapter, we provide a detailed protocol for implementing the latest iteration of the CRISPR-based CAPTURE methods to interrogate the molecular composition of locus-specific chromatin complexes and configuration in a mammalian genome.
Assuntos
Cromatina , Besouros , Animais , Cromatina/genética , Cromatografia de Afinidade , Proteína 9 Associada à CRISPR , Endonucleases , MamíferosRESUMO
The evolutionary conserved NEAT1-MALAT1 gene cluster generates large noncoding transcripts remaining nuclear, while tRNA-like transcripts (mascRNA, menRNA) enzymatically generated from these precursors translocate to the cytosol. Whereas functions have been assigned to the nuclear transcripts, data on biological functions of the small cytosolic transcripts are sparse. We previously found NEAT1-/- and MALAT1-/- mice to display massive atherosclerosis and vascular inflammation. Here, employing selective targeted disruption of menRNA or mascRNA, we investigate the tRNA-like molecules as critical components of innate immunity. CRISPR-generated human ΔmascRNA and ΔmenRNA monocytes/macrophages display defective innate immune sensing, loss of cytokine control, imbalance of growth/angiogenic factor expression impacting upon angiogenesis, and altered cell-cell interaction systems. Antiviral response, foam cell formation/oxLDL uptake, and M1/M2 polarization are defective in ΔmascRNA/ΔmenRNA macrophages, defining first biological functions of menRNA and describing new functions of mascRNA. menRNA and mascRNA represent novel components of innate immunity arising from the noncoding genome. They appear as prototypes of a new class of noncoding RNAs distinct from others (miRNAs, siRNAs) by biosynthetic pathway and intracellular kinetics. Their NEAT1-MALAT1 region of origin appears as archetype of a functionally highly integrated RNA processing system.
Assuntos
Imunidade Inata , Macrófagos , RNA Longo não Codificante , RNA de Transferência , Humanos , Genômica , Imunidade Inata/genética , Imunidade Inata/imunologia , Macrófagos/imunologia , RNA Longo não Codificante/genética , RNA Longo não Codificante/imunologia , RNA de Transferência/genética , RNA de Transferência/imunologiaRESUMO
A growing number of variants associated with risk for neurodevelopmental disorders have been identified by genome-wide association and whole genome sequencing studies. As common risk variants often fall within large haplotype blocks covering long stretches of the noncoding genome, the causal variants within an associated locus are often unknown. Similarly, the effect of rare noncoding risk variants identified by whole genome sequencing on molecular traits is seldom known without functional assays. A massively parallel reporter assay (MPRA) is an assay that can functionally validate thousands of regulatory elements simultaneously using high-throughput sequencing and barcode technology. MPRA has been adapted to various experimental designs that measure gene regulatory effects of genetic variants within cis- and trans-regulatory elements as well as posttranscriptional processes. This review discusses different MPRA designs that have been or could be used in the future to experimentally validate genetic variants associated with neurodevelopmental disorders. Though MPRA has limitations such as it does not model genomic context, this assay can help narrow down the underlying genetic causes of neurodevelopmental disorders by screening thousands of sequences in one experiment. We conclude by describing future directions of this technique such as applications of MPRA for gene-by-environment interactions and pharmacogenetics.