RESUMO
Plant height can be an indicator of plant health across environments and used to identify superior genotypes. Typically plant height is measured at a single timepoint when plants reach terminal height. Evaluating plant height using unoccupied aerial vehicles allows for measurements throughout the growing season, facilitating a better understanding of plant-environment interactions and the genetic basis of this complex trait. To assess variation throughout development, plant height data was collected from planting until terminal height at anthesis (14 flights 2018, 27 in 2019, 12 in 2020, and 11 in 2021) for a panel of ~500 diverse maize inbred lines. The percent variance explained in plant height throughout the season was significantly explained by genotype (9-48%), year (4-52%), and genotype-by-year interactions (14-36%) to varying extents throughout development. Genome-wide association studies revealed 717 significant single nucleotide polymorphisms associated with plant height and growth rate at different parts of the growing season specific to certain phases of vegetative growth. When plant height growth curves were compared to growth curves estimated from canopy cover, greater Fréchet distance stability was observed in plant height growth curves than for canopy cover. This indicated canopy cover may be more useful for understanding environmental modulation of overall plant growth and plant height better for understanding genotypic modulation of overall plant growth. This study demonstrated that substantial information can be gained from high temporal resolution data to understand how plants differentially interact with the environment and can enhance our understanding of the genetic basis of complex polygenic traits.
RESUMO
Variation in gene expression levels is pervasive among individuals and races or varieties, and has substantial agronomic consequences, for example, by contributing to hybrid vigor. Gene expression level variation results from mutations in regulatory sequences (cis) and/or transcription factor (TF) activity (trans), but the mechanisms underlying cis- and/or trans-regulatory variation of complex phenotypes remain largely unknown. Here, we investigated gene expression variation mechanisms underlying the differential accumulation of the insecticidal compounds maysin and chlorogenic acid in silks of widely used maize (Zea mays) inbreds, B73 and A632. By combining transcriptomics and cistromics, we identified 1,338 silk direct targets of the maize R2R3-MYB TF Pericarp color1 (P1), consistent with it being a regulator of maysin and chlorogenic acid biosynthesis. Among these P1 targets, 464 showed allele-specific expression (ASE) between B73 and A632 silks. Allelic DNA-affinity purification sequencing identified 34 examples in which P1 allelic specific binding (ASB) correlated with cis-expression variation. From previous yeast one-hybrid studies, we identified 9 TFs potentially implicated in the control of P1 targets, with ASB to 83 out of 464 ASE genes (cis) and differential expression of 4 out of 9 TFs between B73 and A632 silks (trans). These results provide a molecular framework for understanding universal mechanisms underlying natural variation of gene expression levels, and how the regulation of metabolic diversity is established.
Assuntos
Regulação da Expressão Gênica de Plantas , Proteínas de Plantas , Zea mays , Zea mays/genética , Zea mays/metabolismo , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Fatores de Transcrição/metabolismo , Fatores de Transcrição/genética , Alelos , Ácido Clorogênico/metabolismo , Fenótipo , Variação GenéticaRESUMO
Elucidating gene regulatory networks is a major area of study within plant systems biology. Phenotypic traits are intricately linked to specific gene expression profiles. These expression patterns arise primarily from regulatory connections between sets of transcription factors (TFs) and their target genes. Here, we integrated 46 co-expression networks, 283 protein-DNA interaction (PDI) assays, and 16 million SNPs used to identify expression quantitative trait loci (eQTL) to construct TF-target networks. In total, we analyzed â¼4.6M interactions to generate four distinct types of TF-target networks: co-expression, PDI, trans -eQTL, and cis -eQTL combined with PDIs. To functionally annotate TFs based on their target genes, we implemented three different network integration strategies. We evaluated the effectiveness of each strategy through TF loss-of function mutant inspection and random network analyses. The multi-network integration allowed us to identify transcriptional regulators of several biological processes. Using the topological properties of the fully integrated network, we identified potential functionally redundant TF paralogs. Our findings retrieved functions previously documented for numerous TFs and revealed novel functions that are crucial for informing the design of future experiments. The approach here-described lays the foundation for the integration of multi-omic datasets in maize and other plant systems.
RESUMO
Structural differences between genomes are a major source of genetic variation that contributes to phenotypic differences. Transposable elements, mobile genetic sequences capable of increasing their copy number and propagating themselves within genomes, can generate structural variation. However, their repetitive nature makes it difficult to characterize fine-scale differences in their presence at specific positions, limiting our understanding of their impact on genome variation. Domesticated maize is a particularly good system for exploring the impact of transposable element proliferation as over 70% of the genome is annotated as transposable elements. High-quality transposable element annotations were recently generated for de novo genome assemblies of 26 diverse inbred maize lines. We generated base-pair resolved pairwise alignments between the B73 maize reference genome and the remaining 25 inbred maize line assemblies. From this data, we classified transposable elements as either shared or polymorphic in a given pairwise comparison. Our analysis uncovered substantial structural variation between lines, representing both simple and complex connections between TEs and structural variants. Putative insertions in SNP depleted regions, which represent recently diverged identity by state blocks, suggest some TE families may still be active. However, our analysis reveals that within these recently diverged genomic regions, deletions of transposable elements likely account for more structural variation events and base pairs than insertions. These deletions are often large structural variants containing multiple transposable elements. Combined, our results highlight how transposable elements contribute to structural variation and demonstrate that deletion events are a major contributor to genomic differences.
Assuntos
Elementos de DNA Transponíveis , Zea mays , Humanos , Elementos de DNA Transponíveis/genética , Zea mays/genética , GenômicaRESUMO
The highly active family of Mutator (Mu) DNA transposons has been widely used for forward and reverse genetics in maize. There are examples of Mu-suppressible alleles that result in conditional phenotypic effects based on the activity of Mu. Phenotypes from these Mu-suppressible mutations are observed in Mu-active genetic backgrounds, but absent when Mu activity is lost. For some Mu-suppressible alleles, phenotypic suppression likely results from an outward-reading promoter within Mu that is only active when the autonomous Mu element is silenced or lost. We isolated 35 Mu alleles from the UniformMu population that represent insertions in 24 different genes. Most of these mutant alleles are due to insertions within gene coding sequences, but several 5' UTR and intron insertions were included. RNA-seq and de novo transcript assembly were utilized to document the transcripts produced from 33 of these Mu insertion alleles. For 20 of the 33 alleles, there was evidence of transcripts initiating within the Mu sequence reading through the gene. This outward-reading promoter activity was detected in multiple types of Mu elements and does not depend on the orientation of Mu. Expression analyses of Mu-initiated transcripts revealed the Mu promoter often provides gene expression levels and patterns that are similar to the wild-type gene. These results suggest the Mu promoter may represent a minimal promoter that can respond to gene cis-regulatory elements. Findings from this study have implications for maize researchers using the UniformMu population, and more broadly highlight a strategy for transposons to co-exist with their host.
Assuntos
Zea mays , Sequência de Bases , Elementos de DNA Transponíveis , Mutação , Zea mays/genéticaRESUMO
It is unclear how mobile DNA sequences (transposable elements, hereafter TEs) invade eukaryotic genomes and reach stable copy numbers, as transposition can decrease host fitness. This challenge is particularly stark early in the invasion of a TE family at which point hosts may lack the specialized machinery to repress the spread of these TEs. One possibility (in addition to the evolution of host regulation of TEs) is that TE families may evolve to preferentially insert into chromosomal regions that are less likely to impact host fitness. This may allow the mean TE copy number to grow while minimizing the risk for host population extinction. To test this, we constructed simulations to explore how the transposition probability and insertion preference of a TE family influence the evolution of mean TE copy number and host population size, allowing for extinction. We find that the effect of a TE family's insertion preference depends on a host's ability to regulate this TE family. Without host repression, a neutral insertion preference increases the frequency of and decreases the time to population extinction. With host repression, a preference for neutral insertions minimizes the cumulative deleterious load, increases population fitness, and, ultimately, avoids triggering an extinction vortex.
Assuntos
Elementos de DNA Transponíveis , Evolução Molecular , HumanosRESUMO
Protein translation is tightly and precisely controlled by multiple mechanisms including upstream open reading frames (uORFs), but the origins of uORFs and their role in maize are largely unexplored. In this study, an active transposition event was identified during the propagation of maize inbred line B73. The transposon, which was named BTA for 'B73 active transposable element hAT', creates a novel dosage-dependent hypomorphic allele of the hexose transporter gene ZmSWEET4c through insertion within the coding sequence in the first exon, and results in reduced kernel size. The BTA insertion does not affect transcript abundance but reduces protein abundance of ZmSWEET4c, probably through the introduction of a uORF. Furthermore, the introduction of BTA sequence in the exon of other genes can regulate translation efficiency without affecting their mRNA levels. A transposon capture assay revealed 79 novel insertions for BTA and BTA-like elements. These insertion sites have typical euchromatin features, including low levels of DNA methylation and high levels of H3K27ac. A putative autonomous element that mobilizes BTA and BTA-like elements was identified. Together, our results suggest a transposon-based origin of uORFs and document a new role for transposable elements to influence protein abundance and phenotypic diversity by affecting the translation rate.
Assuntos
Biossíntese de Proteínas , Alelos , Sequência de Bases , RNA Mensageiro/genética , Fases de Leitura Aberta/genéticaRESUMO
cis-Regulatory elements encode the genomic blueprints that ensure the proper spatiotemporal patterning of gene expression necessary for appropriate development and responses to the environment. Accumulating evidence implicates changes to gene expression as a major source of phenotypic novelty in eukaryotes, including acute phenotypes such as disease and cancer in mammals. Moreover, genetic and epigenetic variation affecting cis-regulatory sequences over longer evolutionary timescales has become a recurring theme in studies of morphological divergence and local adaptation. Here, we discuss the functions of and methods used to identify various classes of cis-regulatory elements, as well as their role in plant development and response to the environment. We highlight opportunities to exploit cis-regulatory variants underlying plant development and environmental responses for crop improvement efforts. Although a comprehensive understanding of cis-regulatory mechanisms in plants has lagged behind that in animals, we showcase several breakthrough findings that have profoundly influenced plant biology and shaped the overall understanding of transcriptional regulation in eukaryotes.
Assuntos
Regulação da Expressão Gênica , Sequências Reguladoras de Ácido Nucleico , Animais , Sequências Reguladoras de Ácido Nucleico/genética , Genômica , Genoma , Desenvolvimento Vegetal/genética , Plantas/genética , Plantas/metabolismo , Evolução Molecular , Mamíferos/genéticaRESUMO
BACKGROUND: Many plant species exhibit genetic variation for coping with environmental stress. However, there are still limited approaches to effectively uncover the genomic region that regulates distinct responsive patterns of the gene across multiple varieties within the same species under abiotic stress. RESULTS: By analyzing the transcriptomes of more than 100 maize inbreds, we reveal many cis- and trans-acting eQTLs that influence the expression response to heat stress. The cis-acting eQTLs in response to heat stress are identified in genes with differential responses to heat stress between genotypes as well as genes that are only expressed under heat stress. The cis-acting variants for heat stress-responsive expression likely result from distinct promoter activities, and the differential heat responses of the alleles are confirmed for selected genes using transient expression assays. Global footprinting of transcription factor binding is performed in control and heat stress conditions to document regions with heat-enriched transcription factor binding occupancies. CONCLUSIONS: Footprints enriched near proximal regions of characterized heat-responsive genes in a large association panel can be utilized for prioritizing functional genomic regions that regulate genotype-specific responses under heat stress.
Assuntos
Regulação da Expressão Gênica de Plantas , Zea mays , Zea mays/genética , Resposta ao Choque Térmico/genética , Estresse Fisiológico/genética , Genômica , Fatores de Transcrição/genéticaRESUMO
Accessible chromatin regions are critical components of gene regulation but modeling them directly from sequence remains challenging, especially within plants, whose mechanisms of chromatin remodeling are less understood than in animals. We trained an existing deep-learning architecture, DanQ, on data from 12 angiosperm species to predict the chromatin accessibility in leaf of sequence windows within and across species. We also trained DanQ on DNA methylation data from 10 angiosperms because unmethylated regions have been shown to overlap significantly with ACRs in some plants. The across-species models have comparable or even superior performance to a model trained within species, suggesting strong conservation of chromatin mechanisms across angiosperms. Testing a maize (Zea mays L.) held-out model on a multi-tissue chromatin accessibility panel revealed our models are best at predicting constitutively accessible chromatin regions, with diminishing performance as cell-type specificity increases. Using a combination of interpretation methods, we ranked JASPAR motifs by their importance to each model and saw that the TCP and AP2/ERF transcription factor (TF) families consistently ranked highly. We embedded the top three JASPAR motifs for each model at all possible positions on both strands in our sequence window and observed position- and strand-specific patterns in their importance to the model. With our publicly available across-species 'a2z' model it is now feasible to predict the chromatin accessibility and methylation landscape of any angiosperm genome.
Assuntos
Cromatina , Magnoliopsida , Animais , Genoma , Magnoliopsida/genética , Redes Neurais de Computação , Fatores de Transcrição/genética , Zea mays/genéticaRESUMO
Demethylation of transposons can activate the expression of nearby genes and cause imprinted gene expression in the endosperm; this demethylation is hypothesized to lead to expression of transposon small interfering RNAs (siRNAs) that reinforce silencing in the next generation through transfer either into egg or embryo. Here we describe maize (Zea mays) maternal derepression of r1 (mdr1), which encodes a DNA glycosylase with homology to Arabidopsis thaliana DEMETER and which is partially responsible for demethylation of thousands of regions in endosperm. Instead of promoting siRNA expression in endosperm, MDR1 activity inhibits it. Methylation of most repetitive DNA elements in endosperm is not significantly affected by MDR1, with an exception of Helitrons. While maternally-expressed imprinted genes preferentially overlap with MDR1 demethylated regions, the majority of genes that overlap demethylated regions are not imprinted. Double mutant megagametophytes lacking both MDR1 and its close homolog DNG102 result in early seed failure, and double mutant microgametophytes fail pre-fertilization. These data establish DNA demethylation by glycosylases as essential in maize endosperm and pollen and suggest that neither transposon repression nor genomic imprinting is its main function in endosperm.
Assuntos
Arabidopsis , DNA Glicosilases , Arabidopsis/genética , DNA/metabolismo , DNA Glicosilases/genética , DNA Glicosilases/metabolismo , Metilação de DNA/genética , Endosperma/genética , Endosperma/metabolismo , Regulação da Expressão Gênica de Plantas/genética , Impressão Genômica/genética , RNA Interferente Pequeno/genética , Zea mays/genética , Zea mays/metabolismoRESUMO
CRISPR-Cas9-mediated genome editing has been widely adopted for basic and applied biological research in eukaryotic systems. While many studies consider DNA sequences of CRISPR target sites as the primary determinant for CRISPR mutagenesis efficiency and mutation profiles, increasing evidence reveals the substantial role of chromatin context. Nonetheless, most prior studies are limited by the lack of sufficient epigenetic resources and/or by only transiently expressing CRISPR-Cas9 in a short time window. In this study, we leveraged the wealth of high-resolution epigenomic resources in Arabidopsis (Arabidopsis thaliana) to address the impact of chromatin features on CRISPR-Cas9 mutagenesis using stable transgenic plants. Our results indicated that DNA methylation and chromatin features could lead to substantial variations in mutagenesis efficiency by up to 250-fold. Low mutagenesis efficiencies were mostly associated with repressive heterochromatic features. This repressive effect appeared to persist through cell divisions but could be alleviated through substantial reduction of DNA methylation at CRISPR target sites. Moreover, specific chromatin features, such as H3K4me1, H3.3, and H3.1, appear to be associated with significant variation in CRISPR-Cas9 mutation profiles mediated by the non-homologous end joining repair pathway. Our findings provide strong evidence that specific chromatin features could have substantial and lasting impacts on both CRISPR-Cas9 mutagenesis efficiency and DNA double-strand break repair outcomes.
Assuntos
Arabidopsis , Sistemas CRISPR-Cas , Arabidopsis/genética , Sistemas CRISPR-Cas/genética , Cromatina/genética , Epigenômica , Edição de Genes/métodosRESUMO
The DOMAINS REARRANGED METHYLTRANSFERASEs (DRMs) are crucial for RNA-directed DNA methylation (RdDM) in plant species. Setaria viridis is a model monocot species with a relatively compact genome that has limited transposable element (TE) content. CRISPR-based genome editing approaches were used to create loss-of-function alleles for the two putative functional DRM genes in S. viridis to probe the role of RdDM. Double mutant (drm1ab) plants exhibit some morphological abnormalities but are fully viable. Whole-genome methylation profiling provided evidence for the widespread loss of methylation in CHH sequence contexts, particularly in regions with high CHH methylation in wild-type plants. Evidence was also found for the locus-specific loss of CG and CHG methylation, even in some regions that lack CHH methylation. Transcriptome profiling identified genes with altered expression in the drm1ab mutants. However, the majority of genes with high levels of CHH methylation directly surrounding the transcription start site or in nearby promoter regions in wild-type plants do not have altered expression in the drm1ab mutant, even when this methylation is lost, suggesting limited regulation of gene expression by RdDM. Detailed analysis of the expression of TEs identified several transposons that are transcriptionally activated in drm1ab mutants. These transposons are likely to require active RdDM for the maintenance of transcriptional repression.
Assuntos
Setaria (Planta) , Metilação de DNA/genética , Regulação da Expressão Gênica de Plantas/genética , Metiltransferases/genética , Setaria (Planta)/genética , TranscriptomaRESUMO
BACKGROUND: DNA demethylation occurs in many species and is involved in diverse biological processes. However, the occurrence and role of DNA demethylation in maize remain unknown. RESULTS: We analyze loss-of-function mutants of two major genes encoding DNA demethylases. No significant change in DNA methylation has been detected in these mutants. However, we detect increased DNA methylation levels in the mutants around genes and some transposons. The increase in DNA methylation is accompanied by alteration in gene expression, with a tendency to show downregulation, especially for the genes that are preferentially expressed in endosperm. Imprinted expression of both maternally and paternally expressed genes changes in F1 hybrid with the mutant as female and the wild-type as male parental line, but not in the reciprocal hybrid. This alteration in gene expression is accompanied by allele-specific DNA methylation differences, suggesting that removal of DNA methylation of the maternal allele is required for the proper expression of these imprinted genes. Finally, we demonstrate that hypermethylation in the double mutant is associated with reduced binding of transcription factor to its target, and altered gene expression. CONCLUSIONS: Our results suggest that active removal of DNA methylation is important for transcription factor binding and proper gene expression in maize endosperm.
Assuntos
Endosperma , Zea mays , Alelos , Desmetilação do DNA , Metilação de DNA , Endosperma/genética , Endosperma/metabolismo , Expressão Gênica , Regulação da Expressão Gênica de Plantas , Impressão Genômica , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Zea mays/genética , Zea mays/metabolismoRESUMO
Changes in gene expression are important for responses to abiotic stress. Transcriptome profiling of heat- or cold-stressed maize genotypes identifies many changes in transcript abundance. We used comparisons of expression responses in multiple genotypes to identify alleles with variable responses to heat or cold stress and to distinguish examples of cis- or trans-regulatory variation for stress-responsive expression changes. We used motifs enriched near the transcription start sites (TSSs) for thermal stress-responsive genes to develop predictive models of gene expression responses. Prediction accuracies can be improved by focusing only on motifs within unmethylated regions near the TSS and vary for genes with different dynamic responses to stress. Models trained on expression responses in a single genotype and promoter sequences provided lower performance when applied to other genotypes but this could be improved by using models trained on data from all three genotypes tested. The analysis of genes with cis-regulatory variation provides evidence for structural variants that result in presence/absence of transcription factor binding sites in creating variable responses. This study provides insights into cis-regulatory motifs for heat- and cold-responsive gene expression and defines a framework for developing models to predict expression responses across multiple genotypes.
Assuntos
Resposta ao Choque Frio/genética , Regulação da Expressão Gênica de Plantas/fisiologia , Genes de Plantas , Resposta ao Choque Térmico/genética , Transcriptoma , Zea mays/fisiologia , Perfilação da Expressão Gênica , Zea mays/genéticaRESUMO
Accessible chromatin and unmethylated DNA are associated with many genes and cis-regulatory elements. Attempts to understand natural variation for accessible chromatin regions (ACRs) and unmethylated regions (UMRs) often rely upon alignments to a single reference genome. This limits the ability to assess regions that are absent in the reference genome assembly and monitor how nearby structural variants influence variation in chromatin state. In this study, de novo genome assemblies for four maize inbreds (B73, Mo17, Oh43, and W22) are utilized to assess chromatin accessibility and DNA methylation patterns in a pan-genome context. A more complete set of UMRs and ACRs can be identified when chromatin data are aligned to the matched genome rather than a single reference genome. While there are UMRs and ACRs present within genomic regions that are not shared between genotypes, these features are 6- to 12-fold enriched within regions between genomes. Characterization of UMRs present within shared genomic regions reveals that most UMRs maintain the unmethylated state in other genotypes with only â¼5% being polymorphic between genotypes. However, the majority (71%) of UMRs that are shared between genotypes only exhibit partial overlaps suggesting that the boundaries between methylated and unmethylated DNA are dynamic. This instability is not solely due to sequence variation as these partially overlapping UMRs are frequently found within genomic regions that lack sequence variation. The ability to compare chromatin properties among individuals with structural variation enables pan-epigenome analyses to study the sources of variation for accessible chromatin and unmethylated DNA.
Assuntos
Metilação de DNA , Zea mays , Cromatina/genética , Regulação da Expressão Gênica de Plantas , Genoma de Planta , Humanos , Zea mays/genéticaRESUMO
The use of hybrids is widespread in agriculture, yet the molecular basis for hybrid vigor (heterosis) remains obscure. To identify molecular components that may contribute to trait heterosis, we analyzed paired proteomic and transcriptomic data from seedling leaf and mature leaf blade tissues of maize hybrids and their inbred parents. Nuclear- and plastid-encoded subunits of complexes required for protein synthesis in the chloroplast and for the light reactions of photosynthesis were expressed above midparent and high-parent levels, respectively. Consistent with previous reports in Arabidopsis, ethylene biosynthetic enzymes were expressed below midparent levels in the hybrids, suggesting a conserved mechanism for heterosis between monocots and dicots. The ethylene biosynthesis mutant, acs2/acs6, largely phenocopied the hybrid proteome, indicating that a reduction in ethylene biosynthesis may mediate the differences between inbreds and their hybrids. To rank the relevance of expression differences to trait heterosis, we compared seedling leaf protein levels to the adult plant height of 15 hybrids. Hybrid/midparent expression ratios were most positively correlated with hybrid/midparent plant height ratios for the chloroplast ribosomal proteins. Our results show that increased expression of chloroplast ribosomal proteins in hybrid seedling leaves is mediated by reduced expression of ethylene biosynthetic enzymes and that the degree of their overexpression in seedlings can quantitatively predict adult trait heterosis.
Assuntos
Proteínas de Cloroplastos/metabolismo , Vigor Híbrido/genética , Vigor Híbrido/fisiologia , Plastídeos/metabolismo , Proteínas Ribossômicas/genética , Proteínas Ribossômicas/metabolismo , Arabidopsis/genética , Proteínas de Cloroplastos/genética , Etilenos/metabolismo , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Fotossíntese , Folhas de Planta/metabolismo , Plastídeos/genética , Proteoma , Proteômica , Plântula/metabolismo , Transcriptoma , Zea mays/genéticaRESUMO
Transposable elements (TEs) constitute the majority of flowering plant DNA, reflecting their tremendous success in subverting, avoiding, and surviving the defenses of their host genomes to ensure their selfish replication. More than 85% of the sequence of the maize genome can be ascribed to past transposition, providing a major contribution to the structure of the genome. Evidence from individual loci has informed our understanding of how transposition has shaped the genome, and a number of individual TE insertions have been causally linked to dramatic phenotypic changes. Genome-wide analyses in maize and other taxa have frequently represented TEs as a relatively homogeneous class of fragmentary relics of past transposition, obscuring their evolutionary history and interaction with their host genome. Using an updated annotation of structurally intact TEs in the maize reference genome, we investigate the family-level dynamics of TEs in maize. Integrating a variety of data, from descriptors of individual TEs like coding capacity, expression, and methylation, as well as similar features of the sequence they inserted into, we model the relationship between attributes of the genomic environment and the survival of TE copies and families. In contrast to the wholesale relegation of all TEs to a single category of junk DNA, these differences reveal a diversity of survival strategies of TE families. Together these generate a rich ecology of the genome, with each TE family representing the evolution of a distinct ecological niche. We conclude that while the impact of transposition is highly family- and context-dependent, a family-level understanding of the ecology of TEs in the genome can refine our ability to predict the role of TEs in generating genetic and phenotypic diversity.