RESUMEN
Mechanical properties of DNA have been implied to influence many of its biological functions. Recently, a new high-throughput method, called loop-seq, which allows measuring the intrinsic bendability of DNA fragments, has been developed. Using loop-seq data, we created a deep learning model to explore the biological significance of local DNA flexibility in a range of different species from different kingdoms. Consistently, we observed a characteristic and largely dinucleotide-composition-driven change of local flexibility near transcription start sites. In the presence of a TATA-box, a pronounced peak of high flexibility can be observed. Furthermore, depending on the transcription factor investigated, flanking-sequence-dependent DNA flexibility was identified as a potential factor influencing DNA binding. Compared to randomized genomic sequences, depending on species and taxa, actual genomic sequences were observed both with increased and lowered flexibility. Furthermore, in Arabidopsis thaliana, mutation rates, both de novo and fixed, were found to be associated with relatively rigid sequence regions. Our study presents a range of significant correlations between characteristic DNA mechanical properties and genomic features, the significance of which with regard to detailed molecular relevance awaits further theoretical and experimental exploration.
RESUMEN
Acclimation and adaptation of metabolism to a changing environment are key processes for plant survival and reproductive success. In the present study, 241 natural accessions of Arabidopsis (Arabidopsis thaliana) were grown under two different temperature regimes, 16 °C and 6 °C, and growth parameters were recorded, together with metabolite profiles, to investigate the natural genome × environment effects on metabolome variation. The plasticity of metabolism, which was captured by metabolic distance measures, varied considerably between accessions. Both relative growth rates and metabolic distances were predictable by the underlying natural genetic variation of accessions. Applying machine learning methods, climatic variables of the original growth habitats were tested for their predictive power of natural metabolic variation among accessions. We found specifically habitat temperature during the first quarter of the year to be the best predictor of the plasticity of primary metabolism, indicating habitat temperature as the causal driver of evolutionary cold adaptation processes. Analyses of epigenome- and genome-wide associations revealed accession-specific differential DNA-methylation levels as potentially linked to the metabolome and identified FUMARASE2 as strongly associated with cold adaptation in Arabidopsis accessions. These findings were supported by calculations of the biochemical Jacobian matrix based on variance and covariance of metabolomics data, which revealed that growth under low temperatures most substantially affects the accession-specific plasticity of fumarate and sugar metabolism. Our findings indicate that the plasticity of metabolic regulation is predictable from the genome and epigenome and driven evolutionarily by Arabidopsis growth habitats.
Asunto(s)
Proteínas de Arabidopsis , Arabidopsis , Arabidopsis/fisiología , Frío , Temperatura , Clima , Metaboloma/genética , Proteínas de Arabidopsis/genéticaRESUMEN
BACKGROUND: Intron mediated enhancement (IME) is the potential of introns to enhance the expression of its respective gene. This essential function of introns has been observed in a wide range of species, including fungi, plants, and animals. However, the mechanisms underlying the enhancement are as of yet poorly understood. The goal of this study was to identify potential IME-related sequence motifs and genomic features in first introns of genes in Arabidopsis thaliana. RESULTS: Based on the rationale that functional sequence motifs are evolutionarily conserved, we exploited the deep sequencing information available for Arabidopsis thaliana, covering more than one thousand Arabidopsis accessions, and identified 81 candidate hexamer motifs with increased conservation across all accessions that also exhibit positional occurrence preferences. Of those, 71 were found associated with increased correlation of gene expression of genes harboring them, suggesting a cis-regulatory role. Filtering further for effect on gene expression correlation yielded a set of 16 hexamer motifs, corresponding to five consensus motifs. While all five motifs represent new motif definitions, two are similar to the two previously reported IME-motifs, whereas three are altogether novel. Both consensus and hexamer motifs were found associated with higher expression of alleles harboring them as compared to alleles containing mutated motif variants as found in naturally occurring Arabidopsis accessions. To identify additional IME-related genomic features, Random Forest models were trained for the classification of gene expression level based on an array of sequence-related features. The results indicate that introns contain information with regard to gene expression level and suggest sequence-compositional features as most informative, while position-related features, thought to be of central importance before, were found with lower than expected relevance. CONCLUSIONS: Exploiting deep sequencing and broad gene expression information and on a genome-wide scale, this study confirmed the regulatory role on first-introns, characterized their intra-species conservation, and identified a set of novel sequence motifs located in first introns of genes in the genome of the plant Arabidopsis thaliana that may play a role in inducing high and correlated gene expression of the genes harboring them.