RESUMO
SUMMARY: Understanding the effects of genetic variants is crucial for accurately predicting traits and functional outcomes. Recent approaches have utilized artificial intelligence and protein language models to score all possible missense variant effects at the proteome level for a single genome, but a reliable tool is needed to explore these effects at the pan-genome level. To address this gap, we introduce a new tool called PanEffect. We implemented PanEffect at MaizeGDB to enable a comprehensive examination of the potential effects of coding variants across 50 maize genomes. The tool allows users to visualize over 550 million possible amino acid substitutions in the B73 maize reference genome and to observe the effects of the 2.3 million natural variations in the maize pan-genome. Each variant effect score, calculated from the Evolutionary Scale Modeling (ESM) protein language model, shows the log-likelihood ratio difference between B73 and all variants in the pan-genome. These scores are shown using heatmaps spanning benign outcomes to potential functional consequences. In addition, PanEffect displays secondary structures and functional domains along with the variant effects, offering additional functional and structural context. Using PanEffect, researchers now have a platform to explore protein variants and identify genetic targets for crop enhancement. AVAILABILITY AND IMPLEMENTATION: The PanEffect code is freely available on GitHub (https://github.com/Maize-Genetics-and-Genomics-Database/PanEffect). A maize implementation of PanEffect and underlying datasets are available at MaizeGDB (https://www.maizegdb.org/effect/maize/).
Assuntos
Bases de Dados Genéticas , Zea mays , Zea mays/genética , Inteligência Artificial , Genoma de Planta , Fenótipo , SoftwareRESUMO
BACKGROUND: Environmental stress factors, such as biotic and abiotic stress, are becoming more common due to climate variability, significantly affecting global maize yield. Transcriptome profiling studies provide insights into the molecular mechanisms underlying stress response in maize, though the functions of many genes are still unknown. To enhance the functional annotation of maize-specific genes, MaizeGDB has outlined a data-driven approach with an emphasis on identifying genes and traits related to biotic and abiotic stress. RESULTS: We mapped high-quality RNA-Seq expression reads from 24 different publicly available datasets (17 abiotic and seven biotic studies) generated from the B73 cultivar to the recent version of the reference genome B73 (B73v5) and deduced stress-related functional annotation of maize gene models. We conducted a robust meta-analysis of the transcriptome profiles from the datasets to identify maize loci responsive to stress, identifying 3,230 differentially expressed genes (DEGs): 2,555 DEGs regulated in response to abiotic stress, 408 DEGs regulated during biotic stress, and 267 common DEGs (co-DEGs) that overlap between abiotic and biotic stress. We discovered hub genes from network analyses, and among the hub genes of the co-DEGs we identified a putative NAC domain transcription factor superfamily protein (Zm00001eb369060) IDP275, which previously responded to herbivory and drought stress. IDP275 was up-regulated in our analysis in response to eight different abiotic and four different biotic stresses. A gene set enrichment and pathway analysis of hub genes of the co-DEGs revealed hormone-mediated signaling processes and phenylpropanoid biosynthesis pathways, respectively. Using phylostratigraphic analysis, we also demonstrated how abiotic and biotic stress genes differentially evolve to adapt to changing environments. CONCLUSIONS: These results will help facilitate the functional annotation of multiple stress response gene models and annotation in maize. Data can be accessed and downloaded at the Maize Genetics and Genomics Database (MaizeGDB).
Assuntos
Anotação de Sequência Molecular , Estresse Fisiológico , Transcriptoma , Zea mays , Zea mays/genética , Estresse Fisiológico/genética , Regulação da Expressão Gênica de Plantas , Perfilação da Expressão Gênica , Genes de PlantasRESUMO
BACKGROUND: ââThe genus Fusarium poses significant threats to food security and safety worldwide because numerous species of the fungus cause destructive diseases and/or mycotoxin contamination in crops. The adverse effects of climate change are exacerbating some existing threats and causing new problems. These challenges highlight the need for innovative solutions, including the development of advanced tools to identify targets for control strategies. DESCRIPTION: In response to these challenges, we developed the Fusarium Protein Toolkit (FPT), a web-based tool that allows users to interrogate the structural and variant landscape within the Fusarium pan-genome. The tool displays both AlphaFold and ESMFold-generated protein structure models from six Fusarium species. The structures are accessible through a user-friendly web portal and facilitate comparative analysis, functional annotation inference, and identification of related protein structures. Using a protein language model, FPT predicts the impact of over 270 million coding variants in two of the most agriculturally important species, Fusarium graminearum and F. verticillioides. To facilitate the assessment of naturally occurring genetic variation, FPT provides variant effect scores for proteins in a Fusarium pan-genome based on 22 diverse species. The scores indicate potential functional consequences of amino acid substitutions and are displayed as intuitive heatmaps using the PanEffect framework. CONCLUSION: FPT fills a knowledge gap by providing previously unavailable tools to assess structural and missense variation in proteins produced by Fusarium. FPT has the potential to deepen our understanding of pathogenic mechanisms in Fusarium, and aid the identification of genetic targets for control strategies that reduce crop diseases and mycotoxin contamination. Such targets are vital to solving the agricultural problems incited by Fusarium, particularly evolving threats resulting from climate change. Thus, FPT has the potential to contribute to improving food security and safety worldwide.
Assuntos
Proteínas Fúngicas , Fusarium , Internet , Fusarium/genética , Fusarium/metabolismo , Fusarium/classificação , Proteínas Fúngicas/genética , Proteínas Fúngicas/química , Proteínas Fúngicas/metabolismo , Genoma Fúngico/genética , Variação Genética , Modelos Moleculares , Software , Conformação ProteicaRESUMO
Leaf morphogenesis involves cell division, expansion, and differentiation in the developing leaf, which take place at different rates and at different positions along the medio-lateral and proximal-distal leaf axes. The gene expression changes that control cell fate along these axes remain elusive due to difficulties in precisely isolating tissues. Here, we combined rigorous early leaf characterization, laser capture microdissection, and transcriptomic sequencing to ask how gene expression patterns regulate early leaf morphogenesis in wild-type tomato (Solanum lycopersicum) and the leaf morphogenesis mutant trifoliate. We observed transcriptional regulation of cell differentiation along the proximal-distal axis and identified molecular signatures delineating the classically defined marginal meristem/blastozone region during early leaf development. We describe the role of endoreduplication during leaf development, when and where leaf cells first achieve photosynthetic competency, and the regulation of auxin transport and signaling along the leaf axes. Knockout mutants of BLADE-ON-PETIOLE2 exhibited ectopic shoot apical meristem formation on leaves, highlighting the role of this gene in regulating margin tissue identity. We mapped gene expression signatures in specific leaf domains and evaluated the role of each domain in conferring indeterminacy and permitting blade outgrowth. Finally, we generated a global gene expression atlas of the early developing compound leaf.
Assuntos
Folhas de Planta/metabolismo , Proteínas de Plantas/metabolismo , Plantas Geneticamente Modificadas/metabolismo , Solanum lycopersicum/metabolismo , Diferenciação Celular/genética , Diferenciação Celular/fisiologia , Regulação da Expressão Gênica de Plantas , Solanum lycopersicum/genética , Folhas de Planta/genética , Proteínas de Plantas/genética , Plantas Geneticamente Modificadas/genéticaRESUMO
MOTIVATION: Over the last decade, RNA-Seq whole-genome sequencing has become a widely used method for measuring and understanding transcriptome-level changes in gene expression. Since RNA-Seq is relatively inexpensive, it can be used on multiple genomes to evaluate gene expression across many different conditions, tissues and cell types. Although many tools exist to map and compare RNA-Seq at the genomics level, few web-based tools are dedicated to making data generated for individual genomic analysis accessible and reusable at a gene-level scale for comparative analysis between genes, across different genomes and meta-analyses. RESULTS: To address this challenge, we revamped the comparative gene expression tool qTeller to take advantage of the growing number of public RNA-Seq datasets. qTeller allows users to evaluate gene expression data in a defined genomic interval and also perform two-gene comparisons across multiple user-chosen tissues. Though previously unpublished, qTeller has been cited extensively in the scientific literature, demonstrating its importance to researchers. Our new version of qTeller now supports multiple genomes for intergenomic comparisons, and includes capabilities for both mRNA and protein abundance datasets. Other new features include support for additional data formats, modernized interface and back-end database and an optimized framework for adoption by other organisms' databases. AVAILABILITY AND IMPLEMENTATION: The source code for qTeller is open-source and available through GitHub (https://github.com/Maize-Genetics-and-Genomics-Database/qTeller). A maize instance of qTeller is available at the Maize Genetics and Genomics database (MaizeGDB) (https://qteller.maizegdb.org/), where we have mapped over 200 unique datasets from GenBank across 27 maize genomes. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Genoma , Genômica , Software , Bases de Dados de Ácidos Nucleicos , Zea mays/genética , Perfilação da Expressão GênicaRESUMO
BACKGROUND: Gene annotation in eukaryotes is a non-trivial task that requires meticulous analysis of accumulated transcript data. Challenges include transcriptionally active regions of the genome that contain overlapping genes, genes that produce numerous transcripts, transposable elements and numerous diverse sequence repeats. Currently available gene annotation software applications depend on pre-constructed full-length gene sequence assemblies which are not guaranteed to be error-free. The origins of these sequences are often uncertain, making it difficult to identify and rectify errors in them. This hinders the creation of an accurate and holistic representation of the transcriptomic landscape across multiple tissue types and experimental conditions. Therefore, to gauge the extent of diversity in gene structures, a comprehensive analysis of genome-wide expression data is imperative. RESULTS: We present FINDER, a fully automated computational tool that optimizes the entire process of annotating genes and transcript structures. Unlike current state-of-the-art pipelines, FINDER automates the RNA-Seq pre-processing step by working directly with raw sequence reads and optimizes gene prediction from BRAKER2 by supplementing these reads with associated proteins. The FINDER pipeline (1) reports transcripts and recognizes genes that are expressed under specific conditions, (2) generates all possible alternatively spliced transcripts from expressed RNA-Seq data, (3) analyzes read coverage patterns to modify existing transcript models and create new ones, and (4) scores genes as high- or low-confidence based on the available evidence across multiple datasets. We demonstrate the ability of FINDER to automatically annotate a diverse pool of genomes from eight species. CONCLUSIONS: FINDER takes a completely automated approach to annotate genes directly from raw expression data. It is capable of processing eukaryotic genomes of all sizes and requires no manual supervision-ideal for bench researchers with limited experience in handling computational tools.
Assuntos
Eucariotos , Software , Eucariotos/genética , Genoma , Anotação de Sequência Molecular , RNA-Seq , Análise de Sequência de RNARESUMO
Research in the past decade has demonstrated that a single reference genome is not representative of a species' diversity. MaizeGDB introduces a pan-genomic approach to hosting genomic data, leveraging the large number of diverse maize genomes and their associated datasets to quickly and efficiently connect genomes, gene models, expression, epigenome, sequence variation, structural variation, transposable elements, and diversity data across genomes so that researchers can easily track the structural and functional differences of a locus and its orthologs across maize. We believe our framework is unique and provides a template for any genomic database poised to host large-scale pan-genomic data.
Assuntos
Confiabilidade dos Dados , Coleta de Dados/métodos , Bases de Dados como Assunto , Genoma de Planta , Genômica , Zea mays/genética , Variação GenéticaRESUMO
The transcriptional regulatory structure of plant genomes remains poorly defined relative to animals. It is unclear how many cis-regulatory elements exist, where these elements lie relative to promoters, and how these features are conserved across plant species. We employed the assay for transposase-accessible chromatin (ATAC-seq) in four plant species (Arabidopsis thaliana, Medicago truncatula, Solanum lycopersicum, and Oryza sativa) to delineate open chromatin regions and transcription factor (TF) binding sites across each genome. Despite 10-fold variation in intergenic space among species, the majority of open chromatin regions lie within 3 kb upstream of a transcription start site in all species. We find a common set of four TFs that appear to regulate conserved gene sets in the root tips of all four species, suggesting that TF-gene networks are generally conserved. Comparative ATAC-seq profiling of Arabidopsis root hair and non-hair cell types revealed extensive similarity as well as many cell-type-specific differences. Analyzing TF binding sites in differentially accessible regions identified a MYB-driven regulatory module unique to the hair cell, which appears to control both cell fate regulators and abiotic stress responses. Our analyses revealed common regulatory principles among species and shed light on the mechanisms producing cell-type-specific transcriptomes during development.
Assuntos
Cromatina/metabolismo , Regulação da Expressão Gênica de Plantas , Redes Reguladoras de Genes , Células Vegetais/metabolismo , Plantas/genética , Arabidopsis/genética , Sequência Conservada/genética , Solanum lycopersicum/genética , Medicago/genética , Meristema/genética , Oryza/genética , Epiderme Vegetal/citologia , Análise de Sequência de DNA , Especificidade da Espécie , Fatores de Transcrição/metabolismo , Transposases/metabolismoRESUMO
Since its 2015 update, MaizeGDB, the Maize Genetics and Genomics database, has expanded to support the sequenced genomes of many maize inbred lines in addition to the B73 reference genome assembly. Curation and development efforts have targeted high quality datasets and tools to support maize trait analysis, germplasm analysis, genetic studies, and breeding. MaizeGDB hosts a wide range of data including recent support of new data types including genome metadata, RNA-seq, proteomics, synteny, and large-scale diversity. To improve access and visualization of data types several new tools have been implemented to: access large-scale maize diversity data (SNPversity), download and compare gene expression data (qTeller), visualize pedigree data (Pedigree Viewer), link genes with phenotype images (MaizeDIG), and enable flexible user-specified queries to the MaizeGDB database (MaizeMine). MaizeGDB also continues to be the community hub for maize research, coordinating activities and providing technical support to the maize research community. Here we report the changes MaizeGDB has made within the last three years to keep pace with recent software and research advances, as well as the pan-genomic landscape that cheaper and better sequencing technologies have made possible. MaizeGDB is accessible online at https://www.maizegdb.org.
Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Genoma de Planta/genética , Genômica/métodos , Zea mays/genética , Regulação da Expressão Gênica de Plantas , Variação Genética , Armazenamento e Recuperação da Informação/métodos , Internet , Polimorfismo de Nucleotídeo Único , Proteômica/métodos , Interface Usuário-Computador , Zea mays/metabolismoRESUMO
BACKGROUND: Genome assemblies are foundational for understanding the biology of a species. They provide a physical framework for mapping additional sequences, thereby enabling characterization of, for example, genomic diversity and differences in gene expression across individuals and tissue types. Quality metrics for genome assemblies gauge both the completeness and contiguity of an assembly and help provide confidence in downstream biological insights. To compare quality across multiple assemblies, a set of common metrics are typically calculated and then compared to one or more gold standard reference genomes. While several tools exist for calculating individual metrics, applications providing comprehensive evaluations of multiple assembly features are, perhaps surprisingly, lacking. Here, we describe a new toolkit that integrates multiple metrics to characterize both assembly and gene annotation quality in a way that enables comparison across multiple assemblies and assembly types. RESULTS: Our application, named GenomeQC, is an easy-to-use and interactive web framework that integrates various quantitative measures to characterize genome assemblies and annotations. GenomeQC provides researchers with a comprehensive summary of these statistics and allows for benchmarking against gold standard reference assemblies. CONCLUSIONS: The GenomeQC web application is implemented in R/Shiny version 1.5.9 and Python 3.6 and is freely available at https://genomeqc.maizegdb.org/ under the GPL license. All source code and a containerized version of the GenomeQC pipeline is available in the GitHub repository https://github.com/HuffordLab/GenomeQC.
Assuntos
Genômica/métodos , Mapeamento Cromossômico , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Anotação de Sequência Molecular , Análise de Sequência de DNA , SoftwareRESUMO
BACKGROUND: Maize experienced a whole-genome duplication event approximately 5 to 12 million years ago. Because this event occurred after speciation from sorghum, the pre-duplication subgenomes can be partially reconstructed by mapping syntenic regions to the sorghum chromosomes. During evolution, maize has had uneven gene loss between each ancient subgenome. Fractionation and divergence between these genomes continue today, constantly changing genetic make-up and phenotypes and influencing agronomic traits. RESULTS: Here we regenerate the subgenome reconstructions for the most recent maize reference genome assembly. Based on both expression and abundance data for homeologous gene pairs across multiple tissues, we observed functional divergence of genes across subgenomes. Although the genes in the larger maize subgenome are often expressing more highly than their homeologs in the smaller subgenome, we observed cases where homeolog expression dominance switches in different tissues. We demonstrate for the first time that protein abundances are higher in the larger subgenome, but they also show tissue-specific dominance, a pattern similar to RNA expression dominance. We also find that pollen expression is uniquely decoupled from protein abundance. CONCLUSION: Our study shows that the larger subgenome has a greater range of functional assignments and that there is a relative lack of overlap between the subgenomes in terms of gene functions than would be suggested by similar patterns of gene expression and protein abundance. Our study also revealed that some reactions are catalyzed uniquely by the larger and smaller subgenomes. The tissue-specific, nonequivalent expression-level dominance pattern observed here implies a change in regulatory control which favors differentiated selective pressure on the retained duplicates leading to eventual change in gene functions.
Assuntos
Regulação da Expressão Gênica de Plantas/genética , Expressão Gênica/genética , Zea mays/genética , Mapeamento Cromossômico/métodos , Evolução Molecular , Duplicação Gênica , Ontologia Genética , Genes de Plantas , Genoma de Planta , Filogenia , Proteínas de Plantas/biossíntese , Proteínas de Plantas/genética , Pólen/genética , PoliploidiaRESUMO
Maize has for many decades been both one of the most important crops worldwide and one of the primary genetic model organisms. More recently, maize breeding has been impacted by rapid technological advances in sequencing and genotyping technology, transformation including genome editing, doubled haploid technology, parallelled by progress in data sciences and the development of novel breeding approaches utilizing genomic information. Herein, we report on past, current and future developments relevant for maize breeding with regard to (1) genome analysis, (2) germplasm diversity characterization and utilization, (3) manipulation of genetic diversity by transformation and genome editing, (4) inbred line development and hybrid seed production, (5) understanding and prediction of hybrid performance, (6) breeding methodology and (7) synthesis of opportunities and challenges for future maize breeding.
Assuntos
Melhoramento Vegetal/métodos , Zea mays/genética , Mapeamento Cromossômico , Variação Genética , Genoma de Planta , GenômicaRESUMO
Whole-genome duplications happen repeatedly in a typical flowering plant lineage. Following most ancient tetraploidies, the two subgenomes are distinguishable because one subgenome, the dominant subgenome, tends to have more genes than the other subgenome. Additionally, among retained pairs, the gene on the dominant subgenome tends to be expressed more than its recessive homeolog. Using comparative genomics, we show that genome dominance is heritable. The dominant subgenome of one postpolyploidy event remains dominant through a subsequent polyploidy event. We show that transposon-derived 24-nt RNAs target and cover the upstream region of retained genes preferentially when located on the recessive subgenome, and with little regard for a gene's level of expression. We hypothesize that small RNA (smRNA)-mediated silencing of transposons near genes causes position-effect down-regulation. Unlike 24-nt smRNA coverage, transposon coverage tracks gene expression, so not all transposons behave identically. We propose that successful ancient tetraploids begin as wide crosses between two lines, each evolved for different tradeoffs between transposon silencing and negative position effects on gene expression. We hypothesize that following a chaotic wide-cross/new tetraploid period, genes acquire their new expression balances based on differences in transposon coverage in the parents. We envision patches of silenceable transposon as quantitative cis-regulators of baseline transcription rate. Attractive solutions to heterosis and the C-value paradox are mentioned.
Assuntos
Redes Reguladoras de Genes , Genoma de Planta , Poliploidia , Elementos de DNA Transponíveis , RNA de Plantas/genéticaRESUMO
Subgenome dominance is an important phenomenon observed in allopolyploids after whole genome duplication, in which one subgenome retains more genes as well as contributes more to the higher expressing gene copy of paralogous genes. To dissect the mechanism of subgenome dominance, we systematically investigated the relationships of gene expression, transposable element (TE) distribution and small RNA targeting, relating to the multicopy paralogous genes generated from whole genome triplication in Brassica rapa. The subgenome dominance was found to be regulated by a relatively stable factor established previously, then inherited by and shared among B. rapa varieties. In addition, we found a biased distribution of TEs between flanking regions of paralogous genes. Furthermore, the 24-nt small RNAs target TEs and are negatively correlated to the dominant expression of individual paralogous gene pairs. The biased distribution of TEs among subgenomes and the targeting of 24-nt small RNAs together produce the dominant expression phenomenon at a subgenome scale. Based on these findings, we propose a bucket hypothesis to illustrate subgenome dominance and hybrid vigor. Our findings and hypothesis are valuable for the evolutionary study of polyploids, and may shed light on studies of hybrid vigor, which is common to most species.
Assuntos
Brassica rapa/genética , Elementos de DNA Transponíveis , Epigênese Genética , Regulação da Expressão Gênica de Plantas , Genoma de PlantaRESUMO
Certain types of gene families, such as those encoding most families of transcription factors, maintain their chromosomal syntenic positions throughout angiosperm evolutionary time. Other nonsyntenic gene families are prone to deletion, tandem duplication, and transposition. Here, we describe the chromosomal positional history of all genes in Arabidopsis thaliana throughout the rosid superorder. We introduce a public database where researchers can look up the positional history of their favorite A. thaliana gene or gene family. Finally, we show that specific gene families transposed at specific points in evolutionary time, particularly after whole-genome duplication events in the Brassicales, and suggest that genes in mobile gene families are under different selection pressure than syntenic genes.
Assuntos
Arabidopsis/genética , Elementos de DNA Transponíveis , Frequência do Gene , Família Multigênica , Filogenia , Algoritmos , Proteínas de Arabidopsis/genética , Cromossomos de Plantas/genética , Bases de Dados Genéticas , Evolução Molecular , Duplicação Gênica , Genes de Plantas , Ploidias , Seleção Genética , Sintenia , Fatores de Tempo , Fatores de Transcrição/genética , Regiões não TraduzidasRESUMO
Pan-genomes, encompassing the entirety of genetic sequences found in a collection of genomes within a clade, are more useful than single reference genomes for studying species diversity. This is especially true for a species like Zea mays, which has a particularly diverse and complex genome. Presenting pan-genome data, analyses, and visualization is challenging, especially for a diverse species, but more so when pan-genomic data is linked to extensive gene model and gene data, including classical gene information, markers, insertions, expression and proteomic data, and protein structures as is the case at MaizeGDB. Here, we describe MaizeGDB's expansion to include the genic subset of the Zea pan-genome in a pan-gene data center featuring the maize genomes hosted at MaizeGDB, and the outgroup teosinte Zea genomes from the Pan-Andropoganeae project. The new data center offers a variety of browsing and visualization tools, including sequence alignment visualization, gene trees and other tools, to explore pan-genes in Zea that were calculated by the pipeline Pandagma. Combined, these data will help maize researchers study the complexity and diversity of Zea, and to use the comparative functions to validate pan-gene relationships for a selected gene model.
Assuntos
Bases de Dados Genéticas , Genoma de Planta , Genômica , Zea mays , Zea mays/genética , Genômica/métodos , FilogeniaRESUMO
The Maize Genetics and Genomics Database (MaizeGDB) is the community resource for maize researchers, offering a suite of tools, informatics resources, and curated data sets to support maize genetics, genomics, and breeding research. Here, we provide an overview of the key resources available at MaizeGDB, including maize genomes, comparative genomics, and pan-genomics tools. This review aims to familiarize users with the range of options available for maize research and highlights the importance of MaizeGDB as a central hub for the maize research community. By providing a detailed snapshot of the database's capabilities, we hope to enable researchers to make use of MaizeGDB's resources, ultimately assisting them to better study the evolution and diversity of maize.
RESUMO
Previous work in Arabidopsis showed that after an ancient tetraploidy event, genes were preferentially removed from one of the two homologs, a process known as fractionation. The mechanism of fractionation is unknown. We sought to determine whether such preferential, or biased, fractionation exists in maize and, if so, whether a specific mechanism could be implicated in this process. We studied the process of fractionation using two recently sequenced grass species: sorghum and maize. The maize lineage has experienced a tetraploidy since its divergence from sorghum approximately 12 million years ago, and fragments of many knocked-out genes retain enough sequence similarity to be easily identifiable. Using sorghum exons as the query sequence, we studied the fate of both orthologous genes in maize following the maize tetraploidy. We show that genes are predominantly lost, not relocated, and that single-gene loss by deletion is the rule. Based on comparisons with orthologous sorghum and rice genes, we also infer that the sequences present before the deletion events were flanked by short direct repeats, a signature of intra-chromosomal recombination. Evidence of this deletion mechanism is found 2.3 times more frequently on one of the maize homologs, consistent with earlier observations of biased fractionation. The over-fractionated homolog is also a greater than 3-fold better target for transposon removal, but does not have an observably higher synonymous base substitution rate, nor could we find differentially placed methylation domains. We conclude that fractionation is indeed biased in maize and that intra-chromosomal or possibly a similar illegitimate recombination is the primary mechanism by which fractionation occurs. The mechanism of intra-chromosomal recombination explains the observed bias in both gene and transposon loss in the maize lineage. The existence of fractionation bias demonstrates that the frequency of deletion is modulated. Among the evolutionary benefits of this deletion/fractionation mechanism is bulk DNA removal and the generation of novel combinations of regulatory sequences and coding regions.
Assuntos
Deleção de Genes , Genes de Plantas , Poliploidia , Zea mays/genética , Translocação GenéticaRESUMO
Much of the eukaryotic genome is known to be mobile, largely due to the movement of transposons and other parasitic elements. Recent work in plants and Drosophila suggests that mobility is also a feature of many nontransposon genes and gene families. Indeed, analysis of the Arabidopsis genome suggested that as many as half of all genes had moved to unlinked positions since Arabidopsis diverged from papaya roughly 72 million years ago, and that these mobile genes tend to fall into distinct gene families. However, the mechanism by which single gene transposition occurred was not deduced. By comparing two closely related species, Arabidopsis thaliana and Arabidopsis lyrata, we sought to determine the nature of gene transposition in Arabidopsis. We found that certain categories of genes are much more likely to have transposed than others, and that many of these transposed genes are flanked by direct repeat sequence that was homologous to sequence within the orthologous target site in A. lyrata and which was predominantly genic in identity. We suggest that intrachromosomal recombination between tandemly duplicated sequences, and subsequent insertion of the circular product, is the predominant mechanism of gene transposition.
Assuntos
Arabidopsis/genética , Elementos de DNA Transponíveis , Genes de Plantas , Sequências Repetitivas de Ácido NucleicoRESUMO
Attentional problems are commonly reported as a feature of the behavioural profile in both Williams syndrome (WS) and Down's syndrome (DS). Recent studies have begun to investigate these impairments empirically, acknowledging the need for an approach that considers cross-syndrome comparisons and developmental changes across the different component functions of attention. The present study assessed children with WS and DS using a new preschool attention battery (ECAB: early childhood attention battery), designed to be suitable for mental age 3-6 years including groups with developmental disorders. The ECAB has the advantage of giving an individual profile of attentional abilities for each child, covering different components of attention. In relation to test norms for their mental age, both groups showed a profile of strengths and weaknesses in the attention domain. Both syndrome groups performed relatively well on tests of sustained attention and poorly on aspects of selective attention and attentional control (executive function). The DS group showed a specific strength in auditory sustained attention, whilst the WS group showed a particular deficit in visuo-spatial response control. There was also evidence for considerable differences in the developmental trajectory of these abilities across the two groups. The results provide evidence for syndrome-specific patterns of impairment, and distinct profiles of strengths and weaknesses that may be useful in understanding the nature of everyday attention difficulties in these groups and tailoring interventions to meet these needs.