Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 48
Filtrar
1.
Mob DNA ; 15(1): 24, 2024 Oct 19.
Artigo em Inglês | MEDLINE | ID: mdl-39427206

RESUMO

Genome annotation is an important but challenging task. Accurate identification of short interspersed nuclear elements (SINEs) is particularly difficult due to their lack of highly conserved sequences. AnnoSINE is state-of-the-art software for annotating SINEs in plant genomes, but it is computationally inefficient for large genomes. Moreover, its applicability to animals is limited due to the absence of animal pHMMs in its HMM library. Therefore, we propose AnnoSINE_v2, which extends accurate SINE annotation for animal genomes with greatly optimized computational efficiency. Our results show that AnnoSINE_v2's annotation of SINEs has over 20% higher F1-score compared to the existing tools on animal genomes and enables the processing of complicated genomes, like human and zebrafish, which were beyond the capabilities of AnnoSINE_v1. AnnoSINE_v2 is freely available on Conda and GitHub: https://github.com/liaoherui/AnnoSINE_v2 .

2.
Artigo em Inglês | MEDLINE | ID: mdl-39237452

RESUMO

Transposable elements (TEs) are abundant and ubiquitous components of eukaryotic genomes. Since TEs were first discovered in maize (Zea mays) by Barbara McClintock in the late 1940s, these elements have been shown to be important agents in shaping genome structure and evolution. Today, maize continues to be an important model organism for molecular and quantitative genetics, and represents a particularly useful system for the study of the interplay between TEs and host genomes. While TEs constitute a significant part of the maize genome and are important drivers of genome evolution, their annotation remains a complex and challenging task. Here, we discuss genome annotation of TEs and other repetitive sequences in maize genomes. We briefly review current knowledge on the overall landscape of TE and non-TE repeats in maize, and discuss how these sequences may impact genome structure, and the genotype and phenotype within species. We also provide a summary of the main tools used to find TE polymorphisms, and briefly introduce four different bioinformatic approaches for TE and tandem repeat annotation, explaining how they can be best used by maize researchers.

3.
Artigo em Inglês | MEDLINE | ID: mdl-39237454

RESUMO

Transposable elements (TEs) and tandem repeat arrays are ubiquitous components of genomes across all domains of life. Many types of repetitive DNA do not appear to encode for functional proteins, and those that do, typically only code for enzymes involved in their own replication. Nevertheless, repetitive DNA sequences can significantly alter genome structure, and can have a profound impact on an organism's biology at both the molecular and organismal levels. Advances in long-read sequencing technology have enabled the resolution of previously collapsed contigs and scaffolds that are rich in repeats, which has made the accurate annotation of TEs and other repetitive sequences a crucial early step in genome analysis. Here, we provide a detailed tutorial for streamlined annotation of TEs and repeats in the genome of the model plant Zea mays (maize). Maize is ideally suited to illustrate these procedures due to its repeat-rich genome and the volume of publicly available and high-quality genomic resources. We outline four possible approaches for TE and repeat annotation, each aimed at accommodating a different set of scientific interests. Additionally, we demonstrate how to evaluate annotation quality, and provide scripts to help graphically depict TE and repeat landscapes. Although the protocol is tailored for maize, we also offer pointers for researchers working on other systems throughout and expect that these procedures will be broadly applicable to any eukaryotic genome.

4.
Genome Res ; 34(8): 1140-1153, 2024 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-39251347

RESUMO

Much of the profound interspecific variation in genome content has been attributed to transposable elements (TEs). To explore the extent of TE variation within species, we developed an optimized open-source algorithm, panEDTA, to de novo annotate TEs in a pangenome context. We then generated a unified TE annotation for a maize pangenome derived from 26 reference-quality genomes, which reveals an excess of 35.1 Mb of TE sequences per genome in tropical maize relative to temperate maize. A small number (n = 216) of TE families, mainly LTR retrotransposons, drive these differences. Evidence from the methylome, transcriptome, LTR age distribution, and LTR insertional polymorphisms reveals that 64.7% of the variability is contributed by LTR families that are young, less methylated, and more expressed in tropical maize, whereas 18.5% is driven by LTR families with removal or loss in temperate maize. Additionally, we find enrichment for Young LTR families adjacent to nucleotide-binding and leucine-rich repeat (NLR) clusters of varying copy number across lines, suggesting TE activity may be associated with disease resistance in maize.


Assuntos
Elementos de DNA Transponíveis , Genoma de Planta , Retroelementos , Sequências Repetidas Terminais , Zea mays , Zea mays/genética , Retroelementos/genética , Variação Genética , Anotação de Sequência Molecular , Clima Tropical , Metilação de DNA
5.
Mol Biol Evol ; 41(5)2024 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-38758089

RESUMO

Polyploidy is a prominent mechanism of plant speciation and adaptation, yet the mechanistic understandings of duplicated gene regulation remain elusive. Chromatin structure dynamics are suggested to govern gene regulatory control. Here, we characterized genome-wide nucleosome organization and chromatin accessibility in allotetraploid cotton, Gossypium hirsutum (AADD, 2n = 4X = 52), relative to its two diploid parents (AA or DD genome) and their synthetic diploid hybrid (AD), using DNS-seq. The larger A-genome exhibited wider average nucleosome spacing in diploids, and this intergenomic difference diminished in the allopolyploid but not hybrid. Allopolyploidization also exhibited increased accessibility at promoters genome-wide and synchronized cis-regulatory motifs between subgenomes. A prominent cis-acting control was inferred for chromatin dynamics and demonstrated by transposable element removal from promoters. Linking accessibility to gene expression patterns, we found distinct regulatory effects for hybridization and later allopolyploid stages, including nuanced establishment of homoeolog expression bias and expression level dominance. Histone gene expression and nucleosome organization are coordinated through chromatin accessibility. Our study demonstrates the capability to track high-resolution chromatin structure dynamics and reveals their role in the evolution of cis-regulatory landscapes and duplicate gene expression in polyploids, illuminating regulatory ties to subgenomic asymmetry and dominance.


Assuntos
Cromatina , Diploide , Evolução Molecular , Gossypium , Poliploidia , Gossypium/genética , Cromatina/genética , Regulação da Expressão Gênica de Plantas , Genoma de Planta , Nucleossomos/genética , Genes Duplicados , Regiões Promotoras Genéticas
6.
bioRxiv ; 2024 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-38529488

RESUMO

The combination of ultra-long Oxford Nanopore (ONT) sequencing reads with long, accurate PacBio HiFi reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, "telomere-to-telomere" genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT "Duplex" sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely-studied genomes: human HG002, Solanum lycopersicum Heinz 1706 (tomato), and Zea mays B73 (maize). For the diploid, heterozygous HG002 genome, we also used "Pore-C" chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the ultra-long reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and has the potential to provide a single-instrument solution for the reconstruction of complete genomes.

8.
bioRxiv ; 2023 Aug 24.
Artigo em Inglês | MEDLINE | ID: mdl-37662366

RESUMO

We present the genome of the living fossil, Wollemia nobilis, a southern hemisphere conifer morphologically unchanged since the Cretaceous. Presumed extinct until rediscovery in 1994, the Wollemi pine is critically endangered with less than 60 wild adults threatened by intensifying bushfires in the Blue Mountains of Australia. The 12 Gb genome is among the most contiguous large plant genomes assembled, with extremely low heterozygosity and unusual abundance of DNA transposons. Reduced representation and genome re-sequencing of individuals confirms a relictual population since the last major glacial/drying period in Australia, 120 ky BP. Small RNA and methylome sequencing reveal conservation of ancient silencing mechanisms despite the presence of thousands of active and abundant transposons, including some transferred horizontally to conifers from arthropods in the Jurassic. A retrotransposon burst 8-6 my BP coincided with population decline, possibly as an adaptation enhancing epigenetic diversity. Wollemia, like other conifers, is susceptible to Phytophthora, and a suite of defense genes, similar to those in loblolly pine, are targeted for silencing by sRNAs in leaves. The genome provides insight into the earliest seed plants, while enabling conservation efforts.

11.
DNA Res ; 30(1)2023 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-36208288

RESUMO

A contiguous assembly of the inbred 'EL10' sugar beet (Beta vulgaris ssp. vulgaris) genome was constructed using PacBio long-read sequencing, BioNano optical mapping, Hi-C scaffolding, and Illumina short-read error correction. The EL10.1 assembly was 540 Mb, of which 96.2% was contained in nine chromosome-sized pseudomolecules with lengths from 52 to 65 Mb, and 31 contigs with a median size of 282 kb that remained unassembled. Gene annotation incorporating RNA-seq data and curated sequences via the MAKER annotation pipeline generated 24,255 gene models. Results indicated that the EL10.1 genome assembly is a contiguous genome assembly highly congruent with the published sugar beet reference genome. Gross duplicate gene analyses of EL10.1 revealed little large-scale intra-genome duplication. Reduced gene copy number for well-annotated gene families relative to other core eudicots was observed, especially for transcription factors. Variation in genome size in B. vulgaris was investigated by flow cytometry among 50 individuals producing estimates from 633 to 875 Mb/1C. Read-depth mapping with short-read whole-genome sequences from other sugar beet germplasm suggested that relatively few regions of the sugar beet genome appeared associated with high-copy number variation.


Assuntos
Beta vulgaris , Humanos , Beta vulgaris/genética , Variações do Número de Cópias de DNA , Cromossomos , Anotação de Sequência Molecular , Açúcares
12.
Genome Biol ; 23(1): 258, 2022 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-36522651

RESUMO

Advancing crop genomics requires efficient genetic systems enabled by high-quality personalized genome assemblies. Here, we introduce RagTag, a toolset for automating assembly scaffolding and patching, and we establish chromosome-scale reference genomes for the widely used tomato genotype M82 along with Sweet-100, a new rapid-cycling genotype that we developed to accelerate functional genomics and genome editing in tomato. This work outlines strategies to rapidly expand genetic systems and genomic resources in other plant species.


Assuntos
Solanum lycopersicum , Solanum lycopersicum/genética , Edição de Genes , Genômica , Genoma , Genótipo
13.
Nat Genet ; 54(12): 1972-1982, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36471073

RESUMO

Preharvest sprouting (PHS) due to lack of seed dormancy seriously threatens crop production worldwide. As a complex quantitative trait, breeding of crop cultivars with suitable seed dormancy is hindered by limited useful regulatory genes. Here by repeatable phenotypic characterization of fixed recombinant individuals, we report a quantitative genetic locus, Seed Dormancy 6 (SD6), from aus-type rice, encoding a basic helix-loop-helix (bHLH) transcription factor, which underlies the natural variation of seed dormancy. SD6 and another bHLH factor inducer of C-repeat binding factors expression 2 (ICE2) function antagonistically in controlling seed dormancy by directly regulating the ABA catabolism gene ABA8OX3, and indirectly regulating the ABA biosynthesis gene NCED2 via OsbHLH048, in a temperature-dependent manner. The weak-dormancy allele of SD6 is common in cultivated rice but undergoes negative selection in wild rice. Notably, by genome editing SD6 and its wheat homologs, we demonstrated that SD6 is a useful breeding target for alleviating PHS in cereals under field conditions.


Assuntos
Oryza , Dormência de Plantas , Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Oryza/genética , Dormência de Plantas/genética
15.
Plant J ; 112(1): 172-192, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-35959634

RESUMO

Sacred lotus (Nelumbo nucifera Gaertn.) is a basal eudicot plant with a unique lifestyle, physiological features, and evolutionary characteristics. Here we report the unique profile of transposable elements (TEs) in the genome, using a manually curated repeat library. TEs account for 59% of the genome, and hAT (Ac/Ds) elements alone represent 8%, more than in any other known plant genome. About 18% of the lotus genome is comprised of Copia LTR retrotransposons, and over 25% of them are associated with non-canonical termini (non-TGCA). Such high abundance of non-canonical LTR retrotransposons has not been reported for any other organism. TEs are very abundant in genic regions, with retrotransposons enriched in introns and DNA transposons primarily in flanking regions of genes. The recent insertion of TEs in introns has led to significant intron size expansion, with a total of 200 Mb in the 28 455 genes. This is accompanied by declining TE activity in intergenic regions, suggesting distinct control efficacy of TE amplification in different genomic compartments. Despite the prevalence of TEs in genic regions, some genes are associated with fewer TEs, such as those involved in fruit ripening and stress responses. Other genes are enriched with TEs, and genes in epigenetic pathways are the most associated with TEs in introns, indicating a dynamic interaction between TEs and the host surveillance machinery. The dramatic differential abundance of TEs with genes involved in different biological processes as well as the variation of target preference of different TEs suggests the composition and activity of TEs influence the path of evolution.


Assuntos
Nelumbo , Retroelementos , Elementos de DNA Transponíveis/genética , DNA Intergênico , Evolução Molecular , Genoma de Planta/genética , Nelumbo/genética , Retroelementos/genética
18.
Plant J ; 108(6): 1830-1848, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34661327

RESUMO

Cassava (Manihot esculenta Crantz, 2n = 36) is a global food security crop. It has a highly heterozygous genome, high genetic load, and genotype-dependent asynchronous flowering. It is typically propagated by stem cuttings and any genetic variation between haplotypes, including large structural variations, is preserved by such clonal propagation. Traditional genome assembly approaches generate a collapsed haplotype representation of the genome. In highly heterozygous plants, this results in artifacts and an oversimplification of heterozygous regions. We used a combination of Pacific Biosciences (PacBio), Illumina, and Hi-C to resolve each haplotype of the genome of a farmer-preferred cassava line, TME7 (Oko-iyawo). PacBio reads were assembled using the FALCON suite. Phase switch errors were corrected using FALCON-Phase and Hi-C read data. The ultralong-range information from Hi-C sequencing was also used for scaffolding. Comparison of the two phases revealed >5000 large haplotype-specific structural variants affecting over 8 Mb, including insertions and deletions spanning thousands of base pairs. The potential of these variants to affect allele-specific expression was further explored. RNA-sequencing data from 11 different tissue types were mapped against the scaffolded haploid assembly and gene expression data are incorporated into our existing easy-to-use web-based interface to facilitate use by the broader plant science community. These two assemblies provide an excellent means to study the effects of heterozygosity, haplotype-specific structural variation, gene hemizygosity, and allele-specific gene expression contributing to important agricultural traits and further our understanding of the genetics and domestication of cassava.


Assuntos
Genoma de Planta , Haplótipos , Manihot/genética , África , Elementos de DNA Transponíveis , Diploide , Regulação da Expressão Gênica de Plantas , Tamanho do Genoma , Heterozigoto , Anotação de Sequência Molecular , Sintenia
19.
Science ; 373(6555): 655-662, 2021 08 06.
Artigo em Inglês | MEDLINE | ID: mdl-34353948

RESUMO

We report de novo genome assemblies, transcriptomes, annotations, and methylomes for the 26 inbreds that serve as the founders for the maize nested association mapping population. The number of pan-genes in these diverse genomes exceeds 103,000, with approximately a third found across all genotypes. The results demonstrate that the ancient tetraploid character of maize continues to degrade by fractionation to the present day. Excellent contiguity over repeat arrays and complete annotation of centromeres revealed additional variation in major cytological landmarks. We show that combining structural variation with single-nucleotide polymorphisms can improve the power of quantitative mapping studies. We also document variation at the level of DNA methylation and demonstrate that unmethylated regions are enriched for cis-regulatory elements that contribute to phenotypic variation.


Assuntos
Genoma de Planta , Anotação de Sequência Molecular , Zea mays/genética , Centrômero/genética , Mapeamento Cromossômico , Cromossomos de Plantas , Metilação de DNA , Resistência à Doença/genética , Genes de Plantas , Variação Genética , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Herança Multifatorial/genética , Fenótipo , Doenças das Plantas , Polimorfismo de Nucleotídeo Único , Sequências Reguladoras de Ácido Nucleico , Análise de Sequência de DNA , Tetraploidia , Transcriptoma , Sequenciamento Completo do Genoma
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA