Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 248
Filtrar
Más filtros

Tipo del documento
Intervalo de año de publicación
1.
Mol Cell ; 83(21): 3801-3817.e8, 2023 Nov 02.
Artículo en Inglés | MEDLINE | ID: mdl-37922872

RESUMEN

Histones shape chromatin structure and the epigenetic landscape. H1, the most diverse histone in the human genome, has 11 variants. Due to the high structural similarity between the H1s, their unique functions in transferring information from the chromatin to mRNA-processing machineries have remained elusive. Here, we generated human cell lines lacking up to five H1 subtypes, allowing us to characterize the genomic binding profiles of six H1 variants. Most H1s bind to specific sites, and binding depends on multiple factors, including GC content. The highly expressed H1.2 has a high affinity for exons, whereas H1.3 binds intronic sequences. H1s are major splicing regulators, especially of exon skipping and intron retention events, through their effects on the elongation of RNA polymerase II (RNAPII). Thus, H1 variants determine splicing fate by modulating RNAPII elongation.


Asunto(s)
Histonas , ARN Polimerasa II , Humanos , Histonas/genética , Histonas/metabolismo , ARN Polimerasa II/genética , ARN Polimerasa II/metabolismo , Empalme del ARN , Transcripción Genética , Cromatina/genética , Empalme Alternativo
2.
Mol Cell ; 82(24): 4681-4699.e8, 2022 12 15.
Artículo en Inglés | MEDLINE | ID: mdl-36435176

RESUMEN

Long introns with short exons in vertebrate genes are thought to require spliceosome assembly across exons (exon definition), rather than introns, thereby requiring transcription of an exon to splice an upstream intron. Here, we developed CoLa-seq (co-transcriptional lariat sequencing) to investigate the timing and determinants of co-transcriptional splicing genome wide. Unexpectedly, 90% of all introns, including long introns, can splice before transcription of a downstream exon, indicating that exon definition is not obligatory for most human introns. Still, splicing timing varies dramatically across introns, and various genetic elements determine this variation. Strong U2AF2 binding to the polypyrimidine tract predicts early splicing, explaining exon definition-independent splicing. Together, our findings question the essentiality of exon definition and reveal features beyond intron and exon length that are determinative for splicing timing.


Asunto(s)
Empalme Alternativo , Empalme del ARN , Humanos , Secuencia de Bases , Intrones/genética , Exones/genética
3.
Mol Cell ; 82(5): 1021-1034.e8, 2022 03 03.
Artículo en Inglés | MEDLINE | ID: mdl-35182478

RESUMEN

How the splicing machinery defines exons or introns as the spliced unit has remained a puzzle for 30 years. Here, we demonstrate that peripheral and central regions of the nucleus harbor genes with two distinct exon-intron GC content architectures that differ in the splicing outcome. Genes with low GC content exons, flanked by long introns with lower GC content, are localized in the periphery, and the exons are defined as the spliced unit. Alternative splicing of these genes results in exon skipping. In contrast, the nuclear center contains genes with a high GC content in the exons and short flanking introns. Most splicing of these genes occurs via intron definition, and aberrant splicing leads to intron retention. We demonstrate that the nuclear periphery and center generate different environments for the regulation of alternative splicing and that two sets of splicing factors form discrete regulatory subnetworks for the two gene architectures. Our study connects 3D genome organization and splicing, thus demonstrating that exon and intron definition modes of splicing occur in different nuclear regions.


Asunto(s)
Empalme Alternativo , Empalme del ARN , Composición de Base , Exones/genética , Intrones/genética
4.
Genes Dev ; 36(9-10): 550-565, 2022 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-35589130

RESUMEN

Although splicing is a major driver of RNA nuclear export, many intronless RNAs are efficiently exported to the cytoplasm through poorly characterized mechanisms. For example, GC-rich sequences promote nuclear export in a splicing-independent manner, but how GC content is recognized and coupled to nuclear export is unknown. Here, we developed a genome-wide screening strategy to investigate the mechanism of export of NORAD, an intronless cytoplasmic long noncoding RNA (lncRNA). This screen revealed an RNA binding protein, RBM33, that directs the nuclear export of NORAD and numerous other transcripts. RBM33 directly binds substrate transcripts and recruits components of the TREX-NXF1/NXT1 RNA export pathway. Interestingly, high GC content emerged as the feature that specifies RBM33-dependent nuclear export. Accordingly, RBM33 directly binds GC-rich elements in target transcripts. These results provide a broadly applicable strategy for the genetic dissection of nuclear export mechanisms and reveal a long-sought nuclear export pathway for transcripts with GC-rich sequences.


Asunto(s)
Proteínas de Transporte Nucleocitoplasmático , ARN Viral , Transporte Activo de Núcleo Celular , Núcleo Celular/metabolismo , Proteínas de Transporte Nucleocitoplasmático/metabolismo , Transporte de ARN , ARN Viral/metabolismo
5.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36410731

RESUMEN

Deoxyribonucleic acid (DNA) is an attractive medium for long-term digital data storage due to its extremely high storage density, low maintenance cost and longevity. However, during the process of synthesis, amplification and sequencing of DNA sequences with homopolymers of large run-length, three different types of errors, namely, insertion, deletion and substitution errors frequently occur. Meanwhile, DNA sequences with large imbalances between GC and AT content exhibit high dropout rates and are prone to errors. These limitations severely hinder the widespread use of DNA-based data storage. In order to reduce and correct these errors in DNA storage, this paper proposes a novel coding schema called DNA-LC, which converts binary sequences into DNA base sequences that satisfy both the GC balance and run-length constraints. Furthermore, our coding mode is able to detect and correct multiple errors with a higher error correction capability than the other methods targeting single error correction within a single strand. The decoding algorithm has been implemented in practice. Simulation results indicate that our proposed coding scheme can offer outstanding error protection to DNA sequences. The source code is freely accessible at https://github.com/XiayangLi2301/DNA.


Asunto(s)
ADN , Programas Informáticos , ADN/genética , Secuencia de Bases , Análisis de Secuencia de ADN/métodos , Algoritmos , Almacenamiento y Recuperación de la Información
6.
Immunol Rev ; 304(1): 10-29, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34486113

RESUMEN

T cell homeostasis, T cell differentiation, and T cell effector function rely on the constant fine-tuning of gene expression. To alter the T cell state, substantial remodeling of the proteome is required. This remodeling depends on the intricate interplay of regulatory mechanisms, including post-transcriptional gene regulation. In this review, we discuss how the sequence of a transcript influences these post-transcriptional events. In particular, we review how sequence determinants such as sequence conservation, GC content, and chemical modifications define the levels of the mRNA and the protein in a T cell. We describe the effect of different forms of alternative splicing on mRNA expression and protein production, and their effect on subcellular localization. In addition, we discuss the role of sequences and structures as binding hubs for miRNAs and RNA-binding proteins in T cells. The review thus highlights how the intimate interplay of post-transcriptional mechanisms dictate cellular fate decisions in T cells.


Asunto(s)
MicroARNs , Procesamiento Postranscripcional del ARN , Expresión Génica , Regulación de la Expresión Génica , ARN Mensajero/metabolismo , Proteínas de Unión al ARN/metabolismo , Linfocitos T/metabolismo
7.
Plant Mol Biol ; 114(1): 18, 2024 Feb 14.
Artículo en Inglés | MEDLINE | ID: mdl-38353826

RESUMEN

Microalgae represent a promising but yet underexplored production platform for biotechnology. The vast majority of studies on recombinant protein expression in algae have been conducted in a single species, the green alga Chlamydomonas reinhardtii. However, due to epigenetic silencing, transgene expression in Chlamydomonas is often inefficient. Here we have investigated parameters that govern efficient transgene expression in the red microalga Porphyridium purpureum. Porphyridium is unique in that the introduced transformation vectors are episomally maintained as autonomously replicating plasmids in the nucleus. We show that full codon optimization to the preferred codon usage in the Porphyridium genome confers superior transgene expression, not only at the level of protein accumulation, but also at the level of mRNA accumulation, indicating that high translation rates increase mRNA stability. Our optimized expression constructs resulted in YFP accumulation to unprecedented levels of up to 5% of the total soluble protein. We also designed expression cassettes that target foreign proteins to the secretory pathway and lead to efficient protein secretion into the culture medium, thus simplifying recombinant protein harvest and purification. Our study paves the way to the exploration of red microalgae as expression hosts in molecular farming for recombinant proteins and metabolites.


Asunto(s)
Chlamydomonas reinhardtii , Microalgas , Porphyridium , Porphyridium/genética , Biotecnología , Estabilidad del ARN , Chlamydomonas reinhardtii/genética , Microalgas/genética , Proteínas Recombinantes/genética
8.
Mol Ecol ; 33(6): e17287, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38263702

RESUMEN

The genomes of cellular organisms display CpG and TpA dinucleotide composition biases. Such biases have been poorly investigated in dsDNA viruses. Here, we show that in dsDNA virus, bacterial, and eukaryotic genomes, the representation of TpA and CpG dinucleotides is strongly dependent on genomic G + C content. Thus, the classical observed/expected ratios do not fully capture dinucleotide biases across genomes. Because a larger portion of the variance in TpA frequency was explained by G + C content, we explored which additional factors drive the distribution of CpG dinucleotides. Using the residuals of the linear regressions as a measure of dinucleotide abundance and ancestral state reconstruction across eukaryotic and prokaryotic virus trees, we identified an important role for phylogeny in driving CpG representation. Nonetheless, phylogenetic ANOVA analyses showed that few host associations also account for significant variations. Among eukaryotic viruses, most significant differences were observed between arthropod-infecting viruses and viruses that infect vertebrates or unicellular organisms. However, an effect of viral DNA methylation status (either driven by the host or by viral-encoded methyltransferases) is also likely. Among prokaryotic viruses, cyanobacteria-infecting phages resulted to be significantly CpG-depleted, whereas phages that infect bacteria in the genera Burkolderia and Staphylococcus were CpG-rich. Comparison with bacterial genomes indicated that this effect is largely driven by the general tendency for phages to resemble the host's genomic CpG content. Notably, such tendency is stronger for temperate than for lytic phages. Our data shed light into the processes that shape virus genome composition and inform manipulation strategies for biotechnological applications.


Asunto(s)
Genoma Viral , Virus , Animales , Sesgo , Metilación de ADN/genética , Genoma Viral/genética , Filogenia , Virus/genética , Células Procariotas/química , Células Eucariotas/química
9.
RNA Biol ; 21(1): 1-12, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38091265

RESUMEN

The division of the cellular space into nucleoplasm and cytoplasm promotes quality control mechanisms that prevent misprocessed mRNAs and junk RNAs from gaining access to the translational machinery. Here, we explore how properly processed mRNAs are distinguished from both misprocessed mRNAs and junk RNAs by the presence or absence of various 'identity features'.


Asunto(s)
Núcleo Celular , Empalme del ARN , Transporte Activo de Núcleo Celular , ARN Mensajero/genética , ARN Mensajero/metabolismo , Núcleo Celular/genética , Núcleo Celular/metabolismo , Transporte de ARN , ARN no Traducido/metabolismo
10.
Plant J ; 111(3): 768-784, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35648423

RESUMEN

Two factors are proposed to account for the unusual features of organellar genomes: the disruptions of organelle-targeted DNA replication, repair, and recombination (DNA-RRR) systems in the nuclear genome and repetitive elements in organellar genomes. Little is known about how these factors affect organellar genome evolution. The deep-branching vascular plant family Selaginellaceae is known to have a deficient DNA-RRR system and convergently evolved organellar genomes. However, we found that the plastid genome (plastome) of Selaginella sinensis has extremely accelerated substitution rates, a low GC content, pervasive repeat elements, a dynamic network structure, and it lacks direct or inverted repeats. Unexpectedly, its organelle DNA-RRR system is short of a plastid-targeted Recombinase A1 (RecA1) and a mitochondrion-targeted RecA3, in line with other explored Selaginella species. The plastome contains a large collection of short- and medium-sized repeats. Given the absence of RecA1 surveillance, we propose that these repeats trigger illegitimate recombination, accelerated mutation rates, and structural instability. The correlations between repeat quantity and architectural complexity in the Selaginella plastomes support these conclusions. We, therefore, hypothesize that the interplay of the deficient DNA-RRR system and the high repeat content has led to the extraordinary divergence of the S. sinensis plastome. Our study not only sheds new light on the mechanism of plastome divergence by emphasizing the power of cytonuclear integration, but it also reconciles the longstanding contradiction on the effects of DNA-RRR system disruption on genome structure evolution.


Asunto(s)
Genoma de Plastidios , Selaginellaceae , ADN , Evolución Molecular , Genoma de Plastidios/genética , Filogenia , Selaginellaceae/genética
11.
J Gen Virol ; 104(10)2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37792576

RESUMEN

Poxviruses (family Poxviridae) have long dsDNA genomes and infect a wide range of hosts, including insects, birds, reptiles and mammals. These viruses have substantial incidence, prevalence and disease burden in humans and in other animals. Nucleotide and dinucleotide composition, mostly CpG and TpA, have been largely studied in viral genomes because of their evolutionary and functional implications. We analysed here the nucleotide and dinucleotide composition, as well as codon usage bias, of a set of representative poxvirus genomes, with a very diverse host spectrum. After correcting for overall nucleotide composition, entomopoxviruses displayed low overall GC content, no enrichment in TpA and large variation in CpG enrichment, while chordopoxviruses showed large variation in nucleotide composition, no obvious depletion in CpG and a weak trend for TpA depletion in GC-rich genomes. Overall, intergenome variation in dinucleotide composition in poxviruses is largely accounted for by variation in overall genomic GC levels. Nonetheless, using vaccinia virus as a model, we found that genes expressed at the earliest times in infection are more CpG-depleted than genes expressed at later stages. This observation has parallels in betahepesviruses (also large dsDNA viruses) and suggests an antiviral role for the innate immune system (e.g. via the zinc-finger antiviral protein ZAP) in the early phases of poxvirus infection. We also analysed codon usage bias in poxviruses and we observed that it is mostly determined by genomic GC content, and that stratification after host taxonomy does not contribute to explaining codon usage bias diversity. By analysis of within-species diversity, we show that genomic GC content is the result of mutational biases. Poxvirus genomes that encode a DNA ligase are significantly AT-richer than those that do not, suggesting that DNA repair systems shape mutation biases. Our data shed light on the evolution of poxviruses and inform strategies for their genetic manipulation for therapeutic purposes.


Asunto(s)
Poxviridae , Animales , Humanos , Poxviridae/genética , Nucleótidos , Codón/genética , Evolución Molecular , Mamíferos/genética , Fosfatos de Dinucleósidos , Antivirales
12.
Am J Hum Genet ; 107(3): 487-498, 2020 09 03.
Artículo en Inglés | MEDLINE | ID: mdl-32800095

RESUMEN

The aggregation and joint analysis of large numbers of exome sequences has recently made it possible to derive estimates of intolerance to loss-of-function (LoF) variation for human genes. Here, we demonstrate strong and widespread coupling between genic LoF intolerance and promoter CpG density across the human genome. Genes downstream of the most CpG-rich promoters (top 10% CpG density) have a 67.2% probability of being highly LoF intolerant, using the LOEUF metric from gnomAD. This is in contrast to 7.4% of genes downstream of the most CpG-poor (bottom 10% CpG density) promoters. Combining promoter CpG density with exonic and promoter conservation explains 33.4% of the variation in LOEUF, and the contribution of CpG density exceeds the individual contributions of exonic and promoter conservation. We leverage this to train a simple and easily interpretable predictive model that outperforms other existing predictors and allows us to classify 1,760 genes-which are currently unascertained in gnomAD-as highly LoF intolerant or not. These predictions have the potential to aid in the interpretation of novel variants in the clinical setting. Moreover, our results reveal that high CpG density is not merely a generic feature of human promoters but is preferentially encountered at the promoters of the most selectively constrained genes, calling into question the prevailing view that CpG islands are not subject to selection.


Asunto(s)
Islas de CpG/genética , Genoma Humano/genética , Mutación con Pérdida de Función/genética , Regiones Promotoras Genéticas/genética , Metilación de ADN/genética , Exones/genética , Humanos , ARN Polimerasa II/genética , Sitio de Iniciación de la Transcripción
13.
J Mol Evol ; 91(1): 24-32, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-36484794

RESUMEN

The study of spontaneous mutation rates has revealed a wide range of heritable point mutation rates across species, but there are comparatively few estimates for large-scale deletion and duplication rates. The handful of studies that have directly calculated spontaneous rates of deletion and duplication using mutation accumulation lines have estimated that genes are duplicated and deleted at orders of magnitude greater rates than the spontaneous point mutation rate. In our study, we tested whether spontaneous gene deletion and gene duplication rates are also high in Dictyostelium discoideum, a eukaryote with among the lowest point mutation rates (2.5 × 10-11 per site per generation) and an AT-rich genome (GC content of 22%). We calculated mutation rates of gene deletions and duplications using whole-genome sequencing data originating from a mutation accumulation experiment and determined the association between the copy number mutations and GC content. Overall, we estimated an average of 3.93 × 10-8 gene deletions and 1.18 × 10-8 gene duplications per gene per generation. While orders of magnitude greater than their point mutation rate, these rates are much lower compared to gene deletion and duplication rates estimated from mutation accumulation lines in other organisms (that are on the order of ~ 10-6 per gene/generation). The deletions and duplications were enriched in regions that were AT-rich even compared to the genomic background, in contrast to our expectations if low GC content was contributing to low mutation rates. The low deletion and duplication mutation rates in D. discoideum compared to other eukaryotes mirror their low point mutation rates, supporting previous work suggesting that this organism has high replication fidelity and effective molecular machinery to avoid the accumulation of mutations in their genome.


Asunto(s)
Dictyostelium , Duplicación de Gen , Dictyostelium/genética , Eliminación de Gen , Mutación , Genoma , Eucariontes/genética
14.
J Mol Evol ; 91(6): 963-975, 2023 12.
Artículo en Inglés | MEDLINE | ID: mdl-38006429

RESUMEN

For several decades, it has been known that a substantial number of genes within human DNA exhibit overlap; however, the biological and evolutionary significance of these overlaps remain poorly understood. This study focused on investigating specific instances of overlap where the overlapping DNA region encompasses the coding DNA sequences (CDSs) of protein-coding genes. The results revealed that proteins encoded by overlapping CDSs exhibit greater disorder than those from nonoverlapping CDSs. Additionally, these DNA regions were identified as GC-rich. This could be partially attributed to the absence of stop codons from two distinct reading frames rather than one. Furthermore, these regions were found to harbour fewer single-nucleotide polymorphism (SNP) sites, possibly due to constraints arising from the overlapping state where mutations could affect two genes simultaneously.While elucidating these properties, the NR1D1-THRA gene pair emerged as an exceptional case with highly structured proteins and a distinctly conserved sequence across eutherian mammals. Both NR1D1 and THRA are nuclear receptors lacking a ligand-binding domain at their C-terminus, which is the region where these gene pairs overlap. The NR1D1 gene is involved in the regulation of circadian rhythm, while the THRA gene encodes a thyroid hormone receptor, and both play crucial roles in various physiological processes. This study suggests that, in addition to their well-established functions, the specifically overlapping CDS regions of these genes may encode protein segments with additional, yet undiscovered, biological roles.


Asunto(s)
Genes erbA , Genoma Humano , Animales , Humanos , Genoma Humano/genética , Receptores de Hormona Tiroidea/genética , Mutación , Proteínas/genética , Sistemas de Lectura Abierta/genética , ADN , Mamíferos/genética , Miembro 1 del Grupo D de la Subfamilia 1 de Receptores Nucleares/genética
15.
J Mol Evol ; 91(4): 382-390, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37264211

RESUMEN

The standard genetic code determines that in most species, including viruses, there are 20 amino acids that are coded by 61 codons, while the other three codons are stop triplets. Considering the whole proteome each species features its own amino acid frequencies, given the slow rate of change, closely related species display similar GC content and amino acids usage. In contrast, distantly related species display different amino acid frequencies. Furthermore, within certain multicellular species, as mammals, intragenomic differences in the usage of amino acids are evident. In this communication, we shall summarize some of the most prominent and well-established factors that determine the differences found in the amino acid usage, both across evolution and intragenomically.


Asunto(s)
Aminoácidos , Código Genético , Animales , Aminoácidos/genética , Codón/genética , Composición de Base , Proteoma/genética , Evolución Molecular , Mamíferos/genética
16.
Biochem Biophys Res Commun ; 657: 92-99, 2023 05 21.
Artículo en Inglés | MEDLINE | ID: mdl-37001285

RESUMEN

Ipomoea plants possess important commercial, medicinal, and ornamental value. Molecular and morphological studies have confirmed that most species of this genus exhibit similar phenotypes but complex phylogenetic relationships. To date, limited information is available on these evolutionary relationships. In this study, systematic analysis of diverse species from Ipomoea was used to elucidate the relationships in this genus. To this end, we employed the concept of codon usage bias (CUB) to analyze the codon usage bias of five Ipomoea species such as effective number of codons (ENC) and GC content at the third synonym codon position (GC3s). Three types of plots including ENC-GC3s, parity rule 2 (PR2) and neutrality plots were employed to discover the factors determining CUB, and the frequency of hydrogen bonds and nucleotide were calculated to dissect changes in GC content at the 5'-end of the coding sequence. Our results showed little distinctness in CUB among the five species, with a reduction of hydrogen bonds content at the 5'-end (with similar changes in cytosines). In addition, optimal codons of Ipomoea aquatica ended with G or C, different from those of the other four species, which ended in A or T. These results may be useful for exploring the evolutionary relationships among this group, and for understanding the reasons for the variation among Ipomoea species.


Asunto(s)
Evolución Biológica , Uso de Codones , Filogenia , Composición de Base , Codón/genética , Evolución Molecular
17.
Mol Phylogenet Evol ; 179: 107673, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-36528332

RESUMEN

Spikemoss (Selaginellaceae) is one of the basal lineages of vascular plants. This family has a single genus Selaginella which consists of about 750 extant species. The phylogeny of Selaginellaceae has been extensively studied mainly based on plastid DNA and a few nuclear sequences. However, the placement of the enigmatic sinensis group is a long-term controversy because of the long branch in the plastid DNA phylogeny. The sanguinolenta group is also a phylogenetically problematic clade owing to two alternative positions resulted from different datasets. Here, we newly sequenced 34 mitochondrial genomes (mitogenomes) of individuals representing all seven subgenera and major clades in Selaginellaceae. We assembled the draft mitogenomes and annotated the genes and performed phylogenetic analyses based on the shared 17 mitochondrial genes. Our major results include: (1) all the assembled mitogenomes have complicated structures, unparalleled high GC content and a small gene content set, and the positive correlations among GC content, substitution rates and the number of RNA editing sites hold; (2) the sinensis group was well supported as a member of subg. Stachygynandrum; (3) the sanguinolenta group was strongly resolved as sister to all other Selaginella species except for subg. Selaginella. This study demonstrates the potential of mitogenome data in providing novel insights into phylogenetically recalcitrant problems.


Asunto(s)
Genoma Mitocondrial , Selaginellaceae , Humanos , Filogenia , Selaginellaceae/genética , Secuencia de Bases , Plastidios/genética
18.
BMC Biol ; 20(1): 66, 2022 03 17.
Artículo en Inglés | MEDLINE | ID: mdl-35296310

RESUMEN

BACKGROUND: The plastid genomes of the green algal order Chlamydomonadales tend to expand their non-coding regions, but this phenomenon is poorly understood. Here we shed new light on organellar genome evolution in Chlamydomonadales by studying a previously unknown non-photosynthetic lineage. We established cultures of two new Polytoma-like flagellates, defined their basic characteristics and phylogenetic position, and obtained complete organellar genome sequences and a transcriptome assembly for one of them. RESULTS: We discovered a novel deeply diverged chlamydomonadalean lineage that has no close photosynthetic relatives and represents an independent case of photosynthesis loss. To accommodate these organisms, we establish the new genus Leontynka, with two species (L. pallida and L. elongata) distinguishable through both their morphological and molecular characteristics. Notable features of the colourless plastid of L. pallida deduced from the plastid genome (plastome) sequence and transcriptome assembly include the retention of ATP synthase, thylakoid-associated proteins, the carotenoid biosynthesis pathway, and a plastoquinone-based electron transport chain, the latter two modules having an obvious functional link to the eyespot present in Leontynka. Most strikingly, the ~362 kbp plastome of L. pallida is by far the largest among the non-photosynthetic eukaryotes investigated to date due to an extreme proliferation of sequence repeats. These repeats are also present in coding sequences, with one repeat type found in the exons of 11 out of 34 protein-coding genes, with up to 36 copies per gene, thus affecting the encoded proteins. The mitochondrial genome of L. pallida is likewise exceptionally large, with its >104 kbp surpassed only by the mitogenome of Haematococcus lacustris among all members of Chlamydomonadales hitherto studied. It is also bloated with repeats, though entirely different from those in the L. pallida plastome, which contrasts with the situation in H. lacustris where both the organellar genomes have accumulated related repeats. Furthermore, the L. pallida mitogenome exhibits an extremely high GC content in both coding and non-coding regions and, strikingly, a high number of predicted G-quadruplexes. CONCLUSIONS: With its unprecedented combination of plastid and mitochondrial genome characteristics, Leontynka pushes the frontiers of organellar genome diversity and is an interesting model for studying organellar genome evolution.


Asunto(s)
Chlorophyceae , Chlorophyta , Genoma de Plastidios , Chlorophyta/genética , Evolución Molecular , Fotosíntesis/genética , Filogenia , Plastidios
19.
Int J Mol Sci ; 24(17)2023 Aug 24.
Artículo en Inglés | MEDLINE | ID: mdl-37685974

RESUMEN

The organization of the genome nucleotide (AT/GC) composition in vertebrates remains poorly understood despite the numerous genome assemblies available. Particularly, the origin of the AT/GC heterogeneity in amniotes, in comparison to the homogeneity in anamniotes, is controversial. Recently, several exceptions to this dichotomy were confirmed in an ancient fish lineage with mammalian AT/GC heterogeneity. Hence, our current knowledge necessitates a reevaluation considering this fact and utilizing newly available data and tools. We analyzed fish genomes in silico with as low user input as possible to compare previous approaches to assessing genome composition. Our results revealed a disparity between previously used plots of GC% and histograms representing the authentic distribution of GC% values in genomes. Previous plots heavily reduced the range of GC% values in fish to comply with the alleged AT/GC homogeneity and AT-richness of their genomes. We illustrate how the selected sequence size influences the clustering of GC% values. Previous approaches that disregarded chromosome and genome sizes, which are about three times smaller in fish than in mammals, distorted their results and contributed to the persisting confusion about fish genome composition. Chromosome size and their transposons may drive the AT/GC heterogeneity apparent on mammalian chromosomes, whereas far less in fishes.


Asunto(s)
Peces , Isocoras , Animales , Isocoras/genética , Peces/genética , Tamaño del Genoma , Cromosomas de los Mamíferos , Análisis por Conglomerados , Mamíferos
20.
Int J Mol Sci ; 24(20)2023 Oct 18.
Artículo en Inglés | MEDLINE | ID: mdl-37894996

RESUMEN

CRISPR/Cas9 is an efficient genome-editing tool, and the identification of editing sites and potential influences in the Camellia sinensis genome have not been investigated. In this study, bioinformatics methods were used to characterise the Camellia sinensis genome including editing sites, simple sequence repeats (SSRs), G-quadruplexes (GQ), gene density, and their relationships. A total of 248,134,838 potential editing sites were identified in the genome, and five PAM types, AGG, TGG, CGG, GGG, and NGG, were observed, of which 66,665,912 were found to be specific, and they were present in all structural elements of the genes. The characteristic region of high GC content, GQ density, and PAM density in contrast to low gene density and SSR density was identified in the chromosomes in the joint analysis, and it was associated with secondary metabolites and amino acid biosynthesis pathways. CRISPR/Cas9, as a technology to drive crop improvement, with the identified editing sites and effector elements, provides valuable tools for functional studies and molecular breeding in Camellia sinensis.


Asunto(s)
Sistemas CRISPR-Cas , Camellia sinensis , Sistemas CRISPR-Cas/genética , Camellia sinensis/genética , Genoma de Planta , Edición Génica/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA