RESUMO
LTR-retrotransposons are the most abundant repeat sequences in plant genomes and play an important role in evolution and biodiversity. Their characterization is of great importance to understand their dynamics. However, the identification and classification of these elements remains a challenge today. Moreover, current software can be relatively slow (from hours to days), sometimes involve a lot of manual work and do not reach satisfactory levels in terms of precision and sensitivity. Here we present Inpactor2, an accurate and fast application that creates LTR-retrotransposon reference libraries in a very short time. Inpactor2 takes an assembled genome as input and follows a hybrid approach (deep learning and structure-based) to detect elements, filter partial sequences and finally classify intact sequences into superfamilies and, as very few tools do, into lineages. This tool takes advantage of multi-core and GPU architectures to decrease execution times. Using the rice genome, Inpactor2 showed a run time of 5 minutes (faster than other tools) and has the best accuracy and F1-Score of the tools tested here, also having the second best accuracy and specificity only surpassed by EDTA, but achieving 28% higher sensitivity. For large genomes, Inpactor2 is up to seven times faster than other available bioinformatics tools.
Assuntos
Aprendizado Profundo , Retroelementos , Retroelementos/genética , Sequências Repetidas Terminais/genética , Genoma de Planta , Software , Evolução Molecular , FilogeniaRESUMO
Dominant optic atrophy (DOA) is one of the most prevalent forms of hereditary optic neuropathies and is mainly caused by heterozygous variants in OPA1, encoding a mitochondrial dynamin-related large GTPase. The clinical spectrum of DOA has been extended to a wide variety of syndromic presentations, called DOAplus, including deafness as the main secondary symptom associated to vision impairment. To date, the pathophysiological mechanisms underlying the deafness in DOA remain unknown. To gain insights into the process leading to hearing impairment, we have analyzed the Opa1delTTAG mouse model that recapitulates the DOAplus syndrome through complementary approaches combining morpho-physiology, biochemistry, and cellular and molecular biology. We found that Opa1delTTAG mutation leads an adult-onset progressive auditory neuropathy in mice, as attested by the auditory brainstem response threshold shift over time. However, the mutant mice harbored larger otoacoustic emissions in comparison to wild-type littermates, whereas the endocochlear potential, which is a proxy for the functional state of the stria vascularis, was comparable between both genotypes. Ultrastructural examination of the mutant mice revealed a selective loss of sensory inner hair cells, together with a progressive degeneration of the axons and myelin sheaths of the afferent terminals of the spiral ganglion neurons, supporting an auditory neuropathy spectrum disorder (ANSD). Molecular assessment of cochlea demonstrated a reduction of Opa1 mRNA level by greater than 40%, supporting haploinsufficiency as the disease mechanism. In addition, we evidenced an early increase in Sirtuin 3 level and in Beclin1 activity, and subsequently an age-related mtDNA depletion, increased oxidative stress, mitophagy as well as an impaired autophagic flux. Together, these results support a novel role for OPA1 in the maintenance of inner hair cells and auditory neural structures, addressing new challenges for the exploration and treatment of OPA1-linked ANSD in patients.
Assuntos
Surdez , Perda Auditiva Central , Atrofia Óptica Autossômica Dominante , Animais , Humanos , Camundongos , GTP Fosfo-Hidrolases/genética , Perda Auditiva Central/genética , Mutação , Atrofia Óptica Autossômica Dominante/genéticaRESUMO
Transposable elements (TEs) are mobile elements found in the majority of eukaryotic genomes. TEs deeply impact the structure and evolution of chromosomes and can induce mutations affecting coding genes. In plants, the major group of TEs is long terminal repeat retrotransposons (LTR-RTs). They are classified into superfamilies (Gypsy, Copia) and subclassified into lineages. Horizontal transfer (HT), defined as the nonsexual transmission of genetic material between species, is a process allowing LTR-RTs to invade a new genome. Although this phenomenon was considered rare, recent studies demonstrate numerous transfers of LTR-RTs. This study aims to determine which LTR-RT lineages are shared with high similarity among 69 plant genomes. We identified and classified 88 450 LTR-RTs and determined 143 cases of high similarities between pairs of genomes. Most of them involved three Copia lineages (Oryco/Ivana, Retrofit/Ale, and Tork/Tar/Ikeros). A detailed analysis of three cases of high similarities involving Tork/Tar/Ikeros group shows an uneven distribution in the phylogeny of the elements and incongruence with between phylogenetic trees topologies, indicating they could be originated from HTs. Overall, our results suggest that LTR-RT Copia lineages share outstanding similarity between distant species and may likely be involved in HT mechanisms more frequent than initially estimated.
Assuntos
Nucleotídeos , Retroelementos , Filogenia , Genoma de Planta , Sequências Repetidas Terminais/genética , Evolução MolecularRESUMO
Chili peppers (Solanaceae family) have great commercial value. They are commercialized in natura and used as spices and for ornamental and medicinal purposes. Although three whole genomes have been published, limited information about satellite DNA sequences, their composition, and genomic distribution has been provided. Here, we exploited the noncoding repetitive fraction, represented by satellite sequences, that tends to accumulate in blocks along chromosomes, especially near the chromosome ends of peppers. Two satellite DNA sequences were identified (CDR-1 and CDR-2), characterized and mapped in silico in three Capsicum genomes (C. annuum, C. chinense, and C. baccatum) using data from the published high-coverage sequencing and repeats finding bioinformatic tools. Localization using FISH in the chromosomes of these species and in two others (C. frutescens and C. chacoense), totaling five species, showed signals adjacent to the rDNA sites. A sequence comparison with existing Solanaceae repeats showed that CDR-1 and CDR-2 have different origins but without homology to rDNA sequences. Satellites occupied subterminal chromosomal regions, sometimes collocated with or adjacent to 35S rDNA sequences. Our results expand knowledge about the diversity of subterminal regions of Capsicum chromosomes, showing different amounts and distributions within and between karyotypes. In addition, these sequences may be useful for future phylogenetic studies.
Assuntos
Capsicum , Solanaceae , Capsicum/genética , Solanaceae/genética , Sequência de Bases , DNA Satélite/genética , Filogenia , Cromossomos , Sequências Repetitivas de Ácido Nucleico , Cariótipo , DNA RibossômicoRESUMO
Human endogenous retroviruses (HERVs) are LTR retrotransposons that are present in the human genome. Among them, members of the HERV-K (HML-2) group are suspected to play a role in the development of different types of cancer, including lung, ovarian, and prostate cancer, as well as leukemia. Acute myeloid leukemia (AML) is an important disease that causes 1% of cancer deaths in the United States and has a survival rate of 28.7%. Here, we describe a method for assessing the statistical association between HERV-K (HML-2) transposable element insertion polymorphisms (or TIPs) and AML, using whole-genome sequencing and read mapping using TIP_finder software. Our results suggest that 101 polymorphisms involving HERV-K (HML-2) elements were correlated with AML, with a percentage between 44.4 to 56.6%, most of which (70) were located in the region from 8q24.13 to 8q24.21. Moreover, it was found that the TRIB1, LRATD2, POU5F1B, MYC, PCAT1, PVT1, and CCDC26 genes could be displaced or fragmented by TIPs. Furthermore, a general method was devised to facilitate analysis of the correlation between transposable element insertions and specific diseases. Finally, although the relationship between HERV-K (HML-2) TIPs and AML remains unclear, the data reported in this study indicate a statistical correlation, as supported by the χ2 test with p-values < 0.05.
Assuntos
Retrovirus Endógenos , Leucemia Mieloide Aguda , Masculino , Humanos , Retrovirus Endógenos/genética , Elementos de DNA Transponíveis , Polimorfismo Genético , Genoma Humano , Leucemia Mieloide Aguda/genética , Proteínas Serina-Treonina Quinases , Peptídeos e Proteínas de Sinalização Intracelular/genéticaRESUMO
In rats, hypothyroidism during fetal and neonatal development can disrupt neuronal migration and induce the formation of periventricular heterotopia in the brain. However, it remains uncertain if heterotopia also manifest in mice after developmental hypothyroidism and whether they could be used as a toxicological endpoint to detect TH-mediated effects caused by TH system disrupting chemicals. Here, we performed a mouse study where we induced severe hypothyroidism by exposing pregnant mice (n = 3) to a very high dose of propylthiouracil (PTU) (1500 ppm) in the diet. This, to obtain best chances of detecting heterotopia. We found what appears to be very small heterotopia in 4 out of the 8 PTU-exposed pups. Although the incidence rate could suggest some utility for this endpoint, the small size of the ectopic neuronal clusters at maximum hypothyroidism excludes the utility of heterotopia in mouse toxicity studies aimed to detect TH system disrupting chemicals. On the other hand, parvalbumin expression was manifestly lower in the cortex of hypothyroid mouse offspring demonstrating that offspring TH-deficiency caused an effect on the developing brain. Based on overall results, we conclude that heterotopia formation in mice is not a useful toxicological endpoint for examining TH-mediated developmental neurotoxicity.
Assuntos
Hipotireoidismo , Heterotopia Nodular Periventricular , Efeitos Tardios da Exposição Pré-Natal , Gravidez , Feminino , Humanos , Animais , Ratos , Camundongos , Efeitos Tardios da Exposição Pré-Natal/induzido quimicamente , Exposição Materna , Hormônios Tireóideos/metabolismo , Hipotireoidismo/induzido quimicamente , Hipotireoidismo/metabolismo , Propiltiouracila/toxicidadeRESUMO
Coffea spp. chromosomes are very small and accumulate a variety of repetitive DNA families around the centromeres. However, the proximal regions of Coffea chromosomes remain poorly understood, especially regarding the nature and organisation of the sequences. Taking advantage of the genome sequences of C. arabica (2n = 44), C. canephora, and C. eugenioides (C. arabica progenitors with 2n = 22) and good coverage genome sequencing of dozens of other wild Coffea spp., repetitive DNA sequences were identified, and the genomes were compared to decipher particularities of pericentromeric structures. The searches revealed a short tandem repeat (82 bp length) typical of Gypsy/TAT LTR retrotransposons, named Coffea_sat11. This repeat organises clusters with fragments of other transposable elements, comprising regions of non-coding RNA production. Cytogenomic analyses showed that Coffea_sat11 extends from the pericentromeres towards the middle of the chromosomal arms. This arrangement was observed in the allotetraploid C. arabica chromosomes, as well as in its progenitors. This study improves our understanding of the role of the Gypsy/TAT LTR retrotransposon lineage in the organisation of Coffea pericentromeres, as well as the conservation of Coffea_sat11 within the genus. The relationships between fragments of other transposable elements and the functional aspects of these sequences on the pericentromere chromatin were also evaluated. Highlights: A scattered short tandem repeat, typical of Gypsy/TAT LTR retrotransposons, associated with several fragments of other transposable elements, accumulates in the pericentromeres of Coffea chromosomes. This arrangement is preserved in all clades of the genus and appears to have a strong regulatory role in the organisation of chromatin around centromeres.
Assuntos
Coffea , Retroelementos , Sequência de Bases , Coffea/genética , Evolução Molecular , Genoma de Planta , Humanos , Filogenia , Sequências de Repetição em Tandem , Sequências Repetidas TerminaisRESUMO
We gathered available RNA-seq and ChIP-seq data in a single database to better characterize the target genes of thyroid hormone receptors in several cell types. This database can serve as a resource to analyze the mode of action of thyroid hormone (T3). Additionally, it is an easy-to-use and convenient tool to obtain information on specific genes regarding T3 regulation or to extract large gene lists of interest according to the users' criteria. Overall, this atlas is a unique compilation of recent sequencing data focusing on T3, its receptors, modes of action, targets and roles, which may benefit researchers within the field. A preliminary analysis indicates extensive variations in the repertoire of target genes where transcription is upregulated by chromatin-bound nuclear receptors. Although it has a major influence, chromatin accessibility is not the only parameter that determines the cellular selectivity of the hormonal response.
Assuntos
Receptores dos Hormônios Tireóideos , Hormônios Tireóideos , Animais , Cromatina/genética , Camundongos , Receptores dos Hormônios Tireóideos/genética , Receptores dos Hormônios Tireóideos/metabolismo , Hormônios Tireóideos/genética , Hormônios Tireóideos/metabolismo , Tri-Iodotironina/metabolismoRESUMO
BACKGROUND: Pathogens of the genus Phytophthora are the etiological agents of many devastating diseases in several high-value crops and forestry species such as potato, tomato, cocoa, and oak, among many others. Phytophthora betacei is a recently described species that causes late blight almost exclusively in tree tomatoes, and it is closely related to Phytophthora infestans that causes the disease in potato crops and other Solanaceae. This study reports the assembly and annotation of the genomes of P. betacei P8084, the first of its species, and P. infestans RC1-10, a Colombian strain from the EC-1 lineage, using long-read SMRT sequencing technology. RESULTS: Our results show that P. betacei has the largest sequenced genome size of the Phytophthora genus so far with 270 Mb. A moderate transposable element invasion and a whole genome duplication likely explain its genome size expansion when compared to P. infestans, whereas P. infestans RC1-10 has expanded its genome under the activity of transposable elements. The high diversity and abundance (in terms of copy number) of classified and unclassified transposable elements in P. infestans RC1-10 relative to P. betacei bears testimony of the power of long-read technologies to discover novel repetitive elements in the genomes of organisms. Our data also provides support for the phylogenetic placement of P. betacei as a standalone species and as a sister group of P. infestans. Finally, we found no evidence to support the idea that the genome of P. betacei P8084 follows the same gene-dense/gense-sparse architecture proposed for P. infestans and other filamentous plant pathogens. CONCLUSIONS: This study provides the first genome-wide picture of P. betacei and expands the genomic resources available for P. infestans. This is a contribution towards the understanding of the genome biology and evolutionary history of Phytophthora species belonging to the subclade 1c.
Assuntos
Phytophthora infestans , Solanum tuberosum , Elementos de DNA Transponíveis , Evolução Molecular , Duplicação Gênica , Filogenia , Phytophthora infestans/genética , Doenças das Plantas , Solanum tuberosum/genéticaRESUMO
For decades coffees were associated with the genus Coffea. In 2011, the closely related genus Psilanthus was subsumed into Coffea. However, results obtained in 2017-based on 28,800 nuclear SNPs-indicated that there is not substantial phylogenetic support for this incorporation. In addition, a recent study of 16 plastid full-genome sequences highlighted an incongruous placement of Coffea canephora (Robusta coffee) between maternal and nuclear trees. In this study, similar global features of the plastid genomes of Psilanthus and Coffea are observed. In agreement with morphological and physiological traits, the nuclear phylogenetic tree clearly separates Psilanthus from Coffea (with exception to C. rhamnifolia, closer to Psilanthus than to Coffea). In contrast, the maternal molecular tree was incongruent with both morphological and nuclear differentiation, with four main clades observed, two of which include both Psilanthus and Coffea species, and two with either Psilanthus or Coffea species. Interestingly, Coffea and Psilanthus taxa sampled in West and Central Africa are members of the same group. Several mechanisms such as the retention of ancestral polymorphisms due to incomplete lineage sorting, hybridization leading to homoploidy (without chromosome doubling) and alloploidy (for C. arabica) are involved in the evolutionary history of the coffee species. While sharing similar morphological characteristics, the genetic relationships within C. canephora have shown that some populations are well differentiated and genetically isolated. Given the position of its closely-related species, we may also consider C. canephora to be undergoing a long process of speciation with an intermediate step of (sub-)speciation.
Assuntos
Núcleo Celular/genética , Coffea/genética , Evolução Molecular , Genomas de Plastídeos , Polimorfismo de Nucleotídeo Único/genética , Análise por Conglomerados , Filogenia , Especificidade da EspécieRESUMO
BACKGROUND AND AIMS: Like other clades, the Coffea genus is highly diversified on the island of Madagascar. The 66 endemic species have colonized various environments and consequently exhibit a wide diversity of morphological, functional and phenological features and reproductive strategies. The trends of interspecific trait variation, which stems from interactions between genetically defined species and their environment, still needed to be addressed for Malagasy coffee trees. METHODS: Data acquisition was done in the most comprehensive ex situ collection of Madagascan wild Coffea. The structure of endemic wild coffees maintained in an ex situ collection was explored in terms of morphological, phenological and functional traits. The environmental (natural habitat) effect was assessed on traits in species from distinct natural habitats. Phylogenetic signal (Pagel's λ, Blomberg's K) was used to quantify trait proximities among species according to their phylogenetic relatedness. KEY RESULTS: Despite the lack of environmental difference in the ex situ collection, widely diverging phenotypes were observed. Phylogenetic signal was found to vary greatly across and even within trait categories. The highest values were exhibited by the ratio of internode mass to leaf mass, the length of the maturation phase and leaf dry matter content (ratio of dry leaf mass to fresh leaf mass). By contrast, traits weakly linked to phylogeny were either constrained by the original natural environment (leaf size) or under selective pressures (phenological traits). CONCLUSIONS: This study gives insight into complex patterns of trait variability found in an ex situ collection, and underlines the opportunities offered by living ex situ collections for research characterizing phenotypic variation.
Assuntos
Café , Folhas de Planta , Ilhas , Madagáscar , Fenótipo , FilogeniaRESUMO
Bottle gourd (Lagenaria siceraria) is an important food, medicinal, and utilitarian crop with a large pan-tropical distribution. Two morphologically different types in the siceraria subspecies are sufficiently different to be considered as varieties, but they are assigned into different taxonomic ranks. Genotyping-by-sequencing (GBS) of 95 different accessions from the Nangui Abrogoua University collection was used to confirm the varietal status in bottle gourd. This analysis produced 22 575 single-nucleotide polymorphisms (SNPs). Cluster analyses conducted with 2250 (9.96%) SNPs distinctly separated hard-shelled from soft-shelled types. Analysis of 23 SNPs located in 11 genes coding for traits that differentiate the two types of gourds revealed that genes in the soft-shelled types had about 21% fewer SNPs than genes within hard-shelled gourds, but the latter had more non-synonymous SNPs. Cluster analyses conducted with the 23 SNPs fitted well with the structure defined by the 2250 SNPs, suggesting the implication of these SNPs in the varietal differentiation of bottle gourd. These nucleotide changes along with the genetic relationships between the accessions provide molecular proof supporting the status of two varieties. To prevent the confusion inherent in the use of synonyms and homonyms in bottle gourd, we suggest the terms hard-shelled and soft-shelled to designate, respectively, the varieties used as utensils and those grown for their edible seeds.
Assuntos
Cucurbita/genética , Genoma de Planta , Genótipo , Análise de Sequência de DNA , Análise por Conglomerados , DNA de Plantas , Genes de Plantas , Filogenia , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Thyroid hormone receptors (TRs) are members of the nuclear hormone receptor superfamily that act as ligand-dependent transcription factors. Here we identified the ten-eleven translocation protein 3 (TET3) as a TR interacting protein increasing cell sensitivity to T3. The interaction between TET3 and TRs is independent of TET3 catalytic activity and specifically allows the stabilization of TRs on chromatin. We provide evidence that TET3 is required for TR stability, efficient binding of target genes, and transcriptional activation. Interestingly, the differential ability of different TRα1 mutants to interact with TET3 might explain their differential dominant activity in patients carrying TR germline mutations. So this study evidences a mode of action for TET3 as a nonclassical coregulator of TRs, modulating its stability and access to chromatin, rather than its intrinsic transcriptional activity. This regulatory function might be more general toward nuclear receptors. Indeed, TET3 interacts with different members of the superfamily and also enhances their association to chromatin.
Assuntos
Cromatina/metabolismo , Dioxigenases/metabolismo , Receptores alfa dos Hormônios Tireóideos/metabolismo , Domínio Catalítico , Cromatina/genética , Dioxigenases/genética , Regulação da Expressão Gênica , Células HEK293 , Humanos , Imunoprecipitação , Mutação , Nitrilas/farmacologia , Domínios e Motivos de Interação entre Proteínas , Receptores Androgênicos/genética , Receptores Androgênicos/metabolismo , Tiazóis/farmacologia , Receptores alfa dos Hormônios Tireóideos/genética , Receptores beta dos Hormônios Tireóideos/genética , Receptores beta dos Hormônios Tireóideos/metabolismo , Transcrição Gênica , UbiquitinaçãoRESUMO
The natural rubber biosynthetic pathway is well described in Hevea, although the final stages of rubber elongation are still poorly understood. Small Rubber Particle Proteins and Rubber Elongation Factors (SRPPs and REFs) are proteins with major function in rubber particle formation and stabilization. Their corresponding genes are clustered on a scaffold1222 of the reference genomic sequence of the Hevea brasiliensis genome. Apart from gene expression by transcriptomic analyses, to date, no deep analyses have been carried out for the genomic environment of SRPPs and REFs loci. By integrative analyses on transposable element annotation, small RNAs production and gene expression, we analysed their role in the control of the transcription of rubber biosynthetic genes. The first in-depth annotation of TEs (Transposable Elements) and their capacity to produce TE-derived siRNAs (small interfering RNAs) is presented, only possible in the Hevea brasiliensis clone PB 260 for which all data are available. We observed that 11% of genes are located near TEs and their presence may interfere in their transcription at both genetic and epigenetic level. We hypothesized that the genomic environment of rubber biosynthesis genes has been shaped by TE and TE-derived siRNAs with possible transcriptional interference on their gene expression. We discussed possible functionalization of TEs as enhancers and as donors of alternative transcription start sites in promoter sequences, possibly through the modelling of genetic and epigenetic landscapes.
Assuntos
Vias Biossintéticas , Perfilação da Expressão Gênica/métodos , Hevea/metabolismo , Borracha/metabolismo , Elementos de DNA Transponíveis , Regulação da Expressão Gênica de Plantas , Hevea/genética , Anotação de Sequência Molecular , Filogenia , Proteínas de Plantas/genética , Regiões Promotoras Genéticas , RNA Interferente Pequeno/genética , Análise de Sequência de RNARESUMO
Pineapple occupies an important phylogenetic position and its reference genome expedites genomic research within the family Bromeliaceae and more widely among monocots. One such research focus is the evolution of crassulacean acid metabolism (CAM) photosynthesis. Acquiring circadian clock cis-regulatory elements in CAM-related genes might be a critical step in the evolution of this form of photosynthesis. Follow-up studies will clarify the processes and evolutionary forces leading to the multiple independent origins of CAM photosynthesis within the family Bromeliaceae and in over 400 genera across 36 families.
Assuntos
Ananas/genética , Evolução Molecular , Genoma de Planta/genética , Fotossíntese/genética , Genômica , FilogeniaRESUMO
Coffea arabica L. is an important agricultural commodity, accounting for 60% of traded coffee worldwide. Nitrogen (N) is a macronutrient that is usually limiting to plant yield; however, molecular mechanisms of plant acclimation to N limitation remain largely unknown in tropical woody crops. In this study, we investigated the transcriptome of coffee roots under N starvation, analyzing poly-A+ libraries and small RNAs. We also evaluated the concentration of selected amino acids and N-source preferences in roots. Ammonium was preferentially taken up over nitrate, and asparagine and glutamate were the most abundant amino acids observed in coffee roots. We obtained 34,654 assembled contigs by mRNA sequencing, and validated the transcriptional profile of 12 genes by RT-qPCR. Illumina small RNA sequencing yielded 8,524,332 non-redundant reads, resulting in the identification of 86 microRNA families targeting 253 genes. The transcriptional pattern of eight miRNA families was also validated. To our knowledge, this is the first catalog of differentially regulated amino acids, N sources, mRNAs, and sRNAs in Arabica coffee roots.
Assuntos
Coffea/genética , MicroRNAs/genética , Nitrogênio/deficiência , RNA Mensageiro/genética , RNA de Plantas/genética , Pequeno RNA não Traduzido/genética , Aminoácidos/isolamento & purificação , Aminoácidos/metabolismo , Compostos de Amônio/metabolismo , Coffea/metabolismo , Regulação da Expressão Gênica de Plantas , Ontologia Genética , Sequenciamento de Nucleotídeos em Larga Escala , MicroRNAs/classificação , MicroRNAs/metabolismo , Anotação de Sequência Molecular , Nitratos/metabolismo , Folhas de Planta/genética , Folhas de Planta/metabolismo , Raízes de Plantas/genética , Raízes de Plantas/metabolismo , Poli A/genética , Poli A/metabolismo , RNA Mensageiro/classificação , RNA Mensageiro/metabolismo , RNA de Plantas/classificação , RNA de Plantas/metabolismo , Pequeno RNA não Traduzido/classificação , Pequeno RNA não Traduzido/metabolismo , Sementes/genética , Sementes/metabolismo , Estresse Fisiológico , TranscriptomaRESUMO
Transposable elements (TEs) are genomic units able to move within the genome of virtually all organisms. Due to their natural repetitive numbers and their high structural diversity, the identification and classification of TEs remain a challenge in sequenced genomes. Although TEs were initially regarded as "junk DNA", it has been demonstrated that they play key roles in chromosome structures, gene expression, and regulation, as well as adaptation and evolution. A highly reliable annotation of these elements is, therefore, crucial to better understand genome functions and their evolution. To date, much bioinformatics software has been developed to address TE detection and classification processes, but many problematic aspects remain, such as the reliability, precision, and speed of the analyses. Machine learning and deep learning are algorithms that can make automatic predictions and decisions in a wide variety of scientific applications. They have been tested in bioinformatics and, more specifically for TEs, classification with encouraging results. In this review, we will discuss important aspects of TEs, such as their structure, importance in the evolution and architecture of the host, and their current classifications and nomenclatures. We will also address current methods and their limitations in identifying and classifying TEs.
Assuntos
Genoma de Planta , Genômica , Plantas/genética , Retroelementos , Cromossomos de Plantas , Biologia Computacional , Aprendizado Profundo , Variação Genética , Genômica/métodos , Aprendizado de MáquinaRESUMO
Pomegranate (Punica granatum L.) is a perennial fruit crop grown since ancient times that has been planted worldwide and is known for its functional metabolites, particularly punicalagins. We have sequenced and assembled the pomegranate genome with 328 Mb anchored into nine pseudo-chromosomes and annotated 29 229 gene models. A Myrtales lineage-specific whole-genome duplication event was detected that occurred in the common ancestor before the divergence of pomegranate and Eucalyptus. Repetitive sequences accounted for 46.1% of the assembled genome. We found that the integument development gene INNER NO OUTER (INO) was under positive selection and potentially contributed to the development of the fleshy outer layer of the seed coat, an edible part of pomegranate fruit. The genes encoding the enzymes for synthesis and degradation of lignin, hemicelluloses and cellulose were also differentially expressed between soft- and hard-seeded varieties, reflecting differences in their accumulation in cultivars differing in seed hardness. Candidate genes for punicalagin biosynthesis were identified and their expression patterns indicated that gallic acid synthesis in tissues could follow different biochemical pathways. The genome sequence of pomegranate provides a valuable resource for the dissection of many biological and biochemical traits and also provides important insights for the acceleration of breeding. Elucidation of the biochemical pathway(s) involved in punicalagin biosynthesis could assist breeding efforts to increase production of this bioactive compound.
Assuntos
Genoma de Planta/genética , Genômica , Taninos Hidrolisáveis/metabolismo , Lythraceae/genética , Sequência de Aminoácidos , Vias Biossintéticas , Frutas/genética , Frutas/metabolismo , Lignina/metabolismo , Lythraceae/metabolismo , Anotação de Sequência Molecular , Fenótipo , Alinhamento de SequênciaRESUMO
Sex in papaya is controlled by a pair of nascent sex chromosomes. Females are XX, and two slightly different Y chromosomes distinguish males (XY) and hermaphrodites (XY(h)). The hermaphrodite-specific region of the Y(h) chromosome (HSY) and its X chromosome counterpart were sequenced and analyzed previously. We now report the sequence of the entire male-specific region of the Y (MSY). We used a BAC-by-BAC approach to sequence the MSY and resequence the Y regions of 24 wild males and the Y(h) regions of 12 cultivated hermaphrodites. The MSY and HSY regions have highly similar gene content and structure, and only 0.4% sequence divergence. The MSY sequences from wild males include three distinct haplotypes, associated with the populations' geographic locations, but gene flow is detected for other genomic regions. The Y(h) sequence is highly similar to one Y haplotype (MSY3) found only in wild dioecious populations from the north Pacific region of Costa Rica. The low MSY3-Y(h) divergence supports the hypothesis that hermaphrodite papaya is a product of human domestication. We estimate that Y(h) arose only â¼ 4000 yr ago, well after crop plant domestication in Mesoamerica >6200 yr ago but coinciding with the rise of the Maya civilization. The Y(h) chromosome has lower nucleotide diversity than the Y, or the genome regions that are not fully sex-linked, consistent with a domestication bottleneck. The identification of the ancestral MSY3 haplotype will expedite investigation of the mutation leading to the domestication of the hermaphrodite Y(h) chromosome. In turn, this mutation should identify the gene that was affected by the carpel-suppressing mutation that was involved in the evolution of males.
Assuntos
Carica/genética , Cromossomos de Plantas/genética , Cromossomos Sexuais/genética , Processos de Determinação Sexual/genética , Sequência de Bases , Fluxo Gênico/genética , Haplótipos/genética , Organismos Hermafroditas/genética , Dados de Sequência Molecular , Melhoramento Vegetal , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA , SexoRESUMO
Coffea arabica (the Arabica coffee) is an allotetraploid species originating from a recent hybridization between two diploid species: C. canephora and C. eugenioides. Transposable elements can drive structural and functional variation during the process of hybridization and allopolyploid formation in plants. To learn more about the evolution of the C. arabica genome, we characterized and studied a new Copia LTR-Retrotransposon (LTR-RT) family in diploid and allotetraploid Coffea genomes called Divo. It is a complete and relatively compact LTR-RT element (~5 kb), carrying typical Gag and Pol Copia type domains. Reverse Trancriptase (RT) domain-based phylogeny demonstrated that Divo is a new and well-supported family in the Bianca lineage, but strictly restricted to dicotyledonous species. In C. canephora, Divo is expressed and showed a genomic distribution along gene rich and gene poor regions. The copy number, the molecular estimation of insertion time and the analysis at orthologous locations of insertions in diploid and allotetraploid coffee genomes suggest that Divo underwent a different and recent transposition activity in C. arabica and C. canephora when compared to C. eugenioides. The analysis of this novel LTR-RT family represents an important step toward uncovering the genome structure and evolution of C. arabica allotetraploid genome.