Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25.501
Filtrar
Más filtros

Intervalo de año de publicación
1.
Methods Mol Biol ; 2850: 219-227, 2025.
Artículo en Inglés | MEDLINE | ID: mdl-39363074

RESUMEN

Gene synthesis efficiency has greatly improved in recent years but is limited when it comes to repetitive sequences and results in synthesis failure or delays by DNA synthesis vendors. Here, we describe a method for the assembly of small synthetic genes with repetitive elements: First, a gene of interest is split in silico into small synthons of up to 80 base pairs flanked by Golden Gate-compatible overhangs. Then synthons are made by oligo extension and finally assembled into a synthetic gene by Golden Gate assembly.


Asunto(s)
Secuencias Repetitivas de Ácidos Nucleicos , Secuencias Repetitivas de Ácidos Nucleicos/genética , Genes Sintéticos/genética , ADN/genética , Biología Sintética/métodos
2.
BMC Plant Biol ; 24(1): 966, 2024 Oct 16.
Artículo en Inglés | MEDLINE | ID: mdl-39407117

RESUMEN

BACKGROUND: Trachelospermum jasminoides has medicinal and ornamental value and is widely distributed in China. Although the chloroplast genome has been documented, the mitochondrial genome has not yet been studied. RESULTS: The mitochondrial genome of T. jasminoides was assembled and functionally annotated using Illumina and nanopore reads. The mitochondrial genome comprises a master circular molecular structure of 605,764 bp and encodes 65 genes: 39 protein-coding genes, 23 transfer RNA (tRNA) genes and 3 ribosomal RNA genes. In addition to the single circular conformation, we found many alternative conformations of the T. jasminoides mitochondrial genome mediated by 42 repetitive sequences. Six repetitive sequences (DRS01-DRS06) were supported by nanopore long reads, polymerase chain reaction (PCR) amplifications, and Sanger sequencing of the PCR products. Eleven homologous fragments were identified by comparing the mitochondrial and chloroplast genome sequences, including three complete tRNA genes. Moreover, 531 edited RNA sites were identified in the protein-coding sequences based on RNA sequencing data, with nad4 having the highest number of sites (54). CONCLUSION: To our knowledge, this is the first description of the mitochondrial genome of T. jasminoides. Our results demonstrate the existence of multiple conformations. These findings lay a foundation for understanding the genetics and evolutionary dynamics of Apocynaceae.


Asunto(s)
Genoma Mitocondrial , Recombinación Genética , Secuencias Repetitivas de Ácidos Nucleicos/genética , ARN de Transferencia/genética , Genoma del Cloroplasto , Genoma de Planta
3.
Genome Biol ; 25(1): 244, 2024 Sep 16.
Artículo en Inglés | MEDLINE | ID: mdl-39285474

RESUMEN

BACKGROUND: Telomeric repeat arrays at the ends of chromosomes are highly dynamic in composition, but their repetitive nature and technological limitations have made it difficult to assess their true variation in genome diversity surveys. RESULTS: We have comprehensively characterized the sequence variation immediately adjacent to the canonical telomeric repeat arrays at the very ends of chromosomes in 74 genetically diverse Arabidopsis thaliana accessions. We first describe several types of distinct telomeric repeat units and then identify evolutionary processes such as local homogenization and higher-order repeat formation that shape diversity of chromosome ends. By comparing largely isogenic samples, we also determine repeat number variation of the degenerate and variant telomeric repeat array at both the germline and somatic levels. Finally, our analysis of haplotype structure uncovers chromosome end-specific patterns in the distribution of variant telomeric repeats, and their linkage to the more proximal non-coding region. CONCLUSIONS: Our findings illustrate the spectrum of telomeric repeat variation at multiple levels in A. thaliana-in germline and soma, across all chromosome ends, and across genetic groups-thereby expanding our knowledge of the evolution of chromosome ends.


Asunto(s)
Arabidopsis , Cromosomas de las Plantas , Variación Genética , Telómero , Arabidopsis/genética , Telómero/genética , Secuencias Repetitivas de Ácidos Nucleicos , Haplotipos , Evolución Molecular , Genoma de Planta
4.
Int J Mol Sci ; 25(18)2024 Sep 17.
Artículo en Inglés | MEDLINE | ID: mdl-39337484

RESUMEN

This study describes the first genome sequence and analysis of Coniella granati, a fungal pathogen with a broad host range, which is responsible for postharvest crown rot, shoot blight, and canker diseases in pomegranates. C. granati is a geographically widespread pathogen which has been reported across Europe, Asia, the Americas, and Africa. Our analysis revealed a 46.8 Mb genome with features characteristic of hemibiotrophic fungi. Approximately one third of its genome was compartmentalised within 'AT-rich' regions exhibiting a low GC content (30 to 45%). These regions primarily comprised transposable elements that are repeated at a high frequency and interspersed throughout the genome. Transcriptome-supported gene annotation of the C. granati genome revealed a streamlined proteome, mirroring similar observations in other pathogens with a latent phase. The genome encoded a relatively compact set of 9568 protein-coding genes with a remarkable 95% having assigned functional annotations. Despite this streamlined nature, a set of 40 cysteine-rich candidate secreted effector-like proteins (CSEPs) was predicted as well as a gene cluster involved in the synthesis of a pomegranate-associated toxin. These potential virulence factors were predominantly located near repeat-rich and AT-rich regions, suggesting that the pathogen evades host defences through Repeat-Induced Point mutation (RIP)-mediated pseudogenisation. Furthermore, 23 of these CSEPs exhibited homology to known effector and pathogenicity genes found in other hemibiotrophic pathogens. The study establishes a foundational resource for the study of the genetic makeup of C. granati, paving the way for future research on its pathogenicity mechanisms and the development of targeted control strategies to safeguard pomegranate production.


Asunto(s)
Proteínas Fúngicas , Genoma Fúngico , Enfermedades de las Plantas , Granada (Fruta) , Proteoma , Enfermedades de las Plantas/microbiología , Enfermedades de las Plantas/genética , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Granada (Fruta)/genética , Granada (Fruta)/microbiología , Ascomicetos/genética , Ascomicetos/patogenicidad , Anotación de Secuencia Molecular , Frutas/microbiología , Frutas/genética , Secuencias Repetitivas de Ácidos Nucleicos/genética
5.
Science ; 385(6714): eadn1629, 2024 Sep 13.
Artículo en Inglés | MEDLINE | ID: mdl-39264994

RESUMEN

Macrophages maintain hematopoietic stem cell (HSC) quality by assessing cell surface Calreticulin (Calr), an "eat-me" signal induced by reactive oxygen species (ROS). Using zebrafish genetics, we identified Beta-2-microglobulin (B2m) as a crucial "don't eat-me" signal on blood stem cells. A chemical screen revealed inducers of surface Calr that promoted HSC proliferation without triggering ROS or macrophage clearance. Whole-genome CRISPR-Cas9 screening showed that Toll-like receptor 3 (Tlr3) signaling regulated b2m expression. Targeting b2m or tlr3 reduced the HSC clonality. Elevated B2m levels correlated with high expression of repetitive element (RE) transcripts. Overall, our data suggest that RE-associated double-stranded RNA could interact with TLR3 to stimulate surface expression of B2m on hematopoietic stem and progenitor cells. These findings suggest that the balance of Calr and B2m regulates macrophage-HSC interactions and defines hematopoietic clonality.


Asunto(s)
Calreticulina , Células Madre Hematopoyéticas , Macrófagos , Fagocitosis , Receptor Toll-Like 3 , Microglobulina beta-2 , Animales , Microglobulina beta-2/genética , Microglobulina beta-2/metabolismo , Calreticulina/metabolismo , Calreticulina/genética , Proliferación Celular , Sistemas CRISPR-Cas , Células Madre Hematopoyéticas/metabolismo , Células Madre Hematopoyéticas/citología , Macrófagos/metabolismo , Especies Reactivas de Oxígeno/metabolismo , Secuencias Repetitivas de Ácidos Nucleicos , Transducción de Señal , Receptor Toll-Like 3/metabolismo , Receptor Toll-Like 3/genética , Pez Cebra , Proteínas de Pez Cebra/metabolismo , Proteínas de Pez Cebra/genética
6.
An Acad Bras Cienc ; 96(suppl 1): e20240172, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39319837

RESUMEN

Repetitive sequences can lead to variation in DNA quantity and composition among species. The Orchidaceae, the largest angiosperm family, is divided into five subfamilies, with Apostasioideae as the basal group and Orchidoideae and Epidendroideae showing high diversification rates. Despite their different evolutionary paths, some species in these groups have similar nuclear DNA content. This study focuses on one example to understand the dynamics of major repetitive DNAs in the nucleus. We used Next-Generation Sequencing (NGS) data from Apostasia wallichii (Apostasioideae) and Ludisia discolor (Orchidoideae) to identify and quantify the most abundant repeats. The repetitive fraction varied in abundance (27.5% in L. discolor and 60.6% in A. wallichii) and composition, with LTR retrotransposons of different lineages being the most abundant repeats in each species. Satellite DNAs showed varying organization and abundance. Despite the unbalanced ratio between single-copy and repetitive DNA sequences, the two species had the same genome size, possibly due to the elimination of non-essential genes. This phenomenon has been observed in other Apostasia and likely led to the proliferation of transposable elements in A. wallichii. Deep genome information in the future will aid in understanding the contraction/expansion of gene families and the evolution of sequences in these genomes.


Asunto(s)
Tamaño del Genoma , Genoma de Planta , Orchidaceae , Secuencias Repetitivas de Ácidos Nucleicos , Orchidaceae/genética , Orchidaceae/clasificación , Genoma de Planta/genética , Secuencias Repetitivas de Ácidos Nucleicos/genética , Simulación por Computador , ADN de Plantas/genética , Secuenciación de Nucleótidos de Alto Rendimiento
7.
J Vis Exp ; (211)2024 Sep 13.
Artículo en Inglés | MEDLINE | ID: mdl-39345159

RESUMEN

Two-dimensional neutral/neutral gel-electrophoresis (2DGE) emerged as a benchmark technique to analyze DNA replication through natural impediments. This protocol describes how to analyze replication fork progression through structure-prone, expandable DNA repeats within the simian virus 40 (SV40)-based episome in human cells. In brief, upon plasmid transfection into human cells, replication intermediates are isolated by the modified Hirt protocol and treated with the DpnI restriction enzyme to remove non-replicated DNA. Intermediates are then digested by appropriate restriction enzymes to place the repeat of interest within the origin-distal half of a 3-5 kb-long DNA fragment. The replication intermediates are separated into two perpendicular dimensions, first by size and then by shape. Following Southern blot hybridization, this approach allows researchers to observe fork stalling at various structure-forming repeats on the descending half of the replication Y-arc. Furthermore, this positioning of the stall site allows the visualization of various outcomes of repeat-mediated fork stalling, such as fork reversal, the advent of a converging fork, and recombinational fork restart.


Asunto(s)
Replicación del ADN , Plásmidos , Virus 40 de los Simios , Virus 40 de los Simios/genética , Virus 40 de los Simios/química , Humanos , Plásmidos/genética , Secuencias Repetitivas de Ácidos Nucleicos/genética , Transfección/métodos , ADN Viral/genética , ADN Viral/química , Southern Blotting/métodos
8.
Artículo en Inglés | MEDLINE | ID: mdl-39167800

RESUMEN

Enhancers are DNA sequences that can strengthen transcription initiation. However, the global identification of plant enhancers is complicated due to uncertainty in the distance and orientation of enhancers, especially in species with large genomes. In this study, we performed self-transcribing active regulatory region sequencing (STARR-seq) for the first time to identify enhancers across the barley genome. A total of 7323 enhancers were successfully identified, and among 45 randomly selected enhancers, over 75% were effective as validated by a dual-luciferase reporter assay system in the lower epidermis of tobacco leaves. Interestingly, up to 53.5% of the barley enhancers were repetitive sequences, especially transposable elements (TEs), thus reinforcing the vital role of repetitive enhancers in gene expression. Both the common active mark H3K4me3 and repressive mark H3K27me3 were abundant among the barley STARR-seq enhancers. In addition, the functional range of barley STARR-seq enhancers seemed much broader than that of rice or maize and extended to ±100 kb of the gene body, and this finding was consistent with the high expression levels of genes in the genome. This study specifically depicts the unique features of barley enhancers and provides available barley enhancers for further utilization.


Asunto(s)
Elementos de Facilitación Genéticos , Regulación de la Expresión Génica de las Plantas , Hordeum , Hordeum/genética , Hordeum/metabolismo , Elementos de Facilitación Genéticos/genética , Regulación de la Expresión Génica de las Plantas/genética , Histonas/metabolismo , Histonas/genética , Elementos Transponibles de ADN/genética , Genoma de Planta/genética , Secuencias Repetitivas de Ácidos Nucleicos/genética , Análisis de Secuencia de ADN/métodos
9.
Int J Mol Sci ; 25(16)2024 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-39201503

RESUMEN

Repetitive sequences play an indispensable role in gene expression, transcriptional regulation, and chromosome arrangements through trans and cis regulation. In this review, focusing on recent advances, we summarize the epigenetic regulatory mechanisms of repetitive sequences in embryonic stem cells. We aim to bridge the knowledge gap by discussing DNA damage repair pathway choices on repetitive sequences and summarizing the significance of chromatin organization on repetitive sequences in response to DNA damage. By consolidating these insights, we underscore the critical relationship between the stability of repetitive sequences and early embryonic development, seeking to provide a deeper understanding of repetitive sequence stability and setting the stage for further research and potential therapeutic strategies in developmental biology and regenerative medicine.


Asunto(s)
Células Madre Embrionarias , Secuencias Repetitivas de Ácidos Nucleicos , Humanos , Animales , Células Madre Embrionarias/metabolismo , Células Madre Embrionarias/citología , Secuencias Repetitivas de Ácidos Nucleicos/genética , Epigénesis Genética , Cromatina/metabolismo , Cromatina/genética , Reparación del ADN , Daño del ADN , Inestabilidad Genómica
10.
Genome Biol Evol ; 16(8)2024 Aug 05.
Artículo en Inglés | MEDLINE | ID: mdl-39190481

RESUMEN

Repeats can mediate rearrangements and recombination in plant mitochondrial genomes and plastid genomes. While repeat accumulations are linked to heightened evolutionary rates and complex structures in specific lineages, debates persist regarding the extent of their influence on sequence and structural evolution. In this study, 75 Plantago plastomes were analyzed to investigate the relationships between repeats, nucleotide substitution rates, and structural variations. Extensive repeat accumulations were associated with significant rearrangements and inversions in the large inverted repeats (IRs), suggesting that repeats contribute to rearrangement hotspots. Repeats caused infrequent recombination that potentially led to substoichiometric shifting, supported by long-read sequencing. Repeats were implicated in elevating evolutionary rates by facilitating localized hypermutation, likely through DNA damage and repair processes. This study also observed a decrease in nucleotide substitution rates for loci translocating into IRs, supporting the role of biased gene conversion in maintaining lower substitution rates. Combined with known parallel changes in mitogenomes, it is proposed that potential dysfunction in nuclear-encoded genes associated with DNA replication, recombination, and repair may drive the evolution of Plantago organellar genomes. These findings contribute to understanding how repeats impact organellar evolution and stability, particularly in rapidly evolving plant lineages.


Asunto(s)
Evolución Molecular , Genoma de Plastidios , Plantago , Plantago/genética , Reordenamiento Génico , Secuencias Repetitivas de Ácidos Nucleicos , Genoma de Planta , Genoma Mitocondrial , Recombinación Genética , Secuencias Invertidas Repetidas
11.
Sci Data ; 11(1): 891, 2024 Aug 16.
Artículo en Inglés | MEDLINE | ID: mdl-39152143

RESUMEN

Paspalum notatum Flüggé is an economically important subtropical fodder grass that is widely used in the Americas. Here, we report a new chromosome-scale genome assembly and annotation of a diploid biotype collected in the center of origin of the species. Using Oxford Nanopore long reads, we generated a 557.81 Mb genome assembly (N50 = 56.1 Mb) with high gene completeness (BUSCO = 98.73%). Genome annotation identified 320 Mb (57.86%) of repetitive elements and 45,074 gene models, of which 36,079 have a high level of confidence. Further characterisation included the identification of 59 miRNA precursors together with their putative targets. The present work provides a comprehensive genomic resource for P. notatum improvement and a reference frame for functional and evolutionary research within the genus.


Asunto(s)
Genoma de Planta , Anotación de Secuencia Molecular , Paspalum , Paspalum/genética , Cromosomas de las Plantas/genética , MicroARNs/genética , Secuencias Repetitivas de Ácidos Nucleicos
12.
Genome Biol Evol ; 16(7)2024 Jul 03.
Artículo en Inglés | MEDLINE | ID: mdl-38957923

RESUMEN

We present the first long-read de novo assembly and annotation of the luna moth (Actias luna) and provide the full characterization of heavy chain fibroin (h-fibroin), a long and highly repetitive gene (>20 kb) essential in silk fiber production. There are >160,000 described species of moths and butterflies (Lepidoptera), but only within the last 5 years have we begun to recover high-quality annotated whole genomes across the order that capture h-fibroin. Using PacBio HiFi reads, we produce the first high-quality long-read reference genome for this species. The assembled genome has a length of 532 Mb, a contig N50 of 16.8 Mb, an L50 of 14 contigs, and 99.4% completeness (BUSCO). Our annotation using Bombyx mori protein and A. luna RNAseq evidence captured a total of 20,866 genes at 98.9% completeness with 10,267 functionally annotated proteins and a full-length h-fibroin annotation of 2,679 amino acid residues.


Asunto(s)
Fibroínas , Genoma de los Insectos , Anotación de Secuencia Molecular , Mariposas Nocturnas , Animales , Mariposas Nocturnas/genética , Fibroínas/genética , Seda/genética , Proteínas de Insectos/genética , Bombyx/genética , Secuencias Repetitivas de Ácidos Nucleicos
13.
Int J Mol Sci ; 25(14)2024 Jul 11.
Artículo en Inglés | MEDLINE | ID: mdl-39062839

RESUMEN

From the recent genome assembly NHGRI_mPonAbe1-v2.0_NCBI (GCF_028885655.2) of orangutan chromosome 13, we computed the precise alpha satellite higher-order repeat (HOR) structure using the novel high-precision GRM2023 algorithm with Global Repeat Map (GRM) and Monomer Distance (MD) diagrams. This study rigorously identified alpha satellite HORs in the centromere of orangutan chromosome 13, discovering a novel 59mer HOR-the longest HOR unit identified in any primate to date. Additionally, it revealed the first intertwined sequence of three HORs, 18mer/27mer/45mer HORs, with a common aligned "backbone" across all HOR copies. The major 7mer HOR exhibits a Willard's-type canonical copy, although some segments of the array display significant irregularities. In contrast, the 14mer HOR forms a regular Willard's-type HOR array. Surprisingly, the GRM2023 high-precision analysis of chromosome 13 of human genome assembly T2T-CHM13v2.0 reveals the presence of only a 7mer HOR, despite both the orangutan and human genome assemblies being derived from whole genome shotgun sequences.


Asunto(s)
ADN Satélite , Pongo , Animales , Humanos , ADN Satélite/genética , Pongo/genética , Centrómero/genética , Secuencias Repetitivas de Ácidos Nucleicos/genética , Primates/genética , Cromosomas de los Mamíferos/genética
14.
Sci Data ; 11(1): 823, 2024 Jul 26.
Artículo en Inglés | MEDLINE | ID: mdl-39060306

RESUMEN

Elymus species, belonging to Triticeae tribe, is a tertiary gene pool for improvement of major cereal crops. Elymus sibiricus, a tetraploid with StH genome, is a typical species in the genus Elymus, which is widely utilized as a high-quality perennial forage grass in template regions. In this study, we report the construction of a chromosome-scale reference assembly of E. sibiricus line Gaomu No. 1 based on PacBio HiFi reads and chromosome conformation capture. Subgenome St and H were well phased by assisting with kmer and subgenome-specific repetitive sequence. The total assembly size was 6.929 Gb with a contig N50 of 49.518 Mb. In total, 89,800 protein-coding genes were predicted. The repetitive sequences accounted for 82.49% of the genome in E. sibiricus. Comparative genome analysis confirmed a major species-specific 4H/6H reciprocal translocation in E. sibiricus. The E. sibiricus assembly will be much helpful to exploit genetic resource of StH species in genus Elymus, and provides an important tool for E. sibiricus domestication.


Asunto(s)
Cromosomas de las Plantas , Elymus , Genoma de Planta , Elymus/genética , Cromosomas de las Plantas/genética , Grano Comestible/genética , Secuencias Repetitivas de Ácidos Nucleicos
15.
J Bioinform Comput Biol ; 22(3): 2450009, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-39030667

RESUMEN

A turning point in cancer research is the introduction of massively parallel sequencing technology which greatly reduced the cost and time for genome sequencing. This enhanced the scope for detecting and analyzing the role of structural alterations in cancer. However, certain bias exists in NGS-based approaches, which badly affects the CNV identification process. Moreover, DNA repeats existing in CNV regions need special attention as they will degrade the performance of majority of the existing CNV detection tools, even after applying generalized bias correction method. This motivated this work, where a novel method has been designed to address the issue of DNA repeats and thereby mappability bias existing in regions of CNV. The method consists of three phases, where the first phase computes the alignment information of uniquely mapped DNA reads, considering the base quality and base mismatch parameters at nucleotide level precision. The second and the third phase use a novel approach to allocate the non-uniquely mapped reads to an optimal region of the DNA repeats based on a probabilistic membership model. The proposed method is capable of identifying CNVs present in coding, as well as non-coding region of the DNA, and is also capable of detecting CNVs existing in DNA repeat regions. The methodology achieves a sensitivity greater than [Formula: see text] during the performed simulations, and on real data, the detected variants are validated with the database of genomic variants, where the percentage overlap is also greater than 95%, and has achieved much better breakpoint prediction, as compared with other popular bias correction CNV detection methods.


Asunto(s)
Variaciones en el Número de Copia de ADN , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Humanos , Análisis de Secuencia de ADN/métodos , Algoritmos , Neoplasias/genética , ADN/genética , Secuencias Repetitivas de Ácidos Nucleicos
16.
Nat Commun ; 15(1): 5727, 2024 Jul 08.
Artículo en Inglés | MEDLINE | ID: mdl-38977669

RESUMEN

DNA replication and transcription generate DNA supercoiling, which can cause topological stress and intertwining of daughter chromatin fibers, posing challenges to the completion of DNA replication and chromosome segregation. Type II topoisomerases (Top2s) are enzymes that relieve DNA supercoiling and decatenate braided sister chromatids. How Top2 complexes deal with the topological challenges in different chromatin contexts, and whether all chromosomal contexts are subjected equally to torsional stress and require Top2 activity is unknown. Here we show that catalytic inhibition of the Top2 complex in interphase has a profound effect on the stability of heterochromatin and repetitive DNA elements. Mechanistically, we find that catalytically inactive Top2 is trapped around heterochromatin leading to DNA breaks and unresolved catenates, which necessitate the recruitment of the structure specific endonuclease, Ercc1-XPF, in an SLX4- and SUMO-dependent manner. Our data are consistent with a model in which Top2 complex resolves not only catenates between sister chromatids but also inter-chromosomal catenates between clustered repetitive elements.


Asunto(s)
ADN-Topoisomerasas de Tipo II , Heterocromatina , ADN-Topoisomerasas de Tipo II/metabolismo , ADN-Topoisomerasas de Tipo II/genética , Heterocromatina/metabolismo , Animales , Inhibidores de Topoisomerasa II/farmacología , Secuencias Repetitivas de Ácidos Nucleicos/genética , Proteínas de Unión a Poli-ADP-Ribosa/metabolismo , Proteínas de Unión a Poli-ADP-Ribosa/genética , Replicación del ADN , ADN Superhelicoidal/metabolismo , ADN Superhelicoidal/química , Humanos , Ratones , Proteínas de Unión al ADN/metabolismo , Proteínas de Unión al ADN/genética , ADN/metabolismo , ADN/química , Interfase
17.
Genome Res ; 34(6): 937-951, 2024 Jul 23.
Artículo en Inglés | MEDLINE | ID: mdl-38986578

RESUMEN

Transposable elements (TEs) and other repetitive regions have been shown to contain gene regulatory elements, including transcription factor binding sites. However, regulatory elements harbored by repeats have proven difficult to characterize using short-read sequencing assays such as ChIP-seq or ATAC-seq. Most regulatory genomics analysis pipelines discard "multimapped" reads that align equally well to multiple genomic locations. Because multimapped reads arise predominantly from repeats, current analysis pipelines fail to detect a substantial portion of regulatory events that occur in repetitive regions. To address this shortcoming, we developed Allo, a new approach to allocate multimapped reads in an efficient, accurate, and user-friendly manner. Allo combines probabilistic mapping of multimapped reads with a convolutional neural network that recognizes the read distribution features of potential peaks, offering enhanced accuracy in multimapping read assignment. Allo also provides read-level output in the form of a corrected alignment file, making it compatible with existing regulatory genomics analysis pipelines and downstream peak-finders. In a demonstration application on CTCF ChIP-seq data, we show that Allo results in the discovery of thousands of new CTCF peaks. Many of these peaks contain the expected cognate motif and/or serve as TAD boundaries. We additionally apply Allo to a diverse collection of ENCODE ChIP-seq data sets, resulting in multiple previously unidentified interactions between transcription factors and repetitive element families. Finally, we show that Allo may be particularly beneficial in identifying ChIP-seq peaks at centromeres, near segmentally duplicated genes, and in younger TEs, enabling new regulatory analyses in these regions.


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Humanos , Secuenciación de Inmunoprecipitación de Cromatina/métodos , Secuencias Reguladoras de Ácidos Nucleicos , Secuencias Repetitivas de Ácidos Nucleicos , Genómica/métodos , Sitios de Unión , Factor de Unión a CCCTC/metabolismo , Factor de Unión a CCCTC/genética , Elementos Reguladores de la Transcripción , Elementos Transponibles de ADN , Análisis de Secuencia de ADN/métodos , Redes Neurales de la Computación
18.
Nat Commun ; 15(1): 6213, 2024 Jul 23.
Artículo en Inglés | MEDLINE | ID: mdl-39043652

RESUMEN

Obesity is associated with increased cancer risk, yet the underlying mechanisms remain elusive. Obesity-associated cancers involve disruptions in metabolic and cellular pathways, which can lead to genomic instability. Repetitive DNA sequences capable of adopting alternative DNA structures (e.g., H-DNA) stimulate mutations and are enriched at mutation hotspots in human cancer genomes. However, it is not known if obesity impacts DNA repeat-mediated endogenous mutation hotspots. We address this gap by measuring mutation frequencies in obese and normal-weight transgenic reporter mice carrying either a control human B-DNA- or an H-DNA-forming sequence (from a translocation hotspot in c-MYC in Burkitt lymphoma). Here, we discover that H-DNA-induced DNA damage and mutations are elevated in a tissue-specific manner, and DNA repair efficiency is reduced in obese mice compared to those on the control diet. These findings elucidate the impact of obesity on cancer-associated endogenous mutation hotspots, providing mechanistic insight into the link between obesity and cancer.


Asunto(s)
Daño del ADN , Reparación del ADN , Inestabilidad Genómica , Ratones Transgénicos , Mutación , Obesidad , Animales , Obesidad/genética , Humanos , Ratones , Reparación del ADN/genética , Daño del ADN/genética , Secuencias Repetitivas de Ácidos Nucleicos/genética , Masculino , Ratones Endogámicos C57BL , Femenino , Linfoma de Burkitt/genética , ADN/genética , ADN/metabolismo
19.
Genomics ; 116(5): 110896, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39025318

RESUMEN

Pamphagidae is a family of Acridoidea that inhabits the desert steppes of Eurasia and Africa. This study employed flow cytometry to estimate the genome size of eight species in the Pamphagidae. The results indicate that the genome size of the eight species ranged from 13.88 pg to 14.66 pg, with an average of 14.26 pg. This is the largest average genome size recorded for the Orthoptera families, as well as for the entire Insecta. Furthermore, the study explored the role of repetitive sequences in the genome, including their evolutionary dynamics and activity, using low-coverage next-generation sequencing data. The genome is composed of 14 different types of repetitive sequences, which collectively make up between 59.9% and 68.17% of the total genome. The Pamphagidae family displays high levels of transposable element (TE) activity, with the number of TEs increasing and accumulating since the family's emergence. The study found that the types of repetitive sequences contributing to the TE outburst events are similar across species. Additionally, the study identified unique repetitive elements for each species. The differences in repetitive sequences among the eight Pamphagidae species correspond to their phylogenetic relationships. The study sheds new light on genome gigantism in the Pamphagidae and provides insight into the correlation between genome size and repetitive sequences within the family.


Asunto(s)
Tamaño del Genoma , Genoma de los Insectos , Animales , Elementos Transponibles de ADN , Ortópteros/genética , Ortópteros/clasificación , Secuencias Repetitivas de Ácidos Nucleicos , Saltamontes/genética , Saltamontes/clasificación , Evolución Molecular
20.
Genomics ; 116(5): 110900, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39067796

RESUMEN

Taxus plants are the exclusive source of paclitaxel, an anticancer drug with significant medicinal and economic value. Interspecies hybridization and gene introgression during evolution have obscured distinctions among Taxus species, complicating their phylogenetic classification. While the chloroplast genome of Taxus wallichiana, a widely distributed species in China, has been sequenced, its mitochondrial genome (mitogenome) remains uncharacterized.We sequenced and assembled the T. wallichiana mitogenome using BGI short reads and Nanopore long reads, facilitating comparisons with other gymnosperm mitogenomes. The T. wallichiana mitogenome spanning 469,949 bp, predominantly forms a circular configuration with a GC content of 50.51%, supplemented by 3 minor configurations mediated by one pair of LRs and two pairs of IntRs. It includes 32 protein-coding genes, 7 tRNA genes, and 3 rRNA genes, several of which exist in multiple copies.We detailed the mitogenome's structure, codon usage, RNA editing, and sequence migration between organelles, constructing a phylogenetic tree to elucidate evolutionary relationships. Unlike typical gymnosperm mitochondria, T. wallichiana shows no evidence of mitochondrial-plastid DNA transfer (MTPT), highlighting its unique genomic architecture. Synteny analysis indicated extensive genomic rearrangements in T. wallichiana, likely driven by recombination among abundant repetitive sequences. This study offers a high-quality T. wallichiana mitogenome, enhancing our understanding of gymnosperm mitochondrial evolution and supporting further cultivation and utilization of Taxus species.


Asunto(s)
Genoma Mitocondrial , Filogenia , Taxus , Taxus/genética , Taxus/clasificación , Recombinación Genética , ARN de Transferencia/genética , Edición de ARN , Secuencias Repetitivas de Ácidos Nucleicos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA