RESUMO
Replication-transcription collisions shape genomes, influence evolution, and promote genetic diseases. Although unclear why, head-on transcription (lagging strand genes) is especially disruptive to replication and promotes genomic instability. Here, we find that head-on collisions promote R-loop formation in Bacillus subtilis. We show that pervasive R-loop formation at head-on collision regions completely blocks replication, elevates mutagenesis, and inhibits gene expression. Accordingly, the activity of the R-loop processing enzyme RNase HIII at collision regions is crucial for stress survival in B. subtilis, as many stress response genes are head-on to replication. Remarkably, without RNase HIII, the ability of the intracellular pathogen Listeria monocytogenes to infect and replicate in hosts is weakened significantly, most likely because many virulence genes are head-on to replication. We conclude that the detrimental effects of head-on collisions stem primarily from excessive R-loop formation and that the resolution of these structures is critical for bacterial stress survival and pathogenesis.
Assuntos
Bacillus subtilis/fisiologia , Replicação do DNA , Listeria monocytogenes/fisiologia , Transcrição Gênica , Animais , Período de Replicação do DNA , Feminino , Expressão Gênica , Técnicas de Inativação de Genes , Listeria monocytogenes/genética , Listeria monocytogenes/patogenicidade , Listeriose/microbiologia , Camundongos , Estresse Fisiológico , VirulênciaRESUMO
Protection of euchromatin from invasion by gene-repressive heterochromatin is critical for cellular health and viability. In addition to constitutive loci such as pericentromeres and subtelomeres, heterochromatin can be found interspersed in gene-rich euchromatin, where it regulates gene expression pertinent to cell fate. While heterochromatin and euchromatin are globally poised for mutual antagonism, the mechanisms underlying precise spatial encoding of heterochromatin containment within euchromatic sites remain opaque. We investigated ectopic heterochromatin invasion by manipulating the fission yeast mating type locus boundary using a single-cell spreading reporter system. We found that heterochromatin repulsion is locally encoded by Set1/COMPASS on certain actively transcribed genes and that this protective role is most prominent at heterochromatin islands, small domains interspersed in euchromatin that regulate cell fate specifiers. Sensitivity to invasion by heterochromatin, surprisingly, is not dependent on Set1 altering overall gene expression levels. Rather, the gene-protective effect is strictly dependent on Set1's catalytic activity. H3K4 methylation, the Set1 product, antagonizes spreading in two ways: directly inhibiting catalysis by Suv39/Clr4 and locally disrupting nucleosome stability. Taken together, these results describe a mechanism for spatial encoding of euchromatic signals that repel heterochromatin invasion.
Assuntos
Proteínas de Ciclo Celular/metabolismo , Heterocromatina/metabolismo , Histona-Lisina N-Metiltransferase/metabolismo , Nucleossomos/metabolismo , Proteínas de Schizosaccharomyces pombe/metabolismo , Schizosaccharomyces/enzimologia , Schizosaccharomyces/genética , Fatores de Transcrição/metabolismo , Acetilação , Catálise , Ativação Enzimática , Regulação Fúngica da Expressão Gênica , Inativação Gênica , Histonas/metabolismoRESUMO
The spatial distribution of genes in chromosomes seems not to be random. For instance, only 10% of genes are transcribed from bidirectional promoters in humans, and many more are organized into larger clusters. This raises intriguing questions previously asked by different authors. We would like to add a few more questions in this context, related to gene orientation inversions. Does gene orientation (inversion) follow a random pattern? Is it relevant to biological activity somehow? We define a new kind of network coined as the gene orientation inversion network (GOIN). GOIN's complex network encodes short- and long-range patterns of inversion of the orientation of pairs of gene in the chromosome. We selected Plasmodium falciparum as a case of study due to the high relevance of this parasite to public health (causal agent of malaria). We constructed here for the first time all of the GOINs for the genome of this parasite. These networks have an average of 383 nodes (genes in one chromosome) and 1314 links (pairs of gene with inverse orientation). We calculated node centralities and other parameters of these networks. These numerical parameters were used to study different properties of gene inversion patterns, for example, distribution, local communities, similarity to Erdös-Rényi random networks, randomness, and so on. We find clues that seem to indicate that gene orientation inversion does not follow a random pattern. We noted that some gene communities in the GOINs tend to group genes encoding for RIFIN-related proteins in the proteome of the parasite. RIFIN-like proteins are a second family of clonally variant proteins expressed on the surface of red cells infected with Plasmodium falciparum. Consequently, we used these centralities as input of machine learning (ML) models to predict the RIFIN-like activity of 5365 proteins in the proteome of Plasmodium sp. The best linear ML model found discriminates RIFIN-like from other proteins with sensitivity and specificity 70-80% in training and external validation series. All of these results may point to a possible biological relevance of gene orientation inversion not directly dependent on genetic sequence information. This work opens the gate to the use of GOINs as a tool for the study of the structure of chromosomes and the study of protein function in proteome research.
Assuntos
Cromossomos/química , Redes Reguladoras de Genes , Genes de Protozoários , Proteínas de Membrana/genética , Plasmodium falciparum/genética , Proteoma/genética , Proteínas de Protozoários/genética , Inversão de Sequência , Eritrócitos/parasitologia , Regulação da Expressão Gênica , Humanos , Aprendizado de Máquina , Proteínas de Membrana/metabolismo , Família Multigênica , Plasmodium falciparum/metabolismo , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Proteoma/metabolismo , Proteínas de Protozoários/metabolismo , SoftwareRESUMO
Genomic analyses have proliferated without being tied to tangible phenotypes. For example, although coordination of both gene expression and genetic linkage have been offered as genetic mechanisms for the frequently observed clustering of genes participating in fungal metabolic pathways, elucidation of the phenotype(s) favored by selection, resulting in cluster formation and maintenance, has not been forthcoming. We noted that the cause of certain well-studied human metabolic disorders is the accumulation of toxic intermediate compounds (ICs), which occurs when the product of an enzyme is not used as a substrate by a downstream neighbor in the metabolic network. This raises the hypothesis that the phenotype favored by selection to drive gene clustering is the mitigation of IC toxicity. To test this, we examined 100 diverse fungal genomes for the simplest type of cluster, gene pairs that are both metabolic neighbors and chromosomal neighbors immediately adjacent to each other, which we refer to as "double neighbor gene pairs" (DNGPs). Examination of the toxicity of their corresponding ICs shows that, compared with chromosomally nonadjacent metabolic neighbors, DNGPs are enriched for ICs that have acutely toxic LD50 doses or reactive functional groups. Furthermore, DNGPs are significantly more likely to be divergently oriented on the chromosome; remarkably, â¼40% of these DNGPs have ICs known to be toxic. We submit that the structure of synteny in metabolic pathways of fungi is a signature of selection for protection against the accumulation of toxic metabolic intermediates.
Assuntos
Adaptação Fisiológica/genética , Fungos/genética , Ligação Genética , Substâncias Perigosas/toxicidade , Fungos/classificação , Fungos/efeitos dos fármacos , Fungos/fisiologia , Substâncias Perigosas/metabolismoRESUMO
Kenaf (Hibiscus cannabinus) is one of the most fast-growing bast in the world and belongs to the family Malvaceae. However, the systematic classification and chloroplast (cp) genome of kenaf has not been reported to date. In this study, we sequenced the cp genome of kenaf and conducted phylogenetic and comparative analyses in the family of Malvaceae. The sizes of H. cannabinus cp genomes were 162,903 bp in length, containing 113 unique genes (79 protein-coding genes, four rRNA genes, and 30 tRNA genes). Phylogenetic analysis indicated that the cp genome sequence of H. cannabinus has closer relationships with Talipariti hamabo and Abelmoschus esculentus than with Hibiscus syriacus, which disagrees with the taxonomical relationship. Further analysis obtained a new version of the cp genome annotation of H. syriacus and found that the orientation variation of small single copy (SSC) region exists widely in the family of Malvaceae. The highly variable ycf1 and the highly conserved gene rrn32 were identified among the family of Malvaceae. In particular, the explanation for two different SSC orientations in the cp genomes associated with phylogenetic analysis is discussed. These results provide insights into the systematic classification of the Hibiscus genus in the Malvaceae family.
RESUMO
Base composition asymmetry and gene orientation bias are two common genomic structures in bacterial genomes. Here, correlation coefficients between nucleotide disparities and coding sequence (CDS) skew have been calculated, which provides insights into their relationship from an individual genome perspective. Consequently, we find GC and RY disparities correlate significantly with CDS skew, since around 60% of the bacterial genomes under study have correlation coefficients > 0.9. Then, we present a model for quantitative assessment of nucleotide disparity and CDS skew in which a numerical index R2 is used for evaluation. We find that skew curves with higher R2 perform better on the prediction of replication origins in bacteria.
Assuntos
Genoma Bacteriano/genética , Composição de Bases , Genômica , Modelos Genéticos , Nucleotídeos/genéticaRESUMO
Inferring transcriptional direction (orientation) of the CRISPR array is essential for many applications, including systematically investigating non-canonical CRISPR/Cas functions. The standard method, CRISPRDirection (embedded within CRISPRCasFinder), fails to predict the orientation (ND predictions) for â¼37% of the classified CRISPR arrays (>2200 loci); this goes up to >70% for the II-B subtype where non-canonical functions were first experimentally discovered. Alternatively, Potential Orientation (also embedded within CRISPRCasFinder), has a much smaller frequency of ND predictions but might have significantly lower accuracy. We propose a novel simple criterion, where the CRISPR array direction is assigned according to the direction of its associated cas genes (Cas Orientation). We systematically assess the performance of the three methods (Cas Orientation, CRISPRDirection, and Potential Orientation) across all CRISPR/Cas subtypes, by a mutual crosscheck of their predictions, and by comparing them to the experimental dataset. Interestingly, CRISPRDirection agrees much better with Cas Orientation than with Potential Orientation, despite CRISPRDirection and Potential Orientation being mutually related - Potential Orientation corresponding to one of six (heterogeneous) predictors employed by CRISPRDirection - and being unrelated to Cas Orientation. We find that Cas Orientation has much higher accuracy compared to Potential Orientation and comparable accuracy to CRISPRDirection - while accurately assigning an orientation to â¼95% of the CRISPR arrays that are non-determined by CRISPRDirection. Cas Orientation is, at the same time, simple to employ, requiring only (routine for prokaryotes) the prediction of the associated protein coding gene direction.
RESUMO
Several plant biotechnology applications are based on the expression of multiple genes located on a single transformation vector. The principles of stable expression of foreign genes in plant cells include integration of full-length gene fragments consisting of promoter and transcription terminator sequences, and avoiding converging orientation of the gene transcriptional direction. Therefore, investigators usually generate constructs in which genes are assembled in the same orientation. However, no specific information is available on the effect of the order in which genes should be assembled in the construct to support optimum expression of each gene upon integration in the genome. While many factors, including genomic position and the integration structure, could affect gene expression, the investigators judiciously design DNA constructs to avoid glitches. However, the gene order in a multigene assembly remains an open question. This study addressed the effect of gene order in the DNA construct on gene expression in rice using a simple design of two genes placed in two possible orders with respect to the genomic context. Transgenic rice lines containing green fluorescent protein (GFP) and ß-glucuronidase (GUS) genes in two distinct orders were developed by Cre-lox-mediated site-specific integration. Gene expression analysis of transgenic lines showed that both genes were expressed at similar levels in either orientation, and different transgenic lines expressed each gene within 1-2× range. Thus, no significant effect of the gene order on gene expression was found in the transformed rice lines containing precise site-specific integrations and stable gene expression in plant cells could be obtained with altered gene orders. Therefore, gene orientation and integration structures are more important factors governing gene expression than gene orders in the genomic context.