Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Int J Mol Sci ; 25(7)2024 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-38612733

RESUMO

In the human genome, two short open reading frames (ORFs) separated by a transcriptional silencer and a small intervening sequence stem from the gene SMIM45. The two ORFs show different translational characteristics, and they also show divergent patterns of evolutionary development. The studies presented here describe the evolution of the components of SMIM45. One ORF consists of an ultra-conserved 68 amino acid (aa) sequence, whose origins can be traced beyond the evolutionary age of divergence of the elephant shark, ~462 MYA. The silencer also has ancient origins, but it has a complex and divergent pattern of evolutionary formation, as it overlaps both at the 68 aa ORF and the intervening sequence. The other ORF consists of 107 aa. It develops during primate evolution but is found to originate de novo from an ancestral non-coding genomic region with root origins within the Afrothere clade of placental mammals, whose evolutionary age of divergence is ~99 MYA. The formation of the complete 107 aa ORF during primate evolution is outlined, whereby sequence development is found to occur through biased mutations, with disruptive random mutations that also occur but lead to a dead-end. The 107 aa ORF is of particular significance, as there is evidence to suggest it is a protein that may function in human brain development. Its evolutionary formation presents a view of a human-specific ORF and its linked silencer that were predetermined in non-primate ancestral species. The genomic position of the silencer offers interesting possibilities for the regulation of transcription of the 107 aa ORF. A hypothesis is presented with respect to possible spatiotemporal expression of the 107 aa ORF in embryonic tissues.


Assuntos
Genoma Humano , Placenta , Feminino , Gravidez , Animais , Humanos , Fases de Leitura Aberta/genética , Sequência de Aminoácidos , Primatas , Mamíferos
2.
Wiley Interdiscip Rev RNA ; 15(2): e1845, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38605485

RESUMO

For a long time, it was believed that new genes arise only from modifications of preexisting genes, but the discovery of de novo protein-coding genes that originated from noncoding DNA regions demonstrates the existence of a "motherless" origination process for new genes. However, the features, distributions, expression profiles, and origin modes of these genes in humans seem to support the notion that their origin is not a purely "motherless" process; rather, these genes arise preferentially from genomic regions encoding preexisting precursors with gene-like features. In such a case, the gene loci are typically not brand new. In this short review, we will summarize the definition and features of human de novo genes and clarify their process of origination from ancestral non-coding genomic regions. In addition, we define the favored precursors, or "hopeful monsters," for the origin of de novo genes and present a discussion of the functional significance of these young genes in brain development and tumorigenesis in humans. This article is categorized under: RNA Evolution and Genomics > RNA and Ribonucleoprotein Evolution.


Assuntos
Evolução Molecular , RNA , Humanos
3.
PLoS One ; 17(5): e0267864, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35552551

RESUMO

The process of gene birth is of major interest with current excitement concerning de novo gene formation. We report a new and different mechanism of de novo gene birth based on the finding and the characteristics of a short non-coding sequence situated between two protein genes, termed a spacer sequence. This non-coding sequence is present in genomes of Mus musculus, the house mouse and Philippine tarsier, a primitive ancestral primate. The ancestral sequence is highly conserved during primate evolution with certain base pairs totally invariant from mouse to humans. By following the birth of the sequence of human lincRNA BCRP3 (BCR activator of RhoGEF and GTPase 3 pseudogene) during primate evolution, we find diverse genes, long non-coding RNA and protein genes (and sequences that do not appear to encode a gene) that all stem from the 3' end of the spacer, and all begin with a similar sequence. During primate evolution, part of the BCRP3 sequence initially formed in the Old World Monkeys and developed into different primate genes before evolving into the BCRP3 gene in humans. The gene developmental process consists of the initiation of DNA synthesis at spacer 3' ends, addition of a complex of tandem transposable elements and the addition of a segment of another gene. The findings support the concept of the spacer sequence as a starting site for DNA synthesis that leads to formation of different genes with the addition of other sequences. These data suggest a new process of de novo gene birth.


Assuntos
Evolução Molecular , Hominidae , Animais , Elementos de DNA Transponíveis , Genômica , Hominidae/genética , Camundongos , Filogenia , Primatas/genética , Proteínas/genética , Pseudogenes
4.
Noncoding RNA ; 8(1)2022 Jan 10.
Artigo em Inglês | MEDLINE | ID: mdl-35076559

RESUMO

We are delighted to share with you our seventh Journal Club and highlight some of the most interesting papers published recently [...].

5.
Front Genet ; 12: 661425, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33995491

RESUMO

The origin of genes has been a major topic of research for many years, albeit in some cases, it has been a difficult process to elucidate. Insightful is a recent publication that experimentally shows how one gene, linc-UR-UB was born. This gene is regulated in a complex manner in male germ cells during spermatogenesis and is believed to participate in the regulation of levels of the ubiquitin specific peptidase 18 (USP18) mRNA. The process of formation of linc-UR-UB appears relatively simple. It involves a transcription read through from an upstream gene to a downstream functional element, the USP18 3' UTR sequence. This small element also shares the same sequence as the 3' ends of the lincRNA FAM247 family genes. In addition to linc-UR-UB, it is possible that other genes formed in a similar fashion that involves a genomic sequence read through to a functional element.

6.
Noncoding RNA ; 6(3)2020 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-32899105

RESUMO

A small phylogenetically conserved sequence of 11,231 bp, termed FAM247, is repeated in human chromosome 22 by segmental duplications. This sequence forms part of diverse genes that span evolutionary time, the protein genes being the earliest as they are present in zebrafish and/or mice genomes, and the long noncoding RNA genes and pseudogenes the most recent as they appear to be present only in the human genome. We propose that the conserved sequence provides a nucleation site for new gene development at evolutionarily conserved chromosomal loci where the FAM247 sequences reside. The FAM247 sequence also carries information in its open reading frames that provides protein exon amino acid sequences; one exon plays an integral role in immune system regulation, specifically, the function of ubiquitin-specific protease (USP18) in the regulation of interferon. An analysis of this multifaceted sequence and the genesis of genes that contain it is presented.

7.
PLoS One ; 15(3): e0230236, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32214344

RESUMO

Pathways leading to formation of non-coding RNA and protein genes are varied and complex. We report finding a conserved repeat sequence present in human and chimpanzee genomes that appears to have originated from a common primate ancestor. This sequence is repeatedly copied in human chromosome 22 (chr22) low copy repeats (LCR22) or segmental duplications and forms twenty-one different genes, which include the human long intergenic non-coding RNA (lincRNA) family FAM230, a newly discovered lincRNA gene family termed conserved long intergenic non-coding RNAs (clincRNA), pseudogene families, as well as the gamma-glutamyltransferase (GGT) protein gene family and the RNA pseudogenes that originate from GGT sequences. Of particular interest are the GGT5 and USP18 protein genes that appear to have formed from an homologous repeat sequence that also forms the clincRNA gene family. The data point to ancestral DNA sequences, conserved through evolution and duplicated in humans by chromosomal repeat sequences that may serve as functional genomic elements in the development of diverse genes.


Assuntos
Proteínas/genética , Pseudogenes/genética , RNA Longo não Codificante/genética , Animais , Proteínas de Transporte/genética , Mapeamento Cromossômico/métodos , Sequência Conservada/genética , Elementos de DNA Transponíveis/genética , Evolução Molecular , Humanos , Pan troglodytes/genética , gama-Glutamiltransferase/genética
8.
Noncoding RNA ; 4(3)2018 Jul 20.
Artigo em Inglês | MEDLINE | ID: mdl-30036931

RESUMO

A family of long intergenic noncoding RNA (lincRNA) genes, FAM230 is formed via gene sequence duplication, specifically in human chromosomal low copy repeats (LCR) or segmental duplications. This is the first group of lincRNA genes known to be formed by segmental duplications and is consistent with current views of evolution and the creation of new genes via DNA low copy repeats. It appears to be an efficient way to form multiple lincRNA genes. But as these genes are in a critical chromosomal region with respect to the incidence of abnormal translocations and resulting genetic abnormalities, the 22q11.2 region, and also carry a translocation breakpoint motif, several intriguing questions arise concerning the presence and function of the translocation breakpoint sequence in RNA genes situated in LCR22s.

9.
PLoS One ; 13(4): e0195702, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29668722

RESUMO

FAM230C, a long intergenic non-coding RNA (lincRNA) gene in human chromosome 13 (chr13) is a member of lincRNA genes termed family with sequence similarity 230. An analysis using bioinformatics search tools and alignment programs was undertaken to determine properties of FAM230C and its related genes. Results reveal that the DNA translocation element, the Translocation Breakpoint Type A (TBTA) sequence, which consists of satellite DNA, Alu elements, and AT-rich sequences is embedded in the FAM230C gene. Eight lincRNA genes related to FAM230C also carry the TBTA sequences. These genes were formed from a large segment of the 3' half of the FAM230C sequence duplicated in chr22, and are specifically in regions of low copy repeats (LCR22)s, in or close to the 22q.11.2 region. 22q11.2 is a chromosomal segment that undergoes a high rate of DNA translocation and is prone to genetic deletions. FAM230C-related genes present in other chromosomes do not carry the TBTA motif and were formed from the 5' half region of the FAM230C sequence. These findings identify a high specificity in lincRNA gene formation by gene sequence duplication in different chromosomes.


Assuntos
Cromossomos Humanos Par 22 , RNA Longo não Codificante/genética , Translocação Genética/genética , Sequência Rica em At , Humanos , Família Multigênica
10.
World J Biol Chem ; 6(4): 272-80, 2015 Nov 26.
Artigo em Inglês | MEDLINE | ID: mdl-26629310

RESUMO

The first evidence that RNA can function as a regulator of gene expression came from experiments with prokaryotes in the 1980s. It was shown that Escherichia coli micF is an independent gene, has its own promoter, and encodes a small non-coding RNA that base pairs with and inhibits translation of a target messenger RNA in response to environmental stress conditions. The micF RNA was isolated, sequenced and shown to be a primary transcript. In vitro experiments showed binding to the target ompF mRNA. Secondary structure probing revealed an imperfect micF RNA/ompF RNA duplex interaction and the presence of a non-canonical base pair. Several transcription factors, including OmpR, regulate micF transcription in response to environmental factors. micF has also been found in other bacterial species, however, recently Gerhart Wagner and Jörg Vogel showed pleiotropic effects and found micF inhibits expression of multiple target mRNAs; importantly, one is the global regulatory gene lrp. In addition, micF RNA was found to interact with its targets in different ways; it either inhibits ribosome binding or induces degradation of the message. Thus the concept and initial experimental evidence that RNA can regulate gene expression was born with prokaryotes.

11.
BMC Genomics ; 16: 785, 2015 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-26467088

RESUMO

BACKGROUND: DiGeorge Syndrome is a genetic abnormality involving ~3 Mb deletion in human chromosome 22, termed 22q.11.2. To better understand the non-coding regions of 22q.11.2, a small 10,000 bp non-protein-coding sequence close to the DiGeorge Critical Region 6 gene (DGCR6) was chosen for analysis and functional entities as the homologous sequence in the chimpanzee genome could be aligned and used for comparisons. METHODS: The GenBank database provided genomic sequences. In silico computer programs were used to find homologous DNA sequences in human and chimpanzee genomes, generate random sequences, determine DNA sequence alignments, sequence comparisons and nucleotide repeat copies, and to predicted DNA secondary structures. RESULTS: At its 5' half, the 10,000 bp sequence has three distinct sections that represent phylogenetically variable sequences. These Variable Regions contain biased mutations with a very high A + T content, multiple copies of the motif TATAATATA and sequences that fold into long A:T-base-paired stem loops. The 3' half of the 10,000 bp unit, highly conserved between human and chimpanzee, has sequences representing exons of lncRNA genes and segments of introns of protein genes. Central to the 10,000 bp unit are the multiple copies of a sequence that originates from the flanking 5' end of the translocation breakpoint Type A sequence. This breakpoint flanking sequence carries the exon and intron motifs. The breakpoint Type A sequence seems to be a major player in the proliferation of these RNA motifs, as well as the proliferation of Variable Regions in the 10,000 bp segment and other regions within 22q.11.2. CONCLUSIONS: The data indicate that a non-coding region of the chromosome may be reserved for highly biased mutations that lead to formation of specialized sequences and DNA secondary structures. On the other hand, the highly conserved nucleotide sequence of the non-coding region may form storage sites for RNA motifs.


Assuntos
Cromossomos Humanos Par 22/genética , DNA/genética , Genoma Humano , Motivos de Nucleotídeos/genética , Sequência de Aminoácidos/genética , Animais , Sequência de Bases , Sequência Conservada/genética , Humanos , Mutação , Pan troglodytes/genética , Proteínas/genética , Pequeno RNA não Traduzido/genética
12.
Int J Mol Sci ; 14(11): 21960-4, 2013 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-24201126

RESUMO

This Special Issue of IJMS is devoted to regulation by non-coding RNAs and contains both original research and review articles. An attempt is made to provide an up-to-date analysis of this very fast moving field and cover regulatory roles of both microRNAs and long non-coding RNAs. Multifaceted functions of these RNAs in normal cellular processes, as well as in disease progression, are highlighted.


Assuntos
MicroRNAs/genética , RNA não Traduzido/genética , RNA/genética , Humanos
13.
Int J Mol Sci ; 14(7): 13307-28, 2013 Jun 26.
Artigo em Inglês | MEDLINE | ID: mdl-23803660

RESUMO

Growing evidence shows a close association of transposable elements (TE) with non-coding RNAs (ncRNA), and a significant number of small ncRNAs originate from TEs. Further, ncRNAs linked with TE sequences participate in a wide-range of regulatory functions. Alu elements in particular are critical players in gene regulation and molecular pathways. Alu sequences embedded in both long non-coding RNAs (lncRNA) and mRNAs form the basis of targeted mRNA decay via short imperfect base-pairing. Imperfect pairing is prominent in most ncRNA/target RNA interactions and found throughout all biological kingdoms. The piRNA-Piwi complex is multifunctional, but plays a major role in protection against invasion by transposons. This is an RNA-based genetic immune system similar to the one found in prokaryotes, the CRISPR system. Thousands of long intergenic non-coding RNAs (lincRNAs) are associated with endogenous retrovirus LTR transposable elements in human cells. These TEs can provide regulatory signals for lincRNA genes. A surprisingly large number of long circular ncRNAs have been discovered in human fibroblasts. These serve as "sponges" for miRNAs. Alu sequences, encoded in introns that flank exons are proposed to participate in RNA circularization via Alu/Alu base-pairing. Diseases are increasingly found to have a TE/ncRNA etiology. A single point mutation in a SINE/Alu sequence in a human long non-coding RNA leads to brainstem atrophy and death. On the other hand, genomic clusters of repeat sequences as well as lncRNAs function in epigenetic regulation. Some clusters are unstable, which can lead to formation of diseases such as facioscapulohumeral muscular dystrophy. The future may hold more surprises regarding diseases associated with ncRNAs andTEs.


Assuntos
Elementos de DNA Transponíveis/fisiologia , Retrovirus Endógenos/fisiologia , RNA Longo não Codificante/fisiologia , Elementos Alu/fisiologia , Animais , Humanos , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA Interferente Pequeno/fisiologia
14.
Mol Microbiol ; 84(3): 401-4, 2012 May.
Artigo em Inglês | MEDLINE | ID: mdl-22380658

RESUMO

Studies on the regulatory RNA MicF in Enterobacteriaceae reveal a pivotal role in gene regulation. Multiple target gene mRNAs were identified and, importantly, MicF RNA regulates the expression of the global regulatory gene lrp (Holmqvist et al., 2012; Corcoran et al., 2012). Thus MicF RNA is a central factor in a regulatory network that regulates bacterial cell physiology.


Assuntos
Proteínas de Bactérias/genética , Escherichia coli/genética , Regulação Bacteriana da Expressão Gênica , Genes Reguladores , Proteína Reguladora de Resposta a Leucina/genética , RNA Bacteriano/metabolismo , Salmonella/genética , Proteínas de Bactérias/metabolismo , Escherichia coli/metabolismo , Proteína Reguladora de Resposta a Leucina/metabolismo , RNA Bacteriano/genética , Salmonella/metabolismo
15.
Genome Biol Evol ; 3: 959-73, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21803768

RESUMO

Intergenic regions of prokaryotic genomes carry multiple copies of terminal inverted repeat (TIR) sequences, the nonautonomous miniature inverted-repeat transposable element (MITE). In addition, there are the repetitive extragenic palindromic (REP) sequences that fold into a small stem loop rich in G-C bonding. And the clustered regularly interspaced short palindromic repeats (CRISPRs) display similar small stem loops but are an integral part of a complex genetic element. Other classes of repeats such as the REP2 element do not have TIRs but show other signatures. With the current availability of a large number of whole-genome sequences, many new repeat elements have been discovered. These sequences display diverse properties. Some show an intimate linkage to integrons, and at least one encodes a small RNA. Many repeats are found fused with chromosomal open reading frames, and some are located within protein coding sequences. Small repeat units appear to work hand in hand with the transcriptional and/or post-transcriptional apparatus of the cell. Functionally, they are multifaceted, and this can range from the control of gene expression, the facilitation of host/pathogen interactions, or stimulation of the mammalian immune system. The CRISPR complex displays dramatic functions such as an acquired immune system that defends against invading viruses and plasmids. Evolutionarily, mobile repeat elements may have influenced a cycle of active versus inactive genes in ancestral organisms, and some repeats are concentrated in regions of the chromosome where there is significant genomic plasticity. Changes in the abundance of genomic repeats during the evolution of an organism may have resulted in a benefit to the cell or posed a disadvantage, and some present day species may reflect a purification process. The diverse structure, eclectic functions, and evolutionary aspects of repeat elements are described.


Assuntos
Bactérias/genética , Elementos de DNA Transponíveis/genética , Evolução Molecular , Genoma Bacteriano/genética , Sequências Repetidas Terminais/genética , Sequência de Bases , Sequências Repetidas Invertidas/genética , Dados de Sequência Molecular , Estrutura Molecular , Fases de Leitura Aberta/genética , RNA/genética , RNA não Traduzido/genética
16.
PLoS One ; 4(11): e7941, 2009 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-19936201

RESUMO

BACKGROUND: Plasmids of Borrelia species are dynamic structures that contain a large number of repetitive genes, gene fragments, and gene fusions. In addition, the transposable element IS605/200 family, as well as degenerate forms of this IS element, are prevalent. In Helicobacter pylori, flanking regions of the IS605 transposase gene contain sequences that fold into identical small stem loops. These function in transposition at the single-stranded DNA level. METHODOLOGY/PRINCIPAL FINDINGS: In work reported here, bioinformatics techniques were used to scan Borrelia plasmid genomes for IS605 transposable element specific stem loop sequences. Two variant stem loop motifs are found in the left and right flanking regions of the transposase gene. Both motifs appear to have dispersed in plasmid genomes and are found "free-standing" and phylogenetically conserved without the associated IS605 transposase gene or the adjacent flanking sequence. Importantly, IS605 specific stem loop sequences are also found at the 3' ends of lipoprotein genes (PFam12 and PFam60), however the left and right sequences appear to develop their own evolutionary patterns. The lipoprotein gene-linked left stem loop sequences maintain the IS605 stem loop motif in orthologs but only at the RNA level. These show mutations whereby variants fold into phylogenetically conserved RNA-type stem loops that contain the wobble non-Watson-Crick G-U base-pairing. The right flanking sequence is associated with the family lipoprotein-1 genes. A comparison of homologs shows that the IS605 stem loop motif rapidly dissipates, but a more elaborate secondary structure appears to develop in its place. CONCLUSIONS/SIGNIFICANCE: Stem loop sequences specific to the transposable element IS605 are present in plasmid regions devoid of a transposase gene and significantly, are found linked to lipoprotein genes in Borrelia plasmids. These sequences are evolutionarily conserved and/or structurally developed in an RNA format. The findings show that IS605 stem loop sequences are multifaceted and are selectively conserved during evolution when the transposable element dissipates.


Assuntos
Borrelia/genética , Elementos de DNA Transponíveis/genética , Lipoproteínas/genética , Plasmídeos/metabolismo , Sequência de Aminoácidos , Sequência de Bases , Biologia Computacional/métodos , DNA de Cadeia Simples/genética , Evolução Molecular , Helicobacter pylori/metabolismo , Modelos Genéticos , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Homologia de Sequência de Aminoácidos , Homologia de Sequência do Ácido Nucleico , Transposases/genética
17.
BMC Genomics ; 10: 101, 2009 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-19267927

RESUMO

BACKGROUND: Borrelia species are unusual in that they contain a large number of linear and circular plasmids. Many of these plasmids have long intergenic regions. These regions have many fragmented genes, repeated sequences and appear to be in a state of flux, but they may serve as reservoirs for evolutionary change and/or maintain stable motifs such as small RNA genes. RESULTS: In an in silico study, intergenic regions of Borrelia plasmids were scanned for phylogenetically conserved stem loop structures that may represent functional units at the RNA level. Five repeat sequences were found that could fold into stable RNA-type stem loop structures, three of which are closely linked to protein genes, one of which is a member of the Borrelia lipoprotein_1 super family genes and another is the complement regulator-acquiring surface protein_1 (CRASP-1) family. Modeled secondary structures of repeat sequences display numerous base-pair compensatory changes in stem regions, including C-G-->A-U transversions when orthologous sequences are compared. Base-pair compensatory changes constitute strong evidence for phylogenetic conservation of secondary structure. CONCLUSION: Intergenic regions of Borrelia species carry evolutionarily stable RNA secondary structure motifs. Of major interest is that some motifs are associated with protein genes that show large sequence variability. The cell may conserve these RNA motifs whereas allow a large flux in amino acid sequence, possibly to create new virulence factors but with associated RNA motifs intact.


Assuntos
Borrelia/genética , Sequência Conservada , DNA Intergênico/genética , Conformação de Ácido Nucleico , RNA Bacteriano/genética , Sequência de Aminoácidos , Sequência de Bases , Biologia Computacional , Evolução Molecular , Modelos Moleculares , Dados de Sequência Molecular , Filogenia , Plasmídeos/genética , Alinhamento de Sequência , Análise de Sequência de DNA
18.
Mol Microbiol ; 67(3): 475-81, 2008 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-18086200

RESUMO

Small repeat sequences in bacterial genomes, which represent non-autonomous mobile elements, have close similarities to archaeon and eukaryotic miniature inverted repeat transposable elements. These repeat elements are found in both intergenic and intragenic chromosomal regions, and contain an array of diverse motifs. These can include DNA sequences containing an integration host factor binding site and a proposed DNA methyltransferase recognition site, transcribed RNA secondary structural motifs, which are involved in mRNA regulation, and translated open reading frames found fused to other open reading frames. Some bacterial mobile element fusions are in evolutionarily conserved protein and RNA genes. Others might represent or lead to creation of new protein genes. Here we review the remarkable properties of these small bacterial mobile elements in the context of possible beneficial roles resulting from random insertions into the genome.


Assuntos
Bactérias/genética , Sequências Repetitivas Dispersas , DNA Bacteriano/genética , Mutagênese Insercional , Recombinação Genética , Sequências Repetitivas de Ácido Nucleico
19.
Gene Regul Syst Bio ; 1: 191-205, 2007 Oct 16.
Artigo em Inglês | MEDLINE | ID: mdl-19936088

RESUMO

Intergenic repeat units of 127-bp (RU-1) and 168-bp (RU-2), as well as a newly-found class of 103-bp (RU-3), represent small mobile sequences in enterobacterial genomes present in multiple intergenic regions. These repeat sequences display similarities to eukaryotic miniature inverted-repeat transposable elements (MITE). The RU mobile elements have not been reported to encode amino acid sequences. An in silico approach was used to scan genomes for location of repeat units. RU sequences are found to have open reading frames, which are present in annotated gene loci whereby the RU amino acid sequence is maintained. Gene loci that display repeat units include those that encode large proteins which are part of super families that carry conserved domains and those that carry predicted motifs such as signal peptide sequences and transmembrane domains. A putative exported protein in Y. pestis and a phylogenetically conserved putative inner membrane protein in Salmonella species represent some of the more interesting constructs. We hypothesize that a major outcome of RU open reading frame fusions is the evolutionary emergence of new proteins.

20.
Biol Direct ; 1: 12, 2006 May 22.
Artigo em Inglês | MEDLINE | ID: mdl-16716220

RESUMO

INTRODUCTION: Three major outer membrane protein genes of Escherichia coli, ompF, ompC, and ompA respond to stress factors. Transcripts from these genes are regulated by the small non-coding RNAs micF, micC, and micA, respectively. Here we examine Photorhabdus luminescens, an organism that has a different habitat from E. coli for outer membrane protein genes and their regulatory RNA genes. RESULTS: By bioinformatics analysis of conserved genetic loci, mRNA 5'UTR sequences, RNA secondary structure motifs, upstream promoter regions and protein sequence homologies, an ompF -like porin gene in P. luminescens as well as a duplication of this gene have been predicted. Gene loci for micF RNA, as well as OmpC protein and its associated regulatory micC RNA, were not found. Significantly, a sequence bearing the appropriate signatures of the E. coli micA RNA was located. The ompA homolog was previously annotated in P. luminescens. CONCLUSION: Presence of an ompF-like porin in P. luminescens is in keeping with the necessity to allow for passage of small molecules into the cell. The apparent lack of ompC, micC and micF suggests that these genes are not essential to P. luminescens and ompC and micF in particular may have been lost when the organism entered its defined life cycle and partially protected habitat. Control of porin gene expression by RNA may be more prevalent in free- living cells where survival is dependent on the ability to make rapid adjustments in response to environmental stress. Regulation of ompA by micA may have been retained due to a necessity for ompA control during one or both stages of the P. luminescens life cycle.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA