Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 38(9): 2626-2627, 2022 04 28.
Artigo em Inglês | MEDLINE | ID: mdl-35244144

RESUMO

SUMMARY: Transgene-design is a web application to help design transgenes for use in mammalian studies. It is predicated on the recent discovery that human intronless transgenes and native retrogenes can be expressed very effectively if the GC content at exonic synonymous sites is high. In addition, as exonic splice enhancers resident in intron containing genes may have different utility in intronless genes, these can be reduced or increased in density. Input can be a native gene or a commercially 'optimised' gene. The option to leave in the first intron and to protect or avoid other motifs is also permitted. AVAILABILITY AND IMPLEMENTATION: Transgene-design is based on a ruby for rails platform. The application is available at https://transgene-design.bath.ac.uk. The code is available under GNU General Public License from GitHub (https://github.com/smuehlh/transgenes). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Mamíferos , Software , Animais , Humanos , Íntrons , Éxons , Transgenes , Mutação , Mamíferos/genética
2.
BMC Biol ; 19(1): 258, 2021 12 04.
Artigo em Inglês | MEDLINE | ID: mdl-34863173

RESUMO

BACKGROUND: Yeasts of the CTG-clade lineage, which includes the human-infecting Candida albicans, Candida parapsilosis and Candida tropicalis species, are characterized by an altered genetic code. Instead of translating CUG codons as leucine, as happens in most eukaryotes, these yeasts, whose ancestors are thought to have lost the relevant leucine-tRNA gene, translate CUG codons as serine using a serine-tRNA with a mutated anticodon, [Formula: see text]. Previously reported experiments have suggested that 3-5% of the CTG-clade CUG codons are mistranslated as leucine due to mischarging of the [Formula: see text]. The mistranslation was suggested to result in variable surface proteins explaining fast host adaptation and pathogenicity. RESULTS: In this study, we reassess this potential mistranslation by high-resolution mass spectrometry-based proteogenomics of multiple CTG-clade yeasts, including various C. albicans strains, isolated from colonized and from infected human body sites, and C. albicans grown in yeast and hyphal forms. Our data do not support a bias towards CUG codon mistranslation as leucine. Instead, our data suggest that (i) CUG codons are mistranslated at a frequency corresponding to the normal extent of ribosomal mistranslation with no preference for specific amino acids, (ii) CUG codons are as unambiguous (or ambiguous) as the related CUU leucine and UCC serine codons, (iii) tRNA anticodon loop variation across the CTG-clade yeasts does not result in any difference of the mistranslation level, and (iv) CUG codon unambiguity is independent of C. albicans' strain pathogenicity or growth form. CONCLUSIONS: Our findings imply that C. albicans does not decode CUG ambiguously. This suggests that the proposed misleucylation of the [Formula: see text] might be as prevalent as every other misacylation or mistranslation event and, if at all, be just one of many reasons causing phenotypic diversity.


Assuntos
Candida albicans , Código Genético , Proteogenômica , Sequência de Bases , Candida albicans/genética , Candida albicans/metabolismo , Códon/genética
4.
Genome Biol Evol ; 13(10)2021 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-34427640

RESUMO

Owing to a lag between a deleterious mutation's appearance and its selective removal, gold-standard methods for mutation rate estimation assume no meaningful loss of mutations between parents and offspring. Indeed, from analysis of closely related lineages, in SARS-CoV-2, the Ka/Ks ratio was previously estimated as 1.008, suggesting no within-host selection. By contrast, we find a higher number of observed SNPs at 4-fold degenerate sites than elsewhere and, allowing for the virus's complex mutational and compositional biases, estimate that the mutation rate is at least 49-67% higher than would be estimated based on the rate of appearance of variants in sampled genomes. Given the high Ka/Ks one might assume that the majority of such intrahost selection is the purging of nonsense mutations. However, we estimate that selection against nonsense mutations accounts for only ∼10% of all the "missing" mutations. Instead, classical protein-level selective filters (against chemically disparate amino acids and those predicted to disrupt protein functionality) account for many missing mutations. It is less obvious why for an intracellular parasite, amino acid cost parameters, notably amino acid decay rate, is also significant. Perhaps most surprisingly, we also find evidence for real-time selection against synonymous mutations that move codon usage away from that of humans. We conclude that there is common intrahost selection on SARS-CoV-2 that acts on nonsense, missense, and possibly synonymous mutations. This has implications for methods of mutation rate estimation, for determining times to common ancestry and the potential for intrahost evolution including vaccine escape.


Assuntos
COVID-19/virologia , Mutação , SARS-CoV-2/genética , Uso do Códon , Códon sem Sentido , Evolução Molecular , Humanos , Modelos Genéticos , Taxa de Mutação , Mutação de Sentido Incorreto , Polimorfismo de Nucleotídeo Único , Seleção Genética , Mutação Silenciosa
5.
Mol Biol Evol ; 38(1): 67-83, 2021 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-32687176

RESUMO

Large-scale re-engineering of synonymous sites is a promising strategy to generate vaccines either through synthesis of attenuated viruses or via codon-optimized genes in DNA vaccines. Attenuation typically relies on deoptimization of codon pairs and maximization of CpG dinucleotide frequencies. So as to formulate evolutionarily informed attenuation strategies that aim to force nucleotide usage against the direction favored by selection, here, we examine available whole-genome sequences of SARS-CoV-2 to infer patterns of mutation and selection on synonymous sites. Analysis of mutational profiles indicates a strong mutation bias toward U. In turn, analysis of observed synonymous site composition implicates selection against U. Accounting for dinucleotide effects reinforces this conclusion, observed UU content being a quarter of that expected under neutrality. Possible mechanisms of selection against U mutations include selection for higher expression, for high mRNA stability or lower immunogenicity of viral genes. Consistent with gene-specific selection against CpG dinucleotides, we observe systematic differences of CpG content between SARS-CoV-2 genes. We propose an evolutionarily informed approach to attenuation that, unusually, seeks to increase usage of the already most common synonymous codons. Comparable analysis of H1N1 and Ebola finds that GC3 deviated from neutral equilibrium is not a universal feature, cautioning against generalization of results.


Assuntos
Vacinas contra COVID-19/genética , COVID-19/genética , Genoma Viral , Mutação , SARS-CoV-2/genética , Seleção Genética , COVID-19/prevenção & controle , Humanos , Estabilidade de RNA/genética , RNA Mensageiro/genética , RNA Viral/genética , Uracila
6.
Bioessays ; 41(11): e1900066, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31544971

RESUMO

The major transcript variants of human protein-coding genes are annotated to a certain degree of accuracy combining manual curation, transcript data, and proteomics evidence. However, there is considerable disagreement on the annotation of about 2000 genes-they can be protein-coding, noncoding, or pseudogenes-and on the annotation of most of the predicted alternative transcripts. Pure transcriptome mapping approaches seem to be limited in discriminating functional expression from noise. These limitations have partially been overcome by dedicated algorithms to detect alternative spliced micro-exons and wobble splice variants. Recently, knowledge about splice mechanism and protein structure are incorporated into an algorithm to predict neighboring homologous exons, often spliced in a mutually exclusive manner. Predicted exons are evaluated by transcript data, structural compatibility, and evolutionary conservation, revealing hundreds of novel coding exons and splice mechanism re-assignments. The emerging human pan-genome is necessitating distinctive annotations incorporating differences between individuals and between populations.


Assuntos
Genoma Humano/genética , Proteínas/genética , Algoritmos , Processamento Alternativo/genética , Animais , Éxons/genética , Genômica/métodos , Humanos , Splicing de RNA/genética , Transcriptoma/genética
7.
Curr Biol ; 28(13): 2046-2057.e5, 2018 07 09.
Artigo em Inglês | MEDLINE | ID: mdl-29910077

RESUMO

Although the "universal" genetic code is now known not to be universal, and stop codons can have multiple meanings, one regularity remains, namely that for a given sense codon there is a unique translation. Examining CUG usage in yeasts that have transferred CUG away from leucine, we here report the first example of dual coding: Ascoidea asiatica stochastically encodes CUG as both serine and leucine in approximately equal proportions. This is deleterious, as evidenced by CUG codons being rare, never at conserved serine or leucine residues, and predominantly in lowly expressed genes. Related yeasts solve the problem by loss of function of one of the two tRNAs. This dual coding is consistent with the tRNA-loss-driven codon reassignment hypothesis, and provides a unique example of a proteome that cannot be deterministically predicted. VIDEO ABSTRACT.


Assuntos
Códon de Terminação/metabolismo , RNA de Transferência de Leucina/genética , RNA de Transferência de Serina/genética , Saccharomycetales/genética , RNA de Transferência de Leucina/metabolismo , RNA de Transferência de Serina/metabolismo , Saccharomycetales/metabolismo
8.
BMC Evol Biol ; 17(1): 211, 2017 09 04.
Artigo em Inglês | MEDLINE | ID: mdl-28870165

RESUMO

BACKGROUND: The last eukaryotic common ancestor already had an amazingly complex cell possessing genomic and cellular features such as spliceosomal introns, mitochondria, cilia-dependent motility, and a cytoskeleton together with several intracellular transport systems. In contrast to the microtubule-based dyneins and kinesins, the actin-filament associated myosins are considerably divergent in extant eukaryotes and a unifying picture of their evolution has not yet emerged. RESULTS: Here, we manually assembled and annotated 7852 myosins from 929 eukaryotes providing an unprecedented dense sequence and taxonomic sampling. For classification we complemented phylogenetic analyses with gene structure comparisons resulting in 79 distinct myosin classes. The intron pattern analysis and the taxonomic distribution of the classes suggest two myosins in the last eukaryotic common ancestor, a class-1 prototype and another myosin, which is most likely the ancestor of all other myosin classes. The sparse distribution of class-2 and class-4 myosins outside their major lineages contradicts their presence in the last eukaryotic common ancestor but instead strongly suggests early eukaryote-eukaryote horizontal gene transfer. CONCLUSIONS: By correlating the evolution of myosin diversity with the history of Earth we found that myosin innovation occurred in independent major "burst" events in the major eukaryotic lineages. Most myosin inventions happened in the Mesoproterozoic era. In the late Neoproterozoic era, a process of extensive independent myosin loss began simultaneously with further eukaryotic diversification. Since the Cambrian explosion, myosin repertoire expansion is driven by lineage- and species-specific gene and genome duplications leading to subfunctionalization and fine-tuning of myosin functions.


Assuntos
Eucariotos/classificação , Eucariotos/genética , Evolução Molecular , Miosinas/genética , Células Eucarióticas , Transferência Genética Horizontal , Especiação Genética , Genoma , Íntrons , Miosinas/química , Filogenia , Spliceossomos
9.
Bioessays ; 39(5)2017 05.
Artigo em Inglês | MEDLINE | ID: mdl-28318058

RESUMO

The canonical genetic code ubiquitously translates nucleotide into peptide sequence with several alterations known in viruses, bacteria, mitochondria, plastids, and single-celled eukaryotes. A new hypothesis to explain genetic code changes, termed tRNA loss driven codon reassignment, has been proposed recently when the polyphyly of the yeast codon reassignment events has been uncovered. According to this hypothesis, the driving force for genetic code changes are tRNA or translation termination factor loss-of-function mutations or loss-of-gene events. The free codon can subsequently be captured by all tRNAs that have an appropriately mutated anticodon and are efficiently charged. Thus, codon capture most likely happens by near-cognate tRNAs and tRNAs whose anticodons are not part of the recognition sites of the respective aminoacyl-tRNA-synthetases. This hypothesis comprehensively explains the CTG codon translation as alanine in Pachysolen yeast together with the long known translation of the same codon as serine in Candida albicans and related species, and can also be applied to most other known reassignments.


Assuntos
Códon/genética , Evolução Molecular , Código Genético , Sequência de Aminoácidos , Ascomicetos/classificação , Ascomicetos/genética , Núcleo Celular/genética , Cilióforos/citologia , Cilióforos/genética , Genômica , Modelos Genéticos , Filogenia , Biossíntese de Proteínas , RNA de Transferência/genética , Especificidade da Espécie
10.
RNA Biol ; 14(3): 293-299, 2017 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-28095181

RESUMO

mRNA decoding by tRNAs and tRNA charging by aminoacyl-tRNA synthetases are biochemically separated processes that nevertheless in general involve the same nucleotides. The combination of charging and decoding determines the genetic code. Codon reassignment happens when a differently charged tRNA replaces a former cognate tRNA. The recent discovery of the polyphyly of the yeast CUG sense codon reassignment challenged previous mechanistic considerations and led to the proposal of the so-called tRNA loss driven codon reassignment hypothesis. Accordingly, codon capture is caused by loss of a tRNA or by mutations in the translation termination factor, subsequent reduction of the codon frequency through reduced translation fidelity and final appearance of a new cognate tRNA. Critical for codon capture are sequence and structure of the new tRNA, which must be compatible with recognition regions of aminoacyl-tRNA synthetases. The proposed hypothesis applies to all reported nuclear and organellar codon reassignments.


Assuntos
Códon/genética , Biossíntese de Proteínas , RNA de Transferência/genética , Animais , Anticódon , Códon de Terminação , Código Genético , Humanos , Leveduras/genética , Leveduras/metabolismo
11.
Genome Res ; 26(7): 945-55, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-27197221

RESUMO

The genetic code is the cellular translation table for the conversion of nucleotide sequences into amino acid sequences. Changes to the meaning of sense codons would introduce errors into almost every translated message and are expected to be highly detrimental. However, reassignment of single or multiple codons in mitochondria and nuclear genomes, although extremely rare, demonstrates that the code can evolve. Several models for the mechanism of alteration of nuclear genetic codes have been proposed (including "codon capture," "genome streamlining," and "ambiguous intermediate" theories), but with little resolution. Here, we report a novel sense codon reassignment in Pachysolen tannophilus, a yeast related to the Pichiaceae. By generating proteomics data and using tRNA sequence comparisons, we show that Pachysolen translates CUG codons as alanine and not as the more usual leucine. The Pachysolen tRNACAG is an anticodon-mutated tRNA(Ala) containing all major alanine tRNA recognition sites. The polyphyly of the CUG-decoding tRNAs in yeasts is best explained by a tRNA loss driven codon reassignment mechanism. Loss of the CUG-tRNA in the ancient yeast is followed by gradual decrease of respective codons and subsequent codon capture by tRNAs whose anticodon is not part of the aminoacyl-tRNA synthetase recognition region. Our hypothesis applies to all nuclear genetic code alterations and provides several testable predictions. We anticipate more codon reassignments to be uncovered in existing and upcoming genome projects.


Assuntos
Códon , Evolução Molecular , Saccharomycetales/genética , Sequência de Bases , Núcleo Celular/genética , Código Genético , Anotação de Sequência Molecular , RNA de Transferência/genética , Análise de Sequência de RNA
12.
Bioinformatics ; 31(8): 1302-4, 2015 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-25434742

RESUMO

UNLABELLED: Conserved intron positions in eukaryotic genes can be used to reconstruct phylogenetic trees, to resolve ambiguous subfamily relationships in protein families and to infer the history of gene families. This version of GenePainter facilitates working with large datasets through options to select specific subsets for analysis and visualization, and through providing exhaustive statistics. GenePainter's application in phylogenetic analyses is considerably extended by the newly implemented integration of the exon-intron pattern conservation with phylogenetic trees. AVAILABILITY AND IMPLEMENTATION: The software along with detailed documentation is available at http://www.motorprotein.de/genepainter and as Supplementary Material. CONTACT: mako@nmr.mpibpc.mpg.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Mapeamento Cromossômico , Gráficos por Computador , Éxons/genética , Íntrons/genética , Proteínas dos Microfilamentos/genética , Software , Humanos , Filogenia
13.
Genome Biol Evol ; 6(9): 2274-88, 2014 Aug 27.
Artigo em Inglês | MEDLINE | ID: mdl-25169981

RESUMO

Tubulins belong to the most abundant proteins in eukaryotes providing the backbone for many cellular substructures like the mitotic and meiotic spindles, the intracellular cytoskeletal network, and the axonemes of cilia and flagella. Homologs have even been reported for archaea and bacteria. However, a taxonomically broad and whole-genome-based analysis of the tubulin protein family has never been performed, and thus, the number of subfamilies, their taxonomic distribution, and the exact grouping of the supposed archaeal and bacterial homologs are unknown. Here, we present the analysis of 3,524 tubulins from 504 species. The tubulins formed six major subfamilies, α to ζ. Species of all major kingdoms of the eukaryotes encode members of these subfamilies implying that they must have already been present in the last common eukaryotic ancestor. The proposed archaeal homologs grouped together with the bacterial TubZ proteins as sister clade to the FtsZ proteins indicating that tubulins are unique to eukaryotes. Most species contained α- and/or ß-tubulin gene duplicates resulting from recent branch- and species-specific duplication events. This shows that tubulins cannot be used for constructing species phylogenies without resolving their ortholog-paralog relationships. The many gene duplicates and also the independent loss of the δ-, ε-, or ζ-tubulins, which have been shown to be part of the triplet microtubules in basal bodies, suggest that tubulins can functionally substitute each other.


Assuntos
Eucariotos/genética , Evolução Molecular , Duplicação Gênica , Família Multigênica , Tubulina (Proteína)/genética , Sequência de Aminoácidos , Animais , Archaea/genética , Bactérias/genética , Eucariotos/química , Eucariotos/classificação , Humanos , Modelos Moleculares , Dados de Sequência Molecular , Filogenia , Tubulina (Proteína)/química
14.
Genome Biol Evol ; 6(5)2014 07 22.
Artigo em Inglês | MEDLINE | ID: mdl-25053656

RESUMO

The universal genetic code defines the translation of nucleotide triplets, called codons, into amino acids. In many Saccharomycetes a unique alteration of this code affects the translation of the CUG codon, which is normally translated as leucine. Most of the species encoding CUG alternatively as serine belong to the Candida genus and were grouped into a so-called CTG clade. However, the "Candida genus" is not a monophyletic group and several Candida species are known to use the standard CUG translation. The codon identity could have been changed in a single branch, the ancestor of the Candida, or to several branches independently leading to a polyphyletic alternative yeast codon usage (AYCU). In order to resolve the monophyly or polyphyly of the AYCU, we performed a phylogenomics analysis of 26 motor and cytoskeletal proteins from 60 sequenced yeast species. By investigating the CUG codon positions with respect to sequence conservation at the respective alignment positions we were able to unambiguously assign the standard code or AYCU. Quantitative analysis of the highly conserved leucine and serine alignment positions showed, that 61.1% and 17% of the CUG codons coding for leucine and serine, respectively, are at highly conserved positions, while only 0.6% and 2.3% of the CUG codons, respectively, are at positions conserved in the respective other amino acid. Plotting the codon usage onto the phylogenetic tree revealed the polyphyly of the AYCU with Pachysolen tannophilus and the CTG clade branching independently within a time span of 30 to 100 Mya.

15.
BMC Genomics ; 15: 411, 2014 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-24885275

RESUMO

BACKGROUND: Many eukaryotes have been shown to use alternative schemes to the universal genetic code. While most Saccharomycetes, including Saccharomyces cerevisiae, use the standard genetic code translating the CUG codon as leucine, some yeasts, including many but not all of the "Candida", translate the same codon as serine. It has been proposed that the change in codon identity was accomplished by an almost complete loss of the original CUG codons, making the CUG positions within the extant species highly discriminative for the one or other translation scheme. RESULTS: In order to improve the prediction of genes in yeast species by providing the correct CUG decoding scheme we implemented a web server, called Bagheera, that allows determining the most probable CUG codon translation for a given transcriptome or genome assembly based on extensive reference data. As reference data we use 2071 manually assembled and annotated sequences from 38 cytoskeletal and motor proteins belonging to 79 yeast species. The web service includes a pipeline, which starts with predicting and aligning homologous genes to the reference data. CUG codon positions within the predicted genes are analysed with respect to amino acid similarity and CUG codon conservation in related species. In addition, the tRNACAG gene is predicted in genomic data and compared to known leu-tRNACAG and ser-tRNACAG genes. Bagheera can also be used to evaluate any mRNA and protein sequence data with the codon usage of the respective species. The usage of the system has been demonstrated by analysing six genomes not included in the reference data. CONCLUSIONS: Gene prediction and consecutive comparison with reference data from other Saccharomycetes are sufficient to predict the most probable decoding scheme for CUG codons. This approach has been implemented into Bagheera (http://www.motorprotein.de/bagheera).


Assuntos
Códon , Saccharomyces cerevisiae/genética , Interface Usuário-Computador , Sequência de Bases , Candida/genética , Internet , Leucina/metabolismo , Anotação de Sequência Molecular , Biossíntese de Proteínas , RNA Fúngico/genética , Serina/metabolismo
16.
Genome Biol Evol ; 6(12): 3222-37, 2014 07 22.
Artigo em Inglês | MEDLINE | ID: mdl-25646540

RESUMO

The universal genetic code defines the translation of nucleotide triplets, called codons, into amino acids. In many Saccharomycetes a unique alteration of this code affects the translation of the CUG codon, which is normally translated as leucine. Most of the species encoding CUG alternatively as serine belong to the Candida genus and were grouped into a so-called CTG clade. However, the "Candida genus" is not a monophyletic group and several Candida species are known to use the standard CUG translation. The codon identity could have been changed in a single branch, the ancestor of the Candida, or to several branches independently leading to a polyphyletic alternative yeast codon usage (AYCU). In order to resolve the monophyly or polyphyly of the AYCU, we performed a phylogenomics analysis of 26 motor and cytoskeletal proteins from 60 sequenced yeast species. By investigating the CUG codon positions with respect to sequence conservation at the respective alignment positions, we were able to unambiguously assign the standard code or AYCU. Quantitative analysis of the highly conserved leucine and serine alignment positions showed that 61.1% and 17% of the CUG codons coding for leucine and serine, respectively, are at highly conserved positions, whereas only 0.6% and 2.3% of the CUG codons, respectively, are at positions conserved in the respective other amino acid. Plotting the codon usage onto the phylogenetic tree revealed the polyphyly of the AYCU with Pachysolen tannophilus and the CTG clade branching independently within a time span of 30­100 Ma.


Assuntos
Códon/genética , Filogenia , Saccharomyces cerevisiae/genética , Sequência de Aminoácidos , Candida/genética , Sequência Conservada , Proteínas do Citoesqueleto/química , Proteínas do Citoesqueleto/genética , Evolução Molecular , Proteínas Fúngicas/química , Proteínas Fúngicas/genética , Leucina/genética , Dados de Sequência Molecular , Serina/genética
17.
BMC Evol Biol ; 13: 202, 2013 Sep 22.
Artigo em Inglês | MEDLINE | ID: mdl-24053117

RESUMO

BACKGROUND: The evolution of land plants is characterized by whole genome duplications (WGD), which drove species diversification and evolutionary novelties. Detecting these events is especially difficult if they date back to the origin of the plant kingdom. Established methods for reconstructing WGDs include intra- and inter-genome comparisons, KS age distribution analyses, and phylogenetic tree constructions. RESULTS: By analysing 67 completely sequenced plant genomes 775 myosins were identified and manually assembled. Phylogenetic trees of the myosin motor domains revealed orthologous and paralogous relationships and were consistent with recent species trees. Based on the myosin inventories and the phylogenetic trees, we have identified duplications of the entire myosin motor protein family at timings consistent with 23 WGDs, that had been reported before. We also predict 6 WGDs based on further protein family duplications. Notably, the myosin data support the two recently reported WGDs in the common ancestor of all extant angiosperms. We predict single WGDs in the Manihot esculenta and Nicotiana benthamiana lineages, two WGDs for Linum usitatissimum and Phoenix dactylifera, and a triplication or two WGDs for Gossypium raimondii. Our data show another myosin duplication in the ancestor of the angiosperms that could be either the result of a single gene duplication or a remnant of a WGD. CONCLUSIONS: We have shown that the myosin inventories in angiosperms retain evidence of numerous WGDs that happened throughout plant evolution. In contrast to other protein families, many myosins are still present in extant species. They are closely related and have similar domain architectures, and their phylogenetic grouping follows the genome duplications. Because of its broad taxonomic sampling the dataset provides the basis for reliable future identification of further whole genome duplications.


Assuntos
Evolução Biológica , Duplicação Gênica , Genoma de Planta , Miosinas/genética , Proteínas de Plantas/genética , Plantas/genética , Sequência de Aminoácidos , Dados de Sequência Molecular , Miosinas/química , Miosinas/classificação , Miosinas/metabolismo , Filogenia , Proteínas de Plantas/química , Proteínas de Plantas/classificação , Proteínas de Plantas/metabolismo , Plantas/classificação , Plantas/metabolismo
18.
BMC Bioinformatics ; 14: 77, 2013 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-23496949

RESUMO

BACKGROUND: All sequenced eukaryotic genomes have been shown to possess at least a few introns. This includes those unicellular organisms, which were previously suspected to be intron-less. Therefore, gene splicing must have been present at least in the last common ancestor of the eukaryotes. To explain the evolution of introns, basically two mutually exclusive concepts have been developed. The introns-early hypothesis says that already the very first protein-coding genes contained introns while the introns-late concept asserts that eukaryotic genes gained introns only after the emergence of the eukaryotic lineage. A very important aspect in this respect is the conservation of intron positions within homologous genes of different taxa. RESULTS: GenePainter is a standalone application for mapping gene structure information onto protein multiple sequence alignments. Based on the multiple sequence alignments the gene structures are aligned down to single nucleotides. GenePainter accounts for variable lengths in exons and introns, respects split codons at intron junctions and is able to handle sequencing and assembly errors, which are possible reasons for frame-shifts in exons and gaps in genome assemblies. Thus, even gene structures of considerably divergent proteins can properly be compared, as it is needed in phylogenetic analyses. Conserved intron positions can also be mapped to user-provided protein structures. For their visualization GenePainter provides scripts for the molecular graphics system PyMol. CONCLUSIONS: GenePainter is a tool to analyse gene structure conservation providing various visualization options. A stable version of GenePainter for all operating systems as well as documentation and example data are available at http://www.motorprotein.de/genepainter.html.


Assuntos
Éxons , Íntrons , Proteínas/genética , Alinhamento de Sequência/métodos , Software , Mapeamento Cromossômico/métodos , Gráficos por Computador , Eucariotos/genética , Evolução Molecular , Modelos Moleculares , Análise de Sequência de Proteína
19.
BMC Bioinformatics ; 11: 481, 2010 Sep 24.
Artigo em Inglês | MEDLINE | ID: mdl-20868492

RESUMO

BACKGROUND: Establishing the relationship between an organism's genome sequence and its phenotype is a fundamental challenge that remains largely unsolved. Accurately predicting microbial phenotypes solely based on genomic features will allow us to infer relevant phenotypic characteristics when the availability of a genome sequence precedes experimental characterization, a scenario that is favored by the advent of novel high-throughput and single cell sequencing techniques. RESULTS: We present a novel approach to predict the phenotype of prokaryotes directly from their protein domain frequencies. Our discriminative machine learning approach provides high prediction accuracy of relevant phenotypes such as motility, oxygen requirement or spore formation. Moreover, the set of discriminative domains provides biological insight into the underlying phenotype-genotype relationship and enables deriving hypotheses on the possible functions of uncharacterized domains. CONCLUSIONS: Fast and accurate prediction of microbial phenotypes based on genomic protein domain content is feasible and has the potential to provide novel biological insights. First results of a systematic check for annotation errors indicate that our approach may also be applied to semi-automatic correction and completion of the existing phenotype annotation.


Assuntos
Proteínas de Bactérias/química , Fenótipo , Algoritmos , Genoma Arqueal , Genoma Bacteriano , Anotação de Sequência Molecular , Estrutura Terciária de Proteína
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA