RESUMO
Antimicrobial peptides (AMPs) are a heterogeneous group of short polypeptides that target not only microorganisms but also viruses and cancer cells. Due to their lower selection for resistance compared with traditional antibiotics, AMPs have been attracting the ever-growing attention from researchers, including bioinformaticians. Machine learning represents the most cost-effective method for novel AMP discovery and consequently many computational tools for AMP prediction have been recently developed. In this article, we investigate the impact of negative data sampling on model performance and benchmarking. We generated 660 predictive models using 12 machine learning architectures, a single positive data set and 11 negative data sampling methods; the architectures and methods were defined on the basis of published AMP prediction software. Our results clearly indicate that similar training and benchmark data set, i.e. produced by the same or a similar negative data sampling method, positively affect model performance. Consequently, all the benchmark analyses that have been performed for AMP prediction models are significantly biased and, moreover, we do not know which model is the most accurate. To provide researchers with reliable information about the performance of AMP predictors, we also created a web server AMPBenchmark for fair model benchmarking. AMPBenchmark is available at http://BioGenies.info/AMPBenchmark.
Assuntos
Peptídeos Antimicrobianos , Benchmarking , Antibacterianos , Peptídeos/químicaRESUMO
Genome-wide technologies open up new possibilities to clarify questions on genetic structure and phylogeographic history of taxa previously studied with microsatellite loci and mitochondrial sequences. Here, we used 736 individual red deer (Cervus elaphus) samples genotyped at 35,701 single nucleotide polymorphism loci (SNPs) to assess the population structure of the species throughout Europe. The results identified 28 populations, with higher degrees of genetic distinction in peripheral compared to mainland populations. Iberian red deer show high genetic differentiation, with lineages in Western and Central Iberia maintaining their distinctiveness, which supports separate refugial ranges within Iberia along with little recent connection between Iberian and the remaining Western European populations. The Norwegian population exhibited the lowest variability and the largest allele frequency differences from mainland European populations, compatible with a history of bottlenecks and drift during post-glacial colonization from southern refugia. Scottish populations showed high genetic distance from the mainland but high levels of diversity. Hybrid zones were found between Eastern and Western European lineages in Central Europe as well as in the Pyrenees, where red deer from France are in close contact with Iberian red deer. Anthropogenic restocking has promoted the Pyrenean contact zone, admixture events in populations on the Isle of Rum and in the Netherlands, and at least partly the admixture of the two main lineages in central-eastern Europe. Our analysis enabled detailed resolution of population structure of a large mammal widely distributed throughout Europe and contributes to resolving the evolutionary history, which can also inform conservation and management policies.
Assuntos
Cervos , Genética Populacional , Filogeografia , Polimorfismo de Nucleotídeo Único , Animais , Cervos/genética , Cervos/classificação , Polimorfismo de Nucleotídeo Único/genética , Europa (Continente) , Frequência do Gene , Genótipo , Variação GenéticaRESUMO
The extremely rich palaeontological record of the horse family, also known as equids, has provided many examples of macroevolutionary change over the last ~55 Mya. This family is also one of the most documented at the palaeogenomic level, with hundreds of ancient genomes sequenced. While these data have advanced understanding of the domestication history of horses and donkeys, the palaeogenomic record of other equids remains limited. In this study, we have generated genome-wide data for 25 ancient equid specimens spanning over 44 Ky and spread across Anatolia, the Caucasus, Central Asia and Mongolia. Our dataset includes the genomes from two extinct species, the European wild ass, Equus hydruntinus, and the sussemione Equus ovodovi. We document, for the first time, the presence of sussemiones in Mongolia and their survival around ~3.9 Kya, a finding that should be considered when discussing the timing of the first arrival of the domestic horse in the region. We also identify strong spatial differentiation within the historical ecological range of Asian wild asses, Equus hemionus, and incomplete reproductive isolation in several groups yet considered as different species. Finally, we find common selection signatures at ANTXR2 gene in European, Asian and African wild asses. This locus, which encodes a receptor for bacterial toxins, shows no selection signal in E. ovodovi, but a 5.4-kb deletion within intron 7. Whether such genetic modifications played any role in the sussemione extinction remains unknown.
Assuntos
Equidae , Genética Populacional , Animais , Equidae/genética , Mongólia , Genoma/genética , Filogenia , Fósseis , Cavalos/genética , Adaptação Fisiológica/genéticaRESUMO
Antimicrobial peptides (AMPs) are emerging as a promising alternative to traditional antibiotics due to their ability to disturb bacterial membranes and/or their intracellular processes, offering a potential solution to the growing problem of antimicrobial resistance. AMP effectiveness is governed by factors such as net charge, hydrophobicity, and the ability to form amphipathic secondary structures. When properly balanced, these characteristics enable AMPs to selectively target bacterial membranes while sparing eukaryotic cells. This review focuses on the roles of positive charge, hydrophobicity, and structure in influencing AMP activity and toxicity, and explores strategies to optimize them for enhanced therapeutic potential. We highlight the delicate balance between these properties and how various modifications, including amino acid substitutions, peptide tagging, or lipid conjugation, can either enhance or impair AMP performance. Notably, an increase in these parameters does not always yield the best results; sometimes, a slight reduction in charge, hydrophobicity, or structural stability improves the overall AMP therapeutic potential. Understanding these complex interactions is key to developing AMPs with greater antimicrobial activity and reduced toxicity, making them viable candidates in the fight against antibiotic-resistant bacteria.
Assuntos
Interações Hidrofóbicas e Hidrofílicas , Humanos , Peptídeos Antimicrobianos/química , Peptídeos Antimicrobianos/farmacologia , Relação Estrutura-Atividade , Peptídeos Catiônicos Antimicrobianos/química , Peptídeos Catiônicos Antimicrobianos/farmacologia , Animais , Antibacterianos/farmacologia , Antibacterianos/química , Bactérias/efeitos dos fármacosRESUMO
Synonymous codon usage can be influenced by mutations and/or selection, e.g., for speed of protein translation and correct folding. However, this codon bias can also be affected by a general selection at the amino acid level due to differences in the acceptance of the loss and generation of these codons. To assess the importance of this effect, we constructed a mutation-selection model model, in which we generated almost 90,000 stationary nucleotide distributions produced by mutational processes and applied a selection based on differences in physicochemical properties of amino acids. Under these conditions, we calculated the usage of fourfold degenerated (4FD) codons and compared it with the usage characteristic of the pure mutations. We considered both the standard genetic code (SGC) and alternative genetic codes (AGCs). The analyses showed that a majority of AGCs produced a greater 4FD codon bias than the SGC. The mutations producing more thymine or adenine than guanine and cytosine increased the differences in usage. On the other hand, the mutational pressures generating a lot of cytosine or guanine with a low content of adenine and thymine decreased this bias because the nucleotide content of most 4FD codons stayed in the compositional equilibrium with these pressures. The comparison of the theoretical results with those for real protein coding sequences showed that the influence of selection at the amino acid level on the synonymous codon usage cannot be neglected. The analyses indicate that the effect of amino acid selection cannot be disregarded and that it can interfere with other selection factors influencing codon usage, especially in AT-rich genomes, in which AGCs are usually used.
Assuntos
Aminoácidos , Uso do Códon , Aminoácidos/genética , Timina , Código Genético , Códon/genética , Nucleotídeos/genética , Citosina , Guanina , Adenina , Seleção Genética , Evolução MolecularRESUMO
Lactoferrin, an iron-binding glycoprotein, plays a significant role in the innate immune system, with antibacterial, antivirial, antifungal, anticancer, antioxidant and immunomodulatory functions reported. It is worth emphasizing that not only the whole protein but also its derived fragments possess antimicrobial peptide (AMP) activity. Using AmpGram, a top-performing AMP classifier, we generated three novel human lactoferrin (hLF) fragments: hLF 397-412, hLF 448-464 and hLF 668-683, predicted with high probability as AMPs. For comparative studies, we included hLF 1-11, previously confirmed to kill some bacteria. With the four peptides, we treated three Gram-negative and three Gram-positive bacterial strains. Our results indicate that none of the three new lactoferrin fragments have antimicrobial properties for the bacteria tested, but hLF 1-11 was lethal against Pseudomonas aeruginosa. The addition of serine protease inhibitors with the hLF fragments did not enhance their activity, except for hLF 1-11 against P. aeruginosa, which MIC dropped from 128 to 64 µg/mL. Furthermore, we investigated the impact of EDTA with/without serine protease inhibitors and the hLF peptides on selected bacteria. We stress the importance of reporting non-AMP sequences for the development of next-generation AMP prediction models, which suffer from the lack of experimentally validated negative dataset for training and benchmarking.
Assuntos
Lactoferrina , Peptídeos , Humanos , Lactoferrina/metabolismo , Peptídeos/farmacologia , Antifúngicos , Antibacterianos/farmacologiaRESUMO
Amyloids and antimicrobial peptides (AMPs) have many similarities, e.g., both kill microorganisms by destroying their membranes, form aggregates, and modulate the innate immune system. Given these similarities and the fact that the antimicrobial properties of short amyloids have not yet been investigated, we chose a group of potentially antimicrobial short amyloids to verify their impact on bacterial and eukaryotic cells. We used AmpGram, a best-performing AMP classification model, and selected ten amyloids with the highest AMP probability for our experimental research. Our results indicate that four tested amyloids: VQIVCK, VCIVYK, KCWCFT, and GGYLLG, formed aggregates under the conditions routinely used to evaluate peptide antimicrobial properties, but none of the tested amyloids exhibited antimicrobial or cytotoxic properties. Accordingly, they should be included in the negative datasets to train the next-generation AMP prediction models, based on experimentally confirmed AMP and non-AMP sequences. In the article, we also emphasize the importance of reporting non-AMPs, given that only a handful of such sequences have been officially confirmed.
Assuntos
Anti-Infecciosos , Peptídeos Catiônicos Antimicrobianos , Peptídeos Catiônicos Antimicrobianos/farmacologia , Peptídeos Catiônicos Antimicrobianos/química , Anti-Infecciosos/farmacologia , BactériasRESUMO
The standard genetic code (SGC) is the set of rules by which genetic information is translated into proteins, from codons, i.e. triplets of nucleotides, to amino acids. The questions about the origin and the main factor responsible for the present structure of the code are still under a hot debate. Various methodologies have been used to study the features of the code and assess the level of its potential optimality. Here, we introduced a new general approach to evaluate the quality of the genetic code structure. This methodology comes from graph theory and allows us to describe new properties of the genetic code in terms of conductance. This parameter measures the robustness of codon groups against the potential changes in translation of the protein-coding sequences generated by single nucleotide substitutions. We described the genetic code as a partition of an undirected and unweighted graph, which makes the model general and universal. Using this approach, we showed that the structure of the genetic code is a solution to the graph clustering problem. We presented and discussed the structure of the codes that are optimal according to the conductance. Despite the fact that the standard genetic code is far from being optimal according to the conductance, its structure is characterised by many codon groups reaching the minimum conductance for their size. The SGC represents most likely a local minimum in terms of errors occurring in protein-coding sequences and their translation.
Assuntos
Evolução Molecular , Código Genético , Aminoácidos/genética , Análise por Conglomerados , Códon/genética , Modelos GenéticosRESUMO
The standard genetic code (SGC) is a set of rules according to which 64 codons are assigned to 20 canonical amino acids and stop coding signal. As a consequence, the SGC is redundant because there is a greater number of codons than the number of encoded labels. This redundancy implies the existence of codons that encode the same genetic information. The size and organization of such synonymous codon blocks are important characteristics of the SGC structure whose evolution is still unclear. Therefore, we studied possible evolutionary mechanisms of the codon block structure. We conducted computer simulations assuming that coding systems at early stages of the SGC evolution were sets of ambiguous codon assignments with high entropy. We included three types of reading systems characterized by different inaccuracy and pattern of codon recognition. In contrast to the previous study, we allowed for evolution of the reading systems and their competition. The simulations performed under minimization of translational errors and reduction of coding ambiguity produced the coding system resistant to these errors. The reading system similar to that present in the SGC dominated the others very quickly. The survived system was also characterized by low entropy and possessed properties similar to that in the SGC. Our simulation show that the unambiguous SGC could emerged from a code with a lower level of ambiguity and the number of tRNAs increased during the evolution.
Assuntos
Simulação por Computador , Evolução Molecular , Código Genético , Modelos Genéticos , EntropiaRESUMO
As part of the infective process, Porphyromonas gingivalis must acquire heme which is indispensable for life and enables the microorganism to survive and multiply at the infection site. This oral pathogenic bacterium uses a newly discovered novel hmu heme uptake system with a leading role played by the HmuY hemophore-like protein, responsible for acquiring heme and increasing virulence of this periodontopathogen. We demonstrated that Prevotella intermedia produces two HmuY homologs, termed PinO and PinA. Both proteins were produced at higher mRNA and protein levels when the bacterium grew under low-iron/heme conditions. PinO and PinA bound heme, but preferentially under reducing conditions, and in a manner different from that of the P. gingivalis HmuY. The analysis of the three-dimensional structures confirmed differences between apo-PinO and apo-HmuY, mainly in the fold forming the heme-binding pocket. Instead of two histidine residues coordinating heme iron in P. gingivalis HmuY, PinO and PinA could use one methionine residue to fulfill this function, with potential support of additional methionine residue/s. The P. intermedia proteins sequestered heme only from the host albumin-heme complex under reducing conditions. Our findings suggest that HmuY-like family might comprise proteins subjected during evolution to significant diversification, resulting in different heme coordination modes. The newer data presented in this manuscript on HmuY homologs produced by P. intermedia sheds more light on the novel mechanism of heme uptake, could be helpful in discovering their biological function, and in developing novel therapeutic approaches.
Assuntos
Heme/genética , Hemeproteínas/genética , Periodontite/genética , Prevotella intermedia/genética , Regulação Bacteriana da Expressão Gênica/genética , Heme/química , Hemeproteínas/química , Humanos , Ferro/metabolismo , Periodontite/microbiologia , Periodontite/patologia , Porphyromonas gingivalis/genética , Porphyromonas gingivalis/patogenicidade , Prevotella intermedia/patogenicidade , RNA Mensageiro/genética , Homologia de Sequência de AminoácidosRESUMO
The ribosome is not only a protein-making machine, but also a regulatory element in protein synthesis. This view is supported by our earlier data showing that Arabidopsis mitoribosomes altered due to the silencing of the nuclear RPS10 gene encoding mitochondrial ribosomal protein S10 differentially translate mitochondrial transcripts compared with the wild-type. Here, we used ribosome profiling to determine the contribution of transcriptional and translational control in the regulation of protein synthesis in rps10 mitochondria compared with the wild-type ones. Oxidative phosphorylation system proteins are preferentially synthesized in wild-type mitochondria but this feature is lost in the mutant. The rps10 mitoribosomes show slightly reduced translation efficiency of most respiration-related proteins and at the same time markedly more efficiently synthesize ribosomal proteins and MatR and TatC proteins. The mitoribosomes deficient in S10 protein protect shorter transcript fragments which exhibit a weaker 3-nt periodicity compared with the wild-type. The decrease in the triplet periodicity is particularly drastic for genes containing introns. Notably, splicing is considerably less effective in the mutant, indicating an unexpected link between the deficiency of S10 and mitochondrial splicing. Thus, a shortage of the mitoribosomal S10 protein has wide-ranging consequences on mitochondrial gene expression.
Assuntos
Proteínas de Arabidopsis/genética , Mitocôndrias/genética , Mitocôndrias/metabolismo , Proteínas Mitocondriais/metabolismo , Biossíntese de Proteínas/genética , Splicing de RNA/genética , Proteínas Ribossômicas/genética , Arabidopsis/genética , Arabidopsis/metabolismo , Arabidopsis/ultraestrutura , Deleção de Genes , Regulação da Expressão Gênica de Plantas , Proteínas Mitocondriais/genética , Plantas Geneticamente Modificadas , Proteínas Ribossômicas/deficiênciaRESUMO
The regulation of infection and inflammation by a variety of host peptides may represent an evolutionary failsafe in terms of functional degeneracy and it emphasizes the significance of host defense in survival. Neuropeptides have been demonstrated to have similar antimicrobial activities to conventional antimicrobial peptides with broad-spectrum action against a variety of microorganisms. Neuropeptides display indirect anti-infective capacity via enhancement of the host's innate and adaptive immune defense mechanisms. However, more recently concerns have been raised that some neuropeptides may have the potential to augment microbial virulence. In this review we discuss the dual role of neuropeptides, perceived as a double-edged sword, with antimicrobial activity against bacteria, fungi, and protozoa but also capable of enhancing virulence and pathogenicity. We review the different ways by which neuropeptides modulate crucial stages of microbial pathogenesis such as adhesion, biofilm formation, invasion, intracellular lifestyle, dissemination, etc., including their anti-infective properties but also detrimental effects. Finally, we provide an overview of the efficacy and therapeutic potential of neuropeptides in murine models of infectious diseases and outline the intrinsic host factors as well as factors related to pathogen adaptation that may influence efficacy.
Assuntos
Infecções/imunologia , Neuropeptídeos/imunologia , Animais , Humanos , Infecções/microbiologia , Infecções/terapia , Terapia de Alvo Molecular , VirulênciaRESUMO
BACKGROUND: Bird mitogenomes differ from other vertebrates in gene rearrangement. The most common avian gene order, identified first in Gallus gallus, is considered ancestral for all Aves. However, other rearrangements including a duplicated control region and neighboring genes have been reported in many representatives of avian orders. The repeated regions can be easily overlooked due to inappropriate DNA amplification or genome sequencing. This raises a question about the actual prevalence of mitogenomic duplications and the validity of the current view on the avian mitogenome evolution. In this context, Palaeognathae is especially interesting because is sister to all other living birds, i.e. Neognathae. So far, a unique duplicated region has been found in one palaeognath mitogenome, that of Eudromia elegans. RESULTS: Therefore, we applied an appropriate PCR strategy to look for omitted duplications in other palaeognaths. The analyses revealed the duplicated control regions with adjacent genes in Crypturellus, Rhea and Struthio as well as ND6 pseudogene in three moas. The copies are very similar and were subjected to concerted evolution. Mapping the presence and absence of duplication onto the Palaeognathae phylogeny indicates that the duplication was an ancestral state for this avian group. This feature was inherited by early diverged lineages and lost two times in others. Comparison of incongruent phylogenetic trees based on mitochondrial and nuclear sequences showed that two variants of mitogenomes could exist in the evolution of palaeognaths. Data collected for other avian mitogenomes revealed that the last common ancestor of all birds and early diverging lineages of Neoaves could also possess the mitogenomic duplication. CONCLUSIONS: The duplicated control regions with adjacent genes are more common in avian mitochondrial genomes than it was previously thought. These two regions could increase effectiveness of replication and transcription as well as the number of replicating mitogenomes per organelle. In consequence, energy production by mitochondria may be also more efficient. However, further physiological and molecular analyses are necessary to assess the potential selective advantages of the mitogenome duplications.
Assuntos
Genoma Mitocondrial , Paleógnatas , Animais , Aves/genética , Evolução Molecular , Rearranjo Gênico , FilogeniaRESUMO
Although previous phylogenetic analyses suggested that the araphid diatom family Plagiogrammaceae is monophyletic, there is still not a clear understanding of relationships among the genera, and the taxonomy of several genera--Dimeregramma and Plagiogramma--remains questionable in light of paraphyly for both genera using molecular and morphological data. We have expanded the available DNA for molecular work for dozens of plagiogrammacean clones and analyzed 29 morphological characters from plagiogrammarian taxa and closely related genera, to increase understanding of the evolutionary history and systematics of the family and re-evaluate the current taxonomical classification of plagiogrammacean genera. The addition of more taxa and more data confirm the results from previous molecular phylogenies: most plagiogrammacean genera are monophyletic, except for Dimeregramma and Plagiogramma. Interestingly, the morphological analysis resolves only Talaroneis and Glyphodesmis as monophyletic. Given these results, we feel there is limited support for retaining Dimeregramma and Plagiogramma as distinct genera, and formally propose amending Plagiogramma and transferring six Dimeregramma species. As the Plagiogrammaceae is also one of the first-diverging clades of pennate diatoms, we also used these molecular data to estimate the age of the family, based on multiple calibration points derived from fossil taxa within or close to the Plagiogrammaceae. The results indicated that the Plagiogrammaceae evolved more than 114 million year ago and its diversification appears to correspond to a time of climate cooling. Additionally, we described a new monotypic genus (Coccinelloidea) with one new species C. gracilis, and five new species within established genera, e.g. Plagiogramma marginalis, Plagiogramma harenae, Plagiogramma porcipellis, Neofragilaria montgomeryii and Psammogramma anacarae.
Assuntos
Diatomáceas/classificação , Diatomáceas/genética , Filogenia , Animais , Teorema de Bayes , Mudança Climática , Diatomáceas/citologia , Diatomáceas/ultraestrutura , Fósseis , Análise de Sequência de DNARESUMO
BACKGROUND: The standard genetic code is a recipe for assigning unambiguously 21 labels, i.e. amino acids and stop translation signal, to 64 codons. However, at early stages of the translational machinery development, the codons did not have to be read unambiguously and the early genetic codes could have contained some ambiguous assignments of codons to amino acids. Therefore, the goal of this work was to obtain the genetic code structures which could have evolved assuming different types of inaccuracy of the translational machinery starting from unambiguous assignments of codons to amino acids. RESULTS: We developed a theoretical model assuming that the level of uncertainty of codon assignments can gradually decrease during the simulations. Since it is postulated that the standard code has evolved to be robust against point mutations and mistranslations, we developed three simulation scenarios assuming that such errors can influence one, two or three codon positions. The simulated codes were selected using the evolutionary algorithm methodology to decrease coding ambiguity and increase their robustness against mistranslation. CONCLUSIONS: The results indicate that the typical codon block structure of the genetic code could have evolved to decrease the ambiguity of amino acid to codon assignments and to increase the fidelity of reading the genetic information. However, the robustness to errors was not the decisive factor that influenced the genetic code evolution because it is possible to find theoretical codes that minimize the reading errors better than the standard genetic code.
Assuntos
Código Genético , Biossíntese de Proteínas/genética , Algoritmos , Códon/genética , Simulação por Computador , Entropia , Modelos Genéticos , IncertezaRESUMO
Mitochondrial genomes of vertebrates are generally thought to evolve under strong selection for size reduction and gene order conservation. Therefore, a growing number of mitogenomes with duplicated regions changes our view on the genome evolution. Among Aves, order Psittaciformes (parrots) is especially noteworthy because of its large morphological, ecological, and taxonomical diversity, which offers an opportunity to study genome evolution in various aspects. Former analyses showed that tandem duplications comprising the control region with adjacent genes are restricted to several lineages in which the duplication occurred independently. However, using an appropriate polymerase chain reaction strategy, we demonstrate that early diverged parrot groups contain mitogenomes with the duplicated region. These findings together with mapping duplication data from other mitogenomes onto parrot phylogeny indicate that the duplication was an ancestral state for Psittaciformes. The state was inherited by main parrot groups and was lost several times in some lineages. The duplicated regions were subjected to concerted evolution with a frequency higher than the rate of speciation. The duplicated control regions may provide a selective advantage due to a more efficient initiation of replication or transcription and a larger number of replicating genomes per organelle, which may lead to a more effective energy production by mitochondria. The mitogenomic duplications were associated with phenotypic features and parrots with the duplicated region can live longer, show larger body mass as well as predispositions to a more active flight. The results have wider implications on the presence of duplications and their evolution in mitogenomes of other avian groups.
Assuntos
Duplicação Gênica , Genoma Mitocondrial , Papagaios/genética , Animais , Ordem dos Genes , Longevidade/genética , Papagaios/anatomia & histologia , FilogeniaRESUMO
We evaluated the differences between the standard genetic code (SGC) and its known alternative variants in terms of the consequences of amino acids replacements. Furthermore, the properties of all the possible theoretical genetic codes, which differ from the SGC by one, two or three changes in codon assignments were also tested. Although the SGC is closer to the best theoretical codes than to the worst ones due to the minimization of amino acid replacements, from 10% to 27% of the all possible theoretical codes minimize the effect of these replacements better than the SGC. Interestingly, many types of codon reassignments observed in the alternative codes are also responsible for the substantial robustness to amino acid replacements. As many as 18 out of 21 alternatives perform better than the SGC under the assumed optimization criteria. These findings suggest that not all reassignments in the alternative codes are neutral and some of them could be selected to reduce harmful effects of mutations or translation of protein-coding sequences. The results also imply that the standard genetic code can be improved in this respect by a quite small number of changes, which are in fact realized in its variants. It would mean that the tendency to minimize mutational errors was not the main force that drove the evolution of the SGC.
Assuntos
Códon , Evolução Molecular , Código Genético , Modelos Genéticos , Fases de Leitura AbertaRESUMO
BACKGROUND: The standard genetic code (SGC) is a unique set of rules which assign amino acids to codons. Similar amino acids tend to have similar codons indicating that the code evolved to minimize the costs of amino acid replacements in proteins, caused by mutations or translational errors. However, if such optimization in fact occurred, many different properties of amino acids must have been taken into account during the code evolution. Therefore, this problem can be reformulated as a multi-objective optimization task, in which the selection constraints are represented by measures based on various amino acid properties. RESULTS: To study the optimality of the SGC we applied a multi-objective evolutionary algorithm and we used the representatives of eight clusters, which grouped over 500 indices describing various physicochemical properties of amino acids. Thanks to that we avoided an arbitrary choice of amino acid features as optimization criteria. As a consequence, we were able to conduct a more general study on the properties of the SGC than the ones presented so far in other papers on this topic. We considered two models of the genetic code, one preserving the characteristic codon blocks structure of the SGC and the other without this restriction. The results revealed that the SGC could be significantly improved in terms of error minimization, hereby it is not fully optimized. Its structure differs significantly from the structure of the codes optimized to minimize the costs of amino acid replacements. On the other hand, using newly defined quality measures that placed the SGC in the global space of theoretical genetic codes, we showed that the SGC is definitely closer to the codes that minimize the costs of amino acids replacements than those maximizing them. CONCLUSIONS: The standard genetic code represents most likely only partially optimized systems, which emerged under the influence of many different factors. Our findings can be useful to researchers involved in modifying the genetic code of the living organisms and designing artificial ones.
Assuntos
Algoritmos , Evolução Molecular , Código Genético , Aminoácidos/genética , Códon/genética , Análise Discriminante , Modelos Genéticos , Regiões Operadoras Genéticas/genéticaRESUMO
Signal peptides are N-terminal presequences responsible for targeting proteins to the endomembrane system, and subsequent subcellular or extracellular compartments, and consequently condition their proper function. The significance of signal peptides stimulates development of new computational methods for their detection. These methods employ learning systems trained on datasets comprising signal peptides from different types of proteins and taxonomic groups. As a result, the accuracy of predictions are high in the case of signal peptides that are well-represented in databases, but might be low in other, atypical cases. Such atypical signal peptides are present in proteins found in apicomplexan parasites, causative agents of malaria and toxoplasmosis. Apicomplexan proteins have a unique amino acid composition due to their AT-biased genomes. Therefore, we designed a new, more flexible and universal probabilistic model for recognition of atypical eukaryotic signal peptides. Our approach called signalHsmm includes knowledge about the structure of signal peptides and physicochemical properties of amino acids. It is able to recognize signal peptides from the malaria parasites and related species more accurately than popular programs. Moreover, it is still universal enough to provide prediction of other signal peptides on par with the best preforming predictors.
Assuntos
Plasmodium/química , Sinais Direcionadores de Proteínas , Proteínas de Protozoários/química , Análise de Sequência de Proteína/métodos , Aminoácidos/química , Cadeias de Markov , Análise de Sequência de Proteína/normasRESUMO
BACKGROUND: Conures are a morphologically diverse group of Neotropical parrots classified as members of the tribe Arini, which has recently been subjected to a taxonomic revision. The previously broadly defined Aratinga genus of this tribe has been split into the 'true' Aratinga and three additional genera, Eupsittula, Psittacara and Thectocercus. Popular markers used in the reconstruction of the parrots' phylogenies derive from mitochondrial DNA. However, current phylogenetic analyses seem to indicate conflicting relationships between Aratinga and other conures, and also among other Arini members. Therefore, it is not clear if the mtDNA phylogenies can reliably define the species tree. The inconsistencies may result from the variable evolution rate of the markers used or their weak phylogenetic signal. To resolve these controversies and to assess to what extent the phylogenetic relationships in the tribe Arini can be inferred from mitochondrial genomes, we compared representative Arini mitogenomes as well as examined the usefulness of the individual mitochondrial markers and the efficiency of various phylogenetic methods. RESULTS: Single molecular markers produced inconsistent tree topologies, while different methods offered various topologies even for the same marker. A significant disagreement in these tree topologies occurred for cytb, nd2 and nd6 genes, which are commonly used in parrot phylogenies. The strongest phylogenetic signal was found in the control region and RNA genes. However, these markers cannot be used alone in inferring Arini phylogenies because they do not provide fully resolved trees. The most reliable phylogeny of the parrots under study is obtained only on the concatenated set of all mitochondrial markers. The analyses established significantly resolved relationships within the former Aratinga representatives and the main genera of the tribe Arini. Such mtDNA phylogeny can be in agreement with the species tree, owing to its match with synapomorphic features in plumage colouration. CONCLUSIONS: Phylogenetic relationships inferred from single mitochondrial markers can be incorrect and contradictory. Therefore, such phylogenies should be considered with caution. Reliable results can be produced by concatenated sets of all or at least the majority of mitochondrial genes and the control region. The results advance a new view on the relationships among the main genera of Arini and resolve the inconsistencies between the taxa that were previously classified as the broadly defined genus Aratinga. Although gene and species trees do not always have to be consistent, the mtDNA phylogenies for Arini can reflect the species tree.