Pesquisa | BVS Aleitamento Materno

Highly significant improvement of protein sequence alignments with AlphaFold2.

Baltzis, Athanasios; Mansouri, Leila; Jin, Suzanne; Langer, Björn E; Erb, Ionas; Notredame, Cedric.

Bioinformatics ; 38(22): 5007-5011, 2022 11 15.

Artigo em Inglês | MEDLINE | ID: mdl-36130276

RESUMO

MOTIVATION: Protein sequence alignments are essential to structural, evolutionary and functional analysis, but their accuracy is often limited by sequence similarity unless molecular structures are available. Protein structures predicted at experimental grade accuracy, as achieved by AlphaFold2, could therefore have a major impact on sequence analysis. RESULTS: Here, we find that multiple sequence alignments estimated on AlphaFold2 predictions are almost as accurate as alignments estimated on experimental structures and significantly closer to the structural reference than sequence-based alignments. We also show that AlphaFold2 structural models of relatively low quality can be used to obtain highly accurate alignments. These results suggest that, besides structure modeling, AlphaFold2 encodes higher-order dependencies that can be exploited for sequence analysis. AVAILABILITY AND IMPLEMENTATION: All data, analyses and results are available on Zenodo (https://doi.org/10.5281/zenodo.7031286). The code and scripts have been deposited in GitHub (https://github.com/cbcrg/msa-af2-nf) and the various containers in (https://cloud.sylabs.io/library/athbaltzis/af2/alphafold, https://hub.docker.com/r/athbaltzis/pred). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Proteínas , Software , Alinhamento de Sequência , Evolução Biológica

Phylogenetic and functional characterization of water bears (Tardigrada) tubulins.

Novotná Floriancicová, Kamila; Baltzis, Athanasios; Smejkal, Jirí; Czerneková, Michaela; Kaczmarek, Lukasz; Malý, Jan; Notredame, Cedric; Vinopal, Stanislav.

Sci Rep ; 13(1): 5194, 2023 03 30.

Artigo em Inglês | MEDLINE | ID: mdl-36997657

RESUMO

Tardigrades are microscopic ecdysozoans that can withstand extreme environmental conditions. Several tardigrade species undergo reversible morphological transformations and enter into cryptobiosis, which helps them to survive periods of unfavorable environmental conditions. However, the underlying molecular mechanisms of cryptobiosis are mostly unknown. Tubulins are evolutionarily conserved components of the microtubule cytoskeleton that are crucial in many cellular processes. We hypothesize that microtubules are necessary for the morphological changes associated with successful cryptobiosis. The molecular composition of the microtubule cytoskeleton in tardigrades is unknown. Therefore, we analyzed and characterized tardigrade tubulins and identified 79 tardigrade tubulin sequences in eight taxa. We found three α-, seven ß-, one Î³-, and one Îµ-tubulin isoform. To verify in silico identified tardigrade tubulins, we also isolated and sequenced nine out of ten predicted Hypsibius exemplaris tubulins. All tardigrade tubulins were localized as expected when overexpressed in mammalian cultured cells: to the microtubules or to the centrosomes. The presence of a functional Îµ-tubulin, clearly localized to centrioles, is attractive from a phylogenetic point of view. Although the phylogenetically close Nematoda lost their Î´- and Îµ-tubulins, some groups of Arthropoda still possess them. Thus, our data support the current placement of tardigrades into the Panarthropoda clade.

Assuntos

Filogenia , Tardígrados , Animais , Tardígrados/classificação , Tubulina (Proteína)/genética

Multiple Sequence Alignment Computation Using the T-Coffee Regressive Algorithm Implementation.

Garriga, Edgar; Di Tommaso, Paolo; Magis, Cedrik; Erb, Ionas; Mansouri, Leila; Baltzis, Athanasios; Floden, Evan; Notredame, Cedric.

Methods Mol Biol ; 2231: 89-97, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-33289888

RESUMO

Many fields of biology rely on the inference of accurate multiple sequence alignments (MSA) of biological sequences. Unfortunately, the problem of assembling an MSA is NP-complete thus limiting computation to approximate solutions using heuristics solutions. The progressive algorithm is one of the most popular frameworks for the computation of MSAs. It involves pre-clustering the sequences and aligning them starting with the most similar ones. The scalability of this framework is limited, especially with respect to accuracy. We present here an alternative approach named regressive algorithm. In this framework, sequences are first clustered and then aligned starting with the most distantly related ones. This approach has been shown to greatly improve accuracy during scale-up, especially on datasets featuring 10,000 sequences or more. Another benefit is the possibility to integrate third-party clustering methods and third-party MSA aligners. The regressive algorithm has been tested on up to 1.5 million sequences, its implementation is available in the T-Coffee package.

Assuntos

Biologia Computacional/métodos , Alinhamento de Sequência/métodos , Software , Algoritmos , Análise por Conglomerados , Biologia Computacional/instrumentação , Alinhamento de Sequência/instrumentação

Large multiple sequence alignments with a root-to-leaf regressive method.

Garriga, Edgar; Di Tommaso, Paolo; Magis, Cedrik; Erb, Ionas; Mansouri, Leila; Baltzis, Athanasios; Laayouni, Hafid; Kondrashov, Fyodor; Floden, Evan; Notredame, Cedric.

Nat Biotechnol ; 37(12): 1466-1470, 2019 12.

Artigo em Inglês | MEDLINE | ID: mdl-31792410

RESUMO

Multiple sequence alignments (MSAs) are used for structural1,2 and evolutionary predictions1,2, but the complexity of aligning large datasets requires the use of approximate solutions3, including the progressive algorithm4. Progressive MSA methods start by aligning the most similar sequences and subsequently incorporate the remaining sequences, from leaf to root, based on a guide tree. Their accuracy declines substantially as the number of sequences is scaled up5. We introduce a regressive algorithm that enables MSA of up to 1.4 million sequences on a standard workstation and substantially improves accuracy on datasets larger than 10,000 sequences. Our regressive algorithm works the other way around from the progressive algorithm and begins by aligning the most dissimilar sequences. It uses an efficient divide-and-conquer strategy to run third-party alignment methods in linear time, regardless of their original complexity. Our approach will enable analyses of extremely large genomic datasets such as the recently announced Earth BioGenome Project, which comprises 1.5 million eukaryotic genomes6.

Assuntos

Algoritmos , Alinhamento de Sequência/métodos , Bases de Dados Genéticas , Eucariotos/genética , Genômica/métodos , Análise de Regressão

Characterizing a partially ordered miniprotein through folding molecular dynamics simulations: Comparison with the experimental data.

Baltzis, Athanasios S; Glykos, Nicholas M.

Protein Sci ; 25(3): 587-96, 2016 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-26609791

RESUMO

The villin headpiece helical subdomain (HP36) is one of the best known model systems for computational studies of fast-folding all-α miniproteins. HP21 is a peptide fragment-derived from HP36-comprising only the first and second helices of the full domain. Experimental studies showed that although HP21 is mostly unfolded in solution, it does maintain some persistent native-like structure as indicated by the analysis of NMR-derived chemical shifts. Here we compare the experimental data for HP21 with the results obtained from a 15-µs long folding molecular dynamics simulation performed in explicit water and with full electrostatics. We find that the simulation is in good agreement with the experiment and faithfully reproduces the major experimental findings, namely that (a) HP21 is disordered in solution with <10% of the trajectory corresponding to transiently stable structures, (b) the most highly populated conformer is a native-like structure with an RMSD from the corresponding portion of the HP36 crystal structure of <1 Å, (c) the simulation-derived chemical shifts-over the whole length of the trajectory-are in reasonable agreement with the experiment giving reduced χ(2) values of 1.6, 1.4, and 0.8 for the ΔÎ´(13) C(α) , ΔÎ´(13) CO, and ΔÎ´(13) C(ß) secondary shifts, respectively (becoming 0.8, 0.7, and 0.3 when only the major peptide conformer is considered), and finally, (d) the secondary structure propensity scores are in very good agreement with the experiment and clearly indicate the higher stability of the first helix. We conclude that folding molecular dynamics simulations can be a useful tool for the structural characterization of even marginally stable peptides.

Assuntos

Proteínas dos Microfilamentos/química , Simulação de Dinâmica Molecular , Proteínas de Neurofilamentos/química , Fragmentos de Peptídeos/química , Dobramento de Proteína , Sequência de Aminoácidos , Estabilidade Proteica , Estrutura Secundária de Proteína , Eletricidade Estática

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA