Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Nat Protoc ; 2024 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-39402428

RESUMO

Since its public release in 2021, AlphaFold2 (AF2) has made investigating biological questions, by using predicted protein structures of single monomers or full complexes, a common practice. ColabFold-AF2 is an open-source Jupyter Notebook inside Google Colaboratory and a command-line tool that makes it easy to use AF2 while exposing its advanced options. ColabFold-AF2 shortens turnaround times of experiments because of its optimized usage of AF2's models. In this protocol, we guide the reader through ColabFold best practices by using three scenarios: (i) monomer prediction, (ii) complex prediction and (iii) conformation sampling. The first two scenarios cover classic static structure prediction and are demonstrated on the human glycosylphosphatidylinositol transamidase protein. The third scenario demonstrates an alternative use case of the AF2 models by predicting two conformations of the human alanine serine transporter 2. Users can run the protocol without computational expertise via Google Colaboratory or in a command-line environment for advanced users. Using Google Colaboratory, it takes <2 h to run each procedure. The data and code for this protocol are available at https://protocol.colabfold.com .

2.
Artigo em Inglês | MEDLINE | ID: mdl-38316555

RESUMO

The recent CASP15 competition highlighted the critical role of multiple sequence alignments (MSAs) in protein structure prediction, as demonstrated by the success of the top AlphaFold2-based prediction methods. To push the boundaries of MSA utilization, we conducted a petabase-scale search of the Sequence Read Archive (SRA), resulting in gigabytes of aligned homologs for CASP15 targets. These were merged with default MSAs produced by ColabFold-search and provided to ColabFold-predict. By using SRA data, we achieved highly accurate predictions (GDT_TS > 70) for 66% of the non-easy targets, whereas using ColabFold-search default MSAs scored highly in only 52%. Next, we tested the effect of deep homology search and ColabFold's advanced features, such as more recycles, on prediction accuracy. While SRA homologs were most significant for improving ColabFold's CASP15 ranking from 11th to 3rd place, other strategies contributed too. We analyze these in the context of existing strategies to improve prediction.


Assuntos
Biologia Computacional , Proteínas , Biologia Computacional/métodos , Proteínas/química , Alinhamento de Sequência , Conformação Proteica , Software , Algoritmos , Análise de Sequência de Proteína/métodos
4.
Nucleic Acids Res ; 52(D1): D426-D433, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37933852

RESUMO

The DescribePROT database of amino acid-level descriptors of protein structures and functions was substantially expanded since its release in 2020. This expansion includes substantial increase in the size, scope, and quality of the underlying data, the addition of experimental structural information, the inclusion of new data download options, and an upgraded graphical interface. DescribePROT currently covers 19 structural and functional descriptors for proteins in 273 reference proteomes generated by 11 accurate and complementary predictive tools. Users can search our resource in multiple ways, interact with the data using the graphical interface, and download data at various scales including individual proteins, entire proteomes, and whole database. The annotations in DescribePROT are useful for a broad spectrum of studies that include investigations of protein structure and function, development and validation of predictive tools, and to support efforts in understanding molecular underpinnings of diseases and development of therapeutics. DescribePROT can be freely accessed at http://biomine.cs.vcu.edu/servers/DESCRIBEPROT/.


Assuntos
Aminoácidos , Proteoma , Proteoma/química , Bases de Dados Factuais
5.
Nucleic Acids Res ; 52(D1): D368-D375, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37933859

RESUMO

The AlphaFold Database Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) has significantly impacted structural biology by amassing over 214 million predicted protein structures, expanding from the initial 300k structures released in 2021. Enabled by the groundbreaking AlphaFold2 artificial intelligence (AI) system, the predictions archived in AlphaFold DB have been integrated into primary data resources such as PDB, UniProt, Ensembl, InterPro and MobiDB. Our manuscript details subsequent enhancements in data archiving, covering successive releases encompassing model organisms, global health proteomes, Swiss-Prot integration, and a host of curated protein datasets. We detail the data access mechanisms of AlphaFold DB, from direct file access via FTP to advanced queries using Google Cloud Public Datasets and the programmatic access endpoints of the database. We also discuss the improvements and services added since its initial release, including enhancements to the Predicted Aligned Error viewer, customisation options for the 3D viewer, and improvements in the search engine of AlphaFold DB.


The AlphaFold Protein Structure Database (AlphaFold DB) is a massive digital library of predicted protein structures, with over 214 million entries, marking a 500-times expansion in size since its initial release in 2021. The structures are predicted using Google DeepMind's AlphaFold 2 artificial intelligence (AI) system. Our new report highlights the latest updates we have made to this database. We have added more data on specific organisms and proteins related to global health and expanded to cover almost the complete UniProt database, a primary data resource of protein sequences. We also made it easier for our users to access the data by directly downloading files or using advanced cloud-based tools. Finally, we have also improved how users view and search through these protein structures, making the user experience smoother and more informative. In short, AlphaFold DB has been growing rapidly and has become more user-friendly and robust to support the broader scientific community.


Assuntos
Inteligência Artificial , Estrutura Secundária de Proteína , Proteoma , Sequência de Aminoácidos , Bases de Dados de Proteínas , Ferramenta de Busca , Proteínas/química
6.
Nat Biotechnol ; 42(2): 243-246, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37156916

RESUMO

As structure prediction methods are generating millions of publicly available protein structures, searching these databases is becoming a bottleneck. Foldseek aligns the structure of a query protein against a database by describing tertiary amino acid interactions within proteins as sequences over a structural alphabet. Foldseek decreases computation times by four to five orders of magnitude with 86%, 88% and 133% of the sensitivities of Dali, TM-align and CE, respectively.


Assuntos
Algoritmos , Proteínas , Bases de Dados de Proteínas , Proteínas/química , Aminoácidos , Software
7.
Nature ; 622(7983): 637-645, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37704730

RESUMO

Proteins are key to all cellular processes and their structure is important in understanding their function and evolution. Sequence-based predictions of protein structures have increased in accuracy1, and over 214 million predicted structures are available in the AlphaFold database2. However, studying protein structures at this scale requires highly efficient methods. Here, we developed a structural-alignment-based clustering algorithm-Foldseek cluster-that can cluster hundreds of millions of structures. Using this method, we have clustered all of the structures in the AlphaFold database, identifying 2.30 million non-singleton structural clusters, of which 31% lack annotations representing probable previously undescribed structures. Clusters without annotation tend to have few representatives covering only 4% of all proteins in the AlphaFold database. Evolutionary analysis suggests that most clusters are ancient in origin but 4% seem to be species specific, representing lower-quality predictions or examples of de novo gene birth. We also show how structural comparisons can be used to predict domain families and their relationships, identifying examples of remote structural similarity. On the basis of these analyses, we identify several examples of human immune-related proteins with putative remote homology in prokaryotic species, illustrating the value of this resource for studying protein function and evolution across the tree of life.


Assuntos
Algoritmos , Análise por Conglomerados , Proteínas , Homologia Estrutural de Proteína , Humanos , Bases de Dados de Proteínas , Proteínas/química , Proteínas/classificação , Proteínas/metabolismo , Alinhamento de Sequência , Anotação de Sequência Molecular , Células Procarióticas/química , Filogenia , Especificidade da Espécie , Evolução Molecular
8.
bioRxiv ; 2023 Jul 11.
Artigo em Inglês | MEDLINE | ID: mdl-37503235

RESUMO

The recent CASP15 competition highlighted the critical role of multiple sequence alignments (MSAs) in protein structure prediction, as demonstrated by the success of the top AlphaFold2-based prediction methods. To push the boundaries of MSA utilization, we conducted a petabase-scale search of the Sequence Read Archive (SRA), resulting in gigabytes of aligned homologs for CASP15 targets. These were merged with default MSAs produced by ColabFold-search and provided to ColabFold-predict. By using SRA data, we achieved highly accurate predictions (GDT_TS > 70) for 66% of the non-easy targets, whereas using ColabFold-search default MSAs scored highly in only 52%. Next, we tested the effect of deep homology search and ColabFold's advanced features, such as more recycles, on prediction accuracy. While SRA homologs were most significant for improving ColabFold's CASP15 ranking from 11th to 3rd place, other strategies contributed too. We analyze these in the context of existing strategies to improve prediction.

9.
Genome Biol ; 24(1): 113, 2023 05 12.
Artigo em Inglês | MEDLINE | ID: mdl-37173746

RESUMO

BACKGROUND: Protein annotation is a major goal in molecular biology, yet experimentally determined knowledge is typically limited to a few model organisms. In non-model species, the sequence-based prediction of gene orthology can be used to infer protein identity; however, this approach loses predictive power at longer evolutionary distances. Here we propose a workflow for protein annotation using structural similarity, exploiting the fact that similar protein structures often reflect homology and are more conserved than protein sequences. RESULTS: We propose a workflow of openly available tools for the functional annotation of proteins via structural similarity (MorF: MorphologFinder) and use it to annotate the complete proteome of a sponge. Sponges are highly relevant for inferring the early history of animals, yet their proteomes remain sparsely annotated. MorF accurately predicts the functions of proteins with known homology in [Formula: see text] cases and annotates an additional [Formula: see text] of the proteome beyond standard sequence-based methods. We uncover new functions for sponge cell types, including extensive FGF, TGF, and Ephrin signaling in sponge epithelia, and redox metabolism and control in myopeptidocytes. Notably, we also annotate genes specific to the enigmatic sponge mesocytes, proposing they function to digest cell walls. CONCLUSIONS: Our work demonstrates that structural similarity is a powerful approach that complements and extends sequence similarity searches to identify homologous proteins over long evolutionary distances. We anticipate this will be a powerful approach that boosts discovery in numerous -omics datasets, especially for non-model organisms.


Assuntos
Proteoma , Animais , Anotação de Sequência Molecular , Sequência de Aminoácidos
10.
Bioinformatics ; 39(4)2023 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-36961332

RESUMO

SUMMARY: Highly accurate protein structure predictors have generated hundreds of millions of protein structures; these pose a challenge in terms of storage and processing. Here, we present Foldcomp, a novel lossy structure compression algorithm, and indexing system to address this challenge. By using a combination of internal and Cartesian coordinates and a bi-directional NeRF-based strategy, Foldcomp improves the compression ratio by a factor of three compared to the next best method. Its reconstruction error of 0.08 Å is comparable to the best lossy compressor. It is five times faster than the next fastest compressor and competes with the fastest decompressors. With its multi-threading implementation and a Python interface that allows for easy database downloads and efficient querying of protein structures by accession, Foldcomp is a powerful tool for managing and analysing large collections of protein structures. AVAILABILITY AND IMPLEMENTATION: Foldcomp is a free open-source software (GPLv3) and available for Linux, macOS, and Windows at https://foldcomp.foldseek.com. Foldcomp provides the AlphaFold Swiss-Prot (2.9GB), TrEMBL (1.1TB), and ESMatlas HQ (114GB) database ready-for-download.


Assuntos
Compressão de Dados , Software , Algoritmos , Compressão de Dados/métodos , Proteínas , Biblioteca Gênica
11.
Protein Sci ; 32(1): e4524, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36454227

RESUMO

The availability of accurate and fast artificial intelligence (AI) solutions predicting aspects of proteins are revolutionizing experimental and computational molecular biology. The webserver LambdaPP aspires to supersede PredictProtein, the first internet server making AI protein predictions available in 1992. Given a protein sequence as input, LambdaPP provides easily accessible visualizations of protein 3D structure, along with predictions at the protein level (GeneOntology, subcellular location), and the residue level (binding to metal ions, small molecules, and nucleotides; conservation; intrinsic disorder; secondary structure; alpha-helical and beta-barrel transmembrane segments; signal-peptides; variant effect) in seconds. The structure prediction provided by LambdaPP-leveraging ColabFold and computed in minutes-is based on MMseqs2 multiple sequence alignments. All other feature prediction methods are based on the pLM ProtT5. Queried by a protein sequence, LambdaPP computes protein and residue predictions almost instantly for various phenotypes, including 3D structure and aspects of protein function. LambdaPP is freely available for everyone to use under embed.predictprotein.org, the interactive results for the case study can be found under https://embed.predictprotein.org/o/Q9NZC2. The frontend of LambdaPP can be found on GitHub (github.com/sacdallago/embed.predictprotein.org), and can be freely used and distributed under the academic free use license (AFL-2). For high-throughput applications, all methods can be executed locally via the bio-embeddings (bioembeddings.com) python package, or docker image at ghcr.io/bioembeddings/bio_embeddings, which also includes the backend of LambdaPP.


Assuntos
Inteligência Artificial , Proteínas , Proteínas/química , Sequência de Aminoácidos , Estrutura Secundária de Proteína , Alinhamento de Sequência , Software
12.
Nat Methods ; 19(6): 679-682, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35637307

RESUMO

ColabFold offers accelerated prediction of protein structures and complexes by combining the fast homology search of MMseqs2 with AlphaFold2 or RoseTTAFold. ColabFold's 40-60-fold faster search and optimized model utilization enables prediction of close to 1,000 structures per day on a server with one graphics processing unit. Coupled with Google Colaboratory, ColabFold becomes a free and accessible platform for protein folding. ColabFold is open-source software available at https://github.com/sokrypton/ColabFold and its novel environmental databases are available at https://colabfold.mmseqs.com .


Assuntos
Dobramento de Proteína , Software , Computadores , Bases de Dados Factuais , Proteínas
13.
Nucleic Acids Res ; 49(W1): W535-W540, 2021 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-33999203

RESUMO

Since 1992 PredictProtein (https://predictprotein.org) is a one-stop online resource for protein sequence analysis with its main site hosted at the Luxembourg Centre for Systems Biomedicine (LCSB) and queried monthly by over 3,000 users in 2020. PredictProtein was the first Internet server for protein predictions. It pioneered combining evolutionary information and machine learning. Given a protein sequence as input, the server outputs multiple sequence alignments, predictions of protein structure in 1D and 2D (secondary structure, solvent accessibility, transmembrane segments, disordered regions, protein flexibility, and disulfide bridges) and predictions of protein function (functional effects of sequence variation or point mutations, Gene Ontology (GO) terms, subcellular localization, and protein-, RNA-, and DNA binding). PredictProtein's infrastructure has moved to the LCSB increasing throughput; the use of MMseqs2 sequence search reduced runtime five-fold (apparently without lowering performance of prediction methods); user interface elements improved usability, and new prediction methods were added. PredictProtein recently included predictions from deep learning embeddings (GO and secondary structure) and a method for the prediction of proteins and residues binding DNA, RNA, or other proteins. PredictProtein.org aspires to provide reliable predictions to computational and experimental biologists alike. All scripts and methods are freely available for offline execution in high-throughput settings.


Assuntos
Conformação Proteica , Software , Sítios de Ligação , Proteínas do Nucleocapsídeo de Coronavírus/química , Proteínas de Ligação a DNA/química , Fosfoproteínas/química , Estrutura Secundária de Proteína , Proteínas/química , Proteínas/fisiologia , Proteínas de Ligação a RNA/química , Alinhamento de Sequência , Análise de Sequência de Proteína
14.
Bioinformatics ; 37(19): 3364-3366, 2021 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-33792634

RESUMO

SUMMARY: SpacePHARER (CRISPR Spacer Phage-Host Pair Finder) is a sensitive and fast tool for de novo prediction of phage-host relationships via identifying phage genomes that match CRISPR spacers in genomic or metagenomic data. SpacePHARER gains sensitivity by comparing spacers and phages at the protein level, optimizing its scores for matching very short sequences, and combining evidence from multiple matches, while controlling for false positives. We demonstrate SpacePHARER by searching a comprehensive spacer list against all complete phage genomes. AVAILABILITY AND IMPLEMENTATION: SpacePHARER is available as an open-source (GPLv3), user-friendly command-line software for Linux and macOS: https://github.com/soedinglab/spacepharer. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

15.
Nucleic Acids Res ; 49(D1): D298-D308, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33119734

RESUMO

We present DescribePROT, the database of predicted amino acid-level descriptors of structure and function of proteins. DescribePROT delivers a comprehensive collection of 13 complementary descriptors predicted using 10 popular and accurate algorithms for 83 complete proteomes that cover key model organisms. The current version includes 7.8 billion predictions for close to 600 million amino acids in 1.4 million proteins. The descriptors encompass sequence conservation, position specific scoring matrix, secondary structure, solvent accessibility, intrinsic disorder, disordered linkers, signal peptides, MoRFs and interactions with proteins, DNA and RNAs. Users can search DescribePROT by the amino acid sequence and the UniProt accession number and entry name. The pre-computed results are made available instantaneously. The predictions can be accesses via an interactive graphical interface that allows simultaneous analysis of multiple descriptors and can be also downloaded in structured formats at the protein, proteome and whole database scale. The putative annotations included by DescriPROT are useful for a broad range of studies, including: investigations of protein function, applied projects focusing on therapeutics and diseases, and in the development of predictors for other protein sequence descriptors. Future releases will expand the coverage of DescribePROT. DescribePROT can be accessed at http://biomine.cs.vcu.edu/servers/DESCRIBEPROT/.


Assuntos
Aminoácidos/química , Bases de Dados de Proteínas , Genoma , Proteínas/genética , Proteoma/genética , Software , Sequência de Aminoácidos , Aminoácidos/metabolismo , Animais , Archaea/genética , Archaea/metabolismo , Bactérias/genética , Bactérias/metabolismo , Sítios de Ligação , Sequência Conservada , Fungos/genética , Fungos/metabolismo , Humanos , Internet , Plantas/genética , Plantas/metabolismo , Células Procarióticas/metabolismo , Ligação Proteica , Estrutura Secundária de Proteína , Proteínas/química , Proteínas/classificação , Proteínas/metabolismo , Proteoma/química , Proteoma/metabolismo , Análise de Sequência de Proteína , Vírus/genética , Vírus/metabolismo
16.
Curr Protoc Bioinformatics ; 72(1): e108, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-33315308

RESUMO

The MPI Bioinformatics Toolkit (https://toolkit.tuebingen.mpg.de) provides interactive access to a wide range of the best-performing bioinformatics tools and databases, including the state-of-the-art protein sequence comparison methods HHblits and HHpred. The Toolkit currently includes 35 external and in-house tools, covering functionalities such as sequence similarity searching, prediction of sequence features, and sequence classification. Due to this breadth of functionality, the tight interconnection of its constituent tools, and its ease of use, the Toolkit has become an important resource for biomedical research and for teaching protein sequence analysis to students in the life sciences. In this article, we provide detailed information on utilizing the three most widely accessed tools within the Toolkit: HHpred for the detection of homologs, HHpred in conjunction with MODELLER for structure prediction and homology modeling, and CLANS for the visualization of relationships in large sequence datasets. © 2020 The Authors. Basic Protocol 1: Sequence similarity searching using HHpred Alternate Protocol: Pairwise sequence comparison using HHpred Support Protocol: Building a custom multiple sequence alignment using PSI-BLAST and forwarding it as input to HHpred Basic Protocol 2: Calculation of homology models using HHpred and MODELLER Basic Protocol 3: Cluster analysis using CLANS.


Assuntos
Biologia Computacional , Análise de Sequência de Proteína , Software , Conformação Proteica , Alinhamento de Sequência , Análise de Sequência de Proteína/métodos , Interface Usuário-Computador
17.
Microbiome ; 8(1): 48, 2020 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-32245390

RESUMO

BACKGROUND: Metagenomics is revolutionizing the study of microorganisms and their involvement in biological, biomedical, and geochemical processes, allowing us to investigate by direct sequencing a tremendous diversity of organisms without the need for prior cultivation. Unicellular eukaryotes play essential roles in most microbial communities as chief predators, decomposers, phototrophs, bacterial hosts, symbionts, and parasites to plants and animals. Investigating their roles is therefore of great interest to ecology, biotechnology, human health, and evolution. However, the generally lower sequencing coverage, their more complex gene and genome architectures, and a lack of eukaryote-specific experimental and computational procedures have kept them on the sidelines of metagenomics. RESULTS: MetaEuk is a toolkit for high-throughput, reference-based discovery, and annotation of protein-coding genes in eukaryotic metagenomic contigs. It performs fast searches with 6-frame-translated fragments covering all possible exons and optimally combines matches into multi-exon proteins. We used a benchmark of seven diverse, annotated genomes to show that MetaEuk is highly sensitive even under conditions of low sequence similarity to the reference database. To demonstrate MetaEuk's power to discover novel eukaryotic proteins in large-scale metagenomic data, we assembled contigs from 912 samples of the Tara Oceans project. MetaEuk predicted >12,000,000 protein-coding genes in 8 days on ten 16-core servers. Most of the discovered proteins are highly diverged from known proteins and originate from very sparsely sampled eukaryotic supergroups. CONCLUSION: The open-source (GPLv3) MetaEuk software (https://github.com/soedinglab/metaeuk) enables large-scale eukaryotic metagenomics through reference-based, sensitive taxonomic and functional annotation. Video abstract.


Assuntos
Algoritmos , Eucariotos/genética , Metagenômica/métodos , Microbiota , Anotação de Sequência Molecular/métodos , Biologia Computacional/métodos , Bases de Dados Genéticas , Ensaios de Triagem em Larga Escala , Metagenoma , Metagenômica/instrumentação , Análise de Sequência de DNA/métodos
18.
BMC Bioinformatics ; 20(1): 473, 2019 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-31521110

RESUMO

BACKGROUND: HH-suite is a widely used open source software suite for sensitive sequence similarity searches and protein fold recognition. It is based on pairwise alignment of profile Hidden Markov models (HMMs), which represent multiple sequence alignments of homologous proteins. RESULTS: We developed a single-instruction multiple-data (SIMD) vectorized implementation of the Viterbi algorithm for profile HMM alignment and introduced various other speed-ups. These accelerated the search methods HHsearch by a factor 4 and HHblits by a factor 2 over the previous version 2.0.16. HHblits3 is ∼10× faster than PSI-BLAST and ∼20× faster than HMMER3. Jobs to perform HHsearch and HHblits searches with many query profile HMMs can be parallelized over cores and over cluster servers using OpenMP and message passing interface (MPI). The free, open-source, GPLv3-licensed software is available at https://github.com/soedinglab/hh-suite . CONCLUSION: The added functionalities and increased speed of HHsearch and HHblits should facilitate their use in large-scale protein structure and function prediction, e.g. in metagenomics and genomics projects.


Assuntos
Anotação de Sequência Molecular/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Software , Algoritmos , Cadeias de Markov
19.
Nat Methods ; 16(7): 603-606, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-31235882

RESUMO

The open-source de novo protein-level assembler, Plass ( https://plass.mmseqs.com ), assembles six-frame-translated sequencing reads into protein sequences. It recovers 2-10 times more protein sequences from complex metagenomes and can assemble huge datasets. We assembled two redundancy-filtered reference protein catalogs, 2 billion sequences from 640 soil samples (soil reference protein catalog) and 292 million sequences from 775 marine eukaryotic metatranscriptomes (marine eukaryotic reference catalog), the largest free collections of protein sequences.


Assuntos
Metagenômica , Proteínas/química , Sequência de Aminoácidos , Códon , Fases de Leitura Aberta
20.
Bioinformatics ; 35(16): 2856-2858, 2019 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-30615063

RESUMO

SUMMARY: The MMseqs2 desktop and web server app facilitates interactive sequence searches through custom protein sequence and profile databases on personal workstations. By eliminating MMseqs2's runtime overhead, we reduced response times to a few seconds at sensitivities close to BLAST. AVAILABILITY AND IMPLEMENTATION: The app is easy to install for non-experts. GPLv3-licensed code, pre-built desktop app packages for Windows, MacOS and Linux, Docker images for the web server application and a demo web server are available at https://search.mmseqs.com. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Computadores , Software , Sequência de Aminoácidos , Bases de Dados Factuais
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA