Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
1.
Nucleic Acids Res ; 49(W1): W624-W632, 2021 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-33978761

RESUMO

Dockstore (https://dockstore.org/) is an open source platform for publishing, sharing, and finding bioinformatics tools and workflows. The platform has facilitated large-scale biomedical research collaborations by using cloud technologies to increase the Findability, Accessibility, Interoperability and Reusability (FAIR) of computational resources, thereby promoting the reproducibility of complex bioinformatics analyses. Dockstore supports a variety of source repositories, analysis frameworks, and language technologies to provide a seamless publishing platform for authors to create a centralized catalogue of scientific software. The ready-to-use packaging of hundreds of tools and workflows, combined with the implementation of interoperability standards, enables users to launch analyses across multiple environments. Dockstore is widely used, more than twenty-five high-profile organizations share analysis collections through the platform in a variety of workflow languages, including the Broad Institute's GATK best practice and COVID-19 workflows (WDL), nf-core workflows (Nextflow), the Intergalactic Workflow Commission tools (Galaxy), and workflows from Seven Bridges (CWL) to highlight just a few. Here we describe the improvements made over the last four years, including the expansion of system integrations supporting authors, the addition of collaboration features and analysis platform integrations supporting users, and other enhancements that improve the overall scientific reproducibility of Dockstore content.


Assuntos
Biologia Computacional/métodos , Disseminação de Informação , Internet , Software , Fluxo de Trabalho , Computação em Nuvem , Biologia Computacional/educação , Visualização de Dados , Humanos , National Heart, Lung, and Blood Institute (U.S.) , National Human Genome Research Institute (U.S.) , Reprodutibilidade dos Testes , Estados Unidos
2.
F1000Res ; 6: 52, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28344774

RESUMO

As genomic datasets continue to grow, the feasibility of downloading data to a local organization and running analysis on a traditional compute environment is becoming increasingly problematic. Current large-scale projects, such as the ICGC PanCancer Analysis of Whole Genomes (PCAWG), the Data Platform for the U.S. Precision Medicine Initiative, and the NIH Big Data to Knowledge Center for Translational Genomics, are using cloud-based infrastructure to both host and perform analysis across large data sets. In PCAWG, over 5,800 whole human genomes were aligned and variant called across 14 cloud and HPC environments; the processed data was then made available on the cloud for further analysis and sharing. If run locally, an operation at this scale would have monopolized a typical academic data centre for many months, and would have presented major challenges for data storage and distribution. However, this scale is increasingly typical for genomics projects and necessitates a rethink of how analytical tools are packaged and moved to the data. For PCAWG, we embraced the use of highly portable Docker images for encapsulating and sharing complex alignment and variant calling workflows across highly variable environments. While successful, this endeavor revealed a limitation in Docker containers, namely the lack of a standardized way to describe and execute the tools encapsulated inside the container. As a result, we created the Dockstore ( https://dockstore.org), a project that brings together Docker images with standardized, machine-readable ways of describing and running the tools contained within. This service greatly improves the sharing and reuse of genomics tools and promotes interoperability with similar projects through emerging web service standards developed by the Global Alliance for Genomics and Health (GA4GH).

3.
Mol Cell Proteomics ; 11(4): M111.010587, 2012 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-22186715

RESUMO

Many software tools have been developed for the automated identification of peptides from tandem mass spectra. The accuracy and sensitivity of the identification software via database search are critical for successful proteomics experiments. A new database search tool, PEAKS DB, has been developed by incorporating the de novo sequencing results into the database search. PEAKS DB achieves significantly improved accuracy and sensitivity over two other commonly used software packages. Additionally, a new result validation method, decoy fusion, has been introduced to solve the issue of overconfidence that exists in the conventional target decoy method for certain types of peptide identification software.


Assuntos
Bases de Dados de Proteínas , Peptídeos/análise , Peptídeos/química , Processamento de Proteína Pós-Traducional , Análise de Sequência de Proteína , Software , Espectrometria de Massas em Tandem
4.
Bioinformatics ; 25(17): 2174-80, 2009 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-19535534

RESUMO

MOTIVATION: The bottom-up tandem mass spectrometry (MS/MS) is regularly used in proteomics nowadays for identifying proteins from a sequence database. De novo sequencing software is also available for sequencing novel peptides with relatively short sequence lengths. However, automated sequencing of novel proteins from MS/MS remains a challenging problem. RESULTS: Very often, although the target protein is novel, it has a homologous protein included in a known database. When this happens, we propose a novel algorithm and automated software tool, named Champs, for sequencing the complete protein from MS/MS data of a few enzymatic digestions of the purified protein. Validation with two standard proteins showed that our automated method yields >99% sequence coverage and 100% sequence accuracy on these two proteins. Our method is useful to sequence novel proteins or 're-sequence' a protein that has mutations comparing with the database protein sequence.


Assuntos
Automação/métodos , Bases de Dados de Proteínas , Espectrometria de Massas/métodos , Análise de Sequência de Proteína/métodos , Homologia de Sequência de Aminoácidos , Sequência de Aminoácidos , Animais , Bovinos , Galinhas , Dados de Sequência Molecular , Muramidase/química , Alinhamento de Sequência , Soroalbumina Bovina/química
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA