Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 20(1): 561, 2019 Nov 08.
Artigo em Inglês | MEDLINE | ID: mdl-31703549

RESUMO

BACKGROUND: The MG-RAST API provides search capabilities and delivers organism and function data as well as raw or annotated sequence data via the web interface and its RESTful API. For casual users, however, RESTful APIs are hard to learn and work with. RESULTS: We created the graphical MG-RAST API explorer to help researchers more easily build and export API queries; understand the data abstractions and indices available in MG-RAST; and use the results presented in-browser for exploration, development, and debugging. CONCLUSIONS: The API explorer lowers the barrier to entry for occasional or first-time MG-RAST API users.


Assuntos
Ferramenta de Busca , Software , Interface Usuário-Computador , Archaea/genética , Sequência de Bases , Bases de Dados Genéticas , Internet
2.
Brief Bioinform ; 20(4): 1151-1159, 2019 07 19.
Artigo em Inglês | MEDLINE | ID: mdl-29028869

RESUMO

As technologies change, MG-RAST is adapting. Newly available software is being included to improve accuracy and performance. As a computational service constantly running large volume scientific workflows, MG-RAST is the right location to perform benchmarking and implement algorithmic or platform improvements, in many cases involving trade-offs between specificity, sensitivity and run-time cost. The work in [Glass EM, Dribinsky Y, Yilmaz P, et al. ISME J 2014;8:1-3] is an example; we use existing well-studied data sets as gold standards representing different environments and different technologies to evaluate any changes to the pipeline. Currently, we use well-understood data sets in MG-RAST as platform for benchmarking. The use of artificial data sets for pipeline performance optimization has not added value, as these data sets are not presenting the same challenges as real-world data sets. In addition, the MG-RAST team welcomes suggestions for improvements of the workflow. We are currently working on versions 4.02 and 4.1, both of which contain significant input from the community and our partners that will enable double barcoding, stronger inferences supported by longer-read technologies, and will increase throughput while maintaining sensitivity by using Diamond and SortMeRNA. On the technical platform side, the MG-RAST team intends to support the Common Workflow Language as a standard to specify bioinformatics workflows, both to facilitate development and efficient high-performance implementation of the community's data analysis tasks.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Metagenoma , Metagenômica/métodos , Software , Algoritmos , Orçamentos , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/economia , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Internet , Metagenômica/economia , Metagenômica/estatística & dados numéricos , Análise de Sequência de DNA/economia , Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/estatística & dados numéricos , Interface Usuário-Computador , Fluxo de Trabalho
4.
Nucleic Acids Res ; 44(D1): D590-4, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26656948

RESUMO

MG-RAST (http://metagenomics.anl.gov) is an open-submission data portal for processing, analyzing, sharing and disseminating metagenomic datasets. The system currently hosts over 200,000 datasets and is continuously updated. The volume of submissions has increased 4-fold over the past 24 months, now averaging 4 terabasepairs per month. In addition to several new features, we report changes to the analysis workflow and the technologies used to scale the pipeline up to the required throughput levels. To show possible uses for the data from MG-RAST, we present several examples integrating data and analyses from MG-RAST into popular third-party analysis tools or sequence alignment tools.


Assuntos
Bases de Dados de Ácidos Nucleicos , Metagenômica , Internet , Alinhamento de Sequência
5.
PLoS Comput Biol ; 11(1): e1004008, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25569221

RESUMO

Metagenomic sequencing has produced significant amounts of data in recent years. For example, as of summer 2013, MG-RAST has been used to annotate over 110,000 data sets totaling over 43 Terabases. With metagenomic sequencing finding even wider adoption in the scientific community, the existing web-based analysis tools and infrastructure in MG-RAST provide limited capability for data retrieval and analysis, such as comparative analysis between multiple data sets. Moreover, although the system provides many analysis tools, it is not comprehensive. By opening MG-RAST up via a web services API (application programmers interface) we have greatly expanded access to MG-RAST data, as well as provided a mechanism for the use of third-party analysis tools with MG-RAST data. This RESTful API makes all data and data objects created by the MG-RAST pipeline accessible as JSON objects. As part of the DOE Systems Biology Knowledgebase project (KBase, http://kbase.us) we have implemented a web services API for MG-RAST. This API complements the existing MG-RAST web interface and constitutes the basis of KBase's microbial community capabilities. In addition, the API exposes a comprehensive collection of data to programmers. This API, which uses a RESTful (Representational State Transfer) implementation, is compatible with most programming environments and should be easy to use for end users and third parties. It provides comprehensive access to sequence data, quality control results, annotations, and many other data types. Where feasible, we have used standards to expose data and metadata. Code examples are provided in a number of languages both to show the versatility of the API and to provide a starting point for users. We present an API that exposes the data in MG-RAST for consumption by our users, greatly enhancing the utility of the MG-RAST service.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Bases de Dados Genéticas , Genoma Bacteriano/genética , Metagenômica/métodos , Interface Usuário-Computador , Internet , Anotação de Sequência Molecular/métodos , Software
6.
PLoS One ; 9(3): e92297, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24642836

RESUMO

Mycoplasma salivarium belongs to the class of the smallest self-replicating Tenericutes and is predominantly found in the oral cavity of humans. In general it is considered as a non-pathogenic commensal. However, some reports point to an association with human diseases. M. salivarium was found e.g. as causative agent of a submasseteric abscess, in necrotic dental pulp, in brain abscess and clogged biliary stent. Here we describe the detection of M. salivarium on the surface of a squamous cell carcinoma of the tongue of a patient with Fanconi anaemia (FA). FA is an inherited bone marrow failure syndrome based on defective DNA-repair that increases the risk of carcinomas especially oral squamous cell carcinoma. Employing high coverage, massive parallel Roche/454-next-generation-sequencing of 16S rRNA gene amplicons we analysed the oral microbiome of this FA patient in comparison to that of an FA patient with a benign leukoplakia and five healthy individuals. The microbiota of the FA patient with leukoplakia correlated well with that of the healthy controls. A dominance of Streptococcus, Veillonella and Neisseria species was typically observed. In contrast, the microbiome of the cancer bearing FA patient was dominated by Pseudomonas aeruginosa at the healthy sites, which changed to a predominance of 98% M. salivarium on the tumour surface. Quantification of the mycoplasma load in five healthy, two tumour- and two leukoplakia-FA patients by TaqMan-PCR confirmed the prevalence of M. salivarium at the tumour sites. These new findings suggest that this mycoplasma species with its reduced coding capacity found ideal breeding grounds at the tumour sites. Interestingly, the oral cavity of all FA patients and especially samples at the tumour sites were in addition positive for Candida albicans. It remains to be elucidated in further studies whether M. salivarium can be used as a predictive biomarker for tumour development in these patients.


Assuntos
Carcinoma de Células Escamosas/microbiologia , Anemia de Fanconi/complicações , Infecções por Mycoplasma/microbiologia , Mycoplasma salivarium/genética , Neoplasias da Língua/microbiologia , Adulto , Estudos de Casos e Controles , Genes Bacterianos , Humanos , Masculino , Microbiota/genética , Tipagem Molecular , Boca/microbiologia , RNA Ribossômico 16S/genética , Estudos Retrospectivos
7.
Methods Enzymol ; 531: 487-523, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24060134

RESUMO

The democratized world of sequencing is leading to numerous data analysis challenges; MG-RAST addresses many of these challenges for diverse datasets, including amplicon datasets, shotgun metagenomes, and metatranscriptomes. The changes from version 2 to version 3 include the addition of a dedicated gene calling stage using FragGenescan, clustering of predicted proteins at 90% identity, and the use of BLAT for the computation of similarities. Together with changes in the underlying software infrastructure, this has enabled the dramatic scaling up of pipeline throughput while remaining on a limited hardware budget. The Web-based service allows upload, fully automated analysis, and visualization of results. As a result of the plummeting cost of sequencing and the readily available analytical power of MG-RAST, over 78,000 metagenomic datasets have been analyzed, with over 12,000 of them publicly available in MG-RAST.


Assuntos
Biologia Computacional/métodos , Metagenômica , Software , Bactérias/classificação , Bactérias/genética , Genoma Bacteriano , Sequenciamento de Nucleotídeos em Larga Escala , Internet
8.
Nucleic Acids Res ; 39(14): e91, 2011 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-21586583

RESUMO

The vast majority of microbes are unculturable and thus cannot be sequenced by means of traditional methods. High-throughput sequencing techniques like 454 or Solexa-Illumina make it possible to explore those microbes by studying whole natural microbial communities and analysing their biological diversity as well as the underlying metabolic pathways. Over the past few years, different methods have been developed for the taxonomic and functional characterization of metagenomic shotgun sequences. However, the taxonomic classification of metagenomic sequences from novel species without close homologue in the biological sequence databases poses a challenge due to the high number of wrong taxonomic predictions on lower taxonomic ranks. Here we present CARMA3, a new method for the taxonomic classification of assembled and unassembled metagenomic sequences that has been adapted to work with both BLAST and HMMER3 homology searches. We show that our method makes fewer wrong taxonomic predictions (at the same sensitivity) than other BLAST-based methods. CARMA3 is freely accessible via the web application WebCARMA from http://webcarma.cebitec.uni-bielefeld.de.


Assuntos
Algoritmos , Metagenômica/métodos , Classificação/métodos , Bases de Dados de Proteínas , Filogenia , Alinhamento de Sequência/métodos
9.
BMC Bioinformatics ; 10: 430, 2009 Dec 18.
Artigo em Inglês | MEDLINE | ID: mdl-20021646

RESUMO

BACKGROUND: Metagenomics is a new field of research on natural microbial communities. High-throughput sequencing techniques like 454 or Solexa-Illumina promise new possibilities as they are able to produce huge amounts of data in much shorter time and with less efforts and costs than the traditional Sanger technique. But the data produced comes in even shorter reads (35-100 basepairs with Illumina, 100-500 basepairs with 454-sequencing). CARMA is a new software pipeline for the characterisation of species composition and the genetic potential of microbial samples using short, unassembled reads. RESULTS: In this paper, we introduce WebCARMA, a refined version of CARMA available as a web application for the taxonomic and functional classification of unassembled (ultra-)short reads from metagenomic communities. In addition, we have analysed the applicability of ultra-short reads in metagenomics. CONCLUSIONS: We show that unassembled reads as short as 35 bp can be used for the taxonomic classification of a metagenome. The web application is freely available at http://webcarma.cebitec.uni-bielefeld.de.


Assuntos
Biologia Computacional/métodos , Genoma , Genômica/métodos , Internet , Metagenômica/métodos , Software
10.
Bioinformatics ; 23(5): 629-30, 2007 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-17237063

RESUMO

UNLABELLED: Suffix tree is one of the most fundamental data structures in string algorithms and biological sequence analysis. Unfortunately, when it comes to implementing those algorithms and applying them to real genomic sequences, often the main memory size becomes the bottleneck. This is easily explained by the fact that while a DNA sequence of length n from alphabet sigma = {A, C, G, T} can be stored in n log absolute value(sigma) = 2n bits, its suffix tree occupies O(n log n) bits. In practice, the size difference easily reaches factor 50. We provide an implementation of the compressed suffix tree very recently proposed by Sadakane (Theory of Computing Systems, in press). The compressed suffix tree occupies space proportional to the text size, i.e. O(n log) absolute value(sigma)) bits, and supports all typical suffix tree operations with at most log n factor slowdown. Our experiments show that, e.g. on a 10 MB DNA sequence, the compressed suffix tree takes 10% of the space of normal suffix tree. Typical operations are slowed down by factor 60. AVAILABILITY: The C++ implementation under GNU license is available at http://www.cs.helsinki.fi/group/suds/cst/. An example program implementing a typical pattern discovery task is included. Experimental results in this note correspond to version 0.95.


Assuntos
Algoritmos , Genômica/métodos , Biologia Computacional , DNA/química , Linguagens de Programação , Software
11.
Bioinformatics ; 22(6): 762-4, 2006 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-16403789

RESUMO

MOTIVATION: RNA secondary structure analysis often requires searching for potential helices in large sequence data. RESULTS: We present a utility program GUUGle that efficiently locates potential helical regions under RNA base pairing rules, which include Watson-Crick as well as G-U pairs. It accepts a positive and a negative set of sequences, and determines all exact matches under RNA rules between positive and negative sequences that exceed a specified length. The GUUGle algorithm can also be adapted to use a precomputed suffix array of the positive sequence set. We show how this program can be effectively used as a filter preceding a more computationally expensive task such as miRNA target prediction. AVAILABILITY: GUUGle is available via the Bielefeld Bioinformatics Server at http://bibiserv.techfak.uni-bielefeld.de/guugle


Assuntos
Algoritmos , Pareamento de Bases/genética , Fosfatos de Dinucleosídeos/genética , RNA/genética , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Software , Sequência de Bases , Dados de Sequência Molecular
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...