Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Bioinformatics ; 35(1): 119-121, 2019 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-29931085

RESUMO

Summary: Standardized interfaces for efficiently accessing high-throughput sequencing data are a fundamental requirement for large-scale genomic data sharing. We have developed htsget, a protocol for secure, efficient and reliable access to sequencing read and variation data. We demonstrate four independent client and server implementations, and the results of a comprehensive interoperability demonstration. Availability and implementation: http://samtools.github.io/hts-specs/htsget.html. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Software , Genoma
2.
J Bioinform Comput Biol ; 10(2): 1241005, 2012 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-22809341

RESUMO

The new generation of short-read sequencing technologies requires reliable measures of data quality. Such measures are especially important for variant calling. However, in the particular case of SNP calling, a great number of false-positive SNPs may be obtained. One needs to distinguish putative SNPs from sequencing or other errors. We found that not only the probability of sequencing errors (i.e. the quality value) is important to distinguish an FP-SNP but also the conditional probability of "correcting" this error (the "second best call" probability, conditional on that of the first call). Surprisingly, around 80% of mismatches can be "corrected" with this second call. Another way to reduce the rate of FP-SNPs is to retrieve DNA motifs that seem to be prone to sequencing errors, and to attach a corresponding conditional quality value to these motifs. We have developed several measures to distinguish between sequence errors and candidate SNPs, based on a base call's nucleotide context and its mismatch type. In addition, we suggested a simple method to correct the majority of mismatches, based on conditional probability of their "second" best intensity call. We attach a corresponding second call confidence (quality value) of being corrected to each mismatch.


Assuntos
Análise de Sequência de DNA/métodos , Algoritmos , Motivos de Nucleotídeos , Polimorfismo de Nucleotídeo Único , Projetos de Pesquisa
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA