Your browser doesn't support javascript.
loading
Repeat- and error-aware comparison of deletions.
Wittler, Roland; Marschall, Tobias; Schönhuth, Alexander; Mäkinen, Veli.
Afiliação
  • Wittler R; Genome Informatics, Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Germany, Center for Bioinformatics, Saarland University and Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Saarbrücken, Germany, Centrum Wiskun
  • Marschall T; Genome Informatics, Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Germany, Center for Bioinformatics, Saarland University and Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Saarbrücken, Germany, Centrum Wiskun
  • Schönhuth A; Genome Informatics, Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Germany, Center for Bioinformatics, Saarland University and Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Saarbrücken, Germany, Centrum Wiskun
  • Mäkinen V; Genome Informatics, Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Germany, Center for Bioinformatics, Saarland University and Department of Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Saarbrücken, Germany, Centrum Wiskun
Bioinformatics ; 31(18): 2947-54, 2015 Sep 15.
Article em En | MEDLINE | ID: mdl-25979471
ABSTRACT
MOTIVATION The number of reported genetic variants is rapidly growing, empowered by ever faster accumulation of next-generation sequencing data. A major issue is comparability. Standards that address the combined problem of inaccurately predicted breakpoints and repeat-induced ambiguities are missing. This decisively lowers the quality of 'consensus' callsets and hampers the removal of duplicate entries in variant databases, which can have deleterious effects in downstream analyses.

RESULTS:

We introduce a sound framework for comparison of deletions that captures both tool-induced inaccuracies and repeat-induced ambiguities. We present a maximum matching algorithm that outputs virtual duplicates among two sets of predictions/annotations. We demonstrate that our approach is clearly superior over ad hoc criteria, like overlap, and that it can reduce the redundancy among callsets substantially. We also identify large amounts of duplicate entries in the Database of Genomic Variants, which points out the immediate relevance of our approach. AVAILABILITY AND IMPLEMENTATION Implementation is open source and available from https//bitbucket.org/readdi/readdi CONTACT roland.wittler@uni-bielefeld.de or t.marschall@mpi-inf.mpg.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Variação Genética / Algoritmos / Software / Sequências Repetitivas de Ácido Nucleico / Deleção de Sequência / Análise de Sequência de DNA / Biologia Computacional Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Revista: Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2015 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Variação Genética / Algoritmos / Software / Sequências Repetitivas de Ácido Nucleico / Deleção de Sequência / Análise de Sequência de DNA / Biologia Computacional Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Revista: Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2015 Tipo de documento: Article