Your browser doesn't support javascript.
loading
MAC: identifying and correcting annotation for multi-nucleotide variations.
Wei, Lei; Liu, Lu T; Conroy, Jacob R; Hu, Qiang; Conroy, Jeffrey M; Morrison, Carl D; Johnson, Candace S; Wang, Jianmin; Liu, Song.
Afiliação
  • Wei L; Department of Biostatistics & Bioinformatics, Roswell Park Cancer Institute, Buffalo, NY, USA. Lei.Wei@RoswellPark.org.
  • Liu LT; Center for Personalized Medicine, Roswell Park Cancer Institute, Buffalo, NY, USA. Lu.Liu@RoswellPark.org.
  • Conroy JR; Center for Personalized Medicine, Roswell Park Cancer Institute, Buffalo, NY, USA. jrc63@duke.edu.
  • Hu Q; Department of Biostatistics & Bioinformatics, Roswell Park Cancer Institute, Buffalo, NY, USA. Qiang.Hu@RoswellPark.org.
  • Conroy JM; Center for Personalized Medicine, Roswell Park Cancer Institute, Buffalo, NY, USA. Jeffrey.Conroy@RoswellPark.org.
  • Morrison CD; Center for Personalized Medicine, Roswell Park Cancer Institute, Buffalo, NY, USA. Carl.Morrison@RoswellPark.org.
  • Johnson CS; Department of Pharmacology and Therapeutics, Roswell Park Cancer Institute, Buffalo, NY, USA. Candace.Johnson@RoswellPark.org.
  • Wang J; Department of Biostatistics & Bioinformatics, Roswell Park Cancer Institute, Buffalo, NY, USA. Jianmin.Wang@RoswellPark.org.
  • Liu S; Department of Biostatistics & Bioinformatics, Roswell Park Cancer Institute, Buffalo, NY, USA. Song.Liu@RoswellPark.org.
BMC Genomics ; 16: 569, 2015 Aug 01.
Article em En | MEDLINE | ID: mdl-26231518
BACKGROUND: Next-Generation Sequencing (NGS) technologies have rapidly advanced our understanding of human variation in cancer. To accurately translate the raw sequencing data into practical knowledge, annotation tools, algorithms and pipelines must be developed that keep pace with the rapidly evolving technology. Currently, a challenge exists in accurately annotating multi-nucleotide variants (MNVs). These tandem substitutions, when affecting multiple nucleotides within a single protein codon of a gene, result in a translated amino acid involving all nucleotides in that codon. Most existing variant callers report a MNV as individual single-nucleotide variants (SNVs), often resulting in multiple triplet codon sequences and incorrect amino acid predictions. To correct potentially misannotated MNVs among reported SNVs, a primary challenge resides in haplotype phasing which is to determine whether the neighboring SNVs are co-located on the same chromosome. RESULTS: Here we describe MAC (Multi-Nucleotide Variant Annotation Corrector), an integrative pipeline developed to correct potentially mis-annotated MNVs. MAC was designed as an application that only requires a SNV file and the matching BAM file as data inputs. Using an example data set containing 3024 SNVs and the corresponding whole-genome sequencing BAM files, we show that MAC identified eight potentially mis-annotated SNVs, and accurately updated the amino acid predictions for seven of the variant calls. CONCLUSIONS: MAC can identify and correct amino acid predictions that result from MNVs affecting multiple nucleotides within a single protein codon, which cannot be handled by most existing SNV-based variant pipelines. The MAC software is freely available and represents a useful tool for the accurate translation of genomic sequence to protein function.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Genoma Humano / Polimorfismo de Nucleotídeo Único / Anotação de Sequência Molecular / Neoplasias Limite: Humans Idioma: En Ano de publicação: 2015 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Genoma Humano / Polimorfismo de Nucleotídeo Único / Anotação de Sequência Molecular / Neoplasias Limite: Humans Idioma: En Ano de publicação: 2015 Tipo de documento: Article