Your browser doesn't support javascript.
loading
Statistical analysis of variability in TnSeq data across conditions using zero-inflated negative binomial regression.
Subramaniyam, Siddharth; DeJesus, Michael A; Zaveri, Anisha; Smith, Clare M; Baker, Richard E; Ehrt, Sabine; Schnappinger, Dirk; Sassetti, Christopher M; Ioerger, Thomas R.
Afiliação
  • Subramaniyam S; Department of Computer Science & Engineering, Texas A&M Univeristy, College Station, TX, USA.
  • DeJesus MA; Rockefeller University, New York, NY, USA.
  • Zaveri A; Department of Microbiology & Immunology, Weill Cornell Medical College, New York, NY, USA.
  • Smith CM; Department of Microbiology & Physiological Systems, University of Massachusetts Medical School, Worchester, MA, USA.
  • Baker RE; Department of Microbiology & Physiological Systems, University of Massachusetts Medical School, Worchester, MA, USA.
  • Ehrt S; Department of Microbiology & Immunology, Weill Cornell Medical College, New York, NY, USA.
  • Schnappinger D; Department of Microbiology & Immunology, Weill Cornell Medical College, New York, NY, USA.
  • Sassetti CM; Department of Microbiology & Physiological Systems, University of Massachusetts Medical School, Worchester, MA, USA.
  • Ioerger TR; Department of Computer Science & Engineering, Texas A&M Univeristy, College Station, TX, USA. ioerger@cs.tamu.edu.
BMC Bioinformatics ; 20(1): 603, 2019 Nov 21.
Article em En | MEDLINE | ID: mdl-31752678
BACKGROUND: Deep sequencing of transposon mutant libraries (or TnSeq) is a powerful method for probing essentiality of genomic loci under different environmental conditions. Various analytical methods have been described for identifying conditionally essential genes whose tolerance for insertions varies between two conditions. However, for large-scale experiments involving many conditions, a method is needed for identifying genes that exhibit significant variability in insertions across multiple conditions. RESULTS: In this paper, we introduce a novel statistical method for identifying genes with significant variability of insertion counts across multiple conditions based on Zero-Inflated Negative Binomial (ZINB) regression. Using likelihood ratio tests, we show that the ZINB distribution fits TnSeq data better than either ANOVA or a Negative Binomial (in a generalized linear model). We use ZINB regression to identify genes required for infection of M. tuberculosis H37Rv in C57BL/6 mice. We also use ZINB to perform a analysis of genes conditionally essential in H37Rv cultures exposed to multiple antibiotics. CONCLUSIONS: Our results show that, not only does ZINB generally identify most of the genes found by pairwise resampling (and vastly out-performs ANOVA), but it also identifies additional genes where variability is detectable only when the magnitudes of insertion counts are treated separately from local differences in saturation, as in the ZINB model.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Elementos de DNA Transponíveis / Modelos Estatísticos / Bases de Dados Genéticas / Sequenciamento de Nucleotídeos em Larga Escala Tipo de estudo: Risk_factors_studies Limite: Animals Idioma: En Revista: BMC Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2019 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Elementos de DNA Transponíveis / Modelos Estatísticos / Bases de Dados Genéticas / Sequenciamento de Nucleotídeos em Larga Escala Tipo de estudo: Risk_factors_studies Limite: Animals Idioma: En Revista: BMC Bioinformatics Assunto da revista: INFORMATICA MEDICA Ano de publicação: 2019 Tipo de documento: Article País de afiliação: Estados Unidos