RESUMO
Tuberculosis (TB) remains one of the leading causes of morbidity and mortality worldwide. Extrapulmonary tuberculosis (EPTB) constitutes around 15-20% of TB cases in immunocompetent individuals. Extrapulmonary sites that are affected by TB include bones, lymph nodes, meningitis, pleura, and genitourinary tract. Whole genome sequencing has emerged as a powerful tool to map genetic diversity among Mycobacterium tuberculosis (MTB) isolates and identify the genomic signatures associated with drug resistance, pathogenesis, and disease transmission. Several pulmonary isolates of MTB have been sequenced over the years. However, availability of whole genome sequences of MTB isolates from extrapulmonary sites is limited. Some studies suggest that genetic variations in MTB might contribute to disease presentation in extrapulmonary sites. This can be addressed if whole genome sequence data from large number of extrapulmonary isolates becomes available. In this study, we have performed whole genome sequencing of five MTB clinical isolates derived from EPTB sites using next-generation sequencing platform. We identified 1434 nonsynonymous single nucleotide variations (SNVs), 143 insertions and 105 deletions. This includes 279 SNVs that were not reported before in publicly available datasets. We found several mutations that are known to confer resistance to drugs. All the five isolates belonged to East-African-Indian lineage (lineage 3). We identified 9 putative prophage DNA integrations and 14 predicted clustered regularly interspaced short palindromic repeats (CRISPR) in MTB genome. Our analysis indicates that more work is needed to map the genetic diversity of MTB. Whole genome sequencing in conjunction with comprehensive drug susceptibility testing can reveal clinically relevant mutations associated with drug resistance.