RESUMEN
BACKGROUND: Low-abundance mutations in mitochondrial populations (mutations with minor allele frequency ≤ 1%), are associated with cancer, aging, and neurodegenerative disorders. While recent progress in high-throughput sequencing technology has significantly improved the heteroplasmy identification process, the ability of this technology to detect low-abundance mutations can be affected by the presence of similar sequences originating from nuclear DNA (nDNA). To determine to what extent nDNA can cause false positive low-abundance heteroplasmy calls, we have identified mitochondrial locations of all subsequences that are common or similar (one mismatch allowed) between nDNA and mitochondrial DNA (mtDNA). RESULTS: Performed analysis revealed up to a 25-fold variation in the lengths of longest common and longest similar (one mismatch allowed) subsequences across the mitochondrial genome. The size of the longest subsequences shared between nDNA and mtDNA in several regions of the mitochondrial genome were found to be as low as 11 bases, which not only allows using these regions to design new, very specific PCR primers, but also supports the hypothesis of the non-random introduction of mtDNA into the human nuclear DNA. CONCLUSION: Analysis of the mitochondrial locations of the subsequences shared between nDNA and mtDNA suggested that even very short (36 bases) single-end sequencing reads can be used to identify low-abundance variation in 20.4% of the mitochondrial genome. For longer (76 and 150 bases) reads, the proportion of the mitochondrial genome where nDNA presence will not interfere found to be 44.5 and 67.9%, when low-abundance mutations at 100% of locations can be identified using 417 bases long single reads. This observation suggests that the analysis of low-abundance variations in mitochondria population can be extended to a variety of large data collections such as NCBI Sequence Read Archive, European Nucleotide Archive, The Cancer Genome Atlas, and International Cancer Genome Consortium.
Asunto(s)
Contaminación de ADN , Genoma Humano , Genoma Mitocondrial , Genes Mitocondriales , Secuenciación de Nucleótidos de Alto Rendimiento/normas , Humanos , Reacción en Cadena de la Polimerasa/métodos , Reacción en Cadena de la Polimerasa/normas , Reproducibilidad de los ResultadosRESUMEN
BACKGROUND: Rickettsia species are obligate intracellular Gram-negative pathogenic bacteria and the etiologic agents of diseases such as Rocky Mountain spotted fever (RMSF), Mediterranean spotted fever, epidemic typhus, and murine typhus. Genome sequencing revealed that R. prowazekii has ~25 % non-coding DNA, the majority of which is thought to be either "junk DNA" or pseudogenes resulting from genomic reduction. These characteristics also define other Rickettsia genomes. Bacterial small RNAs, whose biogenesis is predominantly attributed to either the intergenic regions (trans-acting) or to the antisense strand of an open reading frame (cis-acting), are now appreciated to be among the most important post-transcriptional regulators of bacterial virulence and growth. We hypothesize that intergenic regions in rickettsial species encode for small, non-coding RNAs (sRNAs) involved in the regulation of its transcriptome, leading to altered virulence and adaptation depending on the host niche. RESULTS: We employed a combination of bioinformatics and in vitro approaches to explore the presence of sRNAs in a number of species within Genus Rickettsia. Using the sRNA Identification Protocol using High-throughput Technology (SIPHT) web interface, we predicted over 1,700 small RNAs present in the intergenic regions of 16 different strains representing 13 rickettsial species. We further characterized novel sRNAs from typhus (R. prowazekii and R. typhi) and spotted fever (R. rickettsii and R. conorii) groups for their promoters and Rho-independent terminators using Bacterial Promoter Prediction Program (BPROM) and TransTermHP prediction algorithms, respectively. Strong σ70 promoters were predicted upstream of all novel small RNAs, indicating the potential for transcriptional activity. Next, we infected human microvascular endothelial cells (HMECs) with R. prowazekii for 3 h and 24 h and performed Next Generation Sequencing to experimentally validate the expression of 26 sRNA candidates predicted in R. prowazekii. Reverse transcriptase PCR was also used to further verify the expression of six putative novel sRNA candidates in R. prowazekii. CONCLUSIONS: Our results yield clear evidence for the expression of novel R. prowazekii sRNA candidates during infection of HMECs. This is the first description of novel small RNAs for a highly pathogenic species of Rickettsia, which should lead to new insights into rickettsial virulence and adaptation mechanisms.
Asunto(s)
Regulación Bacteriana de la Expresión Génica , ARN Bacteriano , ARN Pequeño no Traducido , Rickettsia/genética , Secuencia de Bases , Mapeo Cromosómico , Biología Computacional/métodos , Secuencia de Consenso , Genoma Bacteriano , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Datos de Secuencia Molecular , Motivos de Nucleótidos , Posición Específica de Matrices de Puntuación , Regiones Promotoras Genéticas , Interferencia de ARN , Reproducibilidad de los ResultadosRESUMEN
The organisms in aerosol microenvironments, especially densely populated urban areas, are relevant to maintenance of public health and detection of potential epidemic or biothreat agents. To examine aerosolized microorganisms in this environment, we performed sequencing on the material from an urban aerosol surveillance program. Whole metagenome sequencing was applied to DNA extracted from air filters obtained during periods from each of the four seasons. The composition of bacteria, plants, fungi, invertebrates, and viruses demonstrated distinct temporal shifts. Bacillus thuringiensis serovar kurstaki was detected in samples known to be exposed to aerosolized spores, illustrating the potential utility of this approach for identification of intentionally introduced microbial agents. Together, these data demonstrate the temporally dependent metagenomic complexity of urban aerosols and the potential of genomic analytical techniques for biosurveillance and monitoring of threats to public health.
Asunto(s)
Microbiología del Aire , ADN Bacteriano/aislamiento & purificación , Metagenómica/métodos , Bacillus thuringiensis/aislamiento & purificación , Bacterias/clasificación , Bacterias/aislamiento & purificación , Biomasa , Ciudades , Variaciones en el Número de Copia de ADN , ADN Bacteriano/genética , District of Columbia , Monitoreo del Ambiente , Hongos/clasificación , Hongos/aislamiento & purificación , Metagenoma , Estaciones del Año , Alineación de Secuencia , Análisis de Secuencia de ADNRESUMEN
BACKGROUND: Perchlorate contamination has been detected in both ground water and drinking water. An attractive treatment option is the use of ion-exchange to remove and concentrate perchlorate in brine. Biological treatment can subsequently remove the perchlorate from the brine. When nitrate is present, it will also be concentrated in the brine and must also be removed by biological treatment. The primary objective was to obtain an in-depth characterization of the microbial populations of two salt-tolerant cultures each of which is capable of metabolizing perchlorate. The cultures were derived from a single ancestral culture and have been maintained in the laboratory for more than 10 years. One culture was fed perchlorate only, while the other was fed both perchlorate and nitrate. RESULTS: A metagenomic characterization was performed using Illumina DNA sequencing technology, and the 16S rDNA of several pure strains isolated from the mixed cultures were sequenced. In the absence of nitrate, members of the Rhodobacteraceae constituted the prevailing taxonomic group. Second in abundance were the Rhodocyclaceae. In the nitrate fed culture, the Rhodobacteraceae are essentially absent. They are replaced by a major expansion of the Rhodocyclaceae and the emergence of the Alteromonadaceae as a significant community member. Gene sequences exhibiting significant homology to known perchlorate and nitrate reduction enzymes were found in both cultures. CONCLUSIONS: The structure of the two microbial ecosystems of interest has been established and some representative strains obtained in pure culture. The results illustrate that under favorable conditions a group of organisms can readily dominate an ecosystem and yet be effectively eliminated when their advantage is lost. Almost all known perchlorate-reducing organisms can also effectively reduce nitrate. This is certainly not the case for the Rhodobacteraceae that were found to dominate in the absence of nitrate, but effectively disappeared in its presence. This study is significant in that it reveals the existence of a novel group of organisms that play a role in the reduction of perchlorate under saline conditions. These Rhodobacteraceae especially, as well as other organisms present in these communities may be a promising source of unique salt-tolerant enzymes for perchlorate reduction.
Asunto(s)
Reactores Biológicos/microbiología , Nitratos/metabolismo , Percloratos/metabolismo , Rhodobacteraceae/metabolismo , Rhodocyclaceae/metabolismo , Cloruro de Sodio/metabolismo , Secuencia de Bases , Biodegradación Ambiental , Intercambio Iónico , Metagenoma/genética , Datos de Secuencia Molecular , ARN Ribosómico 16S/genética , Rhodobacteraceae/genética , Rhodocyclaceae/genética , Sales (Química)/metabolismoRESUMEN
BACKGROUND: The emergence of Next Generation Sequencing technologies has made it possible for individual investigators to generate gigabases of sequencing data per week. Effective analysis and manipulation of these data is limited due to large file sizes, so even simple tasks such as data filtration and quality assessment have to be performed in several steps. This requires (potentially problematic) interaction between the investigator and a bioinformatics/computational service provider. Furthermore, such services are often performed using specialized computational facilities. RESULTS: We present a Windows-based application, Slim-Filter designed to interactively examine the statistical properties of sequencing reads produced by Illumina Genome Analyzer and to perform a broad spectrum of data manipulation tasks including: filtration of low quality and low complexity reads; filtration of reads containing undesired subsequences (such as parts of adapters and PCR primers used during the sample and sequencing libraries preparation steps); excluding duplicated reads (while keeping each read's copy number information in a specialized data format); and sorting reads by copy numbers allowing for easy access and manual editing of the resulting files. Slim-Filter is organized as a sequence of windows summarizing the statistical properties of the reads. Each data manipulation step has roll-back abilities, allowing for return to previous steps of the data analysis process. Slim-Filter is written in C++ and is compatible with fasta, fastq, and specialized AS file formats presented in this manuscript. Setup files and a user's manual are available for download at the supplementary web site ( https://www.bioinfo.uh.edu/Slim_Filter/). CONCLUSION: The presented Windows-based application has been developed with the goal of providing individual investigators with integrated sequencing reads analysis, curation, and manipulation capabilities.
Asunto(s)
Biología Computacional/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Cartilla de ADN , Genoma/genéticaRESUMEN
The emergence and rapid spread of the 2009 H1N1 pandemic influenza virus showed that many diagnostic tests were unsuitable for detecting the novel virus isolates. In most countries the probe-based TaqMan assay developed by the U.S. Centers for Disease Control and Prevention was used for diagnostic purposes. The substantial sequence data that became available during the course of the pandemic created the opportunity to utilize bioinformatics tools to evaluate the unique sequence properties of this virus for the development of diagnostic tests. We used a comprehensive computational approach to examine conserved 2009 H1N1 sequence signatures that are at least 20 nucleotides long and contain at least two mismatches compared to any other known H1N1 genome. We found that the hemagglutinin (HA) and neuraminidase (NA) genes contained sequence signatures that are highly conserved among 2009 H1N1 isolates. Based on the NA gene signatures, we used Visual-OMP to design primers with optimal hybridization affinity and we used ThermoBLAST to minimize amplification artifacts. This procedure resulted in a highly sensitive and discriminatory 2009 H1N1 detection assay. Importantly, we found that the primer set can be used reliably in both a conventional TaqMan and a SYBR green reverse transcriptase (RT)-PCR assay with no loss of specificity or sensitivity. We validated the diagnostic accuracy of the NA SYBR green assay with 125 clinical specimens obtained between May and August 2009 in Chile, and we showed diagnostic efficacy comparable to the CDC assay. Our approach highlights the use of systematic computational approaches to develop robust diagnostic tests during a viral pandemic.
Asunto(s)
Subtipo H1N1 del Virus de la Influenza A/aislamiento & purificación , Gripe Humana/diagnóstico , Compuestos Orgánicos/metabolismo , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa/métodos , Virología/métodos , Benzotiazoles , Chile , Cartilla de ADN/genética , Diaminas , Glicoproteínas Hemaglutininas del Virus de la Influenza/genética , Humanos , Subtipo H1N1 del Virus de la Influenza A/genética , Neuraminidasa/genética , Quinolinas , Sensibilidad y Especificidad , Coloración y Etiquetado/métodos , Estados Unidos , Proteínas Virales/genéticaRESUMEN
Viral diversity is theorized to play a significant role during virus infections, particularly for arthropod-borne viruses (arboviruses) that must infect both vertebrate and invertebrate hosts. To determine how viral diversity influences mosquito infection and dissemination Culex taeniopus mosquitoes were infected with the Venezuelan equine encephalitis virus endemic strain 68U201. Bodies and legs/wings of the mosquitoes were collected individually and subjected to multi-parallel sequencing. Virus sequence diversity was calculated for each tissue. Greater diversity was seen in mosquitoes with successful dissemination versus those with no dissemination. Diversity across time revealed that bottlenecks influence diversity following dissemination to the legs/wings, but levels of diversity are restored by Day 12 post-dissemination. Specific minority variants were repeatedly identified across the mosquito cohort, some in nearly every tissue and time point, suggesting that certain variants are important in mosquito infection and dissemination. This study demonstrates that the interaction between the mosquito and the virus results in changes in diversity and the mutational spectrum and may be essential for successful transition of the bottlenecks associated with arbovirus infection.
RESUMEN
Microbial interactions are an underappreciated force in shaping insect microbiome communities. Although pairwise patterns of symbiont interactions have been identified, we have a poor understanding regarding the scale and the nature of co-occurrence and co-exclusion interactions within the microbiome. To characterize these patterns in mosquitoes, we sequenced the bacterial microbiome of Aedes aegypti, Ae. albopictus, and Culex quinquefasciatus caught in the field or reared in the laboratory and used these data to generate interaction networks. For collections, we used traps that attracted host-seeking or ovipositing female mosquitoes to determine how physiological state affects the microbiome under field conditions. Interestingly, we saw few differences in species richness or microbiome community structure in mosquitoes caught in either trap. Co-occurrence and co-exclusion analysis identified 116 pairwise interactions substantially increasing the list of bacterial interactions observed in mosquitoes. Networks generated from the microbiome of Ae. aegypti often included highly interconnected hub bacteria. There were several instances where co-occurring bacteria co-excluded a third taxa, suggesting the existence of tripartite relationships. Several associations were observed in multiple species or in field and laboratory-reared mosquitoes indicating these associations are robust and not influenced by environmental or host factors. To demonstrate that microbial interactions can influence colonization of the host, we administered symbionts to Ae. aegypti larvae that either possessed or lacked their resident microbiota. We found that the presence of resident microbiota can inhibit colonization of particular bacterial taxa. Our results highlight that microbial interactions in mosquitoes are complex and influence microbiome composition.
RESUMEN
Emerging evidence implicates a critically important role for bacterial small RNAs (sRNAs) as post-transcriptional regulators of physiology, metabolism, stress/adaptive responses, and virulence, but the roles of sRNAs in pathogenic Rickettsia species remain poorly understood. Here, we report on the identification of both novel and well-known bacterial sRNAs in Rickettsia prowazekii, known to cause epidemic typhus in humans. RNA sequencing of human microvascular endothelial cells (HMECs), the preferred targets during human rickettsioses, infected with R. prowazekii revealed the presence of 35 trans-acting and 23 cis-acting sRNAs, respectively. Of these, expression of two trans-acting (Rp_sR17 and Rp_sR60) and one cis-acting (Rp_sR47) novel sRNAs and four well-characterized bacterial sRNAs (RNaseP_bact_a, α-tmRNA, 4.5S RNA, 6S RNA) was further confirmed by Northern blot or RT-PCR analyses. The transcriptional start sites of five novel rickettsial sRNAs and 6S RNA were next determined using 5' RLM-RACE yielding evidence for their independent biogenesis in R. prowazekii. Finally, computational approaches were employed to determine the secondary structures and potential mRNA targets of novel sRNAs. Together, these results establish the presence and expression of sRNAs in R. prowazekii during host cell infection and suggest potential functional roles for these important post-transcriptional regulators in rickettsial biology and pathogenesis.
RESUMEN
Small regulatory RNAs comprise critically important modulators of gene expression in bacteria, yet very little is known about their prevalence and functions in Rickettsia species. R. conorii, the causative agent of Mediterranean spotted fever, is a tick-borne pathogen that primarily infects microvascular endothelium in humans. We have determined the transcriptional landscape of R. conorii during infection of Human Microvascular Endothelial Cells (HMECs) by strand-specific RNA sequencing to identify 4 riboswitches, 13 trans-acting (intergenic), and 22 cis-acting (antisense) small RNAs (termed 'Rc_sR's). Independent expression of four novel trans-acting sRNAs (Rc_sR31, Rc_sR33, Rc_sR35, and Rc_sR42) and known bacterial sRNAs (6S, RNaseP_bact_a, ffs, and α-tmRNA) was next confirmed by Northern hybridization. Comparative analysis during infection of HMECs vis-à-vis tick AAE2 cells revealed significantly higher expression of Rc_sR35 and Rc_sR42 in HMECs, whereas Rc_sR31 and Rc_sR33 were expressed at similar levels in both cell types. We further predicted a total of 502 genes involved in all important biological processes as potential targets of Rc_sRs and validated the interaction of Rc_sR42 with cydA (cytochrome d ubiquinol oxidase subunit I). Our findings constitute the first evidence of the existence of post-transcriptional riboregulatory mechanisms in R. conorii and interactions between a novel Rc_sR and its target mRNA.
Asunto(s)
ARN Bacteriano/genética , ARN Pequeño no Traducido/genética , Rickettsia conorii/genética , Animales , Secuencia de Bases , Fiebre Botonosa/microbiología , Células Cultivadas , Secuencia de Consenso , Vectores de Enfermedades , Células Endoteliales/microbiología , Expresión Génica , Regulación Bacteriana de la Expresión Génica , Humanos , Ixodidae/citología , Ixodidae/microbiología , ARN Bacteriano/metabolismo , Riboswitch , Análisis de Secuencia de ARNRESUMEN
The NCI-60 human tumor cell line panel has been used in a broad range of cancer research over the last two decades. A landmark 2013 whole exome sequencing study of this panel added an exceptional new resource for cancer biologists. The complementary analysis of the sequencing data produced by this study suggests the presence of Propionibacterium acnes genomic sequences in almost half of the datasets, with the highest abundance in the leukemia (RPMI-8226) and central nervous system (SF-295, SF-539, and SNB-19) cell lines. While the origin of these contaminating bacterial sequences remains to be determined, observed results suggest that computational control for the presence of microbial genomic material is a necessary step in the analysis of the high throughput sequencing (HTS) data.