RESUMO
Advances in genomics are increasingly depending upon the ability to analyze large and diverse genomic data collections, which are often difficult to amass due to privacy concerns. Recent works have shown that it is possible to jointly analyze datasets held by multiple parties, while provably preserving the privacy of each party's dataset using cryptographic techniques. However, these tools have been challenging to use in practice due to the complexities of the required setup and coordination among the parties. We present sfkit, a secure and federated toolkit for collaborative genomic studies, to allow groups of collaborators to easily perform joint analyses of their datasets without compromising privacy. sfkit consists of a web server and a command-line interface, which together support a range of use cases including both auto-configured and user-supplied computational environments. sfkit provides collaborative workflows for the essential tasks of genome-wide association study (GWAS) and principal component analysis (PCA). We envision sfkit becoming a one-stop server for secure collaborative tools for a broad range of genomic analyses. sfkit is open-source and available at: https://sfkit.org.
Assuntos
Estudo de Associação Genômica Ampla , Genômica , Software , Estudo de Associação Genômica Ampla/métodos , Genômica/métodos , Internet , Privacidade , Fluxo de TrabalhoRESUMO
Archaeal homologs of eukaryotic C/D box small nucleolar RNAs (C/D box sRNAs) guide precise 2'-O-methyl modification of ribosomal and transfer RNAs. Although C/D box sRNA genes constitute one of the largest RNA gene families in archaeal thermophiles, most genomes have incomplete sRNA gene annotation because reliable, fully automated detection methods are not available. We expanded and curated a comprehensive gene set across six species of the crenarchaeal genus Pyrobaculum, particularly rich in C/D box sRNA genes. Using high-throughput small RNA sequencing, specialized computational searches and comparative genomics, we analyzed 526 Pyrobaculum C/D box sRNAs, organizing them into 110 families based on synteny and conservation of guide sequences which determine methylation targets. We examined gene duplications and rearrangements, including one family that has expanded in a pattern similar to retrotransposed repetitive elements in eukaryotes. New training data and inclusion of kink-turn secondary structural features enabled creation of an improved search model. Our analyses provide the most comprehensive, dynamic view of C/D box sRNA evolutionary history within a genus, in terms of modification function, feature plasticity, and gene mobility.
Assuntos
Evolução Molecular , Pyrobaculum/genética , RNA Arqueal/genética , RNA Nucleolar Pequeno/genética , Proteínas Arqueais/genética , Pareamento Incorreto de Bases , Genes Duplicados , Genômica , Metilação , Família Multigênica , RNA Arqueal/química , RNA Arqueal/classificação , RNA Arqueal/metabolismo , RNA Ribossômico/metabolismo , RNA Nucleolar Pequeno/química , RNA Nucleolar Pequeno/classificação , RNA Nucleolar Pequeno/metabolismo , RNA de Transferência/metabolismo , RNA não Traduzido/genética , Alinhamento de SequênciaRESUMO
Type III secretion systems (T3SS) are essential for virulence in dozens of pathogens, but are not required for growth outside the host. Therefore, the T3SS of many bacterial species are under tight regulatory control. To increase our understanding of the molecular mechanisms behind T3SS regulation, we performed a transposon screen to identify genes important for T3SS function in the food-borne pathogen Yersinia pseudotuberculosis. We identified two unique transposon insertions in YPTB2860, a gene that displays 79% identity with the E. coli iron-sulfur cluster regulator, IscR. A Y. pseudotuberculosis iscR in-frame deletion mutant (ΔiscR) was deficient in secretion of Ysc T3SS effector proteins and in targeting macrophages through the T3SS. To determine the mechanism behind IscR control of the Ysc T3SS, we carried out transcriptome and bioinformatic analysis to identify Y. pseudotuberculosis genes regulated by IscR. We discovered a putative IscR binding motif upstream of the Y. pseudotuberculosis yscW-lcrF operon. As LcrF controls transcription of a number of critical T3SS genes in Yersinia, we hypothesized that Yersinia IscR may control the Ysc T3SS through LcrF. Indeed, purified IscR bound to the identified yscW-lcrF promoter motif and mRNA levels of lcrF and 24 other T3SS genes were reduced in Y. pseudotuberculosis in the absence of IscR. Importantly, mice orally infected with the Y. pseudotuberculosis ΔiscR mutant displayed decreased bacterial burden in Peyer's patches, mesenteric lymph nodes, spleens, and livers, indicating an essential role for IscR in Y. pseudotuberculosis virulence. This study presents the first characterization of Yersinia IscR and provides evidence that IscR is critical for virulence and type III secretion through direct regulation of the T3SS master regulator, LcrF.
Assuntos
Sistemas de Secreção Bacterianos/genética , Proteínas de Escherichia coli/genética , Fatores de Transcrição/genética , Fatores de Virulência/genética , Yersinia pseudotuberculosis/genética , Yersinia pseudotuberculosis/patogenicidade , Sequência de Aminoácidos , Animais , Sítios de Ligação/genética , Elementos de DNA Transponíveis/genética , Escherichia coli/genética , Perfilação da Expressão Gênica , Regulação Bacteriana da Expressão Gênica , Fígado/imunologia , Fígado/microbiologia , Linfonodos/imunologia , Linfonodos/microbiologia , Camundongos , Dados de Sequência Molecular , Nódulos Linfáticos Agregados/imunologia , Nódulos Linfáticos Agregados/microbiologia , Regiões Promotoras Genéticas/genética , Ligação Proteica , Alinhamento de Sequência , Baço/imunologia , Baço/microbiologia , Transcrição Gênica , Transcriptoma/genética , Infecções por Yersinia pseudotuberculosis/imunologia , Infecções por Yersinia pseudotuberculosis/patologiaRESUMO
In the Eukarya and Archaea, small RNA-guided pseudouridine modification is believed to be an essential step in ribosomal RNA maturation. While readily modeled and identified by computational methods in eukaryotic species, these guide RNAs have not been found in most archaeal genomes. Using high-throughput transcriptome sequencing and comparative genomics, we have identified ten novel small RNA families that appear to function as H/ACA pseudouridylation guide sRNAs, yet surprisingly lack several expected canonical features. The new RNA genes are transcribed and highly conserved across at least six species in the archaeal hyperthermophilic genus Pyrobaculum. The sRNAs exhibit a single hairpin structure interrupted by a conserved kink-turn motif, yet only two of ten families contain the complete canonical structure found in all other H/ACA sRNAs. Half of the sRNAs lack the conserved 3'-terminal ACA sequence, and many contain only a single 3' guide region rather than the canonical 5' and 3' bipartite guides. The predicted sRNA structures contain guide sequences that exhibit strong complementarity to ribosomal RNA or transfer RNA. Most of the predicted targets of pseudouridine modification are structurally equivalent to those known in other species. One sRNA appears capable of guiding pseudouridine modification at positions U54 and U55 in most or all Pyrobaculum tRNAs. We experimentally tested seven predicted pseudouridine modifications in ribosomal RNA, and all but one was confirmed. The structural insights provided by this new set of Pyrobaculum sRNAs will augment existing models and may facilitate the identification and characterization of new guide sRNAs in other archaeal species.
Assuntos
Pseudouridina/metabolismo , Pyrobaculum/genética , Sequência de Bases , Sequência Conservada , Expressão Gênica , Variação Genética , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Dobramento de RNA , RNA Ribossômico/química , RNA Ribossômico/genética , RNA Ribossômico 16S/química , RNA Ribossômico 16S/genética , RNA de Transferência/genética , Alinhamento de Sequência , Pequeno RNA não TraduzidoRESUMO
RNase P RNA is an ancient, nearly universal feature of life. As part of the ribonucleoprotein RNase P complex, the RNA component catalyzes essential removal of 5' leaders in pre-tRNAs. In 2004, Li and Altman computationally identified the RNase P RNA gene in all but three sequenced microbes: Nanoarchaeum equitans, Pyrobaculum aerophilum, and Aquifex aeolicus (all hyperthermophiles) [Li Y, Altman S (2004) RNA 10:1533-1540]. A recent study concluded that N. equitans does not have or require RNase P activity because it lacks 5' tRNA leaders. The "missing" RNase P RNAs in the other two species is perplexing given evidence or predictions that tRNAs are trimmed in both, prompting speculation that they may have developed novel alternatives to 5' pre-tRNA processing. Using comparative genomics and improved computational methods, we have now identified a radically minimized form of the RNase P RNA in five Pyrobaculum species and the related crenarchaea Caldivirga maquilingensis and Vulcanisaeta distributa, all retaining a conventional catalytic domain, but lacking a recognizable specificity domain. We confirmed 5' tRNA processing activity by high-throughput RNA sequencing and in vitro biochemical assays. The Pyrobaculum and Caldivirga RNase P RNAs are the smallest naturally occurring form yet discovered to function as trans-acting precursor tRNA-processing ribozymes. Loss of the specificity domain in these RNAs suggests altered substrate specificity and could be a useful model for finding other potential roles of RNase P. This study illustrates an effective combination of next-generation RNA sequencing, computational genomics, and biochemistry to identify a divergent, formerly undetectable variant of an essential noncoding RNA gene.
Assuntos
Proteínas Arqueais/genética , Pyrobaculum/genética , RNA Arqueal/genética , Ribonuclease P/genética , Proteínas Arqueais/isolamento & purificação , Proteínas Arqueais/metabolismo , Sequência de Bases , Biocatálise , Biologia Computacional/métodos , Eletroforese em Gel de Poliacrilamida , Genoma Arqueal/genética , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Pyrobaculum/classificação , Pyrobaculum/enzimologia , Precursores de RNA/química , Precursores de RNA/genética , Precursores de RNA/metabolismo , RNA Arqueal/metabolismo , RNA Catalítico/genética , RNA Catalítico/metabolismo , RNA de Transferência/química , RNA de Transferência/genética , RNA de Transferência/metabolismo , Ribonuclease P/isolamento & purificação , Ribonuclease P/metabolismo , Homologia de Sequência do Ácido Nucleico , Especificidade da Espécie , Especificidade por SubstratoRESUMO
A cornerstone of bacterial molecular biology is the ability to genetically manipulate the microbe under study. Many bacteria are difficult to manipulate genetically, a phenotype due in part to robust removal of newly acquired DNA, for example, by restriction-modification (R-M) systems. Here, we report approaches that dramatically improve bacterial transformation efficiency, piloted using a microbe that is challenging to transform due to expression of many R-M systems, Helicobacter pylori. Initially, we identified conditions that dampened expression of several R-M systems and concomitantly enhanced transformation efficiency. We then identified an approach that would broadly protect newly acquired DNA. We computationally predicted under-represented short DNA sequences in the H. pylori genome, with the idea that these sequences reflect targets of sequence-based surveillance such as R-M systems. We then used this information to modify and eliminate such sites in antibiotic resistance cassettes, creating a "stealth" version. Modifying antibiotic resistance cassettes in this way resulted in significantly higher transformation efficiency compared to non-modified cassettes, a response that was genomic loci independent. Our results suggest that avoiding R-M systems, via modification of under-represented DNA sequences or transformation conditions, is a powerful method to enhance DNA transformation. Our approach to identify under-represented sequences is applicable to any microbe with a sequenced genome.IMPORTANCEManipulating the genomes of bacteria is critical to many fields. Such manipulations are made by genetic engineering, which often requires new pieces of DNA to be added to the genome. Bacteria have robust systems for identifying and degrading new DNA, some of which rely on restriction enzymes. These enzymes cut DNA at specific sequences. We identified a set of DNA sequences that are missing normally from a bacterium's genome, more than would be expected by chance. Eliminating these sequences from a new piece of DNA allowed it to be incorporated into the bacterial genome at a higher frequency than new DNA containing the sequences. Removing such sequences appears to allow the new DNA to fly under the bacterial radar in "stealth" mode. This transformation improvement approach is straightforward to apply and likely broadly applicable.
RESUMO
The All of Us Research Program's Data and Research Center (DRC) was established to help acquire, curate, and provide access to one of the world's largest and most diverse datasets for precision medicine research. Already, over 500,000 participants are enrolled in All of Us, 80% of whom are underrepresented in biomedical research, and data are being analyzed by a community of over 2,300 researchers. The DRC created this thriving data ecosystem by collaborating with engaged participants, innovative program partners, and empowered researchers. In this review, we first describe how the DRC is organized to meet the needs of this broad group of stakeholders. We then outline guiding principles, common challenges, and innovative approaches used to build the All of Us data ecosystem. Finally, we share lessons learned to help others navigate important decisions and trade-offs in building a modern biomedical data platform.
Assuntos
Pesquisa Biomédica , Saúde da População , Humanos , Ecossistema , Medicina de PrecisãoRESUMO
Arsenotrophy, growth coupled to autotrophic arsenite oxidation or arsenate respiratory reduction, occurs only in the prokaryotic domain of life. The enzymes responsible for arsenotrophy belong to distinct clades within the DMSO reductase family of molybdenum-containing oxidoreductases: specifically arsenate respiratory reductase, ArrA, and arsenite oxidase, AioA (formerly referred to as AroA and AoxB). A new arsenite oxidase clade, ArxA, represented by the haloalkaliphilic bacterium Alkalilimnicola ehrlichii strain MLHE-1 was also identified in the photosynthetic purple sulfur bacterium Ectothiorhodospira sp. strain PHS-1. A draft genome sequence of PHS-1 was completed and an arx operon similar to MLHE-1 was identified. Gene expression studies showed that arxA was strongly induced with arsenite. Microbial ecology investigation led to the identification of additional arxA-like sequences in Mono Lake and Hot Creek sediments, both arsenic-rich environments in California. Phylogenetic analyses placed these sequences as distinct members of the ArxA clade of arsenite oxidases. ArxA-like sequences were also identified in metagenome sequences of several alkaline microbial mat environments of Yellowstone National Park hot springs. These results suggest that ArxA-type arsenite oxidases appear to be widely distributed in the environment presenting an opportunity for further investigations of the contribution of Arx-dependent arsenotrophy to the arsenic biogeochemical cycle.
Assuntos
Arsênio/metabolismo , Ectothiorhodospira/enzimologia , Oxirredutases/genética , Arseniato Redutases/genética , Processos Autotróficos , California , Ectothiorhodospira/genética , Genes Bacterianos , Fontes Termais/microbiologia , Proteínas Ferro-Enxofre , Metagenoma , Óperon , Oxirredução , Filogenia , Análise de Sequência de DNARESUMO
The Limnospira genus is a recently established clade that is economically important due to its worldwide use in biotechnology and agriculture. This genus includes organisms that were reclassified from Arthrospira, which are commercially marketed as "Spirulina." Limnospira are photoautotrophic organisms that are widely used for research in nutrition, medicine, bioremediation, and biomanufacturing. Despite its widespread use, there is no closed genome for the Limnospira genus, and no reference genome for the type strain, Limnospira fusiformis. In this work, the L. fusiformis genome was sequenced using Oxford Nanopore Technologies MinION and assembled using only ultra-long reads (>35 kb). This assembly was polished with Illumina MiSeq reads sourced from an axenic L. fusiformis culture; axenicity was verified via microscopy and rDNA analysis. Ultra-long read sequencing resulted in a 6.42 Mb closed genome assembled as a single contig with no plasmid. Phylogenetic analysis placed L. fusiformis in the Limnospira clade; some Arthrospira were also placed in this clade, suggesting a misclassification of these strains. This work provides a fully closed and accurate reference genome for the economically important type strain, L. fusiformis. We also present a rapid axenicity method to isolate L. fusiformis. These contributions enable future biotechnological development of L. fusiformis by way of genetic engineering.
RESUMO
The Global Alliance for Genomics and Health (GA4GH) supports international standards that enable a federated data sharing model for the research community while respecting data security, ethical and regulatory frameworks, and data authorization and access processes for sensitive data. The GA4GH Passport standard (Passport) defines a machine-readable digital identity that conveys roles and data access permissions (called "visas") for individual users. Visas are issued by data stewards, including data access committees (DACs) working with public databases, the entities responsible for the quality, integrity, and access arrangements for the datasets in the management of human biomedical data. Passports streamline management of data access rights across data systems by using visas that present a data user's digital identity and permissions across organizations, tools, environments, and services. We describe real-world implementations of the GA4GH Passport standard in use cases from ELIXIR Europe, National Institutes of Health, and the Autism Sharing Initiative. These implementations demonstrate that the Passport standard has provided transparent mechanisms for establishing permissions and authorizing data access across platforms.
RESUMO
The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution. We describe the GA4GH organization, which is fueled by the development efforts of eight Work Streams and informed by the needs of 24 Driver Projects and other key stakeholders. We present the GA4GH suite of secure, interoperable technical standards and policy frameworks and review the current status of standards, their relevance to key domains of research and clinical care, and future plans of GA4GH. Broad international participation in building, adopting, and deploying GA4GH standards and frameworks will catalyze an unprecedented effort in data sharing that will be critical to advancing genomic medicine and ensuring that all populations can access its benefits.
RESUMO
Hyperthermophilic crenarchaea in the genus Pyrobaculum are notable for respiratory versatility, but relatively little is known about the genetics or regulation of crenarchaeal respiratory pathways. We measured global gene expression in Pyrobaculum aerophilum cultured with oxygen, nitrate, arsenate and ferric iron as terminal electron acceptors to identify transcriptional patterns that differentiate these pathways. We also compared genome sequences for four closely related species with diverse respiratory characteristics (Pyrobaculum arsenaticum, Pyrobaculum calidifontis, Pyrobaculum islandicum, and Thermoproteus neutrophilus) to identify genes associated with different respiratory capabilities. Specific patterns of gene expression in P. aerophilum were associated with aerobic respiration, nitrate respiration, arsenate respiration, and anoxia. Functional predictions based on these patterns include separate cytochrome oxidases for aerobic growth and oxygen scavenging, a nitric oxide-responsive transcriptional regulator, a multicopper oxidase involved in denitrification, and an archaeal arsenate respiratory reductase. We were unable to identify specific genes for iron respiration, but P. aerophilum exhibited repressive transcriptional responses to iron remarkably similar to those controlled by the ferric uptake regulator in bacteria. Together, these analyses present a genome-scale view of crenarchaeal respiratory flexibility and support a large number of functional and regulatory predictions for further investigation. The complete gene expression data set can be viewed in genomic context with the Archaeal Genome Browser at archaea.ucsc.edu.
Assuntos
Perfilação da Expressão Gênica , Pyrobaculum/genética , Pyrobaculum/metabolismo , Northern Blotting , Genoma Arqueal/genética , Análise de Sequência com Séries de Oligonucleotídeos , Consumo de OxigênioRESUMO
Many bacterial genomes are highly variable but nonetheless are typically published as a single assembled genome. Experiments tracking bacterial genome evolution have not looked at the variation present at a given point in time. Here, we analyzed the mouse-passaged Helicobacter pylori strain SS1 and its parent PMSS1 to assess intra- and intergenomic variability. Using high sequence coverage depth and experimental validation, we detected extensive genome plasticity within these H. pylori isolates, including movement of the transposable element IS607, large and small inversions, multiple single nucleotide polymorphisms, and variation in cagA copy number. The cagA gene was found as 1 to 4 tandem copies located off the cag island in both SS1 and PMSS1; this copy number variation correlated with protein expression. To gain insight into the changes that occurred during mouse adaptation, we also compared SS1 and PMSS1 and observed 46 differences that were distinct from the within-genome variation. The most substantial was an insertion in cagY, which encodes a protein required for a type IV secretion system function. We detected modifications in genes coding for two proteins known to affect mouse colonization, the HpaA neuraminyllactose-binding protein and the FutB α-1,3 lipopolysaccharide (LPS) fucosyltransferase, as well as genes predicted to modulate diverse properties. In sum, our work suggests that data from consensus genome assemblies from single colonies may be misleading by failing to represent the variability present. Furthermore, we show that high-depth genomic sequencing data of a population can be analyzed to gain insight into the normal variation within bacterial strains.IMPORTANCE Although it is well known that many bacterial genomes are highly variable, it is nonetheless traditional to refer to, analyze, and publish "the genome" of a bacterial strain. Variability is usually reduced ("only sequence from a single colony"), ignored ("just publish the consensus"), or placed in the "too-hard" basket ("analysis of raw read data is more robust"). Now that whole-genome sequences are regularly used to assess virulence and track outbreaks, a better understanding of the baseline genomic variation present within single strains is needed. Here, we describe the variability seen in typical working stocks and colonies of pathogen Helicobacter pylori model strains SS1 and PMSS1 as revealed by use of high-coverage mate pair next-generation sequencing (NGS) and confirmed by traditional laboratory techniques. This work demonstrates that reliance on a consensus assembly as "the genome" of a bacterial strain may be misleading.
Assuntos
Variação Genética , Genoma Bacteriano , Helicobacter pylori/genética , Animais , Sequenciamento de Nucleotídeos em Larga Escala , Camundongos , MutaçãoRESUMO
Transfer RNAs (tRNA) are the most common RNA molecules in cells and have critical roles as both translators of the genetic code and regulators of protein synthesis. As such, numerous methods have focused on studying tRNA abundance and regulation, with the most widely used methods being RNA-seq and microarrays. Though revolutionary to transcriptomics, these assays are limited by an inability to encode tRNA modifications in the requisite cDNA. These modifications are abundant in tRNA and critical to their function. Here, we describe proof-of-concept experiments where individual tRNA molecules are examined as linear strands using a biological nanopore. This method utilizes an enzymatically ligated synthetic DNA adapter to concentrate tRNA at the lipid bilayer of the nanopore device and efficiently denature individual tRNA molecules, as they are pulled through the α-hemolysin (α-HL) nanopore. Additionally, the DNA adapter provides a loading site for Ï29 DNA polymerase (Ï29 DNAP), which acts as a brake on the translocating tRNA. This increases the dwell time of adapted tRNA in the nanopore, allowing us to identify the region of the nanopore signal that is produced by the translocating tRNA itself. Using adapter-modified Escherichia coli tRNA(fMet) and tRNA(Lys), we show that the nanopore signal during controlled translocation is dependent on the identity of the tRNA. This confirms that adapter-modified tRNA can translocate end-to-end through nanopores and provide the foundation for future work in direct sequencing of individual transfer RNA with a nanopore-based device.
RESUMO
Within the domain Archaea, the CRISPR immune system appears to be nearly ubiquitous based on computational genome analyses. Initial studies in bacteria demonstrated that the CRISPR system targets invading plasmid and viral DNA. Recent experiments in the model archaeon Pyrococcus furiosus have uncovered a novel RNA-targeting variant of the CRISPR system. Because our understanding of CRISPR system evolution in other archaea is limited, we have taken a comparative genomic and transcriptomic view of the CRISPR arrays across six diverse species within the crenarchaeal genus Pyrobaculum. We present transcriptional data from each of four species in the genus (P. aerophilum, P. islandicum, P. calidifontis, P. arsenaticum), analyzing mature CRISPR-associated small RNA abundance from over 20 arrays. Within the genus, there is remarkable conservation of CRISPR array structure, as well as unique features that are have not been studied in other archaeal systems. These unique features include: a nearly invariant CRISPR promoter, conservation of direct repeat families, the 5' polarity of CRISPR-associated small RNA abundance, and a novel CRISPR-specific association with homologues of nurA and herA. These analyses provide a genus-level evolutionary perspective on archaeal CRISPR systems, broadening our understanding beyond existing non-comparative model systems.
RESUMO
A great diversity of small, non-coding RNA (ncRNA) molecules with roles in gene regulation and RNA processing have been intensely studied in eukaryotic and bacterial model organisms, yet our knowledge of possible parallel roles for small RNAs (sRNA) in archaea is limited. We employed RNA-seq to identify novel sRNA across multiple species of the hyperthermophilic genus Pyrobaculum, known for unusual RNA gene characteristics. By comparing transcriptional data collected in parallel among four species, we were able to identify conserved RNA genes fitting into known and novel families. Among our findings, we highlight three novel cis-antisense sRNAs encoded opposite to key regulatory (ferric uptake regulator), metabolic (triose-phosphate isomerase), and core transcriptional apparatus genes (transcription factor B). We also found a large increase in the number of conserved C/D box sRNA genes over what had been previously recognized; many of these genes are encoded antisense to protein coding genes. The conserved opposition to orthologous genes across the Pyrobaculum genus suggests similarities to other cis-antisense regulatory systems. Furthermore, the genus-specific nature of these sRNAs indicates they are relatively recent, stable adaptations.
RESUMO
Pyrobaculum oguniense TE7 is an aerobic hyperthermophilic crenarchaeon isolated from a hot spring in Japan. Here we describe its main chromosome of 2,436,033 bp, with three large-scale inversions and an extra-chromosomal element of 16,887 bp. We have annotated 2,800 protein-coding genes and 145 RNA genes in this genome, including nine H/ACA-like small RNA, 83 predicted C/D box small RNA, and 47 transfer RNA genes. Comparative analyses with the closest known relative, the anaerobe Pyrobaculum arsenaticum from Italy, reveals unexpectedly high synteny and nucleotide identity between these two geographically distant species. Deep sequencing of a mixture of genomic DNA from multiple cells has illuminated some of the genome dynamics potentially shared with other species in this genus.