Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 26
Filter
1.
Nat Commun ; 9(1): 3676, 2018 09 10.
Article in English | MEDLINE | ID: mdl-30201986

ABSTRACT

Current methods for genome-wide analysis of gene expression require fragmentation of original transcripts into small fragments for short-read sequencing. In bacteria, the resulting fragmented information hides operon complexity. Additionally, in vivo processing of transcripts confounds the accurate identification of the 5' and 3' ends of operons. Here we develop a methodology called SMRT-Cappable-seq that combines the isolation of un-fragmented primary transcripts with single-molecule long read sequencing. Applied to E. coli, this technology results in an accurate definition of the transcriptome with 34% of known operons from RegulonDB being extended by at least one gene. Furthermore, 40% of transcription termination sites have read-through that alters the gene content of the operons. As a result, most of the bacterial genes are present in multiple operon variants reminiscent of eukaryotic splicing. By providing such granularity in the operon structure, this study represents an important resource for the study of prokaryotic gene network and regulation.


Subject(s)
Escherichia coli/genetics , Genome, Bacterial , Operon , Sequence Analysis, RNA/methods , Transcriptome , Amino Acid Motifs , Chromosome Mapping , DNA, Complementary/genetics , Gene Expression Profiling , Gene Regulatory Networks , Genomics , Promoter Regions, Genetic , Transcription, Genetic
3.
Sci Rep ; 7(1): 16140, 2017 11 23.
Article in English | MEDLINE | ID: mdl-29170397

ABSTRACT

The Helicobacter pylori phase variable gene modH, typified by gene HP1522 in strain 26695, encodes a N6-adenosine type III DNA methyltransferase. Our previous studies identified multiple strain-specific modH variants (modH1 - modH19) and showed that phase variation of modH5 in H. pylori P12 influenced expression of motility-associated genes and outer membrane protein gene hopG. However, the ModH5 DNA recognition motif and the mechanism by which ModH5 controls gene expression were unknown. Here, using comparative single molecule real-time sequencing, we identify the DNA site methylated by ModH5 as 5'-Gm6ACC-3'. This motif is vastly underrepresented in H. pylori genomes, but overrepresented in a number of virulence genes, including motility-associated genes, and outer membrane protein genes. Motility and the number of flagella of H. pylori P12 wild-type were significantly higher than that of isogenic modH5 OFF or ΔmodH5 mutants, indicating that phase variable switching of modH5 expression plays a role in regulating H. pylori motility phenotypes. Using the flagellin A (flaA) gene as a model, we show that ModH5 modulates flaA promoter activity in a GACC methylation-dependent manner. These findings provide novel insights into the role of ModH5 in gene regulation and how it mediates epigenetic regulation of H. pylori motility.


Subject(s)
Bacterial Proteins/metabolism , Helicobacter pylori/metabolism , Bacterial Proteins/genetics , Epigenesis, Genetic/genetics , Gene Expression Regulation, Bacterial/genetics , Gene Expression Regulation, Bacterial/physiology , Helicobacter pylori/genetics
4.
Genome Announc ; 5(21)2017 May 25.
Article in English | MEDLINE | ID: mdl-28546484

ABSTRACT

Moraxella catarrhalis is an important bacterial pathogen that causes otitis media and exacerbations of chronic obstructive pulmonary disease. Here, we report the complete genome sequence of M. catarrhalis strain CCRI-195ME, which contains the phase-variable epigenetic regulator ModM3.

5.
Genome Res ; 27(5): 849-864, 2017 05.
Article in English | MEDLINE | ID: mdl-28396521

ABSTRACT

The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009; it reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of single base changes to megabase-scale path reorganizations, gap closures, and localization of previously orphaned sequences. We developed a new approach to sequence generation for targeted base updates and used data from new genome mapping technologies and single haplotype resources to identify and resolve larger assembly issues. For the first time, the reference assembly contains sequence-based representations for the centromeres. We also expanded the number of alternate loci to create a reference that provides a more robust representation of human population variation. We demonstrate that the updates render the reference an improved annotation substrate, alter read alignments in unchanged regions, and impact variant interpretation at clinically relevant loci. We additionally evaluated a collection of new de novo long-read haploid assemblies and conclude that although the new assemblies compare favorably to the reference with respect to continuity, error rate, and gene completeness, the reference still provides the best representation for complex genomic regions and coding sequences. We assert that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote our understanding of human biology and advance our efforts to improve health.


Subject(s)
Contig Mapping/methods , Genome, Human , Genomics/methods , Sequence Analysis, DNA/methods , Software , Contig Mapping/standards , Genomics/standards , Haploidy , Haplotypes , Humans , Polymorphism, Genetic , Reference Standards , Sequence Analysis, DNA/standards
6.
Genome Res ; 27(5): 677-685, 2017 05.
Article in English | MEDLINE | ID: mdl-27895111

ABSTRACT

In an effort to more fully understand the full spectrum of human genetic variation, we generated deep single-molecule, real-time (SMRT) sequencing data from two haploid human genomes. By using an assembly-based approach (SMRT-SV), we systematically assessed each genome independently for structural variants (SVs) and indels resolving the sequence structure of 461,553 genetic variants from 2 bp to 28 kbp in length. We find that >89% of these variants have been missed as part of analysis of the 1000 Genomes Project even after adjusting for more common variants (MAF > 1%). We estimate that this theoretical human diploid differs by as much as ∼16 Mbp with respect to the human reference, with long-read sequencing data providing a fivefold increase in sensitivity for genetic variants ranging in size from 7 bp to 1 kbp compared with short-read sequence data. Although a large fraction of genetic variants were not detected by short-read approaches, once the alternate allele is sequence-resolved, we show that 61% of SVs can be genotyped in short-read sequence data sets with high accuracy. Uncoupling discovery from genotyping thus allows for the majority of this missed common variation to be genotyped in the human population. Interestingly, when we repeat SV detection on a pseudodiploid genome constructed in silico by merging the two haploids, we find that ∼59% of the heterozygous SVs are no longer detected by SMRT-SV. These results indicate that haploid resolution of long-read sequencing data will significantly increase sensitivity of SV detection.


Subject(s)
Contig Mapping/methods , Genome, Human , Genomic Structural Variation , Haploidy , Sequence Analysis, DNA/methods , Contig Mapping/standards , Human Genome Project , Humans , Sequence Analysis, DNA/standards
8.
PLoS Genet ; 12(4): e1005954, 2016 Apr.
Article in English | MEDLINE | ID: mdl-27082250

ABSTRACT

We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species' native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics.


Subject(s)
Bass/genetics , Chromosome Mapping , Animals , Bass/classification , Genome , In Situ Hybridization, Fluorescence , Phylogeny
9.
Nat Commun ; 6: 7828, 2015 Jul 28.
Article in English | MEDLINE | ID: mdl-26215614

ABSTRACT

Non-typeable Haemophilus influenzae contains an N(6)-adenine DNA-methyltransferase (ModA) that is subject to phase-variable expression (random ON/OFF switching). Five modA alleles, modA2, modA4, modA5, modA9 and modA10, account for over two-thirds of clinical otitis media isolates surveyed. Here, we use single molecule, real-time (SMRT) methylome analysis to identify the DNA-recognition motifs for all five of these modA alleles. Phase variation of these alleles regulates multiple proteins including vaccine candidates, and key virulence phenotypes such as antibiotic resistance (modA2, modA5, modA10), biofilm formation (modA2) and immunoevasion (modA4). Analyses of a modA2 strain in the chinchilla model of otitis media show a clear selection for ON switching of modA2 in the middle ear. Our results indicate that a biphasic epigenetic switch can control bacterial virulence, immunoevasion and niche adaptation in an animal model system.


Subject(s)
Adaptation, Physiological/genetics , DNA Methylation/genetics , DNA, Bacterial/genetics , Epigenesis, Genetic , Haemophilus influenzae/genetics , Immune Evasion/genetics , Site-Specific DNA-Methyltransferase (Adenine-Specific)/genetics , Alleles , Animals , Base Sequence , Biofilms , Chinchilla , Disease Models, Animal , Ear, Middle , Haemophilus influenzae/immunology , Haemophilus influenzae/pathogenicity , Molecular Sequence Data , Otitis Media/microbiology , Virulence/genetics
10.
Genome Announc ; 3(3)2015 May 07.
Article in English | MEDLINE | ID: mdl-25953183

ABSTRACT

The complete genome sequence of Bacillus subtilis T30 was determined by SMRT sequencing. The entire genome contains 4,138 predicted genes. The genome carries one intact prophage sequence (37.4 kb) similar to Bacillus phage SPBc2 and one incomplete prophage genome of 39.9 kb similar to Bacillus phage phi105.

11.
PLoS One ; 10(4): e0123639, 2015.
Article in English | MEDLINE | ID: mdl-25860355

ABSTRACT

The methylation of DNA bases plays an important role in numerous biological processes including development, gene expression, and DNA replication. Salmonella is an important foodborne pathogen, and methylation in Salmonella is implicated in virulence. Using single molecule real-time (SMRT) DNA-sequencing, we sequenced and assembled the complete genomes of eleven Salmonella enterica isolates from nine different serovars, and analysed the whole-genome methylation patterns of each genome. We describe 16 distinct N6-methyladenine (m6A) methylated motifs, one N4-methylcytosine (m4C) motif, and one combined m6A-m4C motif. Eight of these motifs are novel, i.e., they have not been previously described. We also identified the methyltransferases (MTases) associated with 13 of the motifs. Some motifs are conserved across all Salmonella serovars tested, while others were found only in a subset of serovars. Eight of the nine serovars contained a unique methylated motif that was not found in any other serovar (most of these motifs were part of Type I restriction modification systems), indicating the high diversity of methylation patterns present in Salmonella.


Subject(s)
DNA Methylation , Epigenomics , Genome, Bacterial , Salmonella enterica/genetics , Base Sequence , Gene Expression Profiling , Methyltransferases/genetics , Nucleotide Motifs
12.
Nucleic Acids Res ; 43(8): 4150-62, 2015 Apr 30.
Article in English | MEDLINE | ID: mdl-25845594

ABSTRACT

Phase variation (random ON/OFF switching) of gene expression is a common feature of host-adapted pathogenic bacteria. Phase variably expressed N(6)-adenine DNA methyltransferases (Mod) alter global methylation patterns resulting in changes in gene expression. These systems constitute phase variable regulons called phasevarions. Neisseria meningitidis phasevarions regulate genes including virulence factors and vaccine candidates, and alter phenotypes including antibiotic resistance. The target site recognized by these Type III N(6)-adenine DNA methyltransferases is not known. Single molecule, real-time (SMRT) methylome analysis was used to identify the recognition site for three key N. meningitidis methyltransferases: ModA11 (exemplified by M.NmeMC58I) (5'-CGY M6A: G-3'), ModA12 (exemplified by M.Nme77I, M.Nme18I and M.Nme579II) (5'-AC M6A: CC-3') and ModD1 (exemplified by M.Nme579I) (5'-CC M6A: GC-3'). Restriction inhibition assays and mutagenesis confirmed the SMRT methylome analysis. The ModA11 site is complex and atypical and is dependent on the type of pyrimidine at the central position, in combination with the bases flanking the core recognition sequence 5'-CGY M6A: G-3'. The observed efficiency of methylation in the modA11 strain (MC58) genome ranged from 4.6% at 5'-GCGC M6A: GG-3' sites, to 100% at 5'-ACGT M6A: GG-3' sites. Analysis of the distribution of modified sites in the respective genomes shows many cases of association with intergenic regions of genes with altered expression due to phasevarion switching.


Subject(s)
Bacterial Proteins/metabolism , Neisseria meningitidis/enzymology , Site-Specific DNA-Methyltransferase (Adenine-Specific)/metabolism , DNA, Bacterial/chemistry , DNA, Bacterial/metabolism , Epigenesis, Genetic , Gene Expression Regulation, Bacterial , Genome, Bacterial , Methylation , Molecular Sequence Data , Neisseria meningitidis/genetics
13.
Nucleic Acids Res ; 43(4): 2102-15, 2015 Feb 27.
Article in English | MEDLINE | ID: mdl-25662217

ABSTRACT

Base J (ß-D-glucosyl-hydroxymethyluracil) replaces 1% of T in the Leishmania genome and is only found in telomeric repeats (99%) and in regions where transcription starts and stops. This highly restricted distribution must be co-determined by the thymidine hydroxylases (JBP1 and JBP2) that catalyze the initial step in J synthesis. To determine the DNA sequences recognized by JBP1/2, we used SMRT sequencing of DNA segments inserted into plasmids grown in Leishmania tarentolae. We show that SMRT sequencing recognizes base J in DNA. Leishmania DNA segments that normally contain J also picked up J when present in the plasmid, whereas control sequences did not. Even a segment of only 10 telomeric (GGGTTA) repeats was modified in the plasmid. We show that J modification usually occurs at pairs of Ts on opposite DNA strands, separated by 12 nucleotides. Modifications occur near G-rich sequences capable of forming G-quadruplexes and JBP2 is needed, as it does not occur in JBP2-null cells. We propose a model whereby de novo J insertion is mediated by JBP2. JBP1 then binds to J and hydroxylates another T 13 bp downstream (but not upstream) on the complementary strand, allowing JBP1 to maintain existing J following DNA replication.


Subject(s)
Glucosides/analysis , Uracil/analogs & derivatives , DNA-Binding Proteins/metabolism , Glucosides/metabolism , Leishmania/genetics , Plasmids/genetics , Protozoan Proteins/metabolism , Sequence Analysis, DNA , Uracil/analysis , Uracil/metabolism
14.
Genome Announc ; 3(6)2015 Dec 31.
Article in English | MEDLINE | ID: mdl-26722012

ABSTRACT

Here, we present the complete genome sequence of Streptomyces sp. strain CCM_MD2014 (phylum Actinobacteria), isolated from surface soil in Woods Hole, MA. Its single linear chromosome of 8,274,043 bp in length has a 72.13% G+C content and contains 6,948 coding sequences.

15.
Genome Announc ; 3(6)2015 Dec 31.
Article in English | MEDLINE | ID: mdl-26722011

ABSTRACT

Here, we present the 3,443,800-bp complete genome sequence of Curtobacterium sp. strain MR_MD2014 (phylum Actinobacteria). This strain was isolated from soil in Woods Hole, MA, as part of the 2014 Microbial Diversity Summer Program at the Marine Biological Laboratory in Woods Hole, MA.

16.
Nature ; 517(7536): 608-11, 2015 Jan 29.
Article in English | MEDLINE | ID: mdl-25383537

ABSTRACT

The human genome is arguably the most complete mammalian reference assembly, yet more than 160 euchromatic gaps remain and aspects of its structural variation remain poorly understood ten years after its completion. To identify missing sequence and genetic variation, here we sequence and analyse a haploid human genome (CHM1) using single-molecule, real-time DNA sequencing. We close or extend 55% of the remaining interstitial gaps in the human GRCh37 reference genome--78% of which carried long runs of degenerate short tandem repeats, often several kilobases in length, embedded within (G+C)-rich genomic regions. We resolve the complete sequence of 26,079 euchromatic structural variants at the base-pair level, including inversions, complex insertions and long tracts of tandem repeats. Most have not been previously reported, with the greatest increases in sensitivity occurring for events less than 5 kilobases in size. Compared to the human reference, we find a significant insertional bias (3:1) in regions corresponding to complex insertions and long short tandem repeats. Our results suggest a greater complexity of the human genome in the form of variation of longer and more complex repetitive DNA that can now be largely resolved with the application of this longer-read sequencing technology.


Subject(s)
Genetic Variation/genetics , Genome, Human/genetics , Genomics , Sequence Analysis, DNA/methods , Chromosome Inversion/genetics , Chromosomes, Human, Pair 10/genetics , Cloning, Molecular , GC Rich Sequence/genetics , Haploidy , Humans , Mutagenesis, Insertional/genetics , Reference Standards , Tandem Repeat Sequences/genetics
17.
Nat Commun ; 5: 5055, 2014 Sep 30.
Article in English | MEDLINE | ID: mdl-25268848

ABSTRACT

Streptococcus pneumoniae (the pneumococcus) is the world's foremost bacterial pathogen in both morbidity and mortality. Switching between phenotypic forms (or 'phases') that favour asymptomatic carriage or invasive disease was first reported in 1933. Here, we show that the underlying mechanism for such phase variation consists of genetic rearrangements in a Type I restriction-modification system (SpnD39III). The rearrangements generate six alternative specificities with distinct methylation patterns, as defined by single-molecule, real-time (SMRT) methylomics. The SpnD39III variants have distinct gene expression profiles. We demonstrate distinct virulence in experimental infection and in vivo selection for switching between SpnD39III variants. SpnD39III is ubiquitous in pneumococci, indicating an essential role in its biology. Future studies must recognize the potential for switching between these heretofore undetectable, differentiated pneumococcal subpopulations in vitro and in vivo. Similar systems exist in other bacterial genera, indicating the potential for broad exploitation of epigenetic gene regulation.


Subject(s)
Bacterial Proteins/genetics , Epigenesis, Genetic , Pneumococcal Infections/microbiology , Streptococcus pneumoniae/enzymology , Streptococcus pneumoniae/pathogenicity , Animals , Bacterial Proteins/metabolism , DNA Restriction-Modification Enzymes/genetics , DNA Restriction-Modification Enzymes/metabolism , Female , Gene Expression Regulation, Bacterial , Humans , Mice , Mice, Inbred BALB C , Molecular Sequence Data , Streptococcus pneumoniae/genetics , Virulence
18.
Sci Transl Med ; 6(254): 254ra126, 2014 Sep 17.
Article in English | MEDLINE | ID: mdl-25232178

ABSTRACT

Public health officials have raised concerns that plasmid transfer between Enterobacteriaceae species may spread resistance to carbapenems, an antibiotic class of last resort, thereby rendering common health care-associated infections nearly impossible to treat. To determine the diversity of carbapenemase-encoding plasmids and assess their mobility among bacterial species, we performed comprehensive surveillance and genomic sequencing of carbapenem-resistant Enterobacteriaceae in the National Institutes of Health (NIH) Clinical Center patient population and hospital environment. We isolated a repertoire of carbapenemase-encoding Enterobacteriaceae, including multiple strains of Klebsiella pneumoniae, Klebsiella oxytoca, Escherichia coli, Enterobacter cloacae, Citrobacter freundii, and Pantoea species. Long-read genome sequencing with full end-to-end assembly revealed that these organisms carry the carbapenem resistance genes on a wide array of plasmids. K. pneumoniae and E. cloacae isolated simultaneously from a single patient harbored two different carbapenemase-encoding plasmids, indicating that plasmid transfer between organisms was unlikely within this patient. We did, however, find evidence of horizontal transfer of carbapenemase-encoding plasmids between K. pneumoniae, E. cloacae, and C. freundii in the hospital environment. Our data, including full plasmid identification, challenge assumptions about horizontal gene transfer events within patients and identify possible connections between patients and the hospital environment. In addition, we identified a new carbapenemase-encoding plasmid of potentially high clinical impact carried by K. pneumoniae, E. coli, E. cloacae, and Pantoea species, in unrelated patients and in the hospital environment.


Subject(s)
Bacterial Proteins/biosynthesis , Cross Infection , Enterobacteriaceae/enzymology , Plasmids , beta-Lactamases/biosynthesis , Enterobacteriaceae/classification , Enterobacteriaceae/genetics , Hospitals, Public , Humans , National Institutes of Health (U.S.) , Population Surveillance , Real-Time Polymerase Chain Reaction , United States
19.
FASEB J ; 28(12): 5197-207, 2014 Dec.
Article in English | MEDLINE | ID: mdl-25183669

ABSTRACT

Moraxella catarrhalis is a significant cause of otitis media and exacerbations of chronic obstructive pulmonary disease. Here, we characterize a phase-variable DNA methyltransferase (ModM), which contains 5'-CAAC-3' repeats in its open reading frame that mediate high-frequency mutation resulting in reversible on/off switching of ModM expression. Three modM alleles have been identified (modM1-3), with modM2 being the most commonly found allele. Using single-molecule, real-time (SMRT) genome sequencing and methylome analysis, we have determined that the ModM2 methylation target is 5'-GAR(m6)AC-3', and 100% of these sites are methylated in the genome of the M. catarrhalis 25239 ModM2 on strain. Proteomic analysis of ModM2 on and off variants revealed that ModM2 regulates expression of multiple genes that have potential roles in colonization, infection, and protection against host defenses. Investigation of the distribution of modM alleles in a panel of M. catarrhalis strains, isolated from the nasopharynx of healthy children or middle ear effusions from patients with otitis media, revealed a statistically significant association of modM3 with otitis media isolates. The modulation of gene expression via the ModM phase-variable regulon (phasevarion), and the significant association of the modM3 allele with otitis media, suggests a key role for ModM phasevarions in the pathogenesis of this organism.


Subject(s)
DNA Modification Methylases/metabolism , Moraxella catarrhalis/pathogenicity , Moraxellaceae Infections/microbiology , Otitis Media/microbiology , Amino Acid Sequence , DNA Modification Methylases/chemistry , DNA Primers , Humans , Mass Spectrometry , Molecular Sequence Data , Moraxellaceae Infections/enzymology , Otitis Media/enzymology , Polymerase Chain Reaction , Sequence Homology, Amino Acid
20.
Antimicrob Agents Chemother ; 58(10): 5947-53, 2014 Oct.
Article in English | MEDLINE | ID: mdl-25070096

ABSTRACT

The whole-genome sequence of a carbapenem-resistant Klebsiella pneumoniae strain, PittNDM01, which coproduces NDM-1 and OXA-232 carbapenemases, was determined in this study. The use of single-molecule, real-time (SMRT) sequencing provided a closed genome in a single sequencing run. K. pneumoniae PittNDM01 has a single chromosome of 5,348,284 bp and four plasmids: pPKPN1 (283,371 bp), pPKPN2 (103,694 bp), pPKPN3 (70,814 bp), and pPKPN4 (6,141 bp). The contents of the chromosome were similar to that of the K. pneumoniae reference genome strain MGH 78578, with the exception of a large inversion spanning 23.3% of the chromosome. In contrast, three of the four plasmids are unique. The plasmid pPKPN1, an IncHI1B-like plasmid, carries the blaNDM-1, armA, and qnrB1 genes, along with tellurium and mercury resistance operons. blaNDM-1 is carried on a unique structure in which Tn125 is further bracketed by IS26 downstream of a class 1 integron. The IncFIA-like plasmid pPKPN3 also carries an array of resistance elements, including blaCTX-M-15 and a mercury resistance operon. The ColE-type plasmid pPKPN4 carrying blaOXA-232 is identical to a plasmid previously reported from France. SMRT sequencing was useful in resolving the complex bacterial genomic structures in the de novo assemblies.


Subject(s)
Bacterial Proteins/metabolism , Klebsiella pneumoniae/enzymology , Klebsiella pneumoniae/genetics , beta-Lactamases/metabolism , Anti-Bacterial Agents/pharmacology , Bacterial Proteins/genetics , Genome, Bacterial/genetics , Klebsiella pneumoniae/drug effects , Microbial Sensitivity Tests , Operon/genetics , Plasmids/genetics , beta-Lactamases/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...