Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 23
Filter
1.
Bioinformatics ; 38(18): 4255-4263, 2022 09 15.
Article in English | MEDLINE | ID: mdl-35866989

ABSTRACT

MOTIVATION: Genome sequencing experiments have revolutionized molecular biology by allowing researchers to identify important DNA-encoded elements genome wide. Regions where these elements are found appear as peaks in the analog signal of an assay's coverage track, and despite the ease with which humans can visually categorize these patterns, the size of many genomes necessitates algorithmic implementations. Commonly used methods focus on statistical tests to classify peaks, discounting that the background signal does not completely follow any known probability distribution and reducing the information-dense peak shapes to simply maximum height. Deep learning has been shown to be highly accurate for many pattern recognition tasks, on par or even exceeding human capabilities, providing an opportunity to reimagine and improve peak calling. RESULTS: We present the peak calling framework LanceOtron, which combines deep learning for recognizing peak shape with multifaceted enrichment calculations for assessing significance. In benchmarking ATAC-seq, ChIP-seq and DNase-seq, LanceOtron outperforms long-standing, gold-standard peak callers through its improved selectivity and near-perfect sensitivity. AVAILABILITY AND IMPLEMENTATION: A fully featured web application is freely available from LanceOtron.molbiol.ox.ac.uk, command line interface via python is pip installable from PyPI at https://pypi.org/project/lanceotron/, and source code and benchmarking tests are available at https://github.com/LHentges/LanceOtron. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Deep Learning , Humans , Sequence Analysis, DNA/methods , Software , Chromatin Immunoprecipitation Sequencing , Base Sequence , High-Throughput Nucleotide Sequencing/methods
2.
Genome Res ; 28(9): 1395-1404, 2018 09.
Article in English | MEDLINE | ID: mdl-30049790

ABSTRACT

Current methods struggle to reconstruct and visualize the genomic relationships of large numbers of bacterial genomes. GrapeTree facilitates the analyses of large numbers of allelic profiles by a static "GrapeTree Layout" algorithm that supports interactive visualizations of large trees within a web browser window. GrapeTree also implements a novel minimum spanning tree algorithm (MSTree V2) to reconstruct genetic relationships despite high levels of missing data. GrapeTree is a stand-alone package for investigating phylogenetic trees plus associated metadata and is also integrated into EnteroBase to facilitate cutting edge navigation of genomic relationships among bacterial pathogens.


Subject(s)
Bacteria/genetics , DNA Barcoding, Taxonomic/methods , Genome, Bacterial , Phylogeny , Software , Alleles , Bacteria/classification , Bacteria/pathogenicity
3.
PLoS Genet ; 14(4): e1007261, 2018 04.
Article in English | MEDLINE | ID: mdl-29621240

ABSTRACT

For many decades, Salmonella enterica has been subdivided by serological properties into serovars or further subdivided for epidemiological tracing by a variety of diagnostic tests with higher resolution. Recently, it has been proposed that so-called eBurst groups (eBGs) based on the alleles of seven housekeeping genes (legacy multilocus sequence typing [MLST]) corresponded to natural populations and could replace serotyping. However, this approach lacks the resolution needed for epidemiological tracing and the existence of natural populations had not been independently validated by independent criteria. Here, we describe EnteroBase, a web-based platform that assembles draft genomes from Illumina short reads in the public domain or that are uploaded by users. EnteroBase implements legacy MLST as well as ribosomal gene MLST (rMLST), core genome MLST (cgMLST), and whole genome MLST (wgMLST) and currently contains over 100,000 assembled genomes from Salmonella. It also provides graphical tools for visual interrogation of these genotypes and those based on core single nucleotide polymorphisms (SNPs). eBGs based on legacy MLST are largely consistent with eBGs based on rMLST, thus demonstrating that these correspond to natural populations. rMLST also facilitated the selection of representative genotypes for SNP analyses of the entire breadth of diversity within Salmonella. In contrast, cgMLST provides the resolution needed for epidemiological investigations. These observations show that genomic genotyping, with the assistance of EnteroBase, can be applied at all levels of diversity within the Salmonella genus.


Subject(s)
Databases, Genetic , Genome, Bacterial , Salmonella/classification , Salmonella/genetics , Multilocus Sequence Typing , Phylogeny , Polymorphism, Single Nucleotide
5.
Commun Biol ; 4(1): 623, 2021 05 25.
Article in English | MEDLINE | ID: mdl-34035422

ABSTRACT

Tracking and understanding data quality, analysis and reproducibility are critical concerns in the biological sciences. This is especially true in genomics where next generation sequencing (NGS) based technologies such as ChIP-seq, RNA-seq and ATAC-seq are generating a flood of genome-scale data. However, such data are usually processed with automated tools and pipelines, generating tabular outputs and static visualisations. Interpretation is normally made at a high level without the ability to visualise the underlying data in detail. Conventional genome browsers are limited to browsing single locations and do not allow for interactions with the dataset as a whole. Multi Locus View (MLV), a web-based tool, has been developed to allow users to fluidly interact with genomics datasets at multiple scales. The user is able to browse the raw data, cluster, and combine the data with other analysis and annotate the data. User datasets can then be shared with other users or made public for quick assessment from the academic community. MLV is publically available at https://mlv.molbiol.ox.ac.uk .


Subject(s)
Sequence Analysis, DNA/methods , Chromatin Immunoprecipitation Sequencing/methods , Computational Biology/methods , Genomics/methods , High-Throughput Nucleotide Sequencing/methods , Internet , Numerical Analysis, Computer-Assisted , RNA-Seq/methods , Reproducibility of Results , Sequence Analysis, RNA/methods , Software
6.
Microbiology (Reading) ; 156(Pt 5): 1439-1447, 2010 May.
Article in English | MEDLINE | ID: mdl-20110303

ABSTRACT

In plant-pathogenic fungi, the pmk1 mitogen-activated protein kinase (MAPK) signalling pathway plays an essential role in regulating the development of penetration structures and the sensing of host-derived cues, but its role in other pathosystems such as fungal-fungal interactions is less clear. We report the use of a gene disruption strategy to investigate the pmk1-like MAPK, Lf pmk1 in the development of Lecanicillium fungicola (formerly Verticillium fungicola) infection on the cultivated mushroom Agaricus bisporus. Lf pmk1 was isolated using a degenerate PCR-based approach and was shown to be present in a single copy by Southern blot analysis. Quantitative RT-PCR showed the transcript to be fivefold upregulated in cap lesions compared with pure culture. Agrobacterium-mediated targeted disruption was used to delete a central portion of the Lf pmk1 gene. The resulting mutants showed normal symptom development as assessed by A. bisporus mushroom cap assays, sporulation patterns were normal and there were no apparent changes in overall growth rates. Our results indicate that, unlike the situation in fungal-plant pathogens, the pmk1-like MAPK pathway is not required for virulence in the fungal-fungal interaction between the L. fungicola pathogen and A. bisporus host. This observation may be of wider significance in other fungal-fungal and/or fungal-invertebrate interactions.


Subject(s)
Agaricus/physiology , Fungal Proteins/physiology , Mitogen-Activated Protein Kinases/physiology , Verticillium/enzymology , Verticillium/pathogenicity , Blotting, Southern , Fungal Proteins/genetics , Fungal Proteins/isolation & purification , Genes, Fungal , Mitogen-Activated Protein Kinases/genetics , Mitogen-Activated Protein Kinases/isolation & purification , Phenotype , Polymerase Chain Reaction , Reverse Transcriptase Polymerase Chain Reaction , Transformation, Genetic , Verticillium/genetics , Virulence
7.
New Phytol ; 187(2): 343-354, 2010 Jul.
Article in English | MEDLINE | ID: mdl-20487312

ABSTRACT

SUMMARY: *Strigolactones are considered a novel class of plant hormones that, in addition to their endogenous signalling function, are exuded into the rhizosphere acting as a signal to stimulate hyphal branching of arbuscular mycorrhizal (AM) fungi and germination of root parasitic plant seeds. Considering the importance of the strigolactones and their biosynthetic origin (from carotenoids), we investigated the relationship with the plant hormone abscisic acid (ABA). *Strigolactone production and ABA content in the presence of specific inhibitors of oxidative carotenoid cleavage enzymes and in several tomato ABA-deficient mutants were analysed by LC-MS/MS. In addition, the expression of two genes involved in strigolactone biosynthesis was studied. *The carotenoid cleavage dioxygenase (CCD) inhibitor D2 reduced strigolactone but not ABA content of roots. However, in abamineSG-treated plants, an inhibitor of 9-cis-epoxycarotenoid dioxygenase (NCED), and the ABA mutants notabilis, sitiens and flacca, ABA and strigolactones were greatly reduced. The reduction in strigolactone production correlated with the downregulation of LeCCD7 and LeCCD8 genes in all three mutants. *The results show a correlation between ABA levels and strigolactone production, and suggest a role for ABA in the regulation of strigolactone biosynthesis.


Subject(s)
Abscisic Acid/metabolism , Lactones/metabolism , Abscisic Acid/biosynthesis , Biosynthetic Pathways/drug effects , Carotenoids/metabolism , Chromatography, Liquid , Enzyme Inhibitors/pharmacology , Gene Expression Regulation, Plant/drug effects , Genes, Plant/genetics , Germination/drug effects , Solanum lycopersicum/drug effects , Solanum lycopersicum/genetics , Solanum lycopersicum/metabolism , Mass Spectrometry , Mutation/genetics , Orobanche/drug effects , Orobanche/growth & development , Orobanche/metabolism , Phosphates/deficiency , Phosphates/metabolism , Plant Exudates/metabolism , Plant Roots/drug effects , Plant Roots/metabolism , Plant Shoots/metabolism , Reverse Transcriptase Polymerase Chain Reaction
8.
BMC Mol Biol ; 9: 66, 2008 Jul 23.
Article in English | MEDLINE | ID: mdl-18651954

ABSTRACT

BACKGROUND: The vast quantities of gene expression profiling data produced in microarray studies, and the more precise quantitative PCR, are often not statistically analysed to their full potential. Previous studies have summarised gene expression profiles using simple descriptive statistics, basic analysis of variance (ANOVA) and the clustering of genes based on simple models fitted to their expression profiles over time. We report the novel application of statistical non-linear regression modelling techniques to describe the shapes of expression profiles for the fungus Agaricus bisporus, quantified by PCR, and for E. coli and Rattus norvegicus, using microarray technology. The use of parametric non-linear regression models provides a more precise description of expression profiles, reducing the "noise" of the raw data to produce a clear "signal" given by the fitted curve, and describing each profile with a small number of biologically interpretable parameters. This approach then allows the direct comparison and clustering of the shapes of response patterns between genes and potentially enables a greater exploration and interpretation of the biological processes driving gene expression. RESULTS: Quantitative reverse transcriptase PCR-derived time-course data of genes were modelled. "Split-line" or "broken-stick" regression identified the initial time of gene up-regulation, enabling the classification of genes into those with primary and secondary responses. Five-day profiles were modelled using the biologically-oriented, critical exponential curve, y(t) = A + (B + Ct)Rt + epsilon. This non-linear regression approach allowed the expression patterns for different genes to be compared in terms of curve shape, time of maximal transcript level and the decline and asymptotic response levels. Three distinct regulatory patterns were identified for the five genes studied. Applying the regression modelling approach to microarray-derived time course data allowed 11% of the Escherichia coli features to be fitted by an exponential function, and 25% of the Rattus norvegicus features could be described by the critical exponential model, all with statistical significance of p < 0.05. CONCLUSION: The statistical non-linear regression approaches presented in this study provide detailed biologically oriented descriptions of individual gene expression profiles, using biologically variable data to generate a set of defining parameters. These approaches have application to the modelling and greater interpretation of profiles obtained across a wide range of platforms, such as microarrays. Through careful choice of appropriate model forms, such statistical regression approaches allow an improved comparison of gene expression profiles, and may provide an approach for the greater understanding of common regulatory mechanisms between genes.


Subject(s)
Gene Expression Profiling/statistics & numerical data , Gene Expression Regulation/genetics , Models, Statistical , Agaricus/genetics , Animals , Blotting, Northern , Escherichia coli/genetics , Genes/genetics , Kinetics , Oligonucleotide Array Sequence Analysis , RNA, Messenger/analysis , Rats/genetics , Regression Analysis
9.
Sci Rep ; 8(1): 4678, 2018 03 16.
Article in English | MEDLINE | ID: mdl-29549276

ABSTRACT

There is growing concern about the spreading of human microorganisms in relatively untouched ecosystems such as the Antarctic region. For this reason, three pinniped species (Leptonychotes weddellii, Mirounga leonina and Arctocephalus gazella) from the west coast of the Antartic Peninsula were analysed for the presence of Escherichia spp. with the recovery of 158 E. coli and three E. albertii isolates. From those, 23 harboured different eae variants (α1, ß1, ß2, ε1, θ1, κ, ο), including a bfpA-positive isolate (O49:H10-A-ST206, eae-k) classified as typical enteropathogenic E. coli. Noteworthy, 62 of the 158 E. coli isolates (39.2%) exhibited the ExPEC status and 27 (17.1%) belonged to sequence types (ST) frequently occurring among urinary/bacteremia ExPEC clones: ST12, ST73, ST95, ST131 and ST141. We found similarities >85% within the PFGE-macrorrestriction profiles of pinniped and human clinic O2:H6-B2-ST141 and O16:H5/O25b:H4-B2-ST131 isolates. The in silico analysis of ST131 Cplx genomes from the three pinnipeds (five O25:H4-ST131/PST43-fimH22-virotype D; one O16:H5-ST131/PST506-fimH41; one O25:H4-ST6252/PST9-fimH22-virotype D1) identified IncF and IncI1 plasmids and revealed high core-genome similarities between pinniped and human isolates (H22 and H41 subclones). This is the first study to demonstrate the worrisome presence of human-associated E. coli clonal groups, including ST131, in Antarctic pinnipeds.


Subject(s)
Bacterial Typing Techniques/methods , Caniformia/microbiology , DNA, Bacterial/genetics , Escherichia coli Infections/veterinary , Escherichia coli/classification , Animals , Antarctic Regions , Ecosystem , Escherichia coli/genetics , Escherichia coli Infections/microbiology , Humans , Molecular Epidemiology , Molecular Typing , Phylogeny
10.
Curr Biol ; 28(15): 2420-2428.e10, 2018 08 06.
Article in English | MEDLINE | ID: mdl-30033331

ABSTRACT

Salmonella enterica serovar Paratyphi C causes enteric (paratyphoid) fever in humans. Its presentation can range from asymptomatic infections of the blood stream to gastrointestinal or urinary tract infection or even a fatal septicemia [1]. Paratyphi C is very rare in Europe and North America except for occasional travelers from South and East Asia or Africa, where the disease is more common [2, 3]. However, early 20th-century observations in Eastern Europe [3, 4] suggest that Paratyphi C enteric fever may once have had a wide-ranging impact on human societies. Here, we describe a draft Paratyphi C genome (Ragna) recovered from the 800-year-old skeleton (SK152) of a young woman in Trondheim, Norway. Paratyphi C sequences were recovered from her teeth and bones, suggesting that she died of enteric fever and demonstrating that these bacteria have long caused invasive salmonellosis in Europeans. Comparative analyses against modern Salmonella genome sequences revealed that Paratyphi C is a clade within the Para C lineage, which also includes serovars Choleraesuis, Typhisuis, and Lomita. Although Paratyphi C only infects humans, Choleraesuis causes septicemia in pigs and boar [5] (and occasionally humans), and Typhisuis causes epidemic swine salmonellosis (chronic paratyphoid) in domestic pigs [2, 3]. These different host specificities likely evolved in Europe over the last ∼4,000 years since the time of their most recent common ancestor (tMRCA) and are possibly associated with the differential acquisitions of two genomic islands, SPI-6 and SPI-7. The tMRCAs of these bacterial clades coincide with the timing of pig domestication in Europe [6].


Subject(s)
DNA, Ancient/analysis , DNA, Bacterial/analysis , Genomic Instability , Salmonella enterica/genetics , Typhoid Fever/microbiology , Female , Genomic Islands , Humans , Norway
11.
Chem Commun (Camb) ; (27): 2808-10, 2007 Jul 19.
Article in English | MEDLINE | ID: mdl-17609783

ABSTRACT

We show that the use of multiple photochemistries is necessary to ensure diverse immobilisation of small molecules for binding of polypeptides using phage display and antibody libraries.


Subject(s)
Drug Design , Peptide Library , Peptides/metabolism , Photochemistry
12.
Front Plant Sci ; 8: 357, 2017.
Article in English | MEDLINE | ID: mdl-28373878

ABSTRACT

Abscisic acid (ABA) inhibits seed germination and the regulation of ABA biosynthesis has a role in maintenance of seed dormancy. The key rate-limiting step in ABA biosynthesis is catalyzed by 9-cis-epoxycarotenoid dioxygenase (NCED). Two hydroxamic acid inhibitors of carotenoid cleavage dioxygenase (CCD), D4 and D7, previously found to inhibit CCD and NCED in vitro, are shown to have the novel property of decreasing mean germination time of tomato (Solanum lycopersicum L.) seeds constitutively overexpressing LeNCED1. Post-germination, D4 exhibited no negative effects on tomato seedling growth in terms of height, dry weight, and fresh weight. Tobacco (Nicotiana tabacum L.) seeds containing a tetracycline-inducible LeNCED1 transgene were used to show that germination could be negatively and positively controlled through the chemical induction of gene expression and the chemical inhibition of the NCED protein: application of tetracycline increased mean germination time and delayed hypocotyl emergence in a similar manner to that observed when exogenous ABA was applied and this was reversed by D4 when NCED expression was induced at intermediate levels. D4 also improved germination in lettuce (Lactuca sativa L.) seeds under thermoinhibitory temperatures and in tomato seeds imbibed in high osmolarity solutions of polyethylene glycol. D4 reduced ABA and dihydrophaseic acid accumulation in tomato seeds overexpressing LeNCED1 and reduced ABA accumulation in wild type tomato seeds imbibed on polyethylene glycol. The evidence supports a mode of action of D4 through NCED inhibition, and this molecule provides a lead compound for the design of NCED inhibitors with greater specificity and potency.

13.
Genome Announc ; 4(3)2016 May 26.
Article in English | MEDLINE | ID: mdl-27231374

ABSTRACT

The chicken is the most common domesticated animal and the most abundant bird in the world. However, the chicken gut is home to many previously uncharacterized bacterial taxa. Here, we report draft genome sequences from six bacterial isolates from chicken ceca, all of which fall outside any named species.

14.
G3 (Bethesda) ; 5(5): 971-81, 2015 Mar 24.
Article in English | MEDLINE | ID: mdl-25809074

ABSTRACT

A recombinant in-bred line population derived from a cross between Solanum lycopersicum var. cerasiforme (E9) and S. pimpinellifolium (L5) has been used extensively to discover quantitative trait loci (QTL), including those that act via rootstock genotype, however, high-resolution single-nucleotide polymorphism genotyping data for this population are not yet publically available. Next-generation resequencing of parental lines allows the vast majority of polymorphisms to be characterized and used to progress from QTL to causative gene. We sequenced E9 and L5 genomes to 40- and 44-fold depth, respectively, and reads were mapped to the reference Heinz 1706 genome. In L5 there were three clear regions on chromosome 1, chromosome 4, and chromosome 8 with increased rates of polymorphism. Two other regions were highly polymorphic when we compared Heinz 1706 with both E9 and L5 on chromosome 1 and chromosome 10, suggesting that the reference sequence contains a divergent introgression in these locations. We also identified a region on chromosome 4 consistent with an introgression from S. pimpinellifolium into Heinz 1706. A large dataset of polymorphisms for the use in fine-mapping QTL in a specific tomato recombinant in-bred line population was created, including a high density of InDels validated as simple size-based polymerase chain reaction markers. By careful filtering and interpreting the SnpEff prediction tool, we have created a list of genes that are predicted to have highly perturbed protein functions in the E9 and L5 parental lines.


Subject(s)
Genome, Plant , INDEL Mutation , Plant Proteins/genetics , Plant Proteins/metabolism , Recombination, Genetic , Solanum/genetics , Chromosome Mapping , Crosses, Genetic , Frameshift Mutation , Gene Frequency , Genetics, Population , Genomics/methods , High-Throughput Nucleotide Sequencing , Inbreeding , Open Reading Frames , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Solanum/metabolism
15.
PLoS Negl Trop Dis ; 9(6): e0003861, 2015.
Article in English | MEDLINE | ID: mdl-26114287

ABSTRACT

BACKGROUND: Several infectious diseases and therapeutic interventions cause gut microbe dysbiosis and associated pathology. We characterised the gut microbiome of children exposed to the helminth Schistosoma haematobium pre- and post-treatment with the drug praziquantel (PZQ), with the aim to compare the gut microbiome structure (abundance and diversity) in schistosome infected vs. uninfected children. METHODS: Stool DNA from 139 children aged six months to 13 years old; with S. haematobium infection prevalence of 27.34% was extracted at baseline. 12 weeks following antihelminthic treatment with praziqunatel, stool DNA was collected from 62 of the 139 children. The 16S rRNA genes were sequenced from the baseline and post-treatment samples and the sequence data, clustered into operational taxonomic units (OTUs). The OTU data were analysed using multivariate analyses and paired T-test. RESULTS: Pre-treatment, the most abundant phyla were Bacteroidetes, followed by Firmicutes and Proteobacteria respectively. The relative abundance of taxa among bacterial classes showed limited variation by age group or sex and the bacterial communities had similar overall compositions. Although there were no overall differences in the microbiome structure across the whole age range, the abundance of 21 OTUs varied significantly with age (FDR<0.05). Some OTUs including Veillonella, Streptococcus, Bacteroides and Helicobacter were more abundant in children ≤ 1 year old compared to older children. Furthermore, the gut microbiome differed in schistosome infected vs. uninfected children with 27 OTU occurring in infected but not uninfected children, for 5 of these all Prevotella, the difference was statistically significant (p <0.05) with FDR <0.05. PZQ treatment did not alter the microbiome structure in infected or uninfected children from that observed at baseline. CONCLUSIONS: There are significant differences in the gut microbiome structure of infected vs. uninfected children and the differences were refractory to PZQ treatment.


Subject(s)
Dysbiosis/etiology , Dysbiosis/pathology , Feces/microbiology , Microbiota/genetics , Praziquantel/therapeutic use , Schistosomiasis haematobia/complications , Schistosomiasis haematobia/drug therapy , Schistosomiasis haematobia/microbiology , Animals , Child , High-Throughput Nucleotide Sequencing , Humans , Multivariate Analysis , Phylogeny , RNA, Ribosomal, 16S/genetics
16.
Nat Commun ; 6: 6717, 2015 Apr 07.
Article in English | MEDLINE | ID: mdl-25848958

ABSTRACT

Tuberculosis (TB) was once a major killer in Europe, but it is unclear how the strains and patterns of infection at 'peak TB' relate to what we see today. Here we describe 14 genome sequences of M. tuberculosis, representing 12 distinct genotypes, obtained from human remains from eighteenth-century Hungary using metagenomics. All our historic genotypes belong to M. tuberculosis Lineage 4. Bayesian phylogenetic dating, based on samples with well-documented dates, places the most recent common ancestor of this lineage in the late Roman period. We find that most bodies yielded more than one M. tuberculosis genotype and we document an intimate epidemiological link between infections in two long-dead individuals. Our results suggest that metagenomic approaches usefully inform detection and characterization of historical and contemporary infections.


Subject(s)
Coinfection/microbiology , DNA, Bacterial/analysis , Genome, Bacterial/genetics , Mycobacterium tuberculosis/genetics , Tuberculosis/microbiology , Adult , Bayes Theorem , Europe/epidemiology , Female , Genotype , History, 18th Century , Humans , Hungary/epidemiology , Male , Metagenomics , Middle Aged , Molecular Epidemiology , Phylogeny , Tuberculosis/epidemiology , Tuberculosis/history , Young Adult
17.
PeerJ ; 2: e585, 2014.
Article in English | MEDLINE | ID: mdl-25279265

ABSTRACT

Tuberculosis remains a major global health problem. Laboratory diagnostic methods that allow effective, early detection of cases are central to management of tuberculosis in the individual patient and in the community. Since the 1880s, laboratory diagnosis of tuberculosis has relied primarily on microscopy and culture. However, microscopy fails to provide species- or lineage-level identification and culture-based workflows for diagnosis of tuberculosis remain complex, expensive, slow, technically demanding and poorly able to handle mixed infections. We therefore explored the potential of shotgun metagenomics, sequencing of DNA from samples without culture or target-specific amplification or capture, to detect and characterise strains from the Mycobacterium tuberculosis complex in smear-positive sputum samples obtained from The Gambia in West Africa. Eight smear- and culture-positive sputum samples were investigated using a differential-lysis protocol followed by a kit-based DNA extraction method, with sequencing performed on a benchtop sequencing instrument, the Illumina MiSeq. The number of sequence reads in each sputum-derived metagenome ranged from 989,442 to 2,818,238. The proportion of reads in each metagenome mapping against the human genome ranged from 20% to 99%. We were able to detect sequences from the M. tuberculosis complex in all eight samples, with coverage of the H37Rv reference genome ranging from 0.002X to 0.7X. By analysing the distribution of large sequence polymorphisms (deletions and the locations of the insertion element IS6110) and single nucleotide polymorphisms (SNPs), we were able to assign seven of eight metagenome-derived genomes to a species and lineage within the M. tuberculosis complex. Two metagenome-derived mycobacterial genomes were assigned to M. africanum, a species largely confined to West Africa; the others that could be assigned belonged to lineages T, H or LAM within the clade of "modern" M. tuberculosis strains. We have provided proof of principle that shotgun metagenomics can be used to detect and characterise M. tuberculosis sequences from sputum samples without culture or target-specific amplification or capture, using an accessible benchtop-sequencing platform, the Illumina MiSeq, and relatively simple DNA extraction, sequencing and bioinformatics protocols. In our hands, sputum metagenomics does not yet deliver sufficient depth of coverage to allow sequence-based sensitivity testing; it remains to be determined whether improvements in DNA extraction protocols alone can deliver this or whether culture, capture or amplification steps will be required. Nonetheless, we can foresee a tipping point when a unified automated metagenomics-based workflow might start to compete with the plethora of methods currently in use in the diagnostic microbiology laboratory.

18.
PLoS One ; 9(3): e91941, 2014.
Article in English | MEDLINE | ID: mdl-24657972

ABSTRACT

Chickens are major source of food and protein worldwide. Feed conversion and the health of chickens relies on the largely unexplored complex microbial community that inhabits the chicken gut, including the ceca. We have carried out deep microbial community profiling of the microbiota in twenty cecal samples via 16S rRNA gene sequences and an in-depth metagenomics analysis of a single cecal microbiota. We recovered 699 phylotypes, over half of which appear to represent previously unknown species. We obtained 648,251 environmental gene tags (EGTs), the majority of which represent new species. These were binned into over two-dozen draft genomes, which included Campylobacter jejuni and Helicobacter pullorum. We found numerous polysaccharide- and oligosaccharide-degrading enzymes encoding within the metagenome, some of which appeared to be part of polysaccharide utilization systems with genetic evidence for the co-ordination of polysaccharide degradation with sugar transport and utilization. The cecal metagenome encodes several fermentation pathways leading to the production of short-chain fatty acids, including some with novel features. We found a dozen uptake hydrogenases encoded in the metagenome and speculate that these provide major hydrogen sinks within this microbial community and might explain the high abundance of several genera within this microbiome, including Campylobacter, Helicobacter and Megamonas.


Subject(s)
Cecum/microbiology , Chickens/microbiology , Microbiota , Animals , Biodiversity , Hydrogen/metabolism
19.
mBio ; 5(4): e01337-14, 2014 Jul 15.
Article in English | MEDLINE | ID: mdl-25028426

ABSTRACT

Shotgun metagenomics provides a powerful assumption-free approach to the recovery of pathogen genomes from contemporary and historical material. We sequenced the metagenome of a calcified nodule from the skeleton of a 14th-century middle-aged male excavated from the medieval Sardinian settlement of Geridu. We obtained 6.5-fold coverage of a Brucella melitensis genome. Sequence reads from this genome showed signatures typical of ancient or aged DNA. Despite the relatively low coverage, we were able to use information from single-nucleotide polymorphisms to place the medieval pathogen genome within a clade of B. melitensis strains that included the well-studied Ether strain and two other recent Italian isolates. We confirmed this placement using information from deletions and IS711 insertions. We conclude that metagenomics stands ready to document past and present infections, shedding light on the emergence, evolution, and spread of microbial pathogens. Importance: Infectious diseases have shaped human populations and societies throughout history. The recovery of pathogen DNA sequences from human remains provides an opportunity to identify and characterize the causes of individual and epidemic infections. By sequencing DNA extracted from medieval human remains through shotgun metagenomics, without target-specific capture or amplification, we have obtained a draft genome sequence of an ~700-year-old Brucella melitensis strain. Using a variety of bioinformatic approaches, we have shown that this historical strain is most closely related to recent strains isolated from Italy, confirming the continuity of this zoonotic infection, and even a specific lineage, in the Mediterranean region over the centuries.


Subject(s)
Brucella melitensis/genetics , Genome, Bacterial/genetics , Metagenomics/methods
20.
PLoS One ; 7(5): e38094, 2012.
Article in English | MEDLINE | ID: mdl-22666455

ABSTRACT

The analysis of 16S-rDNA sequences to assess the bacterial community composition of a sample is a widely used technique that has increased with the advent of high throughput sequencing. Although considerable effort has been devoted to identifying the most informative region of the 16S gene and the optimal informatics procedures to process the data, little attention has been paid to the PCR step, in particular annealing temperature and primer length. To address this, amplicons derived from 16S-rDNA were generated from chicken caecal content DNA using different annealing temperatures, primers and different DNA extraction procedures. The amplicons were pyrosequenced to determine the optimal protocols for capture of maximum bacterial diversity from a chicken caecal sample. Even at very low annealing temperatures there was little effect on the community structure, although the abundance of some OTUs such as Bifidobacterium increased. Using shorter primers did not reveal any novel OTUs but did change the community profile obtained. Mechanical disruption of the sample by bead beating had a significant effect on the results obtained, as did repeated freezing and thawing. In conclusion, existing primers and standard annealing temperatures captured as much diversity as lower annealing temperatures and shorter primers.


Subject(s)
DNA Primers/genetics , DNA/genetics , DNA/isolation & purification , High-Throughput Nucleotide Sequencing/methods , RNA, Ribosomal, 16S/genetics , Sequence Analysis, RNA/methods , Temperature , Animals , Cecum , Chickens
SELECTION OF CITATIONS
SEARCH DETAIL