ABSTRACT
The low costs of array-synthesized oligonucleotide libraries are empowering rapid advances in quantitative and synthetic biology. However, high synthesis error rates, uneven representation, and lack of access to individual oligonucleotides limit the true potential of these libraries. We have developed a cost-effective method called Recombinase Directed Indexing (REDI), which involves integration of a complex library into yeast, site-specific recombination to index library DNA, and next-generation sequencing to identify desired clones. We used REDI to generate a library of ~3,300 DNA probes that exhibited > 96% purity and remarkable uniformity (> 95% of probes within twofold of the median abundance). Additionally, we created a collection of ~9,000 individually accessible CRISPR interference yeast strains for > 99% of genes required for either fermentative or respiratory growth, demonstrating the utility of REDI for rapid and cost-effective creation of strain collections from oligonucleotide pools. Our approach is adaptable to any complex DNA library, and fundamentally changes how these libraries can be parsed, maintained, propagated, and characterized.
Subject(s)
Sequence Analysis, DNA/methods , Yeasts/genetics , CRISPR-Cas Systems , Computational Biology/methods , DNA, Fungal/genetics , Gene LibraryABSTRACT
The vast majority of microscopic life on earth consists of microbes that do not grow in laboratory culture. To profile the microbial diversity in environmental and clinical samples, we have devised and employed molecular probe technology, which detects and identifies bacteria that do and do not grow in culture. The only requirement is a short sequence of contiguous bases (currently 60 bases) unique to the genome of the organism of interest. The procedure is relatively fast, inexpensive, customizable, robust, and culture independent and uses commercially available reagents and instruments. In this communication, we report improving the specificity of the molecular probes substantially and increasing the complexity of the molecular probe set by over an order of magnitude (>1,200 probes) and introduce a new final readout method based upon Illumina sequencing. In addition, we employed molecular probes to identify the bacteria from vaginal swabs and demonstrate how a deliberate selection of molecular probes can identify less abundant bacteria even in the presence of much more abundant species.
Subject(s)
Bacteria/isolation & purification , Molecular Probes/chemistry , Bacteria/classification , Bacteria/genetics , Bacteriological Techniques/methods , DNA, Bacterial/genetics , Oligonucleotide Array Sequence Analysis/methods , Oligonucleotides/chemical synthesis , Polymerase Chain Reaction/methods , Sensitivity and Specificity , Sequence Analysis, DNAABSTRACT
BACKGROUND: Our ultimate goal is to detect the entire human microbiome, in health and in disease, in a single reaction tube, and employing only commercially available reagents. To that end, we adapted molecular inversion probes to detect bacteria using solely a massively multiplex molecular technology. This molecular probe technology does not require growth of the bacteria in culture. Rather, the molecular probe technology requires only a sequence of forty sequential bases unique to the genome of the bacterium of interest. In this communication, we report the first results of employing our molecular probes to detect bacteria in clinical samples. RESULTS: While the assay on Affymetrix GenFlex Tag16K arrays allows the multiplexing of the detection of the bacteria in each clinical sample, one Affymetrix GenFlex Tag16K array must be used for each clinical sample. To multiplex the clinical samples, we introduce a second, independent assay for the molecular probes employing Sequencing by Oligonucleotide Ligation and Detection. By adding one unique oligonucleotide barcode for each clinical sample, we combine the samples after processing, but before sequencing, and sequence them together. CONCLUSIONS: Overall, we have employed 192 molecular probes representing 40 bacteria to detect the bacteria in twenty-one vaginal swabs as assessed by the Affymetrix GenFlex Tag16K assay and fourteen of those by the Sequencing by Oligonucleotide Ligation and Detection assay. The correlations among the assays were excellent.
Subject(s)
Bacteria , Microbiological Techniques/methods , Molecular Probe Techniques , Bacteria/genetics , Bacteria/isolation & purification , Computer Simulation , Female , Humans , Molecular Sequence Data , Oligonucleotide Array Sequence Analysis , Reproducibility of Results , Vagina/microbiologyABSTRACT
PURPOSE: To determine the vaginal microbiome in women undergoing IVF-ET and investigate correlations with clinical outcomes. METHODS: Thirty patients had blood drawn for estradiol (E(2)) and progesterone (P(4)) at four time points during the IVF-ET cycle and at 4-6 weeks of gestation, if pregnant. Vaginal swabs were obtained in different hormonal milieu, and the vaginal microbiome determined by deep sequencing of the 16S ribosomal RNA gene. RESULTS: The vaginal microbiome underwent a transition during therapy in some but not all patients. Novel bacteria were found in 33% of women tested during the treatment cycle, but not at 6-8 weeks of gestation. Diversity of species varied across different hormonal milieu, and on the day of embryo transfer correlated with outcome (live birth/no live birth). The species diversity index distinguished women who had a live birth from those who did not. CONCLUSIONS: This metagenomics approach has enabled discovery of novel, previously unidentified bacterial species in the human vagina in different hormonal milieu and supports a shift in the vaginal microbiome during IVF-ET therapy using standard protocols. Furthermore, the data suggest that the vaginal microbiome on the day of embryo transfer affects pregnancy outcome.
Subject(s)
Bacteria/classification , Fertilization in Vitro , Metagenome , RNA, Ribosomal, 16S/genetics , Vagina/microbiology , Adult , Embryo Transfer , Estradiol/blood , Female , Humans , Middle Aged , Pregnancy , Pregnancy Outcome , Progesterone/bloodABSTRACT
We have adapted molecular inversion probe technology to identify microbes in a highly multiplexed procedure. This procedure does not require growth of the microbes. Rather, the technology employs DNA homology twice: once for the molecular probe to hybridize to its homologous DNA and again for the 20-mer oligonucleotide barcode on the molecular probe to hybridize to a commercially available molecular barcode array. As proof of concept, we have designed, tested, and employed 192 molecular probes for 40 microbes. While these particular molecular probes are aimed at our interest in the microbes in the human vagina, this molecular probe method could be employed to identify the microbes in any ecological niche.
Subject(s)
Bacteriological Techniques/methods , Communicable Diseases/diagnosis , Molecular Probe Techniques , Nucleic Acid Hybridization/methods , Polymerase Chain Reaction/methods , Female , Humans , Sensitivity and Specificity , Vagina/microbiologyABSTRACT
The use of molecular probe technology is demonstrated for routine identification and tracking of cultured and uncultured microorganisms in an activated sludge bioreactor treating domestic wastewater. A key advantage of molecular probe technology is that it can interrogate hundreds of microbial species of interest in a single measurement. In environmental niches where a single genus (such as Competibacteraceae) dominates, it can be difficult and expensive to identify microorganisms that are present at low relative abundance. With molecular probe technology, it is straightforward. Members of the Competibacteraceae family, none of which have been grown in pure culture, are abundant in an activated sludge system in the San Francisco Bay Area, California, USA. Molecular probe ensembles with and without Competibacteraceae probes were constructed. Whereas the probe ensemble with Competibacteraceae probes identified a total of ten bacteria, the molecular probe ensemble without Competibacteraceae probes identified 29 bacteria, including many at low relative abundance and including some species of public health significance.
Subject(s)
Molecular Probes , Sewage , Bioreactors , RNA, Ribosomal, 16S , San Francisco , WastewaterABSTRACT
BACKGROUND: The metalloprotease-disintegrin family, or ADAM, proteins, are implicated in cell-cell interactions, cell fusion, and cell signaling, and are widely distributed among metazoan phyla. Orthologous relationships have been defined for a few ADAM proteins including ADAM10 (Kuzbanian), and ADAM17 (TACE), but evolutionary relationships are not clear for the majority of family members. Human ADAM33 refers to a testis cDNA clone that does not contain a complete open reading frame, but portions of the predicted protein are similar to Xenopus laevis ADAM13. RESULTS: In a 48 kb region of mouse DNA adjacent to the Attractin gene on mouse chromosome 2, we identified sequences very similar to human ADAM33. A full-length mouse cDNA was identified by a combination of gene prediction programs and RT-PCR, and the probable full-length human cDNA was identified by comparison to human genomic sequence in the homologous region on chromosome 20p13. Mouse ADAM33 is 44% identical to Xenopus laevis ADAM13, however a phylogenetic alignment and consideration of functional domains suggests that the two genes are not orthologous. Mouse Adam33 is widely expressed, most highly in the adult brain, heart, kidney, lung and testis. CONCLUSIONS: While mouse ADAM33 is similar to Xenopus ADAM13 in sequence, further examination of its embryonic expression pattern, catalytic activity and protein interactions will be required to assess the functional relationship between these two proteins. Adam33 is expressed in the mouse adult brain and could play a role in complex processes that require cell-cell communication.
Subject(s)
Metalloendopeptidases/genetics , Xenopus Proteins , ADAM Proteins , Amino Acid Sequence , Animals , Brain/metabolism , Humans , Membrane Proteins/genetics , Metalloendopeptidases/biosynthesis , Mice , Molecular Sequence Data , Phylogeny , RNA, Messenger/biosynthesis , Sequence Homology, Amino Acid , Tissue DistributionABSTRACT
Reproductive tract infection is a major initiator of preterm birth (PTB). The objective of this prospective cohort study of 88 participants was to determine whether PTB correlates with the vaginal microbiome during pregnancy. Total DNA was purified from posterior vaginal fornix swabs during gestation. The 16S ribosomal RNA gene was amplified using polymerase chain reaction primers, followed by chain-termination sequencing. Bacteria were identified by comparing contig consensus sequences with the Ribosomal Database Project. Dichotomous responses were summarized via proportions and continuous variables via means ± standard deviation. Mean Shannon Diversity index differed by Welch t test (P = .00016) between caucasians with PTB and term gestation. Species diversity was greatest among African Americans (P = .0045). Change in microbiome/Lactobacillus content and presence of putative novel/noxious bacteria did not correlate with PTB. We conclude that uncultured vaginal bacteria play an important role in PTB and race/ethnicity and sampling location are important determinants of the vaginal microbiome.
Subject(s)
Bacteria/classification , Infant, Premature , Microbiota , Premature Birth/microbiology , Vagina/microbiology , Adult , Black or African American , Asian , Bacteria/genetics , Bacteria/isolation & purification , Case-Control Studies , DNA, Bacterial/isolation & purification , Female , Hispanic or Latino , Humans , Lactobacillus/classification , Lactobacillus/genetics , Lactobacillus/isolation & purification , Metagenome , Metagenomics , Pregnancy , Premature Birth/ethnology , Prospective Studies , RNA, Ribosomal, 16S/genetics , Ribotyping , Risk Factors , San Francisco/epidemiology , White PeopleABSTRACT
BACKGROUND: Relatively recently, the software KB™ Basecaller has replaced phred for identifying the bases from raw sequence data in DNA sequencing employing dideoxy chemistry. We have measured quantitatively the consequences of that change. RESULTS: The high quality sequence segment of reads derived from the KB™ Basecaller were, on average, 30-to-50 bases longer than reads derived from phred. However, microbe identification appeared to have been unaffected by the change in software. CONCLUSIONS: We have demonstrated a modest, but statistically significant, superiority in high quality read length of the KB™ Basecaller compared to phred. We found no statistically significant difference between the numbers of microbial species identified from the sequence data.
ABSTRACT
We sequenced the genome of Saccharomyces cerevisiae strain YJM789, which was derived from a yeast isolated from the lung of an AIDS patient with pneumonia. The strain is used for studies of fungal infections and quantitative genetics because of its extensive phenotypic differences to the laboratory reference strain, including growth at high temperature and deadly virulence in mouse models. Here we show that the approximately 12-Mb genome of YJM789 contains approximately 60,000 SNPs and approximately 6,000 indels with respect to the reference S288c genome, leading to protein polymorphisms with a few known cases of phenotypic changes. Several ORFs are found to be unique to YJM789, some of which might have been acquired through horizontal transfer. Localized regions of high polymorphism density are scattered over the genome, in some cases spanning multiple ORFs and in others concentrated within single genes. The sequence of YJM789 contains clues to pathogenicity and spurs the development of more powerful approaches to dissecting the genetic basis of complex hereditary traits.
Subject(s)
Genome, Fungal/genetics , Saccharomyces cerevisiae/genetics , Base Sequence , Chromosome Inversion/genetics , Gene Transfer, Horizontal/genetics , Mitochondria/genetics , Molecular Sequence Data , Open Reading Frames/genetics , Phenotype , Phylogeny , Polymorphism, Genetic/genetics , Translocation, Genetic/geneticsABSTRACT
Using solely a gene-based procedure, PCR amplification of the 16S ribosomal RNA gene coupled with very deep sequencing of the amplified products, the microbes on 20 human vaginal epithelia of healthy women have been identified and quantitated. The Lactobacillus content on these 20 healthy vaginal epithelia was highly variable, ranging from 0% to 100%. For four subjects, Lactobacillus was (virtually) the only bacterium detected. However, that Lactobacillus was far from clonal and was a mixture of species and strains. Eight subjects presented complex mixtures of Lactobacillus and other microbes. The remaining eight subjects had no Lactobacillus. Instead, Bifidobacterium, Gardnerella, Prevotella, Pseudomonas, or Streptococcus predominated.
Subject(s)
Lactobacillus/genetics , Vagina/microbiology , Adult , Base Sequence , Cloning, Molecular , Cluster Analysis , Computational Biology , DNA Primers , Databases, Nucleic Acid , Epithelium/microbiology , Escherichia coli , Female , Humans , Molecular Sequence Data , Polymerase Chain Reaction , RNA, Ribosomal, 16S/genetics , Sequence Analysis, DNA , Species SpecificityABSTRACT
The human malaria parasite Plasmodium falciparum is responsible for the death of more than a million people every year. To stimulate basic research on the disease, and to promote the development of effective drugs and vaccines against the parasite, the complete genome of P. falciparum clone 3D7 has been sequenced, using a chromosome-by-chromosome shotgun strategy. Here we report the nucleotide sequence of the third largest of the parasite's 14 chromosomes, chromosome 12, which comprises about 10% of the 23-megabase genome. As the most (A + T)-rich (80.6%) genome sequenced to date, the P. falciparum genome presented severe problems during the assembly of primary sequence reads. We discuss the methodology that yielded a finished and fully contiguous sequence for chromosome 12. The biological implications of the sequence data are more thoroughly discussed in an accompanying Article (ref. 3).
Subject(s)
DNA, Protozoan , Plasmodium falciparum/genetics , Animals , Chromosomes , Chromosomes, Artificial, Yeast , Genome, Protozoan , Humans , Proteome , Protozoan Proteins/genetics , Sequence Analysis, DNAABSTRACT
The parasite Plasmodium falciparum is responsible for hundreds of millions of cases of malaria, and kills more than one million African children annually. Here we report an analysis of the genome sequence of P. falciparum clone 3D7. The 23-megabase nuclear genome consists of 14 chromosomes, encodes about 5,300 genes, and is the most (A + T)-rich genome sequenced to date. Genes involved in antigenic variation are concentrated in the subtelomeric regions of the chromosomes. Compared to the genomes of free-living eukaryotic microbes, the genome of this intracellular parasite encodes fewer enzymes and transporters, but a large proportion of genes are devoted to immune evasion and host-parasite interactions. Many nuclear-encoded proteins are targeted to the apicoplast, an organelle involved in fatty-acid and isoprenoid metabolism. The genome sequence provides the foundation for future studies of this organism, and is being exploited in the search for new drugs and vaccines to fight malaria.