ABSTRACT
Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas-70% of the European ancestry in today's African Americans dates back to European gene flow happening only 7-8 generations ago.
Subject(s)
Genome, Human , Haplotypes/genetics , Population/genetics , Racial Groups/genetics , Genetics, Population/methods , Heterozygote , Humans , Polymorphism, Single NucleotideABSTRACT
Methods for silencing genes in Phytophthora transformants have been demonstrated previously, but wide variation in effectiveness was reported in different studies. To optimize this important tool for functional genomics, we compared the abilities of sense, antisense, and hairpin transgenes introduced by protoplast, electroporation, and bombardment methods to silence the inf1 elicitin gene in Phytophthora infestans. A hairpin construct induced silencing three times more often than sense or antisense vectors, and protoplast transformation twice as much as electroporation. Using hairpins introduced into protoplasts, 61% of strains were silenced, and transgene copy number was positively correlated with silencing. The utility of bombardment was reduced by the occurrence of heterokaryons containing silenced and non-silenced nuclei, but silenced strains were obtainable from about 20% of primary transformants by single-nuclear purification. Most inf1-deficient strains were fully silenced, however some exhibited partial suppression. These produced inf1-derived RNAs of about 21-nt which correspond to both the sense and antisense strands of inf1, implicating an RNAi-like mechanism in silencing.
Subject(s)
Gene Silencing , Phytophthora/genetics , RNA, Small Interfering/genetics , Transgenes/genetics , Genetic Vectors , Genomics/methods , Models, Genetic , Phytophthora/metabolism , Transformation, GeneticABSTRACT
Current efforts to understand antibiotic resistance on the whole genome scale tend to focus on known genes even as high throughput sequencing strategies uncover novel mechanisms. To identify genomic variations associated with antibiotic resistance, we employed a modified genome-wide association study; we sequenced genomic DNA from pools of E. coli clinical isolates with similar antibiotic resistance phenotypes using SOLiD technology to uncover single nucleotide polymorphisms (SNPs) unanimously conserved in each pool. The multidrug-resistant pools were genotypically similar to SMS-3-5, a previously sequenced multidrug-resistant isolate from a polluted environment. The similarity was evenly spread across the entire genome and not limited to plasmid or pathogenicity island loci. Among the pools of clinical isolates, genomic variation was concentrated adjacent to previously reported inversion and duplication differences between the SMS-3-5 isolate and the drug-susceptible laboratory strain, DH10B. SNPs that result in non-synonymous changes in gyrA (encoding the well-known S83L allele associated with fluoroquinolone resistance), mutM, ligB, and recG were unanimously conserved in every fluoroquinolone-resistant pool. Alleles of the latter three genes are tightly linked among most sequenced E. coli genomes, and had not been implicated in antibiotic resistance previously. The changes in these genes map to amino acid positions in alpha helices that are involved in DNA binding. Plasmid-encoded complementation of null strains with either allelic variant of mutM or ligB resulted in variable responses to ultraviolet light or hydrogen peroxide treatment as markers of induced DNA damage, indicating their importance in DNA metabolism and revealing a potential mechanism for fluoroquinolone resistance. Our approach uncovered evidence that additional DNA binding enzymes may contribute to fluoroquinolone resistance and further implicate environmental bacteria as a reservoir for antibiotic resistance.
Subject(s)
Anti-Bacterial Agents/pharmacology , Drug Resistance, Bacterial/genetics , Escherichia coli/drug effects , Fluoroquinolones/pharmacology , Genotype , DNA, Bacterial/genetics , Escherichia coli/genetics , Microbial Sensitivity Tests , Polymorphism, Single NucleotideABSTRACT
Methylation, the addition of methyl groups to cytosine (C), plays an important role in the regulation of gene expression in both normal and dysfunctional cells. During bisulfite conversion and subsequent PCR amplification, unmethylated Cs are converted into thymine (T), while methylated Cs will not be converted. Sequencing of this bisulfite-treated DNA permits the detection of methylation at specific sites. Through the introduction of next-generation sequencing technologies (NGS) simultaneous analysis of methylation motifs in multiple regions provides the opportunity for hypothesis-free study of the entire methylome. Here we present a whole methylome sequencing study that compares two different bisulfite conversion methods (in solution versus in gel), utilizing the high throughput of the SOLiD System. Advantages and disadvantages of the two different bisulfite conversion methods for constructing sequencing libraries are discussed. Furthermore, the application of the SOLiD bisulfite sequencing to larger and more complex genomes is shown with preliminary in silico created bisulfite converted reads.
Subject(s)
DNA Methylation , Genome, Human/genetics , Sequence Analysis, DNA/methods , Base Sequence , Binding Sites/genetics , DNA/chemistry , DNA/genetics , Electrophoresis, Polyacrylamide Gel/methods , Genomic Library , Humans , Molecular Sequence Data , Polymerase Chain Reaction , Sequence Homology, Nucleic Acid , Sulfites/chemistryABSTRACT
BACKGROUND: In the event of biocrimes or infectious disease outbreaks, high-resolution genetic characterization for identifying the agent and attributing it to a specific source can be crucial for an effective response. Until recently, in-depth genetic characterization required expensive and time-consuming Sanger sequencing of a few strains, followed by genotyping of a small number of marker loci in a panel of isolates at or by gel-based approaches such as pulsed field gel electrophoresis, which by necessity ignores most of the genome. Next-generation, massively parallel sequencing (MPS) technology (specifically the Applied Biosystems sequencing by oligonucleotide ligation and detection (SOLiD™) system) is a powerful investigative tool for rapid, cost-effective and parallel microbial whole-genome characterization. RESULTS: To demonstrate the utility of MPS for whole-genome typing of monomorphic pathogens, four Bacillus anthracis and four Yersinia pestis strains were sequenced in parallel. Reads were aligned to complete reference genomes, and genomic variations were identified. Resequencing of the B. anthracis Ames ancestor strain detected no false-positive single-nucleotide polymorphisms (SNPs), and mapping of reads to the Sterne strain correctly identified 98% of the 133 SNPs that are not clustered or associated with repeats. Three geographically distinct B. anthracis strains from the A branch lineage were found to have between 352 and 471 SNPs each, relative to the Ames genome, and one strain harbored a genomic amplification. Sequencing of four Y. pestis strains from the Orientalis lineage identified between 20 and 54 SNPs per strain relative to the CO92 genome, with the single Bolivian isolate having approximately twice as many SNPs as the three more closely related North American strains. Coverage plotting also revealed a common deletion in two strains and an amplification in the Bolivian strain that appear to be due to insertion element-mediated recombination events. Most private SNPs (that is, a, variant found in only one strain in this set) selected for validation by Sanger sequencing were confirmed, although rare false-positive SNPs were associated with variable nucleotide tandem repeats. CONCLUSIONS: The high-throughput, multiplexing capability, and accuracy of this system make it suitable for rapid whole-genome typing of microbial pathogens during a forensic or epidemiological investigation. By interrogating nearly every base of the genome, rare polymorphisms can be reliably discovered, thus facilitating high-resolution strain tracking and strengthening forensic attribution.