RESUMO
Bisulfite sequencing detects 5mC and 5hmC at single-base resolution. However, bisulfite treatment damages DNA, which results in fragmentation, DNA loss, and biased sequencing data. To overcome these problems, enzymatic methyl-seq (EM-seq) was developed. This method detects 5mC and 5hmC using two sets of enzymatic reactions. In the first reaction, TET2 and T4-BGT convert 5mC and 5hmC into products that cannot be deaminated by APOBEC3A. In the second reaction, APOBEC3A deaminates unmodified cytosines by converting them to uracils. Therefore, these three enzymes enable the identification of 5mC and 5hmC. EM-seq libraries were compared with bisulfite-converted DNA, and each library type was ligated to Illumina adaptors before conversion. Libraries were made using NA12878 genomic DNA, cell-free DNA, and FFPE DNA over a range of DNA inputs. The 5mC and 5hmC detected in EM-seq libraries were similar to those of bisulfite libraries. However, libraries made using EM-seq outperformed bisulfite-converted libraries in all specific measures examined (coverage, duplication, sensitivity, etc.). EM-seq libraries displayed even GC distribution, better correlations across DNA inputs, increased numbers of CpGs within genomic features, and accuracy of cytosine methylation calls. EM-seq was effective using as little as 100 pg of DNA, and these libraries maintained the described advantages over bisulfite sequencing. EM-seq library construction, using challenging samples and lower DNA inputs, opens new avenues for research and clinical applications.
RESUMO
The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome.
Assuntos
Genoma Bacteriano/genética , Genoma Humano/genética , Genômica/instrumentação , Genômica/métodos , Semicondutores , Análise de Sequência de DNA/instrumentação , Análise de Sequência de DNA/métodos , Escherichia coli/genética , Humanos , Luz , Masculino , Rodopseudomonas/genética , Vibrio/genéticaRESUMO
We describe the genome sequencing of an anonymous individual of African origin using a novel ligation-based sequencing assay that enables a unique form of error correction that improves the raw accuracy of the aligned reads to >99.9%, allowing us to accurately call SNPs with as few as two reads per allele. We collected several billion mate-paired reads yielding approximately 18x haploid coverage of aligned sequence and close to 300x clone coverage. Over 98% of the reference genome is covered with at least one uniquely placed read, and 99.65% is spanned by at least one uniquely placed mate-paired clone. We identify over 3.8 million SNPs, 19% of which are novel. Mate-paired data are used to physically resolve haplotype phases of nearly two-thirds of the genotypes obtained and produce phased segments of up to 215 kb. We detect 226,529 intra-read indels, 5590 indels between mate-paired reads, 91 inversions, and four gene fusions. We use a novel approach for detecting indels between mate-paired reads that are smaller than the standard deviation of the insert size of the library and discover deletions in common with those detected with our intra-read approach. Dozens of mutations previously described in OMIM and hundreds of nonsynonymous single-nucleotide and structural variants in genes previously implicated in disease are identified in this individual. There is more genetic variation in the human genome still to be uncovered, and we provide guidance for future surveys in populations and cancer biopsies.
Assuntos
Pareamento de Bases , Biologia Computacional/métodos , Variação Genética , Genoma Humano , Ligases , Análise de Sequência de DNA/métodos , África , Sequência de Bases , Genômica , Genótipo , Heterozigoto , Homozigoto , Humanos , Polimorfismo de Nucleotídeo Único , Padrões de ReferênciaRESUMO
We have identified and validated a spaceflight-associated microRNA (miRNA) signature that is shared by rodents and humans in response to simulated, short-duration and long-duration spaceflight. Previous studies have identified miRNAs that regulate rodent responses to spaceflight in low-Earth orbit, and we have confirmed the expression of these proposed spaceflight-associated miRNAs in rodents reacting to simulated spaceflight conditions. Moreover, astronaut samples from the NASA Twins Study confirmed these expression signatures in miRNA sequencing, single-cell RNA sequencing (scRNA-seq), and single-cell assay for transposase accessible chromatin (scATAC-seq) data. Additionally, a subset of these miRNAs (miR-125, miR-16, and let-7a) was found to regulate vascular damage caused by simulated deep space radiation. To demonstrate the physiological relevance of key spaceflight-associated miRNAs, we utilized antagomirs to inhibit their expression and successfully rescue simulated deep-space-radiation-mediated damage in human 3D vascular constructs.
Assuntos
MicroRNA Circulante/genética , MicroRNAs/genética , Ausência de Peso/efeitos adversos , Animais , Feminino , Expressão Gênica , Perfilação da Expressão Gênica/métodos , Humanos , Masculino , Camundongos , Camundongos Endogâmicos BALB C , Pessoa de Meia-Idade , Ratos , Análise de Sequência de RNA/métodos , Voo Espacial , Transcriptoma/genética , Simulação de Ausência de Peso/métodosRESUMO
Reconstructions of vascular plant mitochondrial genomes (mt-genomes) are notoriously complicated by rampant recombination that has resulted in comparatively few plant mt-genomes being available. The dearth of plant mitochondrial resources has limited our understanding of mt-genome structural diversity, complex patterns of RNA editing, and the origins of novel mt-genome elements. Here, we use an efficient long read (PacBio) iterative assembly pipeline to generate mt-genome assemblies for Leucaena trichandra (Leguminosae: Caesalpinioideae: mimosoid clade), providing the first assessment of non-papilionoid legume mt-genome content and structure to date. The efficiency of the assembly approach facilitated the exploration of alternative structures that are common place among plant mitochondrial genomes. A compact version (729 kbp) of the recovered assemblies was used to investigate sources of mt-genome size variation among legumes and mt-genome sequence similarity to the legume associated root holoparasite Lophophytum. The genome and an associated suite of transcriptome data from select species of Leucaena permitted an in-depth exploration of RNA editing in a diverse clade of closely related species that includes hybrid lineages. RNA editing in the allotetraploid, Leucaena leucocephala, is consistent with co-option of nearly equal maternal and paternal C-to-U edit components, generating novel combinations of RNA edited sites. A preliminary investigation of L. leucocephala C-to-U edit frequencies identified the potential for a hybrid to generate unique pools of alleles from parental variation through edit frequencies shared with one parental lineage, those intermediate between parents, and transgressive patterns.
Assuntos
Fabaceae/genética , Genoma Mitocondrial , Edição de RNA , RNA Mitocondrial/genética , RNA de Plantas/genética , Transferência Genética Horizontal , Sequências Repetitivas de Ácido Nucleico , Sequências de Repetição em Tandem , TetraploidiaRESUMO
Altered chromatin structure is a hallmark of cancer, and inappropriate regulation of chromatin structure may represent the origin of transformation. Important studies have mapped human nucleosome distributions genome wide, but the role of chromatin structure in cancer progression has not been addressed. We developed a MNase-Transcription Start Site Sequence Capture method (mTSS-seq) to map the nucleosome distribution at human transcription start sites genome-wide in primary human lung and colon adenocarcinoma tissue. Here, we confirm that nucleosome redistribution is an early, widespread event in lung (LAC) and colon (CRC) adenocarcinoma. These altered nucleosome architectures are consistent between LAC and CRC patient samples indicating that they may serve as important early adenocarcinoma markers. We demonstrate that the nucleosome alterations are driven by the underlying DNA sequence and potentiate transcription factor binding. We conclude that DNA-directed nucleosome redistributions are widespread early in cancer progression. We have proposed an entirely new hierarchical model for chromatin-mediated genome regulation.
Assuntos
Adenocarcinoma/genética , Cromatina/genética , Mapeamento Cromossômico , Neoplasias do Colo/genética , Genoma Humano , Neoplasias Pulmonares/genética , Nucleossomos/genética , Adenocarcinoma/patologia , Neoplasias do Colo/patologia , Progressão da Doença , Regulação Neoplásica da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Neoplasias Pulmonares/patologiaRESUMO
"Microbiome" is used to describe the communities of microorganisms and their genes in a particular environment, including communities in association with a eukaryotic host or part of a host. One challenge in microbiome analysis concerns the presence of host DNA in samples. Removal of host DNA before sequencing results in greater sequence depth of the intended microbiome target population. This unit describes a novel method of microbial DNA enrichment in which methylated host DNA such as human genomic DNA is selectively bound and separated from microbial DNA before next-generation sequencing (NGS) library construction. This microbiome enrichment technique yields a higher fraction of microbial sequencing reads and improved read quality resulting in a reduced cost of downstream data generation and analysis. © 2016 by John Wiley & Sons, Inc.
Assuntos
DNA/isolamento & purificação , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Microbiota , Análise de Sequência de DNA/métodos , Precipitação Química , DNA/genética , Metilação de DNA , HumanosRESUMO
Ribosomal RNAs (rRNAs) are extremely abundant, often constituting 80% to 90% of total RNA. Since rRNA sequences are often not of interest in genomic RNA sequencing experiments, rRNAs can be removed from the sample before the library preparation step, in order to prevent the majority of the library and the majority of sequencing reads from being rRNA. Removal of rRNA can be especially challenging for low quality and formalin-fixed paraffin-embedded (FFPE) RNA samples due to the fragmented nature of these RNA molecules. The NEBNext rRNA Depletion Kit (Human/Mouse/Rat) depletes both cytoplasmic (5 S rRNA, 5.8 S rRNA, 18 S rRNA, and 28 S rRNA) and mitochondrial rRNA (12 S rRNA and 16 S rRNA) from total RNA preparations from human, mouse, and rat samples. Due to the high similarity among mammalian rRNA sequences, it is likely that rRNA depletion can also be achieved for other mammals but has not been empirically tested. This product is compatible with both intact and degraded RNA (e.g., FFPE RNA). The resulting rRNA-depleted RNA is suitable for RNA-seq, random-primed cDNA synthesis, or other downstream RNA analysis applications. Regardless of the quality or amount of input RNA, this method efficiently removes rRNA, while retaining non-coding and other non-poly(A) RNAs. The NEBNext rRNA Depletion Kit thus provides a more complete picture of the transcript repertoire than oligo d(T) poly(A) mRNA enrichment methods. © 2016 by John Wiley & Sons, Inc.
RESUMO
PREMISE OF THE STUDY: Variation in the distribution of methylated CpG (methyl-CpG) in genomic DNA (gDNA) across the tree of life is biologically interesting and useful in genomic studies. We illustrate the use of human methyl-CpG-binding domain (MBD2) to fractionate angiosperm DNA into eukaryotic nuclear (methyl-CpG-rich) vs. organellar and prokaryotic (methyl-CpG-poor) elements for genomic and metagenomic sequencing projects. ⢠METHODS: MBD2 has been used to enrich prokaryotic DNA in animal systems. Using gDNA from five model angiosperm species, we apply a similar approach to identify whether MBD2 can fractionate plant gDNA into methyl-CpG-depleted vs. enriched methyl-CpG elements. For each sample, three gDNA libraries were sequenced: (1) untreated gDNA, (2) a methyl-CpG-depleted fraction, and (3) a methyl-CpG-enriched fraction. ⢠RESULTS: Relative to untreated gDNA, the methyl-depleted libraries showed a 3.2-11.2-fold and 3.4-11.3-fold increase in chloroplast DNA (cpDNA) and mitochondrial DNA (mtDNA), respectively. Methyl-enriched fractions showed a 1.8-31.3-fold and 1.3-29.0-fold decrease in cpDNA and mtDNA, respectively. ⢠DISCUSSION: The application of MBD2 enabled fractionation of plant gDNA. The effectiveness was particularly striking for monocot gDNA (Poaceae). When sufficiently effective on a sample, this approach can increase the cost efficiency of sequencing plant genomes as well as prokaryotes living in or on plant tissues.
RESUMO
DNA samples derived from vertebrate skin, bodily cavities and body fluids contain both host and microbial DNA; the latter often present as a minor component. Consequently, DNA sequencing of a microbiome sample frequently yields reads originating from the microbe(s) of interest, but with a vast excess of host genome-derived reads. In this study, we used a methyl-CpG binding domain (MBD) to separate methylated host DNA from microbial DNA based on differences in CpG methylation density. MBD fused to the Fc region of a human antibody (MBD-Fc) binds strongly to protein A paramagnetic beads, forming an effective one-step enrichment complex that was used to remove human or fish host DNA from bacterial and protistan DNA for subsequent sequencing and analysis. We report enrichment of DNA samples from human saliva, human blood, a mock malaria-infected blood sample and a black molly fish. When reads were mapped to reference genomes, sequence reads aligning to host genomes decreased 50-fold, while bacterial and Plasmodium DNA sequences reads increased 8-11.5-fold. The Shannon-Wiener diversity index was calculated for 149 bacterial species in saliva before and after enrichment. Unenriched saliva had an index of 4.72, while the enriched sample had an index of 4.80. The similarity of these indices demonstrates that bacterial species diversity and relative phylotype abundance remain conserved in enriched samples. Enrichment using the MBD-Fc method holds promise for targeted microbiome sequence analysis across a broad range of sample types.
Assuntos
Contaminação por DNA , DNA Bacteriano/isolamento & purificação , DNA/isolamento & purificação , Animais , Ilhas de CpG , DNA/sangue , DNA/metabolismo , Metilação de DNA , DNA Bacteriano/metabolismo , DNA de Protozoário/isolamento & purificação , DNA de Protozoário/metabolismo , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Humanos , Fragmentos Fc das Imunoglobulinas/genética , Fragmentos Fc das Imunoglobulinas/metabolismo , Ligação Proteica , Proteínas Recombinantes de Fusão , Saliva/química , Saliva/microbiologia , VertebradosRESUMO
Identifying genetic variants and mutations that underlie human diseases requires development of robust, cost-effective tools for routine resequencing of regions of interest in the human genome. Here, we demonstrate that coupling Applied Biosystems SOLiD system-sequencing platform with microarray capture of targeted regions provides an efficient and robust method for high-coverage resequencing and polymorphism discovery in human protein-coding exons.
Assuntos
Polimorfismo Genético , Análise de Sequência de DNA/métodos , Sequência de Bases , Tecnologia Biomédica/métodos , Éxons , Variação Genética , Genoma Humano , Heterozigoto , Homozigoto , Humanos , Dados de Sequência Molecular , Mutação , Análise de Sequência com Séries de OligonucleotídeosRESUMO
An extended Brownian dynamics simulation method is used to characterize the dynamics of long DNA molecules flowing in microchannels. The relaxation time increases due to confinement in agreement with scaling predictions. During flow the molecules migrate toward the channel center line, and thereby segregate according to molecular weight. Capturing these effects requires the detailed incorporation of solvent flow in the simulation method, demonstrating the importance of hydrodynamic effects in the dynamics of confined macromolecules.
Assuntos
DNA/química , Simulação por Computador , Cinética , Modelos Químicos , Processos Estocásticos , ViscosidadeRESUMO
Modern comparative genomics has been established, in part, by the sequencing and annotation of a broad range of microbial species. To gain further insights, new sequencing efforts are now dealing with the variety of strains or isolates that gives a species definition and range; however, this number vastly outstrips our ability to sequence them. Given the availability of a large number of microbial species, new whole genome approaches must be developed to fully leverage this information at the level of strain diversity that maximize discovery. Here, we describe how optical mapping, a single-molecule system, was used to identify and annotate chromosomal alterations between bacterial strains represented by several species. Since whole-genome optical maps are ordered restriction maps, sequenced strains of Shigella flexneri serotype 2a (2457T and 301), Yersinia pestis (CO 92 and KIM), and Escherichia coli were aligned as maps to identify regions of homology and to further characterize them as possible insertions, deletions, inversions, or translocations. Importantly, an unsequenced Shigella flexneri strain (serotype Y strain AMC[328Y]) was optically mapped and aligned with two sequenced ones to reveal one novel locus implicated in serotype conversion and several other loci containing insertion sequence elements or phage-related gene insertions. Our results suggest that genomic rearrangements and chromosomal breakpoints are readily identified and annotated against a prototypic sequenced strain by using the tools of optical mapping.
Assuntos
Escherichia coli K12/genética , Genoma Bacteriano , Genômica , Mapeamento por Restrição/métodos , Shigella flexneri/genética , Yersinia pestis/genética , Processamento de Imagem Assistida por ComputadorRESUMO
Single molecule approaches offer the promise of large, exquisitely miniature ensembles for the generation of equally large data sets. Although microfluidic devices have previously been designed to manipulate single DNA molecules, many of the functionalities they embody are not applicable to very large DNA molecules, normally extracted from cells. Importantly, such microfluidic devices must work within an integrated system to enable high-throughput biological or biochemical analysis-a key measure of any device aimed at the chemical/biological interface and required if large data sets are to be created for subsequent analysis. The challenge here was to design an integrated microfluidic device to control the deposition or elongation of large DNA molecules (up to millimeters in length), which would serve as a general platform for biological/biochemical analysis to function within an integrated system that included massively parallel data collection and analysis. The approach we took was to use replica molding to construct silastic devices to consistently deposit oriented, elongated DNA molecules onto charged surfaces, creating massive single molecule arrays, which we analyzed for both physical and biochemical insights within an integrated environment that created large data sets. The overall efficacy of this approach was demonstrated by the restriction enzyme mapping and identification of single human genomic DNA molecules.
Assuntos
DNA/química , Técnicas Analíticas Microfluídicas/instrumentação , Análise de Sequência com Séries de Oligonucleotídeos/instrumentação , Humanos , Processamento de Imagem Assistida por Computador , Técnicas Analíticas Microfluídicas/métodos , Peso Molecular , Análise de Sequência com Séries de Oligonucleotídeos/métodosRESUMO
Yersinia pestis is the causative agent of the bubonic, septicemic, and pneumonic plagues (also known as black death) and has been responsible for recurrent devastating pandemics throughout history. To further understand this virulent bacterium and to accelerate an ongoing sequencing project, two whole-genome restriction maps (XhoI and PvuII) of Y. pestis strain KIM were constructed using shotgun optical mapping. This approach constructs ordered restriction maps from randomly sheared individual DNA molecules directly extracted from cells. The two maps served different purposes; the XhoI map facilitated sequence assembly by providing a scaffold for high-resolution alignment, while the PvuII map verified genome sequence assembly. Our results show that such maps facilitated the closure of sequence gaps and, most importantly, provided a purely independent means for sequence validation. Given the recent advancements to the optical mapping system, increased resolution and throughput are enabling such maps to guide sequence assembly at a very early stage of a microbial sequencing project.