Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 40
Filter
1.
Nat Commun ; 15(1): 5414, 2024 Jun 26.
Article in English | MEDLINE | ID: mdl-38926353

ABSTRACT

Borgs are huge extrachromosomal elements (ECE) of anaerobic methane-consuming "Candidatus Methanoperedens" archaea. Here, we used nanopore sequencing to validate published complete genomes curated from short reads and to reconstruct new genomes. 13 complete and four near-complete linear genomes share 40 genes that define a largely syntenous genome backbone. We use these conserved genes to identify new Borgs from peatland soil and to delineate Borg phylogeny, revealing two major clades. Remarkably, Borg genes encoding nanowire-like electron-transferring cytochromes and cell surface proteins are more highly expressed than those of host Methanoperedens, indicating that Borgs augment the Methanoperedens activity in situ. We reconstructed the first complete 4.00 Mbp genome for a Methanoperedens that is inferred to be a Borg host and predicted its methylation motifs, which differ from pervasive TC and CC methylation motifs of the Borgs. Thus, methylation may enable Methanoperedens to distinguish their genomes from those of Borgs. Very high Borg to Methanoperedens ratios and structural predictions suggest that Borgs may be capable of encapsulation. The findings clearly define Borgs as a distinct class of ECE with shared genomic signatures, establish their diversification from a common ancestor with genetic inheritance, and raise the possibility of periodic existence outside of host cells.


Subject(s)
Genome, Archaeal , Methane , Phylogeny , Methane/metabolism , Oxidation-Reduction , Archaea/genetics , Archaea/metabolism , Nanopore Sequencing/methods , DNA Methylation , Soil Microbiology
2.
Nat Commun ; 15(1): 5327, 2024 Jun 22.
Article in English | MEDLINE | ID: mdl-38909018

ABSTRACT

The assignment of variants across haplotypes, phasing, is crucial for predicting the consequences, interaction, and inheritance of mutations and is a key step in improving our understanding of phenotype and disease. However, phasing is limited by read length and stretches of homozygosity along the genome. To overcome this limitation, we designed MethPhaser, a method that utilizes methylation signals from Oxford Nanopore Technologies to extend Single Nucleotide Variation (SNV)-based phasing. We demonstrate that haplotype-specific methylations extensively exist in Human genomes and the advent of long-read technologies enabled direct report of methylation signals. For ONT R9 and R10 cell line data, we increase the phase length N50 by 78%-151% at a phasing accuracy of 83.4-98.7% To assess the impact of tissue purity and random methylation signals due to inactivation, we also applied MethPhaser on blood samples from 4 patients, still showing improvements over SNV-only phasing. MethPhaser further improves phasing across HLA and multiple other medically relevant genes, improving our understanding of how mutations interact across multiple phenotypes. The concept of MethPhaser can also be extended to non-human diploid genomes. MethPhaser is available at https://github.com/treangenlab/methphaser .


Subject(s)
DNA Methylation , Genome, Human , Haplotypes , Polymorphism, Single Nucleotide , Humans , Cell Line , Mutation
3.
Cell Genom ; : 100590, 2024 Jun 19.
Article in English | MEDLINE | ID: mdl-38908378

ABSTRACT

The duplication-triplication/inverted-duplication (DUP-TRP/INV-DUP) structure is a complex genomic rearrangement (CGR). Although it has been identified as an important pathogenic DNA mutation signature in genomic disorders and cancer genomes, its architecture remains unresolved. Here, we studied the genomic architecture of DUP-TRP/INV-DUP by investigating the DNA of 24 patients identified by array comparative genomic hybridization (aCGH) on whom we found evidence for the existence of 4 out of 4 predicted structural variant (SV) haplotypes. Using a combination of short-read genome sequencing (GS), long-read GS, optical genome mapping, and single-cell DNA template strand sequencing (strand-seq), the haplotype structure was resolved in 18 samples. The point of template switching in 4 samples was shown to be a segment of ∼2.2-5.5 kb of 100% nucleotide similarity within inverted repeat pairs. These data provide experimental evidence that inverted low-copy repeats act as recombinant substrates. This type of CGR can result in multiple conformers generating diverse SV haplotypes in susceptible dosage-sensitive loci.

4.
Nat Commun ; 15(1): 5149, 2024 Jun 18.
Article in English | MEDLINE | ID: mdl-38890299

ABSTRACT

Telomeres are the protective nucleoprotein structures at the end of linear eukaryotic chromosomes. Telomeres' repetitive nature and length have traditionally challenged the precise assessment of the composition and length of individual human telomeres. Here, we present Telo-seq to resolve bulk, chromosome arm-specific and allele-specific human telomere lengths using Oxford Nanopore Technologies' native long-read sequencing. Telo-seq resolves telomere shortening in five population doubling increments and reveals intrasample, chromosome arm-specific, allele-specific telomere length heterogeneity. Telo-seq can reliably discriminate between telomerase- and ALT-positive cancer cell lines. Thus, Telo-seq is a tool to study telomere biology during development, aging, and cancer at unprecedented resolution.


Subject(s)
Aging , Neoplasms , Telomere , Humans , Telomere/genetics , Telomere/metabolism , Neoplasms/genetics , Neoplasms/metabolism , Aging/genetics , Telomerase/genetics , Telomerase/metabolism , Cell Line, Tumor , Telomere Shortening/genetics , Sequence Analysis, DNA/methods , High-Throughput Nucleotide Sequencing/methods , Alleles
5.
bioRxiv ; 2024 Mar 29.
Article in English | MEDLINE | ID: mdl-38585716

ABSTRACT

Immunoglobulin (IGH, IGK, IGL) loci in the human genome are highly polymorphic regions that encode the building blocks of the light and heavy chain IG proteins that dimerize to form antibodies. The processes of V(D)J recombination and somatic hypermutation in B cells are responsible for creating an enormous reservoir of highly specific antibodies capable of binding a vast array of possible antigens. However, the antibody repertoire is fundamentally limited by the set of variable (V), diversity (D), and joining (J) alleles present in the germline IG loci. To better understand how the germline IG haplotypes contribute to the expressed antibody repertoire, we combined genome sequencing of the germline IG loci with single-cell transcriptome sequencing of B cells from the same donor. Sequencing and assembly of the germline IG loci captured the IGH locus in a single fully-phased contig where the maternal and paternal contributions to the germline V, D, and J repertoire can be fully resolved. The B cells were collected following a measles, mumps, and rubella (MMR) vaccination, resulting in a population of cells that were activated in response to this specific immune challenge. Single-cell, full-length transcriptome sequencing of these B cells resulted in whole transcriptome characterization of each cell, as well as highly-accurate consensus sequences for the somatically rearranged and hypermutated light and heavy chain IG transcripts. A subset of antibodies synthesized based on their consensus heavy and light chain transcript sequences demonstrated binding to measles antigens and neutralization of measles live virus.

6.
Genome Res ; 34(3): 454-468, 2024 Apr 25.
Article in English | MEDLINE | ID: mdl-38627094

ABSTRACT

Reference-free genome phasing is vital for understanding allele inheritance and the impact of single-molecule DNA variation on phenotypes. To achieve thorough phasing across homozygous or repetitive regions of the genome, long-read sequencing technologies are often used to perform phased de novo assembly. As a step toward reducing the cost and complexity of this type of analysis, we describe new methods for accurately phasing Oxford Nanopore Technologies (ONT) sequence data with the Shasta genome assembler and a modular tool for extending phasing to the chromosome scale called GFAse. We test using new variants of ONT PromethION sequencing, including those using proximity ligation, and show that newer, higher accuracy ONT reads substantially improve assembly quality.


Subject(s)
Nanopores , Humans , Sequence Analysis, DNA/methods , Nanopore Sequencing/methods , High-Throughput Nucleotide Sequencing/methods , Software , Genomics/methods
7.
STAR Protoc ; 5(2): 102966, 2024 Jun 21.
Article in English | MEDLINE | ID: mdl-38512867

ABSTRACT

Studying RNA splicing factor mutations is challenging due to difficulties in distinguishing wild-type and mutant cells within complex human tissues and inaccuracies associated with reconstructing splicing signals from short-read sequencing data. Here, we present Genotyping of Transcriptomes (GoT)-Splice, a protocol that overcomes these limitations by combining GoT with enhanced long-read single-cell transcriptome and cell-surface proteomics profiling. We describe steps for long-read library preparation and analysis, followed by cDNA re-amplification, enrichment of mutation of interest, sample indexing, and GoT library preparation. For complete details on the use and execution of this protocol, please refer to Cortés-López et al.1.


Subject(s)
Membrane Proteins , Mutation , RNA Splicing , Humans , RNA Splicing/genetics , Mutation/genetics , Membrane Proteins/genetics , Membrane Proteins/metabolism , Gene Expression Profiling/methods , Transcriptome/genetics , Proteomics/methods , Gene Library , Single-Cell Analysis/methods , Multiomics
8.
bioRxiv ; 2023 Oct 03.
Article in English | MEDLINE | ID: mdl-37873367

ABSTRACT

Background: The duplication-triplication/inverted-duplication (DUP-TRP/INV-DUP) structure is a type of complex genomic rearrangement (CGR) hypothesized to result from replicative repair of DNA due to replication fork collapse. It is often mediated by a pair of inverted low-copy repeats (LCR) followed by iterative template switches resulting in at least two breakpoint junctions in cis . Although it has been identified as an important mutation signature of pathogenicity for genomic disorders and cancer genomes, its architecture remains unresolved and is predicted to display at least four structural variation (SV) haplotypes. Results: Here we studied the genomic architecture of DUP-TRP/INV-DUP by investigating the genomic DNA of 24 patients with neurodevelopmental disorders identified by array comparative genomic hybridization (aCGH) on whom we found evidence for the existence of 4 out of 4 predicted SV haplotypes. Using a combination of short-read genome sequencing (GS), long- read GS, optical genome mapping and StrandSeq the haplotype structure was resolved in 18 samples. This approach refined the point of template switching between inverted LCRs in 4 samples revealing a DNA segment of ∼2.2-5.5 kb of 100% nucleotide similarity. A prediction model was developed to infer the LCR used to mediate the non-allelic homology repair. Conclusions: These data provide experimental evidence supporting the hypothesis that inverted LCRs act as a recombinant substrate in replication-based repair mechanisms. Such inverted repeats are particularly relevant for formation of copy-number associated inversions, including the DUP-TRP/INV-DUP structures. Moreover, this type of CGR can result in multiple conformers which contributes to generate diverse SV haplotypes in susceptible loci .

9.
Cell Stem Cell ; 30(9): 1262-1281.e8, 2023 09 07.
Article in English | MEDLINE | ID: mdl-37582363

ABSTRACT

RNA splicing factors are recurrently mutated in clonal blood disorders, but the impact of dysregulated splicing in hematopoiesis remains unclear. To overcome technical limitations, we integrated genotyping of transcriptomes (GoT) with long-read single-cell transcriptomics and proteogenomics for single-cell profiling of transcriptomes, surface proteins, somatic mutations, and RNA splicing (GoT-Splice). We applied GoT-Splice to hematopoietic progenitors from myelodysplastic syndrome (MDS) patients with mutations in the core splicing factor SF3B1. SF3B1mut cells were enriched in the megakaryocytic-erythroid lineage, with expansion of SF3B1mut erythroid progenitor cells. We uncovered distinct cryptic 3' splice site usage in different progenitor populations and stage-specific aberrant splicing during erythroid differentiation. Profiling SF3B1-mutated clonal hematopoiesis samples revealed that erythroid bias and cell-type-specific cryptic 3' splice site usage in SF3B1mut cells precede overt MDS. Collectively, GoT-Splice defines the cell-type-specific impact of somatic mutations on RNA splicing, from early clonal outgrowths to overt neoplasia, directly in human samples.


Subject(s)
Myelodysplastic Syndromes , RNA Splice Sites , Humans , Multiomics , RNA Splicing/genetics , Myelodysplastic Syndromes/genetics , Myelodysplastic Syndromes/metabolism , RNA Splicing Factors/genetics , RNA Splicing Factors/metabolism , Mutation/genetics , Phosphoproteins/genetics , Phosphoproteins/metabolism
10.
BMC Biol ; 21(1): 110, 2023 05 16.
Article in English | MEDLINE | ID: mdl-37194054

ABSTRACT

BACKGROUND: DNA-protein cross-links (DPCs) are one of the most deleterious DNA lesions, originating from various sources, including enzymatic activity. For instance, topoisomerases, which play a fundamental role in DNA metabolic processes such as replication and transcription, can be trapped and remain covalently bound to DNA in the presence of poisons or nearby DNA damage. Given the complexity of individual DPCs, numerous repair pathways have been described. The protein tyrosyl-DNA phosphodiesterase 1 (Tdp1) has been demonstrated to be responsible for removing topoisomerase 1 (Top1). Nevertheless, studies in budding yeast have indicated that alternative pathways involving Mus81, a structure-specific DNA endonuclease, could also remove Top1 and other DPCs. RESULTS: This study shows that MUS81 can efficiently cleave various DNA substrates modified by fluorescein, streptavidin or proteolytically processed topoisomerase. Furthermore, the inability of MUS81 to cleave substrates bearing native TOP1 suggests that TOP1 must be either dislodged or partially degraded prior to MUS81 cleavage. We demonstrated that MUS81 could cleave a model DPC in nuclear extracts and that depletion of TDP1 in MUS81-KO cells induces sensitivity to the TOP1 poison camptothecin (CPT) and affects cell proliferation. This sensitivity is only partially suppressed by TOP1 depletion, indicating that other DPCs might require the MUS81 activity for cell proliferation. CONCLUSIONS: Our data indicate that MUS81 and TDP1 play independent roles in the repair of CPT-induced lesions, thus representing new therapeutic targets for cancer cell sensitisation in combination with TOP1 inhibitors.


Subject(s)
DNA-Binding Proteins , Endonucleases , Phosphoric Diester Hydrolases , Saccharomyces cerevisiae Proteins , DNA Damage , DNA Repair , Phosphoric Diester Hydrolases/genetics , Phosphoric Diester Hydrolases/metabolism , Saccharomyces cerevisiae , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae Proteins/metabolism , DNA Topoisomerases, Type I/genetics , DNA Topoisomerases, Type I/metabolism , DNA-Binding Proteins/genetics , DNA-Binding Proteins/metabolism , Endonucleases/genetics , Endonucleases/metabolism
11.
bioRxiv ; 2023 Feb 22.
Article in English | MEDLINE | ID: mdl-36865218

ABSTRACT

As a step towards simplifying and reducing the cost of haplotype resolved de novo assembly, we describe new methods for accurately phasing nanopore data with the Shasta genome assembler and a modular tool for extending phasing to the chromosome scale called GFAse. We test using new variants of Oxford Nanopore Technologies' (ONT) PromethION sequencing, including those using proximity ligation and show that newer, higher accuracy ONT reads substantially improve assembly quality.

12.
Genome Med ; 14(1): 122, 2022 10 27.
Article in English | MEDLINE | ID: mdl-36303224

ABSTRACT

BACKGROUND: The multiple de novo copy number variant (MdnCNV) phenotype is described by having four or more constitutional de novo CNVs (dnCNVs) arising independently throughout the human genome within one generation. It is a rare peri-zygotic mutational event, previously reported to be seen once in every 12,000 individuals referred for genome-wide chromosomal microarray analysis due to congenital abnormalities. These rare families provide a unique opportunity to understand the genetic factors of peri-zygotic genome instability and the impact of dnCNV on human diseases. METHODS: Chromosomal microarray analysis (CMA), array-based comparative genomic hybridization, short- and long-read genome sequencing (GS) were performed on the newly identified MdnCNV family to identify de novo mutations including dnCNVs, de novo single-nucleotide variants (dnSNVs), and indels. Short-read GS was performed on four previously published MdnCNV families for dnSNV analysis. Trio-based rare variant analysis was performed on the newly identified individual and four previously published MdnCNV families to identify potential genetic etiologies contributing to the peri-zygotic genomic instability. Lin semantic similarity scores informed quantitative human phenotype ontology analysis on three MdnCNV families to identify gene(s) driving or contributing to the clinical phenotype. RESULTS: In the newly identified MdnCNV case, we revealed eight de novo tandem duplications, each ~ 1 Mb, with microhomology at 6/8 breakpoint junctions. Enrichment of de novo single-nucleotide variants (SNV; 6/79) and de novo indels (1/12) was found within 4 Mb of the dnCNV genomic regions. An elevated post-zygotic SNV mutation rate was observed in MdnCNV families. Maternal rare variant analyses identified three genes in distinct families that may contribute to the MdnCNV phenomenon. Phenotype analysis suggests that gene(s) within dnCNV regions contribute to the observed proband phenotype in 3/3 cases. CNVs in two cases, a contiguous gene duplication encompassing PMP22 and RAI1 and another duplication affecting NSD1 and SMARCC2, contribute to the clinically observed phenotypic manifestations. CONCLUSIONS: Characteristic features of dnCNVs reported here are consistent with a microhomology-mediated break-induced replication (MMBIR)-driven mechanism during the peri-zygotic period. Maternal genetic variants in DNA repair genes potentially contribute to peri-zygotic genomic instability. Variable phenotypic features were observed across a cohort of three MdnCNV probands, and computational quantitative phenotyping revealed that two out of three had evidence for the contribution of more than one genetic locus to the proband's phenotype supporting the hypothesis of de novo multilocus pathogenic variation (MPV) in those families.


Subject(s)
DNA Copy Number Variations , Genomic Instability , Humans , Comparative Genomic Hybridization , Mutation , DNA , Nucleotides , DNA-Binding Proteins/genetics , Transcription Factors/genetics
13.
Nature ; 608(7922): 353-359, 2022 08.
Article in English | MEDLINE | ID: mdl-35922509

ABSTRACT

Regulation of transcript structure generates transcript diversity and plays an important role in human disease1-7. The advent of long-read sequencing technologies offers the opportunity to study the role of genetic variation in transcript structure8-16. In this Article, we present a large human long-read RNA-seq dataset using the Oxford Nanopore Technologies platform from 88 samples from Genotype-Tissue Expression (GTEx) tissues and cell lines, complementing the GTEx resource. We identified just over 70,000 novel transcripts for annotated genes, and validated the protein expression of 10% of novel transcripts. We developed a new computational package, LORALS, to analyse the genetic effects of rare and common variants on the transcriptome by allele-specific analysis of long reads. We characterized allele-specific expression and transcript structure events, providing new insights into the specific transcript alterations caused by common and rare genetic variants and highlighting the resolution gained from long-read data. We were able to perturb the transcript structure upon knockdown of PTBP1, an RNA binding protein that mediates splicing, thereby finding genetic regulatory effects that are modified by the cellular environment. Finally, we used this dataset to enhance variant interpretation and study rare variants leading to aberrant splicing patterns.


Subject(s)
Alleles , Gene Expression Profiling , Organ Specificity , RNA-Seq , Transcriptome , Alternative Splicing/genetics , Cell Line , Datasets as Topic , Genotype , Heterogeneous-Nuclear Ribonucleoproteins/deficiency , Heterogeneous-Nuclear Ribonucleoproteins/genetics , Humans , Organ Specificity/genetics , Polypyrimidine Tract-Binding Protein/deficiency , Polypyrimidine Tract-Binding Protein/genetics , Reproducibility of Results , Transcriptome/genetics
14.
Nat Biotechnol ; 40(10): 1488-1499, 2022 10.
Article in English | MEDLINE | ID: mdl-35637420

ABSTRACT

High-order three-dimensional (3D) interactions between more than two genomic loci are common in human chromatin, but their role in gene regulation is unclear. Previous high-order 3D chromatin assays either measure distant interactions across the genome or proximal interactions at selected targets. To address this gap, we developed Pore-C, which combines chromatin conformation capture with nanopore sequencing of concatemers to profile proximal high-order chromatin contacts at the genome scale. We also developed the statistical method Chromunity to identify sets of genomic loci with frequencies of high-order contacts significantly higher than background ('synergies'). Applying these methods to human cell lines, we found that synergies were enriched in enhancers and promoters in active chromatin and in highly transcribed and lineage-defining genes. In prostate cancer cells, these included binding sites of androgen-driven transcription factors and the promoters of androgen-regulated genes. Concatemers of high-order contacts in highly expressed genes were demethylated relative to pairwise contacts at the same loci. Synergies in breast cancer cells were associated with tyfonas, a class of complex DNA amplicons. These results rigorously link genome-wide high-order 3D interactions to lineage-defining transcriptional programs and establish Pore-C and Chromunity as scalable approaches to assess high-order genome structure.


Subject(s)
Nanopore Sequencing , Nanopores , Androgens , Chromatin/genetics , Humans , Transcription Factors/genetics
15.
Proc Natl Acad Sci U S A ; 118(37)2021 09 14.
Article in English | MEDLINE | ID: mdl-34497122

ABSTRACT

Some of the most spectacular adaptive radiations begin with founder populations on remote islands. How genetically limited founder populations give rise to the striking phenotypic and ecological diversity characteristic of adaptive radiations is a paradox of evolutionary biology. We conducted an evolutionary genomics analysis of genus Metrosideros, a landscape-dominant, incipient adaptive radiation of woody plants that spans a striking range of phenotypes and environments across the Hawaiian Islands. Using nanopore-sequencing, we created a chromosome-level genome assembly for Metrosideros polymorpha var. incana and analyzed whole-genome sequences of 131 individuals from 11 taxa sampled across the islands. Demographic modeling and population genomics analyses suggested that Hawaiian Metrosideros originated from a single colonization event and subsequently spread across the archipelago following the formation of new islands. The evolutionary history of Hawaiian Metrosideros shows evidence of extensive reticulation associated with significant sharing of ancestral variation between taxa and secondarily with admixture. Taking advantage of the highly contiguous genome assembly, we investigated the genomic architecture underlying the adaptive radiation and discovered that divergent selection drove the formation of differentiation outliers in paired taxa representing early stages of speciation/divergence. Analysis of the evolutionary origins of the outlier single nucleotide polymorphisms (SNPs) showed enrichment for ancestral variations under divergent selection. Our findings suggest that Hawaiian Metrosideros possesses an unexpectedly rich pool of ancestral genetic variation, and the reassortment of these variations has fueled the island adaptive radiation.


Subject(s)
Adaptation, Physiological , Evolution, Molecular , Genetic Speciation , Myrtaceae/physiology , Polymorphism, Genetic , Radiation Tolerance , Radiation, Ionizing , Genetics, Population , Myrtaceae/radiation effects , Phenotype
16.
Genes (Basel) ; 12(1)2021 01 16.
Article in English | MEDLINE | ID: mdl-33467183

ABSTRACT

For the past two decades, microbial monitoring of the International Space Station (ISS) has relied on culture-dependent methods that require return to Earth for analysis. This has a number of limitations, with the most significant being bias towards the detection of culturable organisms and the inherent delay between sample collection and ground-based analysis. In recent years, portable and easy-to-use molecular-based tools, such as Oxford Nanopore Technologies' MinION™ sequencer and miniPCR bio's miniPCR™ thermal cycler, have been validated onboard the ISS. Here, we report on the development, validation, and implementation of a swab-to-sequencer method that provides a culture-independent solution to real-time microbial profiling onboard the ISS. Method development focused on analysis of swabs collected in a low-biomass environment with limited facility resources and stringent controls on allowed processes and reagents. ISS-optimized procedures included enzymatic DNA extraction from a swab tip, bead-based purifications, altered buffers, and the use of miniPCR and the MinION. Validation was conducted through extensive ground-based assessments comparing current standard culture-dependent and newly developed culture-independent methods. Similar microbial distributions were observed between the two methods; however, as expected, the culture-independent data revealed microbial profiles with greater diversity. Protocol optimization and verification was established during NASA Extreme Environment Mission Operations (NEEMO) analog missions 21 and 22, respectively. Unique microbial profiles obtained from analog testing validated the swab-to-sequencer method in an extreme environment. Finally, four independent swab-to-sequencer experiments were conducted onboard the ISS by two crewmembers. Microorganisms identified from ISS swabs were consistent with historical culture-based data, and primarily consisted of commonly observed human-associated microbes. This simplified method has been streamlined for high ease-of-use for a non-trained crew to complete in an extreme environment, thereby enabling environmental and human health diagnostics in real-time as future missions take us beyond low-Earth orbit.


Subject(s)
Bacteria/genetics , DNA, Bacterial/genetics , Nanopore Sequencing , Sequence Analysis, DNA , Spacecraft , Specimen Handling , Humans
17.
Genome Res ; 30(3): 437-446, 2020 03.
Article in English | MEDLINE | ID: mdl-32075851

ABSTRACT

Viruses are the most abundant biological entities on Earth and play key roles in host ecology, evolution, and horizontal gene transfer. Despite recent progress in viral metagenomics, the inherent genetic complexity of virus populations still poses technical difficulties for recovering complete virus genomes from natural assemblages. To address these challenges, we developed an assembly-free, single-molecule nanopore sequencing approach, enabling direct recovery of complete virus genome sequences from environmental samples. Our method yielded thousands of full-length, high-quality draft virus genome sequences that were not recovered using standard short-read assembly approaches. Additionally, our analyses discriminated between populations whose genomes had identical direct terminal repeats versus those with circularly permuted repeats at their termini, thus providing new insight into native virus reproduction and genome packaging. Novel DNA sequences were discovered, whose repeat structures, gene contents, and concatemer lengths suggest they are phage-inducible chromosomal islands, which are packaged as concatemers in phage particles, with lengths that match the size ranges of co-occurring phage genomes. Our new virus sequencing strategy can provide previously unavailable information about the genome structures, population biology, and ecology of naturally occurring viruses and viral parasites.


Subject(s)
Genome, Viral , Nanopore Sequencing/methods , Bacteriophages/genetics , DNA Packaging , Metagenomics , Seawater/virology
18.
Genome Biol ; 21(1): 21, 2020 02 05.
Article in English | MEDLINE | ID: mdl-32019604

ABSTRACT

BACKGROUND: The circum-basmati group of cultivated Asian rice (Oryza sativa) contains many iconic varieties and is widespread in the Indian subcontinent. Despite its economic and cultural importance, a high-quality reference genome is currently lacking, and the group's evolutionary history is not fully resolved. To address these gaps, we use long-read nanopore sequencing and assemble the genomes of two circum-basmati rice varieties. RESULTS: We generate two high-quality, chromosome-level reference genomes that represent the 12 chromosomes of Oryza. The assemblies show a contig N50 of 6.32 Mb and 10.53 Mb for Basmati 334 and Dom Sufid, respectively. Using our highly contiguous assemblies, we characterize structural variations segregating across circum-basmati genomes. We discover repeat expansions not observed in japonica-the rice group most closely related to circum-basmati-as well as the presence and absence variants of over 20 Mb, one of which is a circum-basmati-specific deletion of a gene regulating awn length. We further detect strong evidence of admixture between the circum-basmati and circum-aus groups. This gene flow has its greatest effect on chromosome 10, causing both structural variation and single-nucleotide polymorphism to deviate from genome-wide history. Lastly, population genomic analysis of 78 circum-basmati varieties shows three major geographically structured genetic groups: Bhutan/Nepal, India/Bangladesh/Myanmar, and Iran/Pakistan. CONCLUSION: The availability of high-quality reference genomes allows functional and evolutionary genomic analyses providing genome-wide evidence for gene flow between circum-aus and circum-basmati, describes the nature of circum-basmati structural variation, and reveals the presence/absence variation in this important and iconic rice variety group.


Subject(s)
Nanopore Sequencing/methods , Oryza/genetics , Whole Genome Sequencing/methods , Chromosomes, Plant/genetics , Contig Mapping/methods , Evolution, Molecular , Genome, Plant , Oryza/classification , Phylogeny
19.
Genes (Basel) ; 11(1)2020 01 09.
Article in English | MEDLINE | ID: mdl-31936690

ABSTRACT

The MinION sequencer has made in situ sequencing feasible in remote locations. Following our initial demonstration of its high performance off planet with Earth-prepared samples, we developed and tested an end-to-end, sample-to-sequencer process that could be conducted entirely aboard the International Space Station (ISS). Initial experiments demonstrated the process with a microbial mock community standard. The DNA was successfully amplified, primers were degraded, and libraries prepared and sequenced. The median percent identities for both datasets were 84%, as assessed from alignment of the mock community. The ability to correctly identify the organisms in the mock community standard was comparable for the sequencing data obtained in flight and on the ground. To validate the process on microbes collected from and cultured aboard the ISS, bacterial cells were selected from a NASA Environmental Health Systems Surface Sample Kit contact slide. The locations of bacterial colonies chosen for identification were labeled, and a small number of cells were directly added as input into the sequencing workflow. Prepared DNA was sequenced, and the data were downlinked to Earth. Return of the contact slide to the ground allowed for standard laboratory processing for bacterial identification. The identifications obtained aboard the ISS, Staphylococcus hominis and Staphylococcus capitis, matched those determined on the ground down to the species level. This marks the first ever identification of microbes entirely off Earth, and this validated process could be used for in-flight microbial identification, diagnosis of infectious disease in a crewmember, and as a research platform for investigators around the world.


Subject(s)
Nanopore Sequencing/methods , RNA, Ribosomal, 16S/genetics , Specimen Handling/methods , Bacteria/genetics , DNA, Bacterial/genetics , DNA, Ribosomal/genetics , Exobiology/methods , Extraterrestrial Environment , Genome, Bacterial/genetics , Microbiota/genetics , Nanopores , Sequence Analysis, DNA/methods , Spacecraft/instrumentation
20.
ISME J ; 14(3): 727-739, 2020 03.
Article in English | MEDLINE | ID: mdl-31822788

ABSTRACT

Acanthamoeba-infecting Mimiviridae are giant viruses with dsDNA genome up to 1.5 Mb. They build viral factories in the host cytoplasm in which the nuclear-like virus-encoded functions take place. They are themselves the target of infections by 20-kb-dsDNA virophages, replicating in the giant virus factories and can also be found associated with 7-kb-DNA episomes, dubbed transpovirons. Here we isolated a virophage (Zamilon vitis) and two transpovirons respectively associated to B- and C-clade mimiviruses. We found that the virophage could transfer each transpoviron provided the host viruses were devoid of a resident transpoviron (permissive effect). If not, only the resident transpoviron originally isolated from the corresponding virus was replicated and propagated within the virophage progeny (dominance effect). Although B- and C-clade viruses devoid of transpoviron could replicate each transpoviron, they did it with a lower efficiency across clades, suggesting an ongoing process of adaptive co-evolution. We analysed the proteomes of host viruses and virophage particles in search of proteins involved in this adaptation process. This study also highlights a unique example of intricate commensalism in the viral world, where the transpoviron uses the virophage to propagate and where the Zamilon virophage and the transpoviron depend on the giant virus to replicate, without affecting its infectious cycle.


Subject(s)
Acanthamoeba/virology , Mimiviridae/physiology , Giant Viruses/genetics , Giant Viruses/physiology , Mimiviridae/genetics , Mimiviridae/growth & development , Mimiviridae/isolation & purification , Symbiosis , Virophages/genetics , Virophages/physiology
SELECTION OF CITATIONS
SEARCH DETAIL
...