Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 35
Filter
1.
Gigascience ; 112022 03 24.
Article in English | MEDLINE | ID: mdl-35333302

ABSTRACT

BACKGROUND: Cassava (Manihot esculenta) is an important clonally propagated food crop in tropical and subtropical regions worldwide. Genetic gain by molecular breeding has been limited, partially because cassava is a highly heterozygous crop with a repetitive and difficult-to-assemble genome. FINDINGS: Here we demonstrate that Pacific Biosciences high-fidelity (HiFi) sequencing reads, in combination with the assembler hifiasm, produced genome assemblies at near complete haplotype resolution with higher continuity and accuracy compared to conventional long sequencing reads. We present 2 chromosome-scale haploid genomes phased with Hi-C technology for the diploid African cassava variety TME204. With consensus accuracy >QV46, contig N50 >18 Mb, BUSCO completeness of 99%, and 35k phased gene loci, it is the most accurate, continuous, complete, and haplotype-resolved cassava genome assembly so far. Ab initio gene prediction with RNA-seq data and Iso-Seq transcripts identified abundant novel gene loci, with enriched functionality related to chromatin organization, meristem development, and cell responses. During tissue development, differentially expressed transcripts of different haplotype origins were enriched for different functionality. In each tissue, 20-30% of transcripts showed allele-specific expression (ASE) differences. ASE bias was often tissue specific and inconsistent across different tissues. Direction-shifting was observed in <2% of the ASE transcripts. Despite high gene synteny, the HiFi genome assembly revealed extensive chromosome rearrangements and abundant intra-genomic and inter-genomic divergent sequences, with large structural variations mostly related to LTR retrotransposons. We use the reference-quality assemblies to build a cassava pan-genome and demonstrate its importance in representing the genetic diversity of cassava for downstream reference-guided omics analysis and breeding. CONCLUSIONS: The phased and annotated chromosome pairs allow a systematic view of the heterozygous diploid genome organization in cassava with improved accuracy, completeness, and haplotype resolution. They will be a valuable resource for cassava breeding and research. Our study may also provide insights into developing cost-effective and efficient strategies for resolving complex genomes with high resolution, accuracy, and continuity.


Subject(s)
Manihot , Alleles , Chromosomes , Diploidy , Haplotypes , Manihot/genetics , Plant Breeding , Sequence Analysis, DNA , Transcriptome
2.
Epidemics ; 37: 100480, 2021 12.
Article in English | MEDLINE | ID: mdl-34488035

ABSTRACT

BACKGROUND: In December 2020, the United Kingdom (UK) reported a SARS-CoV-2 Variant of Concern (VoC) which is now named B.1.1.7. Based on initial data from the UK and later data from other countries, this variant was estimated to have a transmission fitness advantage of around 40-80 % (Volz et al., 2021; Leung et al., 2021; Davies et al., 2021). AIM: This study aims to estimate the transmission fitness advantage and the effective reproductive number of B.1.1.7 through time based on data from Switzerland. METHODS: We generated whole genome sequences from 11.8 % of all confirmed SARS-CoV-2 cases in Switzerland between 14 December 2020 and 11 March 2021. Based on these data, we determine the daily frequency of the B.1.1.7 variant and quantify the variant's transmission fitness advantage on a national and a regional scale. RESULTS: We estimate B.1.1.7 had a transmission fitness advantage of 43-52 % compared to the other variants circulating in Switzerland during the study period. Further, we estimate B.1.1.7 had a reproductive number above 1 from 01 January 2021 until the end of the study period, compared to below 1 for the other variants. Specifically, we estimate the reproductive number for B.1.1.7 was 1.24 [1.07-1.41] from 01 January until 17 January 2021 and 1.18 [1.06-1.30] from 18 January until 01 March 2021 based on the whole genome sequencing data. From 10 March to 16 March 2021, once B.1.1.7 was dominant, we estimate the reproductive number was 1.14 [1.00-1.26] based on all confirmed cases. For reference, Switzerland applied more non-pharmaceutical interventions to combat SARS-CoV-2 on 18 January 2021 and lifted some measures again on 01 March 2021. CONCLUSION: The observed increase in B.1.1.7 frequency in Switzerland during the study period is as expected based on observations in the UK. In absolute numbers, B.1.1.7 increased exponentially with an estimated doubling time of around 2-3.5 weeks. To monitor the ongoing spread of B.1.1.7, our plots are available online.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , Switzerland/epidemiology , United Kingdom
3.
Gene ; 793: 145748, 2021 Aug 15.
Article in English | MEDLINE | ID: mdl-34077775

ABSTRACT

The rice root-knot nematode Meloidogyne graminicola is a major biotic stress for the rice crop under upland, rain-fed lowland and irrigated cultivation conditions. Here, we present an improved draft genome assembly of M. graminicola IARI strain using the long-read sequencing approach (PacBio Sequel platform). The assembled genome size was 36.86 Mb with 514 contigs and N50 value of 105 kb. BUSCO estimated the genome to be 88.6% complete. Meloidogyne graminicola genome contained 17.83% repeat elements and showed 14,062 protein-coding gene models, 4,974 conserved orthologous genes, 561 putative secreted proteins, 49 RNAi pathway genes, 1,853 proteins involved in pathogen-host interactions, 1,575 carbohydrate-active enzymes, and 32,138 microsatellites. Five of the carbohydrate-active enzymes were found only in M. graminicola genome and were not present in any other analysed root-knot nematode genome. Together with the previous two genome assemblies, this improved genome assembly would facilitate comparative and functional genomics for M. graminicola.


Subject(s)
Genes, Helminth , Genome, Helminth , Helminth Proteins/genetics , Oryza/parasitology , Tylenchoidea/genetics , Animals , Gene Ontology , Genome Size , Helminth Proteins/classification , High-Throughput Nucleotide Sequencing/methods , Microsatellite Repeats , Molecular Sequence Annotation , Open Reading Frames , Phylogeny , Plant Diseases/parasitology , Tylenchoidea/classification
4.
Sci Rep ; 9(1): 16444, 2019 11 11.
Article in English | MEDLINE | ID: mdl-31712730

ABSTRACT

Pseudoalteromonas haloplanktis TAC125 is among the most commonly studied bacteria adapted to cold environments. Aside from its ecological relevance, P. haloplanktis has a potential use for biotechnological applications. Due to its importance, we decided to take advantage of next generation sequencing (Illumina) and third generation sequencing (PacBio and Oxford Nanopore) technologies to resequence its genome. The availability of a reference genome, obtained using whole genome shotgun sequencing, allowed us to study and compare the results obtained by the different technologies and draw useful conclusions for future de novo genome assembly projects. We found that assembly polishing using Illumina reads is needed to achieve a consensus accuracy over 99.9% when using Oxford Nanopore sequencing, but not in PacBio sequencing. However, the dependency of consensus accuracy on coverage is lower in Oxford Nanopore than in PacBio, suggesting that a cost-effective solution might be the use of low coverage Oxford Nanopore sequencing together with Illumina reads. Despite the differences in consensus accuracy, all sequencing technologies revealed the presence of a large plasmid, pMEGA, which was undiscovered until now. Among the most interesting features of pMEGA is the presence of a putative error-prone polymerase regulated through the SOS response. Aside from the characterization of the newly discovered plasmid, we confirmed the sequence of the small plasmid pMtBL and uncovered the presence of a potential partitioning system. Crucially, this study shows that the combination of next and third generation sequencing technologies give us an unprecedented opportunity to characterize our bacterial model organisms at a very detailed level.


Subject(s)
Genome, Bacterial , Genomics , Gram-Negative Bacterial Infections/microbiology , High-Throughput Nucleotide Sequencing , Pseudoalteromonas/genetics , Aquatic Organisms , Computational Biology/methods , Genomics/methods , Molecular Sequence Annotation , Water Microbiology
5.
BMC Biol ; 17(1): 75, 2019 09 18.
Article in English | MEDLINE | ID: mdl-31533702

ABSTRACT

BACKGROUND: Cassava is an important food crop in tropical and sub-tropical regions worldwide. In Africa, cassava production is widely affected by cassava mosaic disease (CMD), which is caused by the African cassava mosaic geminivirus that is transmitted by whiteflies. Cassava breeders often use a single locus, CMD2, for introducing CMD resistance into susceptible cultivars. The CMD2 locus has been genetically mapped to a 10-Mbp region, but its organization and genes as well as their functions are unknown. RESULTS: We report haplotype-resolved de novo assemblies and annotations of the genomes for the African cassava cultivar TME (tropical Manihot esculenta), which is the origin of CMD2, and the CMD-susceptible cultivar 60444. The assemblies provide phased haplotype information for over 80% of the genomes. Haplotype comparison identified novel features previously hidden in collapsed and fragmented cassava genomes, including thousands of allelic variants, inter-haplotype diversity in coding regions, and patterns of diversification through allele-specific expression. Reconstruction of the CMD2 locus revealed a highly complex region with nearly identical gene sets but limited microsynteny between the two cultivars. CONCLUSIONS: The genome maps of the CMD2 locus in both 60444 and TME3, together with the newly annotated genes, will help the identification of the causal genetic basis of CMD2 resistance to geminiviruses. Our de novo cassava genome assemblies will also facilitate genetic mapping approaches to narrow the large CMD2 region to a few candidate genes for better informed strategies to develop robust geminivirus resistance in susceptible cassava cultivars.


Subject(s)
Disease Resistance/genetics , Haplotypes/genetics , Manihot/genetics , Plant Diseases/genetics , Chromosome Mapping/methods , Disease Susceptibility , Geminiviridae , Genetic Predisposition to Disease , Molecular Sequence Annotation
6.
Mol Cell Biol ; 39(23)2019 12 01.
Article in English | MEDLINE | ID: mdl-31548262

ABSTRACT

The enhancer/promoter of the vitellogenin II gene (VTG) has been extensively studied as a model system of vertebrate transcriptional control. While deletion mutagenesis and in vivo footprinting identified the transcription factor (TF) binding sites governing its tissue specificity, DNase hypersensitivity and DNA methylation studies revealed the epigenetic changes accompanying its hormone-dependent activation. Moreover, upon induction with estrogen (E2), the region flanking the estrogen-responsive element (ERE) was reported to undergo active DNA demethylation. We now show that although the VTG ERE is methylated in embryonic chicken liver and in LMH/2A hepatocytes, its induction by E2 was not accompanied by extensive demethylation. In contrast, E2 failed to activate a VTG enhancer/promoter-controlled luciferase reporter gene methylated by SssI. Surprisingly, this inducibility difference could be traced not to the ERE but rather to a single CpG in an E-box (CACGTG) sequence upstream of the VTG TATA box, which is unmethylated in vivo but methylated by SssI. We demonstrate that this E-box binds the upstream stimulating factor USF1/2. Selective methylation of the CpG within this binding site with an E-box-specific DNA methyltransferase, Eco72IM, was sufficient to attenuate USF1/2 binding in vitro and abolish the hormone-induced transcription of the VTG gene in the reporter system.


Subject(s)
Ectopic Gene Expression/genetics , Estrogen Receptor alpha/genetics , Vitellogenins/genetics , Animals , Binding Sites , Cell Line , Chick Embryo , CpG Islands/genetics , DNA Methylation/genetics , DNA-Binding Proteins/metabolism , DNA-Cytosine Methylases/metabolism , Ectopic Gene Expression/drug effects , Estrogen Receptor alpha/metabolism , Estrogens/metabolism , Gene Expression Regulation/drug effects , Genes, Reporter , Humans , Promoter Regions, Genetic/drug effects , Promoter Regions, Genetic/genetics , Regulatory Sequences, Nucleic Acid , Transcription Factors/metabolism , Vitellogenins/metabolism
7.
Nat Commun ; 10(1): 3359, 2019 07 31.
Article in English | MEDLINE | ID: mdl-31366910

ABSTRACT

A platform for highly parallel direct sequencing of native RNA strands was recently described by Oxford Nanopore Technologies, but despite initial efforts it remains crucial to further investigate the technology for quantification of complex transcriptomes. Here we undertake native RNA sequencing of polyA + RNA from two human cell lines, analysing ~5.2 million aligned native RNA reads. To enable informative comparisons, we also perform relevant ONT direct cDNA- and Illumina-sequencing. We find that while native RNA sequencing does enable some of the anticipated advantages, key unexpected aspects currently hamper its performance, most notably the quite frequent inability to obtain full-length transcripts from single reads, as well as difficulties to unambiguously infer their true transcript of origin. While characterising issues that need to be addressed when investigating more complex transcriptomes, our study highlights that with some defined improvements, native RNA sequencing could be an important addition to the mammalian transcriptomics toolbox.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, RNA/methods , Transcriptome/genetics , Base Sequence/genetics , Cell Line , DNA, Complementary/genetics , HEK293 Cells , Humans , Poly A/genetics
8.
Nucleic Acids Res ; 47(2): e9, 2019 01 25.
Article in English | MEDLINE | ID: mdl-30357413

ABSTRACT

We present a new method, CIDER-Seq (Circular DNA Enrichment sequencing) for the unbiased enrichment and long-read sequencing of viral-sized circular DNA molecules. We used CIDER-Seq to produce single-read full-length virus genomes for the first time. CIDER-Seq combines PCR-free virus enrichment with Single Molecule Real Time sequencing and a new sequence de-concatenation algorithm. We apply our technique to produce >1200 full-length, highly accurate geminivirus genomes from RNAi-transgenic and control plants in a field trial in Kenya. Using CIDER-Seq we can demonstrate for the first time that the expression of antiviral double-stranded RNA (dsRNA) in transgenic plants causes a consistent shift in virus populations towards species sharing low homology to the transgene derived dsRNA. Our method and its application in an economically important crop plant opens new possibilities in periodic virus sequence surveillance and accurate profiling of diverse circular DNA elements.


Subject(s)
DNA, Circular/chemistry , DNA, Viral/chemistry , Geminiviridae/genetics , Genome, Viral , High-Throughput Nucleotide Sequencing/methods , Plants, Genetically Modified/virology , Sequence Analysis, DNA/methods , Algorithms , Plants, Genetically Modified/genetics , RNA Interference
9.
Nucleic Acids Res ; 46(17): 8953-8965, 2018 09 28.
Article in English | MEDLINE | ID: mdl-30137508

ABSTRACT

Generating a complete, de novo genome assembly for prokaryotes is often considered a solved problem. However, we here show that Pseudomonas koreensis P19E3 harbors multiple, near identical repeat pairs up to 70 kilobase pairs in length, which contained several genes that may confer fitness advantages to the strain. Its complex genome, which also included a variable shufflon region, could not be de novo assembled with long reads produced by Pacific Biosciences' technology, but required very long reads from Oxford Nanopore Technologies. Importantly, a repeat analysis, whose results we release for over 9600 prokaryotes, indicated that very complex bacterial genomes represent a general phenomenon beyond Pseudomonas. Roughly 10% of 9331 complete bacterial and a handful of 293 complete archaeal genomes represented this 'dark matter' for de novo genome assembly of prokaryotes. Several of these 'dark matter' genome assemblies contained repeats far beyond the resolution of the sequencing technology employed and likely contain errors, other genomes were closed employing labor-intense steps like cosmid libraries, primer walking or optical mapping. Using very long sequencing reads in combination with assembly algorithms capable of resolving long, near identical repeats will bring most prokaryotic genomes within reach of fast and complete de novo genome assembly.


Subject(s)
Algorithms , Chromosome Mapping/methods , DNA, Bacterial/chemistry , Genome, Bacterial , Microsatellite Repeats , Pseudomonas/genetics , DNA, Bacterial/genetics , DNA, Bacterial/metabolism , Gene Ontology , Genetic Fitness , High-Throughput Nucleotide Sequencing , Molecular Sequence Annotation , Origanum/microbiology , Phylogeny , Plant Leaves/microbiology , Pseudomonas/classification , Pseudomonas/isolation & purification , Pseudomonas/metabolism , Pseudomonas aeruginosa/classification , Pseudomonas aeruginosa/genetics , Pseudomonas aeruginosa/isolation & purification , Pseudomonas aeruginosa/metabolism , Pseudomonas putida/classification , Pseudomonas putida/genetics , Pseudomonas putida/isolation & purification , Pseudomonas putida/metabolism
10.
11.
DNA Res ; 25(1): 39-47, 2018 Feb 01.
Article in English | MEDLINE | ID: mdl-28985356

ABSTRACT

Finger millet (Eleusine coracana (L.) Gaertn) is an important crop for food security because of its tolerance to drought, which is expected to be exacerbated by global climate changes. Nevertheless, it is often classified as an orphan/underutilized crop because of the paucity of scientific attention. Among several small millets, finger millet is considered as an excellent source of essential nutrient elements, such as iron and zinc; hence, it has potential as an alternate coarse cereal. However, high-quality genome sequence data of finger millet are currently not available. One of the major problems encountered in the genome assembly of this species was its polyploidy, which hampers genome assembly compared with a diploid genome. To overcome this problem, we sequenced its genome using diverse technologies with sufficient coverage and assembled it via a novel multiple hybrid assembly workflow that combines next-generation with single-molecule sequencing, followed by whole-genome optical mapping using the Bionano Irys® system. The total number of scaffolds was 1,897 with an N50 length >2.6 Mb and detection of 96% of the universal single-copy orthologs. The majority of the homeologs were assembled separately. This indicates that the proposed workflow is applicable to the assembly of other allotetraploid genomes.

12.
Nat Ecol Evol ; 1(12): 1931-1941, 2017 Dec.
Article in English | MEDLINE | ID: mdl-29085064

ABSTRACT

Armillaria species are both devastating forest pathogens and some of the largest terrestrial organisms on Earth. They forage for hosts and achieve immense colony sizes via rhizomorphs, root-like multicellular structures of clonal dispersal. Here, we sequenced and analysed the genomes of four Armillaria species and performed RNA sequencing and quantitative proteomic analysis on the invasive and reproductive developmental stages of A. ostoyae. Comparison with 22 related fungi revealed a significant genome expansion in Armillaria, affecting several pathogenicity-related genes, lignocellulose-degrading enzymes and lineage-specific genes expressed during rhizomorph development. Rhizomorphs express an evolutionarily young transcriptome that shares features with the transcriptomes of both fruiting bodies and vegetative mycelia. Several genes show concomitant upregulation in rhizomorphs and fruiting bodies and share cis-regulatory signatures in their promoters, providing genetic and regulatory insights into complex multicellularity in fungi. Our results suggest that the evolution of the unique dispersal and pathogenicity mechanisms of Armillaria might have drawn upon ancestral genetic toolkits for wood-decay, morphogenesis and complex multicellularity.


Subject(s)
Armillaria/genetics , Fungal Proteins/genetics , Genome, Fungal , Proteomics , Sequence Analysis, RNA , Species Specificity , Transcriptome
13.
Genome Res ; 27(12): 2083-2095, 2017 12.
Article in English | MEDLINE | ID: mdl-29141959

ABSTRACT

Accurate annotation of all protein-coding sequences (CDSs) is an essential prerequisite to fully exploit the rapidly growing repertoire of completely sequenced prokaryotic genomes. However, large discrepancies among the number of CDSs annotated by different resources, missed functional short open reading frames (sORFs), and overprediction of spurious ORFs represent serious limitations. Our strategy toward accurate and complete genome annotation consolidates CDSs from multiple reference annotation resources, ab initio gene prediction algorithms and in silico ORFs (a modified six-frame translation considering alternative start codons) in an integrated proteogenomics database (iPtgxDB) that covers the entire protein-coding potential of a prokaryotic genome. By extending the PeptideClassifier concept of unambiguous peptides for prokaryotes, close to 95% of the identifiable peptides imply one distinct protein, largely simplifying downstream analysis. Searching a comprehensive Bartonella henselae proteomics data set against such an iPtgxDB allowed us to unambiguously identify novel ORFs uniquely predicted by each resource, including lipoproteins, differentially expressed and membrane-localized proteins, novel start sites and wrongly annotated pseudogenes. Most novelties were confirmed by targeted, parallel reaction monitoring mass spectrometry, including unique ORFs and single amino acid variations (SAAVs) identified in a re-sequenced laboratory strain that are not present in its reference genome. We demonstrate the general applicability of our strategy for genomes with varying GC content and distinct taxonomic origin. We release iPtgxDBs for B. henselae, Bradyrhizobium diazoefficiens and Escherichia coli and the software to generate both proteogenomics search databases and integrated annotation files that can be viewed in a genome browser for any prokaryote.


Subject(s)
Bacterial Proteins/genetics , Bartonella henselae/genetics , Bradyrhizobium/genetics , Escherichia coli/genetics , Genome, Bacterial , Proteogenomics , Databases, Protein , Molecular Sequence Annotation , Open Reading Frames , Software
14.
PLoS Genet ; 12(12): e1006499, 2016 Dec.
Article in English | MEDLINE | ID: mdl-27997543

ABSTRACT

Heritable DNA methylation imprints are ubiquitous and underlie genetic variability from bacteria to humans. In microbial genomes, DNA methylation has been implicated in gene transcription, DNA replication and repair, nucleoid segregation, transposition and virulence of pathogenic strains. Despite the importance of local (hypo)methylation at specific loci, how and when these patterns are established during the cell cycle remains poorly characterized. Taking advantage of the small genomes and the synchronizability of α-proteobacteria, we discovered that conserved determinants of the cell cycle transcriptional circuitry establish specific hypomethylation patterns in the cell cycle model system Caulobacter crescentus. We used genome-wide methyl-N6-adenine (m6A-) analyses by restriction-enzyme-cleavage sequencing (REC-Seq) and single-molecule real-time (SMRT) sequencing to show that MucR, a transcriptional regulator that represses virulence and cell cycle genes in S-phase but no longer in G1-phase, occludes 5'-GANTC-3' sequence motifs that are methylated by the DNA adenine methyltransferase CcrM. Constitutive expression of CcrM or heterologous methylases in at least two different α-proteobacteria homogenizes m6A patterns even when MucR is present and affects promoter activity. Environmental stress (phosphate limitation) can override and reconfigure local hypomethylation patterns imposed by the cell cycle circuitry that dictate when and where local hypomethylation is instated.


Subject(s)
Caulobacter crescentus/genetics , Cell Cycle/genetics , DNA Methylation/genetics , Transcription, Genetic , Cell Division/genetics , DNA Replication/drug effects , DNA Replication/genetics , Gene Expression Regulation, Bacterial , Genome, Microbial , Methyltransferases/genetics , Phosphates/metabolism , Promoter Regions, Genetic , Site-Specific DNA-Methyltransferase (Adenine-Specific)/genetics , Starvation/genetics , Starvation/metabolism
15.
New Phytol ; 212(3): 780-791, 2016 Nov.
Article in English | MEDLINE | ID: mdl-27381250

ABSTRACT

Community analyses of arbuscular mycorrhizal fungi (AMF) using ribosomal small subunit (SSU) or internal transcribed spacer (ITS) DNA sequences often suffer from low resolution or coverage. We developed a novel sequencing based approach for a highly resolving and specific profiling of AMF communities. We took advantage of previously established AMF-specific PCR primers that amplify a c. 1.5-kb long fragment covering parts of SSU, ITS and parts of the large ribosomal subunit (LSU), and we sequenced the resulting amplicons with single molecule real-time (SMRT) sequencing. The method was applicable to soil and root samples, detected all major AMF families and successfully discriminated closely related AMF species, which would not be discernible using SSU sequences. In inoculation tests we could trace the introduced AMF inoculum at the molecular level. One of the introduced strains almost replaced the local strain(s), revealing that AMF inoculation can have a profound impact on the native community. The methodology presented offers researchers a powerful new tool for AMF community analysis because it unifies improved specificity and enhanced resolution, whereas the drawback of medium sequencing throughput appears of lesser importance for low-diversity groups such as AMF.


Subject(s)
Glomeromycota/physiology , Mycorrhizae/physiology , DNA, Fungal/genetics , Operon/genetics , RNA, Ribosomal/genetics , Sequence Analysis, DNA , Soil Microbiology
16.
Nucleic Acids Res ; 43(11): e76, 2015 Jun 23.
Article in English | MEDLINE | ID: mdl-25820422

ABSTRACT

Whole exome sequencing (WES) is increasingly used in research and diagnostics. WES users expect coverage of the entire coding region of known genes as well as sufficient read depth for the covered regions. It is, however, unknown which recent WES platform is most suitable to meet these expectations. We present insights into the performance of the most recent standard exome enrichment platforms from Agilent, NimbleGen and Illumina applied to six different DNA samples by two sequencing vendors per platform. Our results suggest that both Agilent and NimbleGen overall perform better than Illumina and that the high enrichment performance of Agilent is stable among samples and between vendors, whereas NimbleGen is only able to achieve vendor- and sample-specific best exome coverage. Moreover, the recent Agilent platform overall captures more coding exons with sufficient read depth than NimbleGen and Illumina. Due to considerable gaps in effective exome coverage, however, the three platforms cannot capture all known coding exons alone or in combination, requiring improvement. Our data emphasize the importance of evaluation of updated platform versions and suggest that enrichment-free whole genome sequencing can overcome the limitations of WES in sufficiently covering coding exons, especially GC-rich regions, and in characterizing structural variants.


Subject(s)
Exome , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Alleles , Base Composition , DNA/chemistry , Humans
17.
Appl Transl Genom ; 7: 32-9, 2015 Dec.
Article in English | MEDLINE | ID: mdl-27054083

ABSTRACT

Colorectal cancer (CRC) represents one of the most prevalent and lethal malignant neoplasms and every individual of age 50 and above should undergo regular CRC screening. Currently, the most effective preventive screening procedure to detect adenomatous polyps, the precursors to CRC, is colonoscopy. Since every colorectal cancer starts as a polyp, detecting all polyps and removing them is crucial. By exactly doing that, colonoscopy reduces CRC incidence by 80%, however it is an invasive procedure that might have unpleasant and, in rare occasions, dangerous side effects. Despite numerous efforts over the past two decades, a non-invasive screening method for the general population with detection rates for adenomas and CRC similar to that of colonoscopy has not yet been established. Recent advances in next generation sequencing technologies have yet to be successfully applied to this problem, because the detection of rare mutations has been hindered by the systematic biases due to sequencing context and the base calling quality of NGS. We present the first study that applies the high read accuracy and depth of single molecule, real time, circular consensus sequencing (SMRT-CCS) to the detection of mutations in stool DNA in order to provide a non-invasive, sensitive and accurate test for CRC. In stool DNA isolated from patients diagnosed with adenocarcinoma, we are able to detect mutations at frequencies below 0.5% with no false positives. This approach establishes a foundation for a non-invasive, highly sensitive assay to screen the population for CRC and the early stage adenomas that lead to CRC.

18.
Neuron ; 84(2): 386-98, 2014 Oct 22.
Article in English | MEDLINE | ID: mdl-25284007

ABSTRACT

Molecular diversity of surface receptors has been hypothesized to provide a mechanism for selective synaptic connectivity. Neurexins are highly diversified receptors that drive the morphological and functional differentiation of synapses. Using a single cDNA sequencing approach, we detected 1,364 unique neurexin-α and 37 neurexin-ß mRNAs produced by alternative splicing of neurexin pre-mRNAs. This molecular diversity results from near-exhaustive combinatorial use of alternative splice insertions in Nrxn1α and Nrxn2α. By contrast, Nrxn3α exhibits several highly stereotyped exon selections that incorporate novel elements for posttranscriptional regulation of a subset of transcripts. Complexity of Nrxn1α repertoires correlates with the cellular complexity of neuronal tissues, and a specific subset of isoforms is enriched in a purified cell type. Our analysis defines the molecular diversity of a critical synaptic receptor and provides evidence that neurexin diversity is linked to cellular diversity in the nervous system.


Subject(s)
Alternative Splicing , Brain/metabolism , Exons/genetics , Nerve Tissue Proteins/genetics , RNA, Messenger/metabolism , Animals , Mice , Nerve Tissue Proteins/metabolism , Neurons/metabolism , Protein Isoforms/genetics , Protein Isoforms/metabolism , Synapses/metabolism
19.
Nucleic Acids Res ; 42(14): e115, 2014 Aug.
Article in English | MEDLINE | ID: mdl-24972832

ABSTRACT

Next-generation sequencing (NGS) technologies enable new insights into the diversity of virus populations within their hosts. Diversity estimation is currently restricted to single-nucleotide variants or to local fragments of no more than a few hundred nucleotides defined by the length of sequence reads. To study complex heterogeneous virus populations comprehensively, novel methods are required that allow for complete reconstruction of the individual viral haplotypes. Here, we show that assembly of whole viral genomes of ∼8600 nucleotides length is feasible from mixtures of heterogeneous HIV-1 strains derived from defined combinations of cloned virus strains and from clinical samples of an HIV-1 superinfected individual. Haplotype reconstruction was achieved using optimized experimental protocols and computational methods for amplification, sequencing and assembly. We comparatively assessed the performance of the three NGS platforms 454 Life Sciences/Roche, Illumina and Pacific Biosciences for this task. Our results prove and delineate the feasibility of NGS-based full-length viral haplotype reconstruction and provide new tools for studying evolution and pathogenesis of viruses.


Subject(s)
Genetic Variation , HIV-1/genetics , Haplotypes , High-Throughput Nucleotide Sequencing/methods , Genome, Viral , HIV Infections/virology , Humans
20.
Nat Neurosci ; 17(3): 377-82, 2014 Mar.
Article in English | MEDLINE | ID: mdl-24531307

ABSTRACT

The timing of daily circadian behavior can be highly variable among different individuals, and twin studies have suggested that about half of this variability is environmentally controlled. Similar plasticity can be seen in mice exposed to an altered lighting environment, for example, 22-h instead of 24-h, which stably alters the genetically determined period of circadian behavior for months. The mechanisms mediating these environmental influences are unknown. We found that transient exposure of mice to such lighting stably altered global transcription in the suprachiasmatic nucleus (SCN) of the hypothalamus (the master clock tissue regulating circadian behavior in mammals). In parallel, genome-wide methylation profiling revealed global alterations in promoter DNA methylation in the SCN that correlated with these changes. Behavioral, transcriptional and DNA methylation changes were reversible after prolonged re-entrainment to 24-h d. Notably, infusion of a methyltransferase inhibitor to the SCN suppressed period changes. We conclude that the SCN utilizes DNA methylation as a mechanism to drive circadian clock plasticity.


Subject(s)
Circadian Rhythm Signaling Peptides and Proteins/genetics , Circadian Rhythm Signaling Peptides and Proteins/metabolism , Circadian Rhythm/genetics , DNA Methylation/genetics , Neuronal Plasticity/physiology , Photoperiod , Actigraphy , Animals , Behavior, Animal/physiology , Mice , Mice, Inbred C57BL , Suprachiasmatic Nucleus/metabolism , Transcriptome/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...