RESUMO
Fewer than half of individuals with a suspected Mendelian or monogenic condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control data sets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project (1KGP) Oxford Nanopore Technologies Sequencing Consortium aims to generate LRS data from at least 800 of the 1KGP samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37× and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.
RESUMO
Less than half of individuals with a suspected Mendelian condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control datasets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project ONT Sequencing Consortium aims to generate LRS data from at least 800 of the 1000 Genomes Project samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37x and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.
RESUMO
Monitoring the spread of viral pathogens in the population during epidemics is crucial for mounting an effective public health response. Understanding the viral lineages that constitute the infections in a population can uncover the origins and transmission patterns of outbreaks and detect the emergence of novel variants that may impact the course of an epidemic. Population-level surveillance of viruses through genomic sequencing of wastewater captures unbiased lineage data, including cryptic asymptomatic and undiagnosed infections, and has been shown to detect infection outbreaks and novel variant emergence before detection in clinical samples. Here, we present an optimised protocol for quantification and sequencing of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in influent wastewater, used for high-throughput genomic surveillance in England during the COVID-19 pandemic. This protocol utilises reverse compliment PCR for library preparation, enabling tiled amplification across the whole viral genome and sequencing adapter addition in a single step to enhance efficiency. Sequencing of synthetic SARS-CoV-2 RNA provided evidence validating the efficacy of this protocol, while data from high-throughput sequencing of wastewater samples demonstrated the sensitivity of this method. We also provided guidance on the quality control steps required during library preparation and data analysis. Overall, this represents an effective method for high-throughput sequencing of SARS-CoV-2 in wastewater which can be applied to other viruses and pathogens of humans and animals.
Assuntos
COVID-19 , SARS-CoV-2 , Animais , Humanos , SARS-CoV-2/genética , Águas Residuárias , Pandemias , RNA Viral/genética , COVID-19/diagnóstico , COVID-19/epidemiologia , Reação em Cadeia da Polimerase , Proteínas do Sistema Complemento , Teste para COVID-19RESUMO
The ongoing SARS-CoV-2 pandemic demonstrates the utility of real-time sequence analysis in monitoring and surveillance of pathogens. However, cost-effective sequencing requires that samples be PCR amplified and multiplexed via barcoding onto a single flow cell, resulting in challenges with maximising and balancing coverage for each sample. To address this, we developed a real-time analysis pipeline to maximise flow cell performance and optimise sequencing time and costs for any amplicon based sequencing. We extended our nanopore analysis platform MinoTour to incorporate ARTIC network bioinformatics analysis pipelines. MinoTour predicts which samples will reach sufficient coverage for downstream analysis and runs the ARTIC networks Medaka pipeline once sufficient coverage has been reached. We show that stopping a viral sequencing run earlier, at the point that sufficient data has become available, has no negative effect on subsequent down-stream analysis. A separate tool, SwordFish, is used to automate adaptive sampling on Nanopore sequencers during the sequencing run. This enables normalisation of coverage both within (amplicons) and between samples (barcodes) on barcoded sequencing runs. We show that this process enriches under-represented samples and amplicons in a library as well as reducing the time taken to obtain complete genomes without affecting the consensus sequence.
RESUMO
The world has moved into a new stage of managing the SARS-CoV-2 pandemic with minimal restrictions and reduced testing in the population, leading to reduced genomic surveillance of virus variants in individuals. Wastewater-based epidemiology (WBE) can provide an alternative means of tracking virus variants in the population but decision-makers require confidence that it can be applied to a national scale and is comparable to individual testing data. We analysed 19,911 samples from 524 wastewater sites across England at least twice a week between November 2021 and February 2022, capturing sewage from >70% of the English population. We used amplicon-based sequencing and the phylogeny based de-mixing tool Freyja to estimate SARS-CoV-2 variant frequencies and compared these to the variant dynamics observed in individual testing data from clinical and community settings. We show that wastewater data can reconstruct the spread of the Omicron variant across England since November 2021 in close detail and aligns closely with epidemiological estimates from individual testing data. We also show the temporal and spatial spread of Omicron within London. Our wastewater data further reliably track the transition between Omicron subvariants BA1 and BA2 in February 2022 at regional and national levels. Our demonstration that WBE can track the fast-paced dynamics of SARS-CoV-2 variant frequencies at a national scale and closely match individual testing data in time shows that WBE can reliably fill the monitoring gap left by reduced individual testing in a more affordable way.
Assuntos
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Águas Residuárias , Vigilância Epidemiológica Baseada em Águas Residuárias , COVID-19/epidemiologia , Genômica , Inglaterra/epidemiologiaRESUMO
Grass pea (Lathyrus sativus L.) is a rich source of protein cultivated as an insurance crop in Ethiopia, Eritrea, India, Bangladesh, and Nepal. Its resilience to both drought and flooding makes it a promising crop for ensuring food security in a changing climate. The lack of genetic resources and the crop's association with the disease neurolathyrism have limited the cultivation of grass pea. Here, we present an annotated, long read-based assembly of the 6.5 Gbp L. sativus genome. Using this genome sequence, we have elucidated the biosynthetic pathway leading to the formation of the neurotoxin, ß-L-oxalyl-2,3-diaminopropionic acid (ß-L-ODAP). The final reaction of the pathway depends on an interaction between L. sativus acyl-activating enzyme 3 (LsAAE3) and a BAHD-acyltransferase (LsBOS) that form a metabolon activated by CoA to produce ß-L-ODAP. This provides valuable insight into the best approaches for developing varieties which produce substantially less toxin.
Assuntos
Diamino Aminoácidos , Lathyrus , Lathyrus/genética , Lathyrus/metabolismo , Diamino Aminoácidos/metabolismo , Neurotoxinas/metabolismo , GenômicaRESUMO
Investigations of the human germline and programming are challenging because of limited access to embryonic material. However, the pig as a model may provide insights into transcriptional network and epigenetic reprogramming applicable to both species. Here we show that, during the pre- and early migratory stages, pig primordial germ cells (PGCs) initiate large-scale epigenomic reprogramming, including DNA demethylation involving TET-mediated hydroxylation and, potentially, base excision repair (BER). There is also macroH2A1 depletion and increased H3K27me3 as well as X chromosome reactivation (XCR) in females. Concomitantly, there is dampening of glycolytic metabolism genes and re-expression of some pluripotency genes like those in preimplantation embryos. We identified evolutionarily young transposable elements and gene coding regions resistant to DNA demethylation in acutely hypomethylated gonadal PGCs, with potential for transgenerational epigenetic inheritance. Detailed insights into the pig germline will likely contribute significantly to advances in human germline biology, including in vitro gametogenesis.
Assuntos
Metilação de DNA , Elementos de DNA Transponíveis , Epigênese Genética , Epigenômica , Células Germinativas/metabolismo , Cromossomo X/metabolismo , Animais , Feminino , Humanos , Suínos , Cromossomo X/genéticaRESUMO
We present a genome assembly from an individual female Aquila chrysaetos chrysaetos (the European golden eagle; Chordata; Aves; Accipitridae). The genome sequence is 1.23 gigabases in span. The majority of the assembly is scaffolded into 28 chromosomal pseudomolecules, including the W and Z sex chromosomes.
RESUMO
High-resolution molecular programmes delineating the cellular foundations of mammalian embryogenesis have emerged recently. Similar analysis of human embryos is limited to pre-implantation stages, since early post-implantation embryos are largely inaccessible. Notwithstanding, we previously suggested conserved principles of pig and human early development. For further insight on pluripotent states and lineage delineation, we analysed pig embryos at single cell resolution. Here we show progressive segregation of inner cell mass and trophectoderm in early blastocysts, and of epiblast and hypoblast in late blastocysts. We show that following an emergent short naive pluripotent signature in early embryos, there is a protracted appearance of a primed signature in advanced embryonic stages. Dosage compensation with respect to the X-chromosome in females is attained via X-inactivation in late epiblasts. Detailed human-pig comparison is a basis towards comprehending early human development and a foundation for further studies of human pluripotent stem cell differentiation in pig interspecies chimeras.
Assuntos
Análise de Célula Única/métodos , Cromossomo X/metabolismo , Animais , Diferenciação Celular/fisiologia , Feminino , Gastrulação/fisiologia , Regulação da Expressão Gênica no Desenvolvimento , Camadas Germinativas/metabolismo , Humanos , Suínos , Inativação do Cromossomo X/fisiologiaRESUMO
Referral hospitals in sub-Saharan Africa concentrate large numbers of tuberculosis (TB) and multidrug-resistant TB (MDR-TB) patients, failed by community TB services. We have previously shown, from enhanced screening and through autopsy studies, a significant burden of missed TB infections at the University Teaching Hospital, Lusaka, Zambia, with many patients dying or being discharged without treatment. With minimal TB isolation facilities and minimal political will to invest in broader screening and isolation, the risk of nosocomial transmission is likely to be extremely high. Studies from other hospitals in low burden settings and in South Africa have shown that next generation sequencing (NGS) is a very powerful tool for rapidly sequencing whole TB genomes and comparing them to confirm or rule out nosocomial transmission. The established platforms for NGS analysis, such as Illumina, are very expensive, immobile, and require regular maintenance, making them a costly inclusion on a research proposal or programmatic intervention grant in Africa. MinION nanopore sequencing has changed the NGS landscape with cheap portable sequencers, rapid simple library preparation (15min), and automated real-time analysis tools. The application of highly portable MinION nanopore sequencing technology for the monitoring of nosocomial TB infection will be discussed. Preliminary data from our pediatric pneumonia study will demonstrate the detection of TB in induced sputum from children admitted to the University Teaching Hospital.
RESUMO
Nanopore sequencing was recently made available to users in the form of the Oxford Nanopore MinION. Released to users through an early access programme, the MinION is made unique by its tiny form factor and ability to generate very long sequences from single DNA molecules. The platform is undergoing rapid evolution with three distinct nanopore types and five updates to library preparation chemistry in the last 18 months. To keep pace with the rapid evolution of this sequencing platform, and to provide a space where new analysis methods can be openly discussed, we present a new F1000Research channel devoted to updates to and analysis of nanopore sequence data.
RESUMO
PURPOSE OF REVIEW: We provide a summary of the temporal cascade of transcriptional networks giving rise to the hematopoietic stem cell (HSC) and controlling differentiation of the erythroid lineage from it. We focus on the mechanisms by which cell fate decisions are made and comment on recent developments and additions to the networks. RECENT FINDINGS: A role for an SCL/LMO2 complex in HSC emergence, as well as in subsequent erythroid differentiation, has received support. Connections between the transcriptional networks and signaling molecules are being made but more work is needed in this area. Evidence that transcriptional cross-antagonistic switches underlie the choice between lineage pathways is increasing, and we highlight how the dynamics of earlier lineage decisions can influence later ones. Mathematical models are being built and reveal a surprising degree of power in these simple motifs to explain lineage choices. SUMMARY: New links in the transcriptional networks underlying cell-fate decisions are constantly emerging, and their incorporation into the evolving networks will make mathematical modeling more precise in its predictions of cell behavior, which can be tested experimentally.
Assuntos
Redes Reguladoras de Genes/fisiologia , Hematopoese/genética , Células-Tronco Hematopoéticas/citologia , Fatores de Transcrição/fisiologia , Animais , Diferenciação Celular , Linhagem da Célula , Humanos , Transdução de SinaisRESUMO
Controlled differentiation of pluripotential cells takes place routinely and with great success in developing vertebrate embryos. It therefore makes sense to take note of how this is achieved and use this knowledge to control the differentiation of embryonic stem cells (ESCs). An added advantage is that the differentiated cells resulting from this process in embryos have proven functionality and longevity. This unit reviews what is known about the embryonic signals that drive differentiation in one of the most informative of the vertebrate animal models of development, the amphibian Xenopus laevis. It summarizes their identities and the extent to which their activities are dose-dependent. The unit details what is known about the transcription factor responses to these signals, describing the networks of interactions that they generate. It then discusses the target genes of these transcription factors, the effectors of the differentiated state. Finally, how these same developmental programs operate during germ layer formation in the context of ESC differentiation is summarized.