Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 59
Filter
1.
Nat Commun ; 15(1): 2480, 2024 Mar 20.
Article in English | MEDLINE | ID: mdl-38509097

ABSTRACT

The expression of genes encompasses their transcription into mRNA followed by translation into protein. In recent years, next-generation sequencing and mass spectrometry methods have profiled DNA, RNA and protein abundance in cells. However, there are currently no reference standards that are compatible across these genomic, transcriptomic and proteomic methods, and provide an integrated measure of gene expression. Here, we use synthetic biology principles to engineer a multi-omics control, termed pREF, that can act as a universal molecular standard for next-generation sequencing and mass spectrometry methods. The pREF sequence encodes 21 synthetic genes that can be in vitro transcribed into spike-in mRNA controls, and in vitro translated to generate matched protein controls. The synthetic genes provide qualitative controls that can measure sensitivity and quantitative accuracy of DNA, RNA and peptide detection. We demonstrate the use of pREF in metagenome DNA sequencing and RNA sequencing experiments and evaluate the quantification of proteins using mass spectrometry. Unlike previous spike-in controls, pREF can be independently propagated and the synthetic mRNA and protein controls can be sustainably prepared by recipient laboratories using common molecular biology techniques. Together, this provides a universal synthetic standard able to integrate genomic, transcriptomic and proteomic methods.


Subject(s)
DNA , Proteomics , RNA, Messenger/genetics , RNA, Messenger/metabolism , DNA/genetics , Genomics , RNA
2.
Nat Commun ; 14(1): 5663, 2023 09 21.
Article in English | MEDLINE | ID: mdl-37735471

ABSTRACT

The success of mRNA vaccines has been realised, in part, by advances in manufacturing that enabled billions of doses to be produced at sufficient quality and safety. However, mRNA vaccines must be rigorously analysed to measure their integrity and detect contaminants that reduce their effectiveness and induce side-effects. Currently, mRNA vaccines and therapies are analysed using a range of time-consuming and costly methods. Here we describe a streamlined method to analyse mRNA vaccines and therapies using long-read nanopore sequencing. Compared to other industry-standard techniques, VAX-seq can comprehensively measure key mRNA vaccine quality attributes, including sequence, length, integrity, and purity. We also show how direct RNA sequencing can analyse mRNA chemistry, including the detection of nucleoside modifications. To support this approach, we provide supporting software to automatically report on mRNA and plasmid template quality and integrity. Given these advantages, we anticipate that RNA sequencing methods, such as VAX-seq, will become central to the development and manufacture of mRNA drugs.


Subject(s)
Commerce , mRNA Vaccines , RNA, Messenger/genetics , Sequence Analysis, RNA
4.
Nat Commun ; 13(1): 6437, 2022 10 28.
Article in English | MEDLINE | ID: mdl-36307482

ABSTRACT

Library adaptors are short oligonucleotides that are attached to RNA and DNA samples in preparation for next-generation sequencing (NGS). Adaptors can also include additional functional elements, such as sample indexes and unique molecular identifiers, to improve library analysis. Here, we describe Control Library Adaptors, termed CAPTORs, that measure the accuracy and reliability of NGS. CAPTORs can be integrated within the library preparation of RNA and DNA samples, and their encoded information is retrieved during sequencing. We show how CAPTORs can measure the accuracy of nanopore sequencing, evaluate the quantitative performance of metagenomic and RNA sequencing, and improve normalisation between samples. CAPTORs can also be customised for clinical diagnoses, correcting systematic sequencing errors and improving the diagnosis of pathogenic BRCA1/2 variants in breast cancer. CAPTORs are a simple and effective method to increase the accuracy and reliability of NGS, enabling comparisons between samples, reagents and laboratories, and supporting the use of nanopore sequencing for clinical diagnosis.


Subject(s)
Nanopore Sequencing , Reproducibility of Results , Gene Library , High-Throughput Nucleotide Sequencing/methods , RNA
5.
Trends Pharmacol Sci ; 43(4): 269-280, 2022 04.
Article in English | MEDLINE | ID: mdl-35153075

ABSTRACT

The human genome expresses vast numbers of long noncoding RNAs (lncRNA) that fulfil diverse roles in gene regulation, cell biology, development, and human disease. These roles are often mediated by sequence motifs and secondary structures bound by proteins and can regulate epigenetic, transcriptional, and translational pathways. These functional domains can be further optimised and engineered into RNA devices that are widely used in synthetic biology. We propose that natural lncRNA structures can be explored and exploited for the rational design and assembly of synthetic RNA therapies. This potential has been enabled by advances in the stability, immunogenicity, manufacture, and delivery of other RNA-based therapies, from which we can anticipate the pharmacological properties of lncRNA therapies that have not yet otherwise entered clinical trials.


Subject(s)
RNA, Long Noncoding , Gene Expression Regulation , Genome, Human , Humans , RNA, Long Noncoding/genetics
6.
Genome Biol ; 23(1): 19, 2022 01 12.
Article in English | MEDLINE | ID: mdl-35022065

ABSTRACT

BACKGROUND: Next-generation sequencing (NGS) can identify mutations in the human genome that cause disease and has been widely adopted in clinical diagnosis. However, the human genome contains many polymorphic, low-complexity, and repetitive regions that are difficult to sequence and analyze. Despite their difficulty, these regions include many clinically important sequences that can inform the treatment of human diseases and improve the diagnostic yield of NGS. RESULTS: To evaluate the accuracy by which these difficult regions are analyzed with NGS, we built an in silico decoy chromosome, along with corresponding synthetic DNA reference controls, that encode difficult and clinically important human genome regions, including repeats, microsatellites, HLA genes, and immune receptors. These controls provide a known ground-truth reference against which to measure the performance of diverse sequencing technologies, reagents, and bioinformatic tools. Using this approach, we provide a comprehensive evaluation of short- and long-read sequencing instruments, library preparation methods, and software tools and identify the errors and systematic bias that confound our resolution of these remaining difficult regions. CONCLUSIONS: This study provides an analytical validation of diagnosis using NGS in difficult regions of the human genome and highlights the challenges that remain to resolve these difficult regions.


Subject(s)
Genome, Human , High-Throughput Nucleotide Sequencing , Chromosomes , High-Throughput Nucleotide Sequencing/methods , Humans , Microsatellite Repeats , Sequence Analysis, DNA/methods , Software
7.
Genet Med ; 24(2): 398-409, 2022 02.
Article in English | MEDLINE | ID: mdl-34906448

ABSTRACT

PURPOSE: Branchpoint elements are required for intron removal, and variants at these elements can result in aberrant splicing. We aimed to assess the value of branchpoint annotations generated from recent large-scale studies to select branchpoint-abrogating variants, using hereditary cancer genes as model. METHODS: We identified branchpoint elements in 119 genes associated with hereditary cancer from 3 genome-wide experimentally-inferred and 2 predicted branchpoint data sets. We then identified variants that occur within branchpoint elements from public databases. We compared conservation, unique variant observations, and population frequencies at different nucleotides within branchpoint motifs. Finally, selected minigene assays were performed to assess the splicing effect of variants at branchpoint elements within mismatch repair genes. RESULTS: There was poor overlap between predicted and experimentally-inferred branchpoints. Our analysis of cancer genes suggested that variants at -2 nucleotide, -1 nucleotide, and branchpoint positions in experimentally-inferred canonical motifs are more likely to be clinically relevant. Minigene assay data showed the -2 nucleotide to be more important to branchpoint motif integrity but also showed fluidity in branchpoint usage. CONCLUSION: Data from cancer gene analysis suggest that there are few high-risk alleles that severely impact function via branchpoint abrogation. Results of this study inform a general scheme to prioritize branchpoint motif variants for further study.


Subject(s)
Neoplasms , RNA Splicing , Genes, Neoplasm , Humans , Introns/genetics , Neoplasms/genetics , RNA Splicing/genetics
9.
Nat Rev Genet ; 22(7): 415-426, 2021 07.
Article in English | MEDLINE | ID: mdl-33948037

ABSTRACT

Assembly and publication of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genome in January 2020 enabled the immediate development of tests to detect the new virus. This began the largest global testing programme in history, in which hundreds of millions of individuals have been tested to date. The unprecedented scale of testing has driven innovation in the strategies, technologies and concepts that govern testing in public health. This Review describes the changing role of testing during the COVID-19 pandemic, including the use of genomic surveillance to track SARS-CoV-2 transmission around the world, the use of contact tracing to contain disease outbreaks and testing for the presence of the virus circulating in the environment. Despite these efforts, widespread community transmission has become entrenched in many countries and has required the testing of populations to identify and isolate infected individuals, many of whom are asymptomatic. The diagnostic and epidemiological principles that underpin such population-scale testing are also considered, as are the high-throughput and point-of-care technologies that make testing feasible on a massive scale.


Subject(s)
COVID-19 , Pandemics , Public Health , SARS-CoV-2 , COVID-19/diagnosis , COVID-19/epidemiology , COVID-19/genetics , COVID-19/transmission , Humans , SARS-CoV-2/genetics , SARS-CoV-2/pathogenicity
10.
Genome Biol ; 22(1): 146, 2021 05 10.
Article in English | MEDLINE | ID: mdl-33971925

ABSTRACT

Pseudogenes are gene copies presumed to mainly be functionless relics of evolution due to acquired deleterious mutations or transcriptional silencing. Using deep full-length PacBio cDNA sequencing of normal human tissues and cancer cell lines, we identify here hundreds of novel transcribed pseudogenes expressed in tissue-specific patterns. Some pseudogene transcripts have intact open reading frames and are translated in cultured cells, representing unannotated protein-coding genes. To assess the biological impact of noncoding pseudogenes, we CRISPR-Cas9 delete the nucleus-enriched pseudogene PDCL3P4 and observe hundreds of perturbed genes. This study highlights pseudogenes as a complex and dynamic component of the human transcriptional landscape.


Subject(s)
DNA, Complementary/genetics , Pseudogenes , Sequence Analysis, DNA , Transcriptome/genetics , Cell Line , Gene Deletion , Haploidy , Humans , Promoter Regions, Genetic/genetics
11.
Sci Rep ; 11(1): 2636, 2021 01 29.
Article in English | MEDLINE | ID: mdl-33514761

ABSTRACT

DNA synthesis in vitro has enabled the rapid production of reference standards. These are used as controls, and allow measurement and improvement of the accuracy and quality of diagnostic tests. Current reference standards typically represent target genetic material, and act only as positive controls to assess test sensitivity. However, negative controls are also required to evaluate test specificity. Using a pair of chimeric A/B RNA standards, this allowed incorporation of positive and negative controls into diagnostic testing for the Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2). The chimeric standards constituted target regions for RT-PCR primer/probe sets that are joined in tandem across two separate synthetic molecules. Accordingly, a target region that is present in standard A provides a positive control, whilst being absent in standard B, thereby providing a negative control. This design enables cross-validation of positive and negative controls between the paired standards in the same reaction, with identical conditions. This enables control and test failures to be distinguished, increasing confidence in the accuracy of results. The chimeric A/B standards were assessed using the US Centres for Disease Control real-time RT-PCR protocol, and showed results congruent with other commercial controls in detecting SARS-CoV-2 in patient samples. This chimeric reference standard design approach offers extensive flexibility, allowing representation of diverse genetic features and distantly related sequences, even from different organisms.


Subject(s)
Chimera , Amino Acid Sequence , COVID-19/diagnosis , COVID-19/virology , Humans , RNA, Viral/standards , Reference Standards , Reproducibility of Results , SARS-CoV-2/chemistry , SARS-CoV-2/genetics , SARS-CoV-2/isolation & purification , Sensitivity and Specificity
12.
Nat Commun ; 11(1): 3609, 2020 07 17.
Article in English | MEDLINE | ID: mdl-32681090

ABSTRACT

Standard units of measurement are required for the quantitative description of nature; however, few standard units have been established for genomics to date. Here, we have developed a synthetic DNA ladder that defines a quantitative standard unit that can measure DNA sequence abundance within a next-generation sequencing library. The ladder can be spiked into a DNA sample, and act as an internal scale that measures quantitative genetics features. Unlike previous spike-ins, the ladder is encoded within a single molecule, and can be equivalently and independently synthesized by different laboratories. We show how the ladder can measure diverse quantitative features, including human genetic variation and microbial abundance, and also estimate uncertainty due to technical variation and improve normalization between libraries. This ladder provides an independent quantitative unit that can be used with any organism, application or technology, thereby providing a common metric by which genomes can be measured.


Subject(s)
DNA/analysis , DNA/chemical synthesis , Base Sequence , DNA/genetics , Gene Dosage , Gene Library , Genomics , Humans
13.
Nat Commun ; 11(1): 1810, 2020 Apr 08.
Article in English | MEDLINE | ID: mdl-32269228

ABSTRACT

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

14.
Cell ; 180(5): 878-894.e19, 2020 03 05.
Article in English | MEDLINE | ID: mdl-32059783

ABSTRACT

Pathogenic autoantibodies arise in many autoimmune diseases, but it is not understood how the cells making them evade immune checkpoints. Here, single-cell multi-omics analysis demonstrates a shared mechanism with lymphoid malignancy in the formation of public rheumatoid factor autoantibodies responsible for mixed cryoglobulinemic vasculitis. By combining single-cell DNA and RNA sequencing with serum antibody peptide sequencing and antibody synthesis, rare circulating B lymphocytes making pathogenic autoantibodies were found to comprise clonal trees accumulating mutations. Lymphoma driver mutations in genes regulating B cell proliferation and V(D)J mutation (CARD11, TNFAIP3, CCND3, ID3, BTG2, and KLHL6) were present in rogue B cells producing the pathogenic autoantibody. Antibody V(D)J mutations conferred pathogenicity by causing the antigen-bound autoantibodies to undergo phase transition to insoluble aggregates at lower temperatures. These results reveal a pre-neoplastic stage in human lymphomagenesis and a cascade of somatic mutations leading to an iconic pathogenic autoantibody.


Subject(s)
Autoantibodies/genetics , Autoimmune Diseases/genetics , B-Lymphocytes/immunology , Lymphoma/genetics , Animals , Autoantibodies/immunology , Autoimmune Diseases/immunology , Autoimmune Diseases/pathology , B-Lymphocytes/pathology , CARD Signaling Adaptor Proteins/genetics , Carrier Proteins/genetics , Clonal Evolution/genetics , Clonal Evolution/immunology , Cyclin D3/genetics , Guanylate Cyclase/genetics , Humans , Immediate-Early Proteins/genetics , Immunoglobulin Variable Region/genetics , Immunoglobulin Variable Region/immunology , Inhibitor of Differentiation Proteins/genetics , Lymphoma/immunology , Lymphoma/pathology , Mice , Mutation/genetics , Mutation/immunology , Neoplasm Proteins/genetics , Sequence Analysis, DNA/methods , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Tumor Necrosis Factor alpha-Induced Protein 3/genetics , Tumor Suppressor Proteins/genetics , V(D)J Recombination/genetics
15.
Nat Protoc ; 14(7): 2119-2151, 2019 07.
Article in English | MEDLINE | ID: mdl-31217595

ABSTRACT

Next-generation sequencing (NGS) has been widely adopted to identify genetic variants and investigate their association with disease. However, the analysis of sequencing data remains challenging because of the complexity of human genetic variation and confounding errors introduced during library preparation, sequencing and analysis. We have developed a set of synthetic DNA spike-ins-termed 'sequins' (sequencing spike-ins)-that are directly added to DNA samples before library preparation. Sequins can be used to measure technical biases and to act as internal quantitative and qualitative controls throughout the sequencing workflow. This step-by-step protocol explains the use of sequins for both whole-genome and targeted sequencing of the human genome. This includes instructions regarding the dilution and addition of sequins to human DNA samples, followed by the bioinformatic steps required to separate sequin- and sample-derived sequencing reads and to evaluate the diagnostic performance of the assay. These practical guidelines are accompanied by a broader discussion of the conceptual and statistical principles that underpin the design of sequin standards. This protocol is suitable for users with standard laboratory and bioinformatic experience. The laboratory steps require ~1-4 d and the bioinformatic steps (which can be performed with the provided example data files) take an additional day.


Subject(s)
DNA/chemical synthesis , Genome, Human , High-Throughput Nucleotide Sequencing/methods , Calibration , Computational Biology/methods , DNA/genetics , High-Throughput Nucleotide Sequencing/standards , Humans , K562 Cells , MCF-7 Cells , Neoplasms/genetics , Proto-Oncogene Proteins B-raf/genetics
16.
Prostate ; 79(10): 1191-1196, 2019 07.
Article in English | MEDLINE | ID: mdl-31090091

ABSTRACT

BACKGROUND: The androgen-regulated gene TMPRSS2 to the ETS transcription factor gene ERG fusion is the most common genomic alteration acquired during prostate tumorigenesis and biased toward men of European ancestry. In contrast, African American men present with more advanced disease, yet their tumors are less likely to acquire TMPRSS2-ERG. Data for Africa is scarce. METHODS: RNA was made available for genomic analyses from 181 prostate tissue biopsy cores from Black South African men, 94 with and 87 without pathological evidence for prostate cancer. Reverse transcription polymerase chain reaction was used to screen for the TMPRSS2-ERG fusion, while transcript junction coordinates and isoform frequencies, including novel gene fusions, were determined using targeted RNA sequencing. RESULTS: Here we report a frequency of 13% for TMPRSS2-ERG in tumors from Black South Africans. Present in 12/94 positive versus 1/87 cancer negative prostate tissue cores, this suggests a 92.62% predictivity for a positive cancer diagnosis (P = 0.0031). At a frequency of almost half that reported for African Americans and roughly a quarter of that reported for men of European ancestry, acquisition of TMPRSS2-ERG appears to be inversely associated with aggressive prostate cancer. Further support was provided by linking the presence of TMPRSS2-ERG to low-grade disease in younger patients (P = 0.0466), with higher expressing distal ERG fusion junction coordinates. CONCLUSIONS: Only the second study of its kind for the African continent, we support a link between TMPRSS2-ERG status and prostate cancer racial health disparity beyond the borders of the United States. We call for urgent evaluation of androgen deprivation therapy within Africa.


Subject(s)
Oncogene Fusion/genetics , Prostatic Neoplasms/genetics , Serine Endopeptidases/genetics , Adult , Aged , Aged, 80 and over , Black People , Genomic Instability , Health Status Disparities , Humans , Male , Middle Aged , Prostate/pathology , Prostatic Neoplasms/pathology , South Africa , Transcriptional Regulator ERG/genetics , White People
17.
Front Genet ; 10: 309, 2019.
Article in English | MEDLINE | ID: mdl-31031799

ABSTRACT

The human brain is one of the last frontiers of biomedical research. Genome-wide association studies (GWAS) have succeeded in identifying thousands of haplotype blocks associated with a range of neuropsychiatric traits, including disorders such as schizophrenia, Alzheimer's and Parkinson's disease. However, the majority of single nucleotide polymorphisms (SNPs) that mark these haplotype blocks fall within non-coding regions of the genome, hindering their functional validation. While some of these GWAS loci may contain cis-acting regulatory DNA elements such as enhancers, we hypothesized that many are also transcribed into non-coding RNAs that are missing from publicly available transcriptome annotations. Here, we use targeted RNA capture ('RNA CaptureSeq') in combination with nanopore long-read cDNA sequencing to transcriptionally profile 1,023 haplotype blocks across the genome containing non-coding GWAS SNPs associated with neuropsychiatric traits, using post-mortem human brain tissue from three neurologically healthy donors. We find that the majority (62%) of targeted haplotype blocks, including 13% of intergenic blocks, are transcribed into novel, multi-exonic RNAs, most of which are not yet recorded in GENCODE annotations. We validated our findings with short-read RNA-seq, providing orthogonal confirmation of novel splice junctions and enabling a quantitative assessment of the long-read assemblies. Many novel transcripts are supported by independent evidence of transcription including cap analysis of gene expression (CAGE) data and epigenetic marks, and some show signs of potential functional roles. We present these transcriptomes as a preliminary atlas of non-coding transcription in human brain that can be used to connect neurological phenotypes with gene expression.

18.
Nat Commun ; 10(1): 1388, 2019 03 27.
Article in English | MEDLINE | ID: mdl-30918253

ABSTRACT

Fusion genes are a major cause of cancer. Their rapid and accurate diagnosis can inform clinical action, but current molecular diagnostic assays are restricted in resolution and throughput. Here, we show that targeted RNA sequencing (RNAseq) can overcome these limitations. First, we establish that fusion gene detection with targeted RNAseq is both sensitive and quantitative by optimising laboratory and bioinformatic variables using spike-in standards and cell lines. Next, we analyse a clinical patient cohort and improve the overall fusion gene diagnostic rate from 63% with conventional approaches to 76% with targeted RNAseq while demonstrating high concordance for patient samples with previous diagnoses. Finally, we show that targeted RNAseq offers additional advantages by simultaneously measuring gene expression levels and profiling the immune-receptor repertoire. We anticipate that targeted RNAseq will improve clinical fusion gene detection, and its increasing use will provide a deeper understanding of fusion gene biology.


Subject(s)
Gene Fusion/genetics , Molecular Diagnostic Techniques/methods , Neoplasms/genetics , Sequence Analysis, RNA/methods , Cell Line, Tumor , Gene Expression Profiling , High-Throughput Nucleotide Sequencing , Humans , Neoplasms/diagnosis , Oncogene Fusion/genetics
19.
Nat Commun ; 10(1): 1342, 2019 03 22.
Article in English | MEDLINE | ID: mdl-30902988

ABSTRACT

Chirality is a property describing any object that is inequivalent to its mirror image. Due to its 5'-3' directionality, a DNA sequence is distinct from a mirrored sequence arranged in reverse nucleotide-order, and is therefore chiral. A given sequence and its opposing chiral partner sequence share many properties, such as nucleotide composition and sequence entropy. Here we demonstrate that chiral DNA sequence pairs also perform equivalently during molecular and bioinformatic techniques that underpin genetic analysis, including PCR amplification, hybridization, whole-genome, target-enriched and nanopore sequencing, sequence alignment and variant detection. Given these shared properties, synthetic DNA sequences mirroring clinically relevant or analytically challenging regions of the human genome are ideal controls for clinical genomics. The addition of synthetic chiral sequences (sequins) to patient tumor samples can prevent false-positive and false-negative mutation detection to improve diagnosis. Accordingly, we propose that sequins can fulfill the need for commutable internal controls in precision medicine.


Subject(s)
DNA/genetics , Genomics , Base Sequence , High-Throughput Nucleotide Sequencing , Humans , Microsatellite Repeats/genetics , Mutation/genetics , Nanopores , Neoplasms/genetics , Polymerase Chain Reaction , Sequence Alignment
20.
Article in English | MEDLINE | ID: mdl-32914017

ABSTRACT

PURPOSE: Before anaplastic lymphoma kinase (ALK) inhibitors, treatment options for ALK-positive inflammatory myofibroblastic tumors (AP-IMTs) were unsatisfactory. We retrospectively analyzed the outcome of patients with AP-IMT treated with crizotinib to document response, toxicity, survival, and features associated with relapse. METHODS: The cohort comprised eight patients with AP-IMT treated with crizotinib and surgery. Outcome measures were progression-free and overall survival after commencing crizotinib, treatment-related toxicities, features associated with relapse, outcome after relapse, and outcome after ceasing crizotinib. RESULTS: The median follow-up after commencing crizotinib was 3 years (range, 0.9 to 5.5 years). The major toxicity was neutropenia. All patients responded to crizotinib. Five were able to discontinue therapy without recurrence (median treatment duration, 1 year; range, 0.2 to 3.0 years); one continues on crizotinib. Two critically ill patients with initial complete response experienced relapse while on therapy. Both harbored RANBP2-ALK fusions and responded to alternative ALK inhibitors; one ultimately died as a result of progressive disease, whereas the other remains alive on treatment. Progression-free and overall survival since commencement of crizotinib is 0.75 ± 0.15% and 0.83 ± 0.15%, respectively. CONCLUSION: We confirm acceptable toxicity and excellent disease control in patients with AP-IMT treated with crizotinib, which may be ceased without recurrence in most. Relapses occurred in two of three patients with RANBP2-ALK translocated IMT, which suggests that such patients require additional therapy.

SELECTION OF CITATIONS
SEARCH DETAIL
...