RESUMO
Metagenomic sequencing has emerged as a transformative tool in infectious disease diagnosis, offering a comprehensive and unbiased approach to pathogen detection. Leveraging international standards and guidelines is essential for ensuring the quality and reliability of metagenomic sequencing in clinical practice. This review explores the implications of international standards and guidelines for the application of metagenomic sequencing in infectious disease diagnosis. By adhering to established standards, such as those outlined by regulatory bodies and expert consensus, healthcare providers can enhance the accuracy and clinical utility of metagenomic sequencing. The integration of international standards and guidelines into metagenomic sequencing workflows can streamline diagnostic processes, improve pathogen identification, and optimize patient care. Strategies in implementing these standards for infectious disease diagnosis using metagenomic sequencing are discussed, highlighting the importance of standardized approaches in advancing precision infectious disease diagnosis initiatives.
Assuntos
Doenças Transmissíveis , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Reprodutibilidade dos Testes , Metagenoma , Padrões de Referência , Metagenômica , Doenças Transmissíveis/diagnósticoRESUMO
Circulating tumor RNA (ctRNA) has recently emerged as a novel and attractive liquid biomarker. CtRNA is capable of providing important information about the expression of a variety of target genes noninvasively, without the need for biopsies, through the use of circulating RNA sequencing. The overexpression of cancer-specific transcripts increases the tumor-derived RNA signal, which overcomes limitations due to low quantities of circulating tumor DNA (ctDNA). The purpose of this work is to present an up-to-date review of current knowledge regarding ctRNAs and their status as biomarkers to address the diagnosis, prognosis, prediction, and drug resistance of colorectal cancer. The final section of the article discusses the practical aspects involved in analyzing plasma ctRNA, including storage and isolation, detection technologies, and their limitations in clinical applications.
Assuntos
Ácidos Nucleicos Livres , DNA Tumoral Circulante , Neoplasias Colorretais , Humanos , Biópsia Líquida , Ácidos Nucleicos Livres/genética , Biomarcadores Tumorais/genética , RNA/genética , Neoplasias Colorretais/patologiaRESUMO
BACKGROUND & AIMS: Patients with colorectal cancer (CRC) have a different gut microbiome signature than individuals without CRC. Little is known about the viral component of CRC-associated microbiome. We aimed to identify and validate viral taxonomic markers of CRC that might be used in detection of the disease or predicting outcome. METHODS: We performed shotgun metagenomic analyses of viromes of fecal samples from 74 patients with CRC (cases) and 92 individuals without CRC (controls) in Hong Kong (discovery cohort). Viral sequences were classified by taxonomic alignment against an integrated microbial reference genome database. Viral markers associated with CRC were validated using fecal samples from 3 separate cohorts: 111 patients with CRC and 112 controls in Hong Kong, 46 patients with CRC and 63 controls in Austria, and 91 patients with CRC and 66 controls in France and Germany. Using abundance profiles of CRC-associated virome genera, we constructed random survival forest models to identify those associated with patient survival times. RESULTS: The diversity of the gut bacteriophage community was significantly increased in patients with CRC compared with controls. Twenty-two viral taxa discriminated cases from controls with an area under the receiver operating characteristic curve of 0.802 in the discovery cohort. The viral markers were validated in 3 cohorts, with area under the receiver operating characteristic curves of 0.763, 0.736, and 0.715, respectively. Clinical subgroup analysis showed that dysbiosis of the gut virome was associated with early- and late-stage CRC. A combination of 4 taxonomic markers associated with reduced survival of patients with CRC (log-rank test, P = 8.1 × 10-6) independently of tumor stage, lymph node metastases, or clinical parameters. We found altered interactions between bacteriophages and oral bacterial commensals in fecal samples from patients with CRC compared with controls. CONCLUSIONS: In a metagenomic analysis of fecal samples from patients and controls, we identified virome signatures associated with CRC. These data might be used to develop tools to identify individuals with CRC or predict outcomes.
Assuntos
Biomarcadores Tumorais/análise , Neoplasias Colorretais/virologia , Disbiose/virologia , Microbioma Gastrointestinal/genética , Vírus/genética , Áustria/epidemiologia , Estudos de Casos e Controles , Estudos de Coortes , Colonoscopia , Neoplasias Colorretais/diagnóstico por imagem , Neoplasias Colorretais/mortalidade , Neoplasias Colorretais/patologia , Estudos Transversais , Disbiose/diagnóstico por imagem , Fezes/virologia , Feminino , França/epidemiologia , Alemanha/epidemiologia , Hong Kong/epidemiologia , Humanos , Masculino , Metagenômica , Pessoa de Meia-Idade , Sensibilidade e Especificidade , Análise de SobrevidaRESUMO
The outbreak of COVID-19 has positively impacted the NGS market recently. Targeted sequencing (TS) has become an important routine technique in both clinical and research settings, with advantages including high confidence and accuracy, a reasonable turnaround time, relatively low cost, and fewer data burdens with the level of bioinformatics or computational demand. Since there are no clear consensus guidelines on the wide range of next-generation sequencing (NGS) platforms and techniques, there is a vital need for researchers and clinicians to develop efficient approaches, especially for the molecular diagnosis of diseases in the emergency of the disease and the global pandemic outbreak of COVID-19. In this review, we aim to summarize different methods of TS, demonstrate parameters for TS assay designs, illustrate different TS panels, discuss their limitations, and present the challenges of TS concerning their clinical application for the molecular diagnosis of human diseases.
Assuntos
COVID-19 , Humanos , COVID-19/diagnóstico , Testes Genéticos/métodos , Biologia Computacional , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Consenso , Teste para COVID-19RESUMO
Colorectal cancer (CRC) threatens human health seriously. Early diagnosis of CRC is critical to improving patient survival. Meanwhile, non-invasive detection through tumor-circulating markers can be an important auxiliary diagnosis. In this study, we performed targeted RNA sequencing in paired tumor and adjacent normal fresh frozen tissues from 68 patients, and we also measured circulating mRNA levels in 4 time-point plasma samples collected before and after operation or chemotherapy. Our results showed that SOX9 (6.73-fold with adjusted p value < 1 × 10-45), MYC (20.59-fold with adjusted p value < 1 × 10-57), and MMP7 (131.94-fold with adjusted p value < 1 × 10-78) highly expressed in tumor compared with adjacent normal tissues. Besides, the circulating mRNA of SOX9 (41.14-fold with adjusted p value < 1 × 10-13) in CRC was significantly higher than in the normal control as well. Moreover, a SOX9-based 9-gene panel (SOX9, GSK3A, FZD4, LEF1, DVL1, FZD7, NFATC1, KRT19, and RUVBL1) showed the non-invasive diagnostic value of CRC (AUC: 0.863 (0.766-0.960), TPR: 0.92, TNR: 0.87). In summary, SOX9 expression consistently increases in tumor and plasma samples from CRC patients, which indicates the important role of SOX9 in CRC progression and its potential in non-invasive diagnosis of CRC.
Assuntos
Neoplasias Colorretais , Humanos , Neoplasias Colorretais/diagnóstico , Neoplasias Colorretais/genética , Neoplasias Colorretais/patologia , Biomarcadores Tumorais , Detecção Precoce de Câncer/métodos , RNA Mensageiro , Regulação Neoplásica da Expressão Gênica , ATPases Associadas a Diversas Atividades Celulares/genética , ATPases Associadas a Diversas Atividades Celulares/metabolismo , Proteínas de Transporte/genética , Proteínas de Transporte/metabolismo , DNA Helicases/genética , DNA Helicases/metabolismo , Receptores Frizzled/genética , Receptores Frizzled/metabolismo , Fatores de Transcrição SOX9/genética , Fatores de Transcrição SOX9/metabolismoRESUMO
BACKGROUND: Colorectal cancer (CRC) is the second leading cause of cancer deaths in Hong Kong. We tested the hypothesis that circulating tumor cell (CTC) analysis by ARB101 antibody could be used as a tool for CRC detection, progression, and therapy response. RESEARCH METHODS: ARB101 antibody was used for investigation of CDH17 expression in formalin-fixed, paraffin-embedded (FFPE) tissue sections and circulating tumor cells (CTCs) of CRC patients. RESULTS: Using ARB101, highest sensitivity was observed in 98/100 (98%) colorectal cancer tissue compared to 72/100 gastric cancer (72%) and 27/32 pancreatic cancer (84%). Immunoreactivity of CDH17 was significantly higher in distant metastatic (tumor-node-metastasis [TNM] stage IV) than non-distant metastatic (TNM stage I to III) CRC. ARB101 antibody also manifested the higher sensitivity than c-erbB2 (8%) and epidermal growth factor receptor (EGFR)-targeting antibodies (37%) with the significance (p < 0.0001). ARB101 positive CTCs were detected in 64/83 (77%) TNM stage I to IV CRC patients. Furthermore, ARB101 positive CTCs detected in TNM stage I to III CRC patients before and after surgical operation are statistically significant (p < 0.0001). CONCLUSIONS: CTC detection by ARB101 antibody could serve as a potential non-invasive approach for CRC detection, progression, and monitoring of treatment response.
Assuntos
Neoplasias Colorretais , Células Neoplásicas Circulantes , Neoplasias Pancreáticas , Neoplasias Gástricas , Humanos , Células Neoplásicas Circulantes/patologia , Neoplasias Colorretais/metabolismo , Hong Kong , Biomarcadores Tumorais/metabolismo , CaderinasRESUMO
Pediatric population was generally less affected clinically by SARS-CoV-2 infection. Few pediatric cases of COVID-19 have been reported compared to those reported in infected adults. However, a rapid increase in the hospitalization rate of SARS-CoV-2 infected pediatric patients was observed during Omicron variant dominated COVID-19 outbreak. In this study, we analyzed the B.1.1.529 (Omicron) genome sequences collected from pediatric patients by whole viral genome amplicon sequencing using Illumina next generation sequencing platform, followed by phylogenetic analysis. The demographic, epidemiologic and clinical data of these pediatric patients are also reported in this study. Fever, cough, running nose, sore throat and vomiting were the more commonly reported symptoms in children infected by Omicron variant. A novel frameshift mutation was found in the ORF1b region (NSP12) of the genome of Omicron variant. Seven mutations were identified in the target regions of the WHO listed SARS-CoV-2 primers and probes. On protein level, eighty-three amino acid substitutions and fifteen amino acid deletions were identified. Our results indicate that asymptomatic infection and transmission among children infected by Omicron subvariants BA.2.2 and BA.2.10.1 are not common. Omicron may have different pathogenesis in pediatric population.
Assuntos
COVID-19 , Adulto , Humanos , Criança , Filogenia , SARS-CoV-2 , Genoma ViralRESUMO
Background: Cell free RNA (cfRNA) contains transcript fragments from multiple cell types, making it useful for cancer detection in clinical settings. However, the pathophysiological origins of cfRNAs in plasma from colorectal cancer (CRC) patients remain unclear. Methods: To identify the tissue-specific contributions of cfRNAs transcriptomic profile, we used a published single-cell transcriptomics profile to deconvolute cell type abundance among paired plasma samples from CRC patients who underwent tumor-ablative surgery. We further validated the differentially expressed cfRNAs in 5 pairs of CRC tumor samples and adjacent tissue samples as well as 3 additional CRC tumor samples using RNA-sequencing. Results: The transcriptomic component from intestinal secretory cells was significantly decreased in the in-house post-surgical cfRNA. The HPGD, PACS1, and TDP2 expression was consistent across cfRNA and tissue samples. Using the Cancer Genome Atlas (TCGA) CRC datasets, we were able to classify the patients into two groups with significantly different survival outcomes. Conclusions: The three-gene signature holds promise in applying minimal residual disease (MRD) testing, which involves profiling remnants of cancer cells after or during treatment. Biomarkers identified in the present study need to be validated in a larger cohort of samples in order to ascertain their possible use in early diagnosis of CRC.
RESUMO
INTRODUCTION: Clinical metagenomic next-generation sequencing (mNGS) allows a comprehensive genetic analysis of microbial materials. Different from other traditional target-driven molecular diagnostic tests, such as PCR, mNGS is a hypothesis-free diagnostic approach that allows a comprehensive genetic analysis of the clinical specimens that cover nearly any common, rare, and new pathogens ranging broadly from viruses, bacteria, fungi to parasites. AREAS COVERED: In this article, we discussed the clinical application of the mNGS using two clinical cases as examples and described the use of mNGS to assist the diagnosis of parasitic pulmonary infection. The advantages and challenges in implementing mNGS in clinical microbiology are also discussed. EXPERT OPINION: mNGS is a promising technology that allows quick diagnosis of infectious diseases. Currently, a plethora of sequencing and analysis methods exists for mNGS, each with individual merits and pitfalls. While standards and best practices were proposed by various metagenomics working groups, they are yet to be widely adopted in the community. The development of a consensus set of guidelines is necessary to guide the usage of this new technology and the interpretation of NGS results before clinical adoption of mNGS testing.
Assuntos
Doenças Transmissíveis , Metagenômica , Líquido da Lavagem Broncoalveolar/microbiologia , Doenças Transmissíveis/diagnóstico , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Metagenoma , Metagenômica/métodos , Sensibilidade e EspecificidadeRESUMO
BACKGROUND: The import of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) lineage B.1.36.27 has sparked the fourth wave of COVID-19 outbreak in Hong Kong. This strain has been circulating in Hong Kong since September 2020 but rarely found in other countries (<1%). RESEARCH DESIGN AND METHODS: A total of 14 SARS-CoV-2 genome sequences collected from patients in Hong Kong between July 2020 and March 2021 were determined by whole viral genome sequencing using Illumina next-generation sequencing platform, followed by phylogenetic analysis. RESULTS: Of the 14 SARS-CoV-2 genome sequences analyzed, 9 strains belonged to the PANGO lineage B.1.36.27, GISAID clade GH, and Nextclade clade 20A. Compared to the reference genome, 31 nucleotide differences and 11 amino acid differences were identified in the genome of the SARS-CoV-2 from PANGO lineage B.1.36.27. CONCLUSIONS: We reported the nucleotides and amino acids mutations identified in the SARS-CoV-2 from PANGO lineage B.1.36.27. Our viral genome sequences enriched the understanding of SARS-CoV-2 mutational landscape and improved the repertoire of known SARS-CoV-2 variants for tracking and tracing. From this study, we found no evidence to show that SARS-CoV-2 from lineage B.1.36.27 can compromise existing vaccines and antibody therapies.
Assuntos
Genoma Viral , Filogenia , SARS-CoV-2 , COVID-19/virologia , Hong Kong/epidemiologia , Humanos , SARS-CoV-2/genéticaRESUMO
INTRODUCTION: To date, the transmission of Coronavirus Disease-2019 (COVID-19) is still uncontrollable with the fact that the numbers of confirmed and death cases are still increasing. Up to 1st October 2020, 33,842,281 confirmed cases and 1,010,634 confirmed deaths have been reported to the World Health Organization from 216 different countries, areas and territories. Despite the urgent demand for effective treatment strategies, there is still no specific antiviral treatment for COVID-19 and the treatment guidelines for COVID-19 vary between countries. AREA COVERED: In this article, we summarized the current knowledge on COVID-19 and the pandemic worldwide. Moreover, the epidemiology, pathogenesis, prevention and different treatment options will be discussed so that we shall prepare ourselves better to fight with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). EXPERT OPINION: The situation of the COVID-19 pandemic is still unpredictable. There is no effective vaccine or specific anti-viral drug to treat serve COVID-19 patients. Combination therapies have shown promising clinical improvement. Repurposing FDA-approved drugs might be one of possible treatment options. Without specific treatment and vaccines for COVID-19, the most effective way to prevent from being infected is to generate an ecosystem with effective protection, precautions and preventive measures.
Assuntos
Tratamento Farmacológico da COVID-19 , COVID-19/epidemiologia , Animais , Antivirais/administração & dosagem , COVID-19/prevenção & controle , Vacinas contra COVID-19/administração & dosagem , Reposicionamento de Medicamentos , HumanosRESUMO
INTRODUCTION: There are great potentials of using exosomal RNAs (exoRNA) as biomarkers in cancers. The isolation of exoRNA requires the use of ultracentrifugation to isolate cell-free RNA followed by detection using real-time PCR, microarray, next-generation sequencing, or Nanostring nCounter system. The use of exoRNA enrichment panels has largely increased the detection sensitivity and specificity when compared to traditional diagnostic tests. Moreover, using exoRNA as biomarkers can assist the early detection of chemo and radioresistance cancer, and in turn opens up the possibility of personalized treatment to patients. Finally, exoRNA can be detected at an early stage of cancer recurrence to improve the survival rate. AREAS COVERED: In this review, the authors summarized the detection methods of exoRNA as well as its potential as a biomarker in cancer diagnosis and chemo and radioresistance. EXPERT OPINION: The application of exoRNAs in clinical diagnosis is still in its infancy. Further researches on extracellular vesicles isolation, detection protocols, exoRNA classes and subclasses, and the regulatory biological pathways have to be performed before exoRNA can be applied translationally.
Assuntos
Biomarcadores Tumorais/sangue , Ácidos Nucleicos Livres/sangue , Exossomos/química , Neoplasias/sangue , RNA Neoplásico/sangue , Sequência de Bases , Biomarcadores Tumorais/isolamento & purificação , Carcinoma/sangue , Carcinoma/diagnóstico , Carcinoma/patologia , Carcinoma/terapia , Ácidos Nucleicos Livres/isolamento & purificação , Resistencia a Medicamentos Antineoplásicos , Detecção Precoce de Câncer/métodos , Feminino , Citometria de Fluxo/métodos , Humanos , Masculino , MicroRNAs/sangue , MicroRNAs/isolamento & purificação , Análise em Microsséries , Nanotecnologia/instrumentação , Nanotecnologia/métodos , Estadiamento de Neoplasias/métodos , Neoplasias/genética , Prognóstico , RNA Neoplásico/isolamento & purificação , Reação em Cadeia da Polimerase em Tempo Real , Sensibilidade e Especificidade , Ultracentrifugação/métodosRESUMO
Genetic testing for neurodegenerative diseases (NDs) is highly challenging because of genetic heterogeneity and overlapping manifestations. Targeted-gene panels (TGPs), coupled with next-generation sequencing (NGS), can facilitate the profiling of a large repertoire of ND-related genes. Due to the technical limitations inherent in NGS and TGPs, short tandem repeat (STR) variations are often ignored. However, STR expansions are known to cause such NDs as Huntington's disease and spinocerebellar ataxias type 3 (SCA3). Here, we studied the clinical utility of a custom-made TGP that targets 199 NDs and 311 ND-associated genes on 118 undiagnosed patients. At least one known or likely pathogenic variation was found in 54 patients; 27 patients demonstrated clinical profiles that matched the variants; and 16 patients whose original diagnosis were refined. A high concordance of variant calling were observed when comparing the results from TGP and whole-exome sequencing of four patients. Our in-house STR detection algorithm has reached a specificity of 0.88 and a sensitivity of 0.82 in our SCA3 cohort. This study also uncovered a trove of novel and recurrent variants that may enrich the repertoire of ND-related genetic markers. We propose that a combined comprehensive TGPs-bioinformatics pipeline can improve the clinical diagnosis of NDs.
RESUMO
In our previous study, we detected the effects of centrifugal forces on plasma RNA quantification by quantitative reverse transcription PCR. The aims of this study were to perform targeted mRNA sequencing and data analysis in healthy donors' plasma prepared by two centrifugation protocols and to investigate the effects of centrifugal forces on plasma mRNA quality and quantity. Targeted mRNA sequencing was performed using a custom panel with 108 colorectal cancer-related genes in 18 healthy donors' plasma that prepared by (1) 3,500 g for 10 min at 4°C and (2) 1,600 g for 10 min at 4°C followed by 16,000 g for 10 min at 4°C. Results showed that plasma ribosomal RNA was detected in 16/18 (88.9%) 3,500 g and 6/18 (33.3%) 1,600 g followed by 16,000 g centrifuged plasma. For targeted sequencing, 75/108 (69.4%) and 86/108 (79.6%) genes were detected in 3,500 and 1,600 g followed by 16,000 g, respectively, while 16/108 (14.8%) genes were not detected in both centrifugations. Detailed analysis showed that 2 of 108 (1.85%) genes showed lower expressions in 3,500 g than in 1,600 g followed by 16,000 g. The median expressions of genes in 3,500 g were positively correlated with the expressions in 1,600 g followed by 16,000 g (R2 = 0.9471, P < 0.0001, Spearman rank correlation). Meanwhile, plasma samples were not distinctively clustered based on centrifugal forces according to hierarchical clustering. Targeted mRNA sequencing and subsequent data analysis were performed in this study to investigate the effects of two different centrifugal forces that are commonly used in plasma collection. Our targeted sequencing results help to understand the centrifugal force effects on plasma mRNA, and these findings show that the centrifugation protocol for plasma mRNA research using targeted sequencing can be standardized which facilitates multicenter studies for comparison and quality assurance in the future.
RESUMO
This year marks the 48th anniversary of Francis Crick's seminal work on the origin of the genetic code, in which he first proposed the "frozen accident" hypothesis to describe evolutionary selection against changes to the genetic code that cause devastating global proteome modification. However, numerous efforts have demonstrated the viability of both natural and artificial genetic code variations. Recent advances in genetic engineering allow the creation of synthetic organisms that incorporate noncanonical, or even unnatural, amino acids into the proteome. Currently, successful genetic code engineering is mainly achieved by creating orthogonal aminoacyl-tRNA/synthetase pairs to repurpose stop and rare codons or to induce quadruplet codons. In this review, we summarize the current progress in genetic code engineering and discuss the challenges, current understanding, and future perspectives regarding genetic code modification.
RESUMO
INTRODUCTION: The genetic architecture of diabetes has been extensively studied. Numerous genetic markers for diabetes have been reported. However, the translation of such knowledge into clinical interventions has been inadequate. Areas covered: We performed a literature search on various frontiers in diabetes treatment that could be improved using genetic information: (1) understanding the mechanisms of existing antidiabetic drugs, (2) repurposing existing drugs for the treatment of diabetes, (3) complementing clinical trial findings; (4) finding novel treatment approaches; (5) better estimation of the efficacy of metabolic surgery. Expert commentary: The translation of genetic information to clinical intervention requires further study, including the development of an appropriate genetic risk score algorithm for type 2 diabetes. Genomic studies provide empirical explanations for clinical trial findings. Moreover, the mechanisms of antidiabetic drugs should be thoroughly investigated to enable clinical trials and pharmacogenomics studies of these drugs. As metabolic surgery becomes more prevalent for the treatment of diabetes, genetic approaches may improve patient prioritization.
RESUMO
Hereditary spastic paraplegias (HSPs) are a group of heterogeneous neurodegenerative disorders, which are often presented with overlapping phenotypes such as progressive paraparesis and spasticity. To assist the diagnosis of HSP subtypes, next-generation sequencing is often used to provide supporting evidence. In this study, we report the case of two probands from the same family with HSP symptoms, including bilateral lower limb weakness, unsteady gait, cognitive decline, dysarthria, and slurring of speech since the age of 14. Subsequent whole-genome sequencing revealed that the patients are compound heterozygous for variants in the SPG11 gene, including the paternally inherited c.6856C>T (p.Arg2286*) variant and the novel maternally inherited c.2316+5G>A splice-donor region variant. Variants in SPG11 are the common cause of autosomal recessive spastic paraplegia type 11. According to the ClinVar database, there are already 101 reported pathogenic variants in SPG11 that are associated with HSPs. To our knowledge, this is the first report of SPG11 variants in our local population. The novel splice variant identified in this study enriches the catalog of SPG11 variants, potentially leading to better genetic diagnosis of HSPs.
Assuntos
Proteínas/genética , Paraplegia Espástica Hereditária/genética , Adulto , Sequência de Bases/genética , Feminino , Heterozigoto , Humanos , Masculino , Mutação , Paraplegia/genética , Linhagem , Fenótipo , Proteínas/metabolismo , Sítios de Splice de RNA/genética , Splicing de RNA/genética , Paraplegia Espástica Hereditária/metabolismo , Sequenciamento Completo do Genoma/métodosRESUMO
We report the draft genome sequence of an extensively drug-resistant strain of Acinetobacter baumannii, CUAB1, isolated from a patient in a local Hong Kong hospital. MIC testing was performed, and genes previously associated with drug resistance were located.
RESUMO
The size of digital data is ever increasing and is expected to grow to 40,000 EB by 2020, yet the estimated global information storage capacity in 2011 is <300 EB, indicating that most of the data are transient. DNA, as a very stable nano-molecule, is an ideal massive storage device for long-term data archive. The two most notable illustrations are from Church et al. and Goldman et al., whose approaches are well-optimized for most sequencing platforms - short synthesized DNA fragments without homopolymer. Here, we suggested improvements on error handling methodology that could enable the integration of DNA-based computational process, e.g., algorithms based on self-assembly of DNA. As a proof of concept, a picture of size 438 bytes was encoded to DNA with low-density parity-check error-correction code. We salvaged a significant portion of sequencing reads with mutations generated during DNA synthesis and sequencing and successfully reconstructed the entire picture. A modular-based programing framework - DNAcodec with an eXtensible Markup Language-based data format was also introduced. Our experiments demonstrated the practicability of long DNA message recovery with high error tolerance, which opens the field to biocomputing and synthetic biology.
RESUMO
The 20 canonical amino acids of the genetic code have been invariant over 3 billion years of biological evolution. Although various aminoacyl-tRNA synthetases can charge their cognate tRNAs with amino acid analogs, there has been no known displacement of any canonical amino acid from the code. Experimental departure from this universal protein alphabet comprising the canonical amino acids was first achieved in the mutants of the Bacillus subtilis QB928 strain, which after serial selection and mutagenesis led to the HR23 strain that could use 4-fluorotryptophan (4FTrp) but not canonical tryptophan (Trp) for propagation. To gain insight into this displacement of Trp from the genetic code by 4FTrp, genome sequencing was performed on LC33 (a precursor strain of HR23), HR23, and TR7 (a revertant of HR23 that regained the capacity to propagate on Trp). Compared with QB928, the negative regulator mtrB of Trp transport was found to be knocked out in LC33, HR23, and TR7, and sigma factor sigB was mutated in HR23 and TR7. Moreover, rpoBC encoding RNA polymerase subunits were mutated in three independent isolates of TR7 relative to HR23. Increased expression of sigB was also observed in HR23 and in TR7 growing under 4FTrp. These findings indicated that stabilization of the genetic code can be provided by just a small number of analog-sensitive proteins, forming an oligogenic barrier that safeguards the canonical amino acids throughout biological evolution.