RESUMEN
BACKGROUND: Uniform random sampling of mass-balanced flux solutions offers an unbiased appraisal of the capabilities of metabolic networks. Unfortunately, it is impossible to avoid thermodynamically infeasible loops in flux samples when using convex samplers on large metabolic models. Current strategies for randomly sampling the non-convex loopless flux space display limited efficiency and lack theoretical guarantees. RESULTS: Here, we present LooplessFluxSampler, an efficient algorithm for exploring the loopless mass-balanced flux solution space of metabolic models, based on an Adaptive Directions Sampling on a Box (ADSB) algorithm. ADSB is rooted in the general Adaptive Direction Sampling (ADS) framework, specifically the Parallel ADS, for which theoretical convergence and irreducibility results are available for sampling from arbitrary distributions. By sampling directions that adapt to the target distribution, ADSB traverses more efficiently the sample space achieving faster mixing than other methods. Importantly, the presented algorithm is guaranteed to target the uniform distribution over convex regions, and it provably converges on the latter distribution over more general (non-convex) regions provided the sample can have full support. CONCLUSIONS: LooplessFluxSampler enables scalable statistical inference of the loopless mass-balanced solution space of large metabolic models. Grounded in a theoretically sound framework, this toolbox provides not only efficient but also reliable results for exploring the properties of the almost surely non-convex loopless flux space. Finally, LooplessFluxSampler includes a Markov Chain diagnostics suite for assessing the quality of the final sample and the performance of the algorithm.
Asunto(s)
Algoritmos , Modelos Biológicos , Redes y Vías Metabólicas , Proyectos de Investigación , Adaptación FisiológicaRESUMEN
Chinese Hamster Ovary (CHO) cells have rapidly become a cornerstone in biopharmaceutical production. Recently, a reinvigoration of perfusion culture mode in CHO cell cultivation has been observed. However, most cell lines currently in use have been engineered and adapted for fed-batch culture methods, and may not perform optimally under perfusion conditions. To improve the cell's resilience and viability during perfusion culture, we cultured a triple knockout CHO cell line, deficient in three apoptosis related genes BAX, BAK, and BOK in a perfusion system. After 20 days of culture, the cells exhibited a halt in cell proliferation. Interestingly, following this phase of growth arrest, the cells entered a second growth phase. During this phase, the cell numbers nearly doubled, but cell specific productivity decreased. We performed a proteomics investigation, elucidating a distinct correlation between growth arrest and cell cycle arrest and showing an upregulation of the central carbon metabolism and oxidative phosphorylation. The upregulation was partially reverted during the second growth phase, likely caused by intragenerational adaptations to stresses encountered. A phase-dependent response to oxidative stress was noted, indicating glutathione has only a secondary role during cell cycle arrest. Our data provides evidence of metabolic regulation under high cell density culturing conditions and demonstrates that cell growth arrest can be overcome. The acquired insights have the potential to not only enhance our understanding of cellular metabolism but also contribute to the development of superior cell lines for perfusion cultivation.
Asunto(s)
Técnicas de Cultivo Celular por Lotes , Reactores Biológicos , Cricetinae , Animales , Cricetulus , Células CHO , Técnicas de Cultivo Celular por Lotes/métodos , PerfusiónRESUMEN
Patient blood samples are invaluable in clinical omics databases, yet current methodologies often fail to fully uncover the molecular mechanisms driving patient pathology. While genome-scale metabolic models (GEMs) show promise in systems medicine by integrating various omics data, having only exometabolomic data remains a limiting factor. To address this gap, we introduce a comprehensive pipeline integrating GEMs with patient plasma metabolome. This pipeline constructs case-specific GEMs using literature-based and patient-specific metabolomic data. Novel computational methods, including adaptive sampling and an in-house developed algorithm for the rational exploration of the sampled space of solutions, enhance integration accuracy while improving computational performance. Model characterization involves task analysis in combination with clustering methods to identify critical cellular functions. The new pipeline was applied to a cohort of trauma patients to investigate shock-induced endotheliopathy using patient plasma metabolome data. By analyzing endothelial cell metabolism comprehensively, the pipeline identified critical therapeutic targets and biomarkers that can potentially contribute to the development of therapeutic strategies. Our study demonstrates the efficacy of integrating patient plasma metabolome data into computational models to analyze endothelial cell metabolism in disease contexts. This approach offers a deeper understanding of metabolic dysregulations and provides insights into diseases with metabolic components and potential treatments.
Asunto(s)
Células Endoteliales , Metaboloma , Metabolómica , Humanos , Células Endoteliales/metabolismo , Metabolómica/métodos , Modelos Biológicos , Algoritmos , Biomarcadores/sangre , Biología Computacional/métodosRESUMEN
The dominant method for generating Chinese hamster ovary (CHO) cell lines that produce high titers of biotherapeutic proteins utilizes selectable markers such as dihydrofolate reductase (Dhfr) or glutamine synthetase (Gs), alongside inhibitory compounds like methotrexate or methionine sulfoximine, respectively. Recent work has shown the importance of asparaginase (Aspg) for growth in media lacking glutamine-the selection medium for Gs-based selection systems. We generated a Gs/Aspg double knockout CHO cell line and evaluated its utility as a novel dual selectable system via co-transfection of Gs-Enbrel and Aspg-Enbrel plasmids. Using the same selection conditions as the standard Gs system, the resulting cells from the Gs/Aspg dual selection showed substantially improved specific productivity and titer compared to the standard Gs selection method, however, with reduced growth rate and viability. Following adaptation in the selection medium, the cells improved viability and growth while still achieving ~5-fold higher specific productivity and ~3-fold higher titer than Gs selection alone. We anticipate that with further optimization of culture medium and selection conditions, this approach would serve as an effective addition to workflows for the industrial production of recombinant biotherapeutics.
Asunto(s)
Asparaginasa , Glutamato-Amoníaco Ligasa , Cricetinae , Animales , Cricetulus , Células CHO , Glutamato-Amoníaco Ligasa/genética , Glutamato-Amoníaco Ligasa/metabolismo , Glutamina/metabolismo , Glutamina/farmacología , Etanercept , Proteínas Recombinantes/genéticaRESUMEN
The topology of metabolic networks is recognisably modular with modules weakly connected apart from sharing a pool of currency metabolites. Here, we defined modules as sets of reversible reactions isolated from the rest of metabolism by irreversible reactions except for the exchange of currency metabolites. Our approach identifies topologically independent modules under specific conditions associated with different metabolic functions. As case studies, the E.coli iJO1366 and Human Recon 2.2 genome-scale metabolic models were split in 103 and 321 modules respectively, displaying significant correlation patterns in expression data. Finally, we addressed a fundamental question about the metabolic flexibility conferred by reversible reactions: "Of all Directed Topologies (DTs) defined by fixing directions to all reversible reactions, how many are capable of carrying flux through all reactions?". Enumeration of the DTs for iJO1366 model was performed using an efficient depth-first search algorithm, rejecting infeasible DTs based on mass-imbalanced and loopy flux patterns. We found the direction of 79% of reversible reactions must be defined before all directions in the network can be fixed, granting a high degree of flexibility.
Asunto(s)
Redes y Vías Metabólicas , Modelos Biológicos , Algoritmos , Escherichia coli/genética , Escherichia coli/metabolismo , Genoma , Humanos , Redes y Vías Metabólicas/genéticaRESUMEN
In trauma patients, shock-induced endotheliopathy (SHINE) is associated with a poor prognosis. We have previously identified four metabolic phenotypes in a small cohort of trauma patients (N = 20) and displayed the intracellular metabolic profile of the endothelial cell by integrating quantified plasma metabolomic profiles into a genome-scale metabolic model (iEC-GEM). A retrospective observational study of 99 trauma patients admitted to a Level 1 Trauma Center. Mass spectrometry was conducted on admission samples of plasma metabolites. Quantified metabolites were analyzed by computational network analysis of the iEC-GEM. Four plasma metabolic phenotypes (A-D) were identified, of which phenotype D was associated with an increased injury severity score (p < 0.001); 90% (91.6%) of the patients who died within 72 h possessed this phenotype. The inferred EC metabolic patterns were found to be different between phenotype A and D. Phenotype D was unable to maintain adequate redox homeostasis. We confirm that trauma patients presented four metabolic phenotypes at admission. Phenotype D was associated with increased mortality. Different EC metabolic patterns were identified between phenotypes A and D, and the inability to maintain adequate redox balance may be linked to the high mortality.
Asunto(s)
Choque , Humanos , Estudios Prospectivos , Fenotipo , Metabolómica , Células EndotelialesRESUMEN
The human microbiome has been linked to several diseases. Gastrointestinal diseases are still one of the most prominent area of study in host-microbiome interactions however the underlying microbial mechanisms in these disorders are not fully established. Irritable bowel syndrome (IBS) remains as one of the prominent disorders with significant changes in the gut microbiome composition and without definitive treatment. IBS has a severe impact on socio-economic and patient's lifestyle. The association studies between the IBS and microbiome have shed a light on relevance of microbial composition, and hence microbiome-based trials were designed. However, there are no clear evidence of potential treatment for IBS. This review summarizes the epidemiology and socioeconomic impact of IBS and then focus on microbiome observational and clinical trials. At the end, we propose a new perspective on using data-driven approach and applying computational modelling and machine learning to design microbiome-aware personalized treatment for IBS.
Asunto(s)
Microbioma Gastrointestinal , Síndrome del Colon Irritable , Microbiota , Humanos , Síndrome del Colon Irritable/diagnóstico , Síndrome del Colon Irritable/terapiaRESUMEN
The ever-increasing demand for biopharmaceuticals has created the need for improving the overall productivity of culture processes. One such operational concept that is considered is fed-batch operations as opposed to batch operations. However, optimal fed-batch operations require complete knowledge of the cell culture to optimize the culture conditions and the nutrients feeding. For example, when using high-throughput small-scale bioreactors to test multiple clones that do not behave the same, depletion or overfeeding of some key components can occur if the feeding strategy is not individually optimized. Over the recent years, various solutions for real-time measuring of the main cell culture metabolites have been proposed. Still, the complexity in the implementation of these techniques has limited their use. Soft-sensors present an opportunity to overcome these limitations by indirectly estimating these variables in real-time. This manuscript details the development of a new soft-sensor-based fed-batch strategy to maintain substrate concentration (glucose and glutamine) at optimal levels in small-scale multiparallel Chinese Hamster Ovary Cells cultures. Two alternatives to the standard feeding strategy were tested: an OUR soft-sensor-based strategy for glucose and glutamine (Strategy 1) and a dual OUR for glutamine and CO2 /alkali addition for glucose soft-sensor strategy (Strategy 2). The results demonstrated the applicability of the OUR soft-sensor-based strategy to optimize glucose and glutamine feedings, which yielded a 21% increase in final viable cell density (VCD) and a 31% in erythropoietin titer compared with the reference one. However, CO2 /alkali addition soft-sensor suffered from insufficient data to relate alkali addition with glucose consumption. As a result, the culture was overfed with glucose resulting in a 4% increase on final VCD, but a 9% decrease in final titer compared with the Reference Strategy.
Asunto(s)
Dióxido de Carbono , Glutamina , Álcalis , Animales , Técnicas de Cultivo Celular por Lotes/métodos , Reactores Biológicos , Células CHO , Técnicas de Cultivo de Célula/métodos , Cricetinae , Cricetulus , Glucosa/metabolismo , Glutamina/metabolismoRESUMEN
Chinese hamster ovary (CHO) cells are the primary platform for the production of biopharmaceuticals. To increase yields, many CHO cell lines have been genetically engineered to resist cell death. However, the kinetics that governs cell fate in bioreactors are confounded by many variables associated with batch processes. Here, we used CRISPR-Cas9 to create combinatorial knockouts of the three known BCL-2 family effector proteins: Bak1, Bax, and Bok. To assess the response to apoptotic stimuli, cell lines were cultured in the presence of four cytotoxic compounds with different mechanisms of action. A population-based model was developed to describe the behavior of the resulting viable cell dynamics as a function of genotype and treatment. Our results validated the synergistic antiapoptotic nature of Bak1 and Bax, while the deletion of Bok had no significant impact. Importantly, the uniform application of apoptotic stresses permitted direct observation and quantification of a delay in the onset of cell death through Bayesian inference of meaningful model parameters. In addition to the classical death rate, a delay function was found to be essential in the accurate modeling of the cell death response. These findings represent an important bridge between cell line engineering strategies and biological modeling in a bioprocess context.
Asunto(s)
Apoptosis , Proteínas Proto-Oncogénicas c-bcl-2 , Animales , Apoptosis/genética , Teorema de Bayes , Células CHO , Cricetinae , Cricetulus , Proteínas Proto-Oncogénicas c-bcl-2/genética , Proteínas Proto-Oncogénicas c-bcl-2/metabolismo , Proteína X Asociada a bcl-2/genética , Proteína X Asociada a bcl-2/metabolismoRESUMEN
Pseudomonas putida is evolutionarily endowed with features relevant for bioproduction, especially under harsh operating conditions. The rich metabolic versatility of this species, however, comes at the price of limited formation of acetyl-coenzyme A (CoA) from sugar substrates. Since acetyl-CoA is a key metabolic precursor for a number of added-value products, in this work we deployed an in silico-guided rewiring program of central carbon metabolism for upgrading P. putida as a host for acetyl-CoA-dependent bioproduction. An updated kinetic model, integrating fluxomics and metabolomics datasets in addition to manually-curated information of enzyme mechanisms, identified targets that would lead to increased acetyl-CoA levels. Based on these predictions, a set of plasmids based on clustered regularly interspaced short palindromic repeats (CRISPR) and dead CRISPR-associated protein 9 (dCas9) was constructed to silence genes by CRISPR interference (CRISPRi). Dynamic reduction of gene expression of two key targets (gltA, encoding citrate synthase, and the essential accA gene, encoding subunit A of the acetyl-CoA carboxylase complex) mediated an 8-fold increase in the acetyl-CoA content of rewired P. putida. Poly(3-hydroxybutyrate) (PHB) was adopted as a proxy of acetyl-CoA availability, and two synthetic pathways were engineered for biopolymer accumulation. By including cell morphology as an extra target for the CRISPRi approach, fully rewired P. putida strains programmed for PHB accumulation had a 5-fold increase in PHB titers in bioreactor cultures using glucose. Thus, the strategy described herein allowed for rationally redirecting metabolic fluxes in P. putida from central metabolism towards product biosynthesis-especially relevant when deletion of essential pathways is not an option.
Asunto(s)
Pseudomonas putida , Acetilcoenzima A/genética , Acetilcoenzima A/metabolismo , Citrato (si)-Sintasa/genética , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas , Ingeniería Metabólica , Plásmidos , Pseudomonas putida/genética , Pseudomonas putida/metabolismoRESUMEN
Chinese hamster ovary (CHO) cells are widely used in biopharmaceutical production. Improvements to cell lines and bioprocesses are constantly being explored. One of the major limitations of CHO cell culture is that the cells undergo apoptosis, leading to rapid cell death, which impedes reaching high recombinant protein titres. While several genetic engineering strategies have been successfully employed to reduce apoptosis, there is still room to further enhance CHO cell lines performance. 'Omics analysis is a powerful tool to better understand different phenotypes and for the identification of gene targets for engineering. Here, we present a comprehensive review of previous CHO 'omics studies that revealed changes in the expression of apoptosis-related genes. We highlight targets for genetic engineering that have reduced, or have the potential to reduce, apoptosis or to increase cell proliferation in CHO cells, with the final aim of increasing productivity.
Asunto(s)
Apoptosis , Proliferación Celular , Proteómica , Animales , Células CHO , Cricetulus , Proteínas Recombinantes/biosíntesis , Proteínas Recombinantes/genéticaRESUMEN
BACKGROUND: Cell line-specific, genome-scale metabolic models enable rigorous and systematic in silico investigation of cellular metabolism. Such models have recently become available for Chinese hamster ovary (CHO) cells. However, a key ingredient, namely an experimentally validated biomass function that summarizes the cellular composition, was so far missing. Here, we close this gap by providing extensive experimental data on the biomass composition of 13 parental and producer CHO cell lines under various conditions. RESULTS: We report total protein, lipid, DNA, RNA and carbohydrate content, cell dry mass, and detailed protein and lipid composition. Furthermore, we present meticulous data on exchange rates between cells and environment and provide detailed experimental protocols on how to determine all of the above. The biomass composition is converted into cell line- and condition-specific biomass functions for use in cell line-specific, genome-scale metabolic models of CHO. Finally, flux balance analysis (FBA) is used to demonstrate consistency between in silico predictions and experimental analysis. CONCLUSIONS: Our study reveals a strong variability of the total protein content and cell dry mass across cell lines. However, the relative amino acid composition is independent of the cell line and condition and thus needs not be explicitly measured for each new cell line. In contrast, the lipid composition is strongly influenced by the growth media and thus will have to be determined in each case. These cell line-specific variations in biomass composition have a small impact on growth rate predictions with FBA, as inaccuracies in the predictions are rather dominated by inaccuracies in the exchange rate spectra. Cell-specific biomass variations only become important if the experimental errors in the exchange rate spectra drop below twenty percent.
Asunto(s)
Biomasa , Simulación por Computador , Modelos Biológicos , Animales , Células CHO , Cricetulus , Medios de Cultivo/análisis , Medios de Cultivo/químicaRESUMEN
Native to propionibacteria, the Wood-Werkman cycle enables propionate production via succinate decarboxylation. Current limitations in engineering propionibacteria strains have redirected attention toward the heterologous production in model organisms. Here, we report the functional expression of the Wood-Werkman cycle in Escherichia coli to enable propionate and 1-propanol production. The initial proof-of-concept attempt showed that the cycle can be used for production. However, production levels were low (0.17 mM). In silico optimization of the expression system by operon rearrangement and ribosomal-binding site tuning improved performance by fivefold. Adaptive laboratory evolution further improved performance redirecting almost 30% of total carbon through the Wood-Werkman cycle, achieving propionate and propanol titers of 9 and 5 mM, respectively. Rational engineering to reduce the generation of byproducts showed that lactate (∆ldhA) and formate (∆pflB) knockout strains exhibit an improved propionate and 1-propanol production, while the ethanol (∆adhE) knockout strain only showed improved propionate production.
Asunto(s)
Escherichia coli/genética , Escherichia coli/metabolismo , Ingeniería Metabólica/métodos , Propionatos/metabolismo , Simulación por Computador , Redes y Vías Metabólicas/genética , Ácido Succínico/metabolismoRESUMEN
Chinese hamster ovary (CHO) cells are the predominant host cell line for the production of biopharmaceuticals, a growing industry currently worth more than $188 billion USD in global sales. CHO cells undergo programmed cell death (apoptosis) following different stresses encountered in cell culture, such as substrate limitation, accumulation of toxic by-products, and mechanical shear, hindering production. Genetic engineering strategies to reduce apoptosis in CHO cells have been investigated with mixed results. In this review, a contemporary understanding of the real complexity of apoptotic mechanisms and signaling pathways is described; followed by an overview of antiapoptotic cell line engineering strategies tested so far in CHO cells.
Asunto(s)
Apoptosis , Productos Biológicos/metabolismo , Células CHO , Ingeniería Celular , Animales , Técnicas de Cultivo de Célula , Cricetinae , CricetulusRESUMEN
The published online version of this paper contains mistake. The authors first and last names have been interchanged. The correct version is given above.
RESUMEN
Tetanus is a fatal disease caused by Clostridium tetani infections. To prevent infections, a toxoid vaccine, developed almost a century ago, is routinely used in humans and animals. The vaccine is listed in the World Health Organisation list of Essential Medicines and can be produced and administered very cheaply in the developing world for less than one US Dollar per dose. Recent developments in both analytical tools and frameworks for systems biology provide industry with an opportunity to gain a deeper understanding of the parameters that determine C. tetani virulence and physiological behaviour in bioreactors. Here, we compared a traditional fermentation process with a fermentation medium supplemented with five heavily consumed amino acids. The experiment demonstrated that amino acid catabolism plays a key role in the virulence of C. tetani. The addition of the five amino acids favoured growth, decreased toxin production and changed C. tetani morphology. Using time-course transcriptomics, we created a "fermentation map", which shows that the tetanus toxin transcriptional regulator BotR, P21 and the tetanus toxin gene was downregulated. Moreover, this in-depth analysis revealed potential genes that might be involved in C. tetani virulence regulation. We observed differential expression of genes related to cell separation, surface/cell adhesion, pyrimidine biosynthesis and salvage, flagellar motility, and prophage genes. Overall, the fermentation map shows that, mediated by free amino acid concentrations, virulence in C. tetani is regulated at the transcriptional level and affects a plethora of metabolic functions.
Asunto(s)
Aminoácidos , Clostridium tetani , Aminoácidos/metabolismo , Animales , Clostridium tetani/genética , Clostridium tetani/metabolismo , Clostridium tetani/patogenicidad , Humanos , Toxina Tetánica/biosíntesis , Toxina Tetánica/genética , TranscriptomaRESUMEN
The mammalian heart undergoes maturation during postnatal life to meet the increased functional requirements of an adult. However, the key drivers of this process remain poorly defined. We are currently unable to recapitulate postnatal maturation in human pluripotent stem cell-derived cardiomyocytes (hPSC-CMs), limiting their potential as a model system to discover regenerative therapeutics. Here, we provide a summary of our studies, where we developed a 96-well device for functional screening in human pluripotent stem cell-derived cardiac organoids (hCOs). Through interrogation of >10,000 organoids, we systematically optimize parameters, including extracellular matrix (ECM), metabolic substrate, and growth factor conditions, that enhance cardiac tissue viability, function, and maturation. Under optimized maturation conditions, functional and molecular characterization revealed that a switch to fatty acid metabolism was a central driver of cardiac maturation. Under these conditions, hPSC-CMs were refractory to mitogenic stimuli, and we found that key proliferation pathways including ß-catenin and Yes-associated protein 1 (YAP1) were repressed. This proliferative barrier imposed by fatty acid metabolism in hCOs could be rescued by simultaneous activation of both ß-catenin and YAP1 using genetic approaches or a small molecule activating both pathways. These studies highlight that human organoids coupled with higher-throughput screening platforms have the potential to rapidly expand our knowledge of human biology and potentially unlock therapeutic strategies.
Asunto(s)
Factores Biológicos/metabolismo , Puntos de Control del Ciclo Celular , Miocitos Cardíacos/metabolismo , Organoides/metabolismo , Células Madre Pluripotentes/metabolismo , Regeneración/fisiología , Adulto , Animales , Diferenciación Celular , Daño del ADN , Humanos , Masculino , Miocitos Cardíacos/citología , Organoides/citología , Células Madre Pluripotentes/citología , Ratas Sprague-DawleyRESUMEN
BACKGROUND: Genome-scale metabolic models (GSMM) integrating transcriptomics have been widely used to study cancer metabolism. This integration is achieved through logical rules that describe the association between genes, proteins, and reactions (GPRs). However, current gene-to-reaction formulation lacks the stoichiometry describing the transcript copies necessary to generate an active catalytic unit, which limits our understanding of how genes modulate metabolism. The present work introduces a new state-of-the-art GPR formulation that considers the stoichiometry of the transcripts (S-GPR). As case of concept, this novel gene-to-reaction formulation was applied to investigate the metabolic effects of the chronic exposure to Aldrin, an endocrine disruptor, on DU145 prostate cancer cells. To this aim we integrated the transcriptomic data from Aldrin-exposed and non-exposed DU145 cells through S-GPR or GPR into a human GSMM by applying different constraint-based-methods. RESULTS: Our study revealed a significant improvement of metabolite consumption/production predictions when S-GPRs are implemented. Furthermore, our computational analysis unveiled important alterations in carnitine shuttle and prostaglandine biosynthesis in Aldrin-exposed DU145 cells that is supported by bibliographic evidences of enhanced malignant phenotype. CONCLUSIONS: The method developed in this work enables a more accurate integration of gene expression data into model-driven methods. Thus, the presented approach is conceptually new and paves the way for more in-depth studies of aberrant cancer metabolism and other diseases with strong metabolic component with important environmental and clinical implications.
Asunto(s)
Aldrín/toxicidad , Disruptores Endocrinos/toxicidad , Neoplasias de la Próstata/metabolismo , Carnitina/metabolismo , Línea Celular Tumoral , Biología Computacional , Humanos , Lipidómica , Masculino , Redes y Vías Metabólicas/genética , Modelos Biológicos , Prostaglandinas/biosíntesis , Neoplasias de la Próstata/química , Neoplasias de la Próstata/genética , TranscriptomaRESUMEN
The identification of genetic variation with next-generation sequencing is confounded by the complexity of the human genome sequence and by biases that arise during library preparation, sequencing and analysis. We have developed a set of synthetic DNA standards, termed 'sequins', that emulate human genetic features and constitute qualitative and quantitative spike-in controls for genome sequencing. Sequencing reads derived from sequins align exclusively to an artificial in silico reference chromosome, rather than the human reference genome, which allows them them to be partitioned for parallel analysis. Here we use this approach to represent common and clinically relevant genetic variation, ranging from single nucleotide variants to large structural rearrangements and copy-number variation. We validate the design and performance of sequin standards by comparison to examples in the NA12878 reference genome, and we demonstrate their utility during the detection and quantification of variants. We provide sequins as a standardized, quantitative resource against which human genetic variation can be measured and diagnostic performance assessed.
Asunto(s)
Variaciones en el Número de Copia de ADN , ADN/genética , Genoma Humano , Genómica/métodos , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN/métodos , Cromosomas Artificiales/química , Cromosomas Artificiales/genética , ADN/síntesis química , ADN/química , Humanos , Estándares de Referencia , Análisis de Secuencia de ADN/normasRESUMEN
RNA sequencing (RNA-seq) can be used to assemble spliced isoforms, quantify expressed genes and provide a global profile of the transcriptome. However, the size and diversity of the transcriptome, the wide dynamic range in gene expression and inherent technical biases confound RNA-seq analysis. We have developed a set of spike-in RNA standards, termed 'sequins' (sequencing spike-ins), that represent full-length spliced mRNA isoforms. Sequins have an entirely artificial sequence with no homology to natural reference genomes, but they align to gene loci encoded on an artificial in silico chromosome. The combination of multiple sequins across a range of concentrations emulates alternative splicing and differential gene expression, and it provides scaling factors for normalization between samples. We demonstrate the use of sequins in RNA-seq experiments to measure sample-specific biases and determine the limits of reliable transcript assembly and quantification in accompanying human RNA samples. In addition, we have designed a complementary set of sequins that represent fusion genes arising from rearrangements of the in silico chromosome to aid in cancer diagnosis. RNA sequins provide a qualitative and quantitative reference with which to navigate the complexity of the human transcriptome.