RESUMEN
Adaptive laboratory evolution (ALE) can be used to make bacteria less susceptible to oxidative stress. An alternative to large batch scale ALE cultures is to use microfluidic platforms, which are often more economical and more efficient. Microfluidic ALE platforms have shown promise, but many have suffered from subpar cell passaging mechanisms and poor spatial definition. A new approach is presented using a microfluidic Evolution on a Chip (EVoc) design which progressively drives microbial cells from areas of lower H2O2 concentration to areas of higher concentration. Prolonged exposure, up to 72 h, revealed the survival of adaptive strains of Lacticaseibacillus rhamnosus GG, a beneficial probiotic often included in food products. After performing ALE on this microfluidic platform, the bacteria persisted under high H2O2 concentrations in repeated trials. After two progressive exposures, the ability of L. rhamnosus to grow in the presence of H2O2 increased from 1 mm H2O2 after a lag time of 31 h to 1 mm after 21 h, 2 mm after 28 h, and 3 mm after 42 h. The adaptive strains have different morphology, and gene expression compared to wild type, and genome sequencing revealed a potentially meaningful single nucleotide mutation in the protein omega-amidase.
Asunto(s)
Peróxido de Hidrógeno , Lacticaseibacillus rhamnosus , Microfluídica , Estrés Oxidativo , Probióticos , Estrés Oxidativo/efectos de los fármacos , Peróxido de Hidrógeno/farmacología , Peróxido de Hidrógeno/química , Peróxido de Hidrógeno/metabolismo , Lacticaseibacillus rhamnosus/metabolismo , Microfluídica/métodos , Evolución Molecular Dirigida/métodosRESUMEN
Phycocyanin (PC), an algae-extracted colorant, has extensive applications for its water-solubility and fresh blue shade. When PC is added to acidified media, dispersions are prone to aggregate and decolorize into cloudy systems. For palliating this matter, chitosan with high, medium, and low molecular weights (HMC, MMC, and LMC) were adopted in PC dispersions, and their protective effects were compared based on physiochemical stabilities. The optimal mass ratio between chitosan and PC was identified as 1:5 based on preliminary evaluations and was supported by the higher ζ-potential (31.0-32.1 mV), lower turbidity (39.6-43.6 NTU), and polyacrylamide gel electrophoresis results. Through interfacial and antioxidant capacity analyses, LMC was found to display a higher affinity to PC, which was also confirmed by SEM images and the maximum increase in transition temperature of their complex (155.70 °C) in DSC measurements. The mechanism of electrostatic interaction reinforced by hydrophobic effects and hydrogen bonding was elucidated by FT-IR and Raman spectroscopy. Further comprehensive stability evaluations revealed that, without light exposure, LMC kept PC from internal secondary structure to external blueness luster to the maximum extent. While with light exposure, LMC was not so flexible as HMC, to protect chromophores from attack of free radicals.
Asunto(s)
Quitosano , Ficocianina , Ficocianina/química , Peso Molecular , Espectroscopía Infrarroja por Transformada de Fourier , Antioxidantes/químicaRESUMEN
Hydrostatic pressure can reversibly modulate protein-protein and protein-chromophore interactions of C-phycocyanin (C-PC) from Spirulina platensis. Small-angle X-ray scattering combined with UV-Vis spectrophotometry and protein modeling was used to explore the color and structural changes of C-PC under high pressure conditions at different pH levels. It was revealed that pressures up to 350 MPa were enough to fully disassemble C-PC from trimers to monomers at pH 7.0, or from monomers to detached subunits at pH 9.0. These disassemblies were accompanied by protein unfolding that caused these high-pressure induced structures to be more extended. These changes were reversible following depressurization. The trimer-to-monomer transition proceeded through a collection of previously unrecognized, L-shaped intermediates resembling C-PC dimers. Additionally, pressurized C-PC showed decayed Q-band absorption and fortified Soret-band absorption. This was evidence that the folded tetrapyrroles, which had folded at ambient pressure, formed semicyclic unfolded conformations at a high pressure. Upon depressurization, the peak intensity and shift all recovered stepwise, showing pressure can precisely manipulate C-PC's structure as well as its color. Overall, a protein-chromophore regulatory theory of C-PC was unveiled. The pressure-tunability could be harnessed to modify and stabilize C-PC's structure and photochemical properties for designing new delivery and optical materials.
Asunto(s)
Ficocianina , Presión Hidrostática , Ficocianina/química , EspectrofotometríaRESUMEN
This study designed an in-plane resonant micro-accelerometer based on electrostatic stiffness. The accelerometer adopts a one-piece proof mass structure; two double-folded beam resonators are symmetrically distributed inside the proof mass, and only one displacement is introduced under the action of acceleration, which reduces the influence of processing errors on the performance of the accelerometer. The two resonators form a differential structure that can diminish the impact of common-mode errors. This accelerometer realizes the separation of the introduction of electrostatic stiffness and the detection of resonant frequency, which is conducive to the decoupling of accelerometer signals. An improved differential evolution algorithm was developed to optimize the scale factor of the accelerometer. Through the final elimination principle, excellent individuals are preserved, and the most suitable parameters are allocated to the surviving individuals to stimulate the offspring to find the globally optimal ability. The algorithm not only maintains the global optimality but also reduces the computational complexity of the algorithm and deterministically realizes the optimization of the accelerometer scale factor. The electrostatic stiffness resonant micro-accelerometer was fabricated by deep dry silicon-on-glass (DDSOG) technology. The unloaded resonant frequency of the accelerometer resonant beam was between 24 and 26 kHz, and the scale factor of the packaged accelerometer was between 54 and 59 Hz/g. The average error between the optimization result and the actual scale factor was 7.03%. The experimental results verified the rationality of the structural design.
RESUMEN
Repurposing existing drugs for new therapeutic indications can improve success rates and streamline development. Use of large-scale biomedical data repositories, including eQTL regulatory relationships and genome-wide disease risk associations, offers opportunities to propose novel indications for drugs targeting common or convergent molecular candidates associated to two or more diseases. This proposed novel computational approach scales across 262 complex diseases, building a multi-partite hierarchical network integrating (i) GWAS-derived SNP-to-disease associations, (ii) eQTL-derived SNP-to-eGene associations incorporating both cis- and trans-relationships from 19 tissues, (iii) protein target-to-drug, and (iv) drug-to-disease indications with (iv) Gene Ontology-based information theoretic semantic (ITS) similarity calculated between protein target functions. Our hypothesis is that if two diseases are associated to a common or functionally similar eGene - and a drug targeting that eGene/protein in one disease exists - the second disease becomes a potential repurposing indication. To explore this, all possible pairs of independently segregating GWAS-derived SNPs were generated, and a statistical network of similarity within each SNP-SNP pair was calculated according to scale-free overrepresentation of convergent biological processes activity in regulated eGenes (ITSeGENE-eGENE) and scale-free overrepresentation of common eGene targets between the two SNPs (ITSSNP-SNP). Significance of ITSSNP-SNP was conservatively estimated using empirical scale-free permutation resampling keeping the node-degree constant for each molecule in each permutation. We identified 26 new drug repurposing indication candidates spanning 89 GWAS diseases, including a potential repurposing of the calcium-channel blocker Verapamil from coronary disease to gout. Predictions from our approach are compared to known drug indications using DrugBank as a gold standard (odds ratio=13.1, p-value=2.49x10-8). Because of specific disease-SNPs associations to candidate drug targets, the proposed method provides evidence for future precision drug repositioning to a patient's specific polymorphisms.
Asunto(s)
Reposicionamiento de Medicamentos/métodos , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Biología Computacional , Bases de Datos Genéticas , Reposicionamiento de Medicamentos/estadística & datos numéricos , Ontología de Genes , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Humanos , Medicina de Precisión/métodos , Medicina de Precisión/estadística & datos numéricosRESUMEN
KEY MESSAGE: The whole promoter regions of SUTs in Vitis were firstly isolated. SUTs are involved in the adaptation to biotic and abiotic stresses. The vulnerability of Vitis vinifera to abiotic and biotic stresses limits its yields. In contrast, Vitis amurensis displays resistance to environmental stresses, such as microbial pathogens, low temperatures, and drought. Sucrose transporters (SUTs) are important regulators for plant growth and stress tolerance; however, the role that SUTs play in stress resistance in V. amurensis is not known. Using V. amurensis Ruper. 'Zuoshan-1' and V. vinifera 'Chardonnay', we found that SUC27 was highly expressed in several vegetative organs of Zuoshan-1, SUC12 was weakly expressed or absent in most organs in both the species, and the distribution of SUC11 in source and sink organs was highest in Zuoshan-1. A search for cis-regulatory elements in the promoter sequences of SUTs revealed that they were regulated by light, environmental stresses, physiological correlation, and hormones. The SUTs in Zuoshan-1 mostly show a higher and rapid response than in Chardonnay under the induction by Plasmopara viticola infection, cold, water deficit, and dark conditions. The induction of SUTs was associated with the upregulation of key genes involved in sucrose metabolism and the biosynthesis of plant hormones. These results indicate that stress resistance in Zuoshan-1 is governed by the differential distribution and induction of SUTs by various stimuli, and the subsequent promotion of sucrose metabolism and hormone synthesis.
Asunto(s)
Resistencia a la Enfermedad , Proteínas de Transporte de Membrana/metabolismo , Proteínas de Plantas/metabolismo , Estrés Fisiológico , Vitis/fisiología , Secuencia de Bases , Transporte Biológico , Frío , Oscuridad , Regulación de la Expresión Génica de las Plantas , Proteínas de Transporte de Membrana/genética , Oomicetos/fisiología , Especificidad de Órganos , Ósmosis , Enfermedades de las Plantas/genética , Enfermedades de las Plantas/microbiología , Reguladores del Crecimiento de las Plantas/metabolismo , Hojas de la Planta/metabolismo , Proteínas de Plantas/genética , Regiones Promotoras Genéticas/genética , Sacarosa/metabolismo , Azúcares/metabolismo , Vitis/genética , Vitis/microbiologíaRESUMEN
The development of computational methods capable of analyzing -omics data at the individual level is critical for the success of precision medicine. Although unprecedented opportunities now exist to gather data on an individual's -omics profile ('personalome'), interpreting and extracting meaningful information from single-subject -omics remain underdeveloped, particularly for quantitative non-sequence measurements, including complete transcriptome or proteome expression and metabolite abundance. Conventional bioinformatics approaches have largely been designed for making population-level inferences about 'average' disease processes; thus, they may not adequately capture and describe individual variability. Novel approaches intended to exploit a variety of -omics data are required for identifying individualized signals for meaningful interpretation. In this review-intended for biomedical researchers, computational biologists and bioinformaticians-we survey emerging computational and translational informatics methods capable of constructing a single subject's 'personalome' for predicting clinical outcomes or therapeutic responses, with an emphasis on methods that provide interpretable readouts. Key points: (i) the single-subject analytics of the transcriptome shows the greatest development to date and, (ii) the methods were all validated in simulations, cross-validations or independent retrospective data sets. This survey uncovers a growing field that offers numerous opportunities for the development of novel validation methods and opens the door for future studies focusing on the interpretation of comprehensive 'personalomes' through the integration of multiple -omics, providing valuable insights into individual patient outcomes and treatments.
Asunto(s)
Medicina de Precisión , Transcriptoma , HumanosRESUMEN
Calculating Differentially Expressed Genes (DEGs) from RNA-sequencing requires replicates to estimate gene-wise variability, a requirement that is at times financially or physiologically infeasible in clinics. By imposing restrictive transcriptome-wide assumptions limiting inferential opportunities of conventional methods (edgeR, NOISeq-sim, DESeq, DEGseq), comparing two conditions without replicates (TCWR) has been proposed, but not evaluated. Under TCWR conditions (e.g., unaffected tissue vs. tumor), differences of transformed expression of the proposed individualized DEG (iDEG) method follow a distribution calculated across a local partition of related transcripts at baseline expression; thereafter the probability of each DEG is estimated by empirical Bayes with local false discovery rate control using a two-group mixture model. In extensive simulation studies of TCWR methods, iDEG and NOISeq are more accurate at 5%Asunto(s)
Algoritmos
, Perfilación de la Expresión Génica
, Análisis de Secuencia de ARN/métodos
, Transcriptoma
, Teorema de Bayes
, Genómica
, Humanos
, Conceptos Matemáticos
, Modelos Teóricos
, Medicina de Precisión
RESUMEN
Analysis of single-subject transcriptome response data is an unmet need of precision medicine, made challenging by the high dimension, dynamic nature and difficulty in extracting meaningful signals from biological or stochastic noise. We have proposed a method for single subject analysis that uses a mixture model for transcript fold-change clustering from isogenically paired samples, followed by integration of these distributions with Gene Ontology Biological Processes (GO-BP) to reduce dimension and identify functional attributes. We then extended these methods to develop functional signing metrics for gene set process regulation by incorporating biological repressor relationships encoded in GO-BP as negatively_regulates edges. Results revealed reproducible and biologically meaningful signals from analysis of a single subject's response, opening the door to future transcriptomic studies where subject and resource availability are currently limiting. We used inbred mouse strains fed different diets to provide isogenic biological replicates, permitting rigorous validation of our method. We compared significant genotype-specific GO-BP term results for overlap and rank order across three replicate pairs per genotype, and cross-methods to reference standards (limma+FET, SAM+FET, and GSEA). All single-subject analytics findings were robust and highly reproducible (median area under the ROC curve=0.96, n=24 genotypes × 3 replicates), providing confidence and validation of this approach for analyses in single subjects. R code is available online at http://www.lussiergroup.org/publications/PathwayActivity.
Asunto(s)
Perfilación de la Expresión Génica/estadística & datos numéricos , Ontología de Genes/estadística & datos numéricos , Animales , Biología Computacional/métodos , Bases de Datos Genéticas/estadística & datos numéricos , Dieta Alta en Grasa/efectos adversos , Femenino , Redes Reguladoras de Genes , Humanos , Masculino , Ratones , Ratones de la Cepa 129 , Ratones Endogámicos C57BL , Ratones Endogámicos DBA , Ratones Endogámicos NZB , Ratones Endogámicos , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Medicina de PrecisiónRESUMEN
Recent precision medicine initiatives have led to the expectation of improved clinical decisionmaking anchored in genomic data science. However, over the last decade, only a handful of new single-gene product biomarkers have been translated to clinical practice (FDA approved) in spite of considerable discovery efforts deployed and a plethora of transcriptomes available in the Gene Expression Omnibus. With this modest outcome of current approaches in mind, we developed a pilot simulation study to demonstrate the untapped benefits of developing disease detection methods for cases where the true signal lies at the pathway level, even if the pathway's gene expression alterations may be heterogeneous across patients. In other words, we relaxed the crosspatient homogeneity assumption from the transcript level (cohort assumptions of deregulated gene expression) to the pathway level (assumptions of deregulated pathway expression). Furthermore, we have expanded previous single-subject (SS) methods into cohort analyses to illustrate the benefit of accounting for an individual's variability in cohort scenarios. We compare SS and cohort-based (CB) techniques under 54 distinct scenarios, each with 1,000 simulations, to demonstrate that the emergence of a pathway-level signal occurs through the summative effect of its altered gene expression, heterogeneous across patients. Studied variables include pathway gene set size, fraction of expressed gene responsive within gene set, fraction of expressed gene responsive up- vs down-regulated, and cohort size. We demonstrated that our SS approach was uniquely suited to detect signals in heterogeneous populations in which individuals have varying levels of baseline risks that are simultaneously confounded by patient-specific "genome -by-environment" interactions (G×E). Area under the precision-recall curve of the SS approach far surpassed that of the CB (1st quartile, median, 3rd quartile: SS = 0.94, 0.96, 0.99; CB= 0.50, 0.52, 0.65). We conclude that single-subject pathway detection methods are uniquely suited for consistently detecting pathway dysregulation by the inclusion of a patient's individual variability. http://www.lussiergroup.org/publications/PathwayMarker/.
Asunto(s)
Perfilación de la Expresión Génica/estadística & datos numéricos , Marcadores Genéticos , Transcriptoma , Estudios de Cohortes , Biología Computacional/métodos , Simulación por Computador , Bases de Datos de Ácidos Nucleicos/estadística & datos numéricos , Redes Reguladoras de Genes , Interacción Gen-Ambiente , Humanos , Modelos Genéticos , Proyectos Piloto , Medicina de Precisión , Biología de Sistemas , Investigación Biomédica TraslacionalRESUMEN
OBJECTIVE: To introduce a disease prognosis framework enabled by a robust classification scheme derived from patient-specific transcriptomic response to stimulation. MATERIALS AND METHODS: Within an illustrative case study to predict asthma exacerbation, we designed a stimulation assay that reveals individualized transcriptomic response to human rhinovirus. Gene expression from peripheral blood mononuclear cells was quantified from 23 pediatric asthmatic patients and stimulated in vitro with human rhinovirus. Responses were obtained via the single-subject gene set testing methodology "N-of-1-pathways." The classifier was trained on a related independent training dataset (n = 19). Novel visualizations of personal transcriptomic responses are provided. RESULTS: Of the 23 pediatric asthmatic patients, 12 experienced recurrent exacerbations. Our classifier, using individualized responses and trained on an independent dataset, obtained 74% accuracy (area under the receiver operating curve of 71%; 2-sided P = .039). Conventional classifiers using messenger RNA (mRNA) expression within the viral-exposed samples were unsuccessful (all patients predicted to have recurrent exacerbations; accuracy of 52%). DISCUSSION: Prognosis based on single time point, static mRNA expression alone neglects the importance of dynamic genome-by-environment interplay in phenotypic presentation. Individualized transcriptomic response quantified at the pathway (gene sets) level reveals interpretable signals related to clinical outcomes. CONCLUSION: The proposed framework provides an innovative approach to precision medicine. We show that quantifying personal pathway-level transcriptomic response to a disease-relevant environmental challenge predicts disease progression. This genome-by-environment interaction assay offers a noninvasive opportunity to translate omics data to clinical practice by improving the ability to predict disease exacerbation and increasing the potential to produce more effective treatment decisions.
Asunto(s)
Asma/genética , Interacción Gen-Ambiente , Medicina de Precisión , Transcriptoma , Asma/clasificación , Teorema de Bayes , Niño , Conjuntos de Datos como Asunto , Árboles de Decisión , Progresión de la Enfermedad , Femenino , Humanos , Leucocitos Mononucleares/metabolismo , Masculino , Modelos Estadísticos , Modelación Específica para el Paciente , Pronóstico , ARN Mensajero/metabolismo , Curva ROC , Rhinovirus , Máquina de Vectores de Soporte , Transcriptoma/inmunología , Transcriptoma/fisiologíaRESUMEN
BACKGROUND: Transcriptome analytic tools are commonly used across patient cohorts to develop drugs and predict clinical outcomes. However, as precision medicine pursues more accurate and individualized treatment decisions, these methods are not designed to address single-patient transcriptome analyses. We previously developed and validated the N-of-1-pathways framework using two methods, Wilcoxon and Mahalanobis Distance (MD), for personal transcriptome analysis derived from a pair of samples of a single patient. Although, both methods uncover concordantly dysregulated pathways, they are not designed to detect dysregulated pathways with up- and down-regulated genes (bidirectional dysregulation) that are ubiquitous in biological systems. RESULTS: We developed N-of-1-pathways MixEnrich, a mixture model followed by a gene set enrichment test, to uncover bidirectional and concordantly dysregulated pathways one patient at a time. We assess its accuracy in a comprehensive simulation study and in a RNA-Seq data analysis of head and neck squamous cell carcinomas (HNSCCs). In presence of bidirectionally dysregulated genes in the pathway or in presence of high background noise, MixEnrich substantially outperforms previous single-subject transcriptome analysis methods, both in the simulation study and the HNSCCs data analysis (ROC Curves; higher true positive rates; lower false positive rates). Bidirectional and concordant dysregulated pathways uncovered by MixEnrich in each patient largely overlapped with the quasi-gold standard compared to other single-subject and cohort-based transcriptome analyses. CONCLUSION: The greater performance of MixEnrich presents an advantage over previous methods to meet the promise of providing accurate personal transcriptome analysis to support precision medicine at point of care.
Asunto(s)
Perfilación de la Expresión Génica/métodos , Neoplasias de Cabeza y Cuello/genética , Humanos , Neoplasias de Células Escamosas/genética , Medicina de Precisión , Curva ROCRESUMEN
MOTIVATION: Understanding dynamic, patient-level transcriptomic response to therapy is an important step forward for precision medicine. However, conventional transcriptome analysis aims to discover cohort-level change, lacking the capacity to unveil patient-specific response to therapy. To address this gap, we previously developed two N-of-1-pathways methods, Wilcoxon and Mahalanobis distance, to detect unidirectionally responsive transcripts within a pathway using a pair of samples from a single subject. Yet, these methods cannot recognize bidirectionally (up and down) responsive pathways. Further, our previous approaches have not been assessed in presence of background noise and are not designed to identify differentially expressed mRNAs between two samples of a patient taken in different contexts (e.g. cancer vs non cancer), which we termed responsive transcripts (RTs). METHODS: We propose a new N-of-1-pathways method, k-Means Enrichment (kMEn), that detects bidirectionally responsive pathways, despite background noise, using a pair of transcriptomes from a single patient. kMEn identifies transcripts responsive to the stimulus through k-means clustering and then tests for an over-representation of the responsive genes within each pathway. The pathways identified by kMEn are mechanistically interpretable pathways significantly responding to a stimulus. RESULTS: In â¼9000 simulations varying six parameters, superior performance of kMEn over previous single-subject methods is evident by: (i) improved precision-recall at various levels of bidirectional response and (ii) lower rates of false positives (1-specificity) when more than 10% of genes in the genome are differentially expressed (background noise). In a clinical proof-of-concept, personal treatment-specific pathways identified by kMEn correlate with therapeutic response (p-value<0.01). CONCLUSION: Through improved single-subject transcriptome dynamics of bidirectionally-regulated signals, kMEn provides a novel approach to identify mechanism-level biomarkers.
Asunto(s)
Perfilación de la Expresión Génica , Medicina de Precisión , Transcriptoma , Análisis por Conglomerados , Interpretación Estadística de Datos , Humanos , ARN MensajeroRESUMEN
MOTIVATION: As 'omics' biotechnologies accelerate the capability to contrast a myriad of molecular measurements from a single cell, they also exacerbate current analytical limitations for detecting meaningful single-cell dysregulations. Moreover, mRNA expression alone lacks functional interpretation, limiting opportunities for translation of single-cell transcriptomic insights to precision medicine. Lastly, most single-cell RNA-sequencing analytic approaches are not designed to investigate small populations of cells such as circulating tumor cells shed from solid tumors and isolated from patient blood samples. RESULTS: In response to these characteristics and limitations in current single-cell RNA-sequencing methodology, we introduce an analytic framework that models transcriptome dynamics through the analysis of aggregated cell-cell statistical distances within biomolecular pathways. Cell-cell statistical distances are calculated from pathway mRNA fold changes between two cells. Within an elaborate case study of circulating tumor cells derived from prostate cancer patients, we develop analytic methods of aggregated distances to identify five differentially expressed pathways associated to therapeutic resistance. Our aggregation analyses perform comparably with Gene Set Enrichment Analysis and better than differentially expressed genes followed by gene set enrichment. However, these methods were not designed to inform on differential pathway expression for a single cell. As such, our framework culminates with the novel aggregation method, cell-centric statistics (CCS). CCS quantifies the effect size and significance of differentially expressed pathways for a single cell of interest. Improved rose plots of differentially expressed pathways in each cell highlight the utility of CCS for therapeutic decision-making. AVAILABILITY AND IMPLEMENTATION: http://www.lussierlab.org/publications/CCS/ CONTACT: yves@email.arizona.edu or piegorsch@math.arizona.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Resistencia a Antineoplásicos , Células Neoplásicas Circulantes/efectos de los fármacos , Análisis de Secuencia de ARN , Transcriptoma , Perfilación de la Expresión Génica , Humanos , Masculino , Neoplasias de la Próstata/tratamiento farmacológico , ARNRESUMEN
The causal and interplay mechanisms of Single Nucleotide Polymorphisms (SNPs) associated with complex diseases (complex disease SNPs) investigated in genome-wide association studies (GWAS) at the transcriptional level (mRNA) are poorly understood despite recent advancements such as discoveries reported in the Encyclopedia of DNA Elements (ENCODE) and Genotype-Tissue Expression (GTex). Protein interaction network analyses have successfully improved our understanding of both single gene diseases (Mendelian diseases) and complex diseases. Whether the mRNAs downstream of complex disease genes are central or peripheral in the genetic information flow relating DNA to mRNA remains unclear and may be disease-specific. Using expression Quantitative Trait Loci (eQTL) that provide DNA to mRNA associations and network centrality metrics, we hypothesize that we can unveil the systems properties of information flow between SNPs and the transcriptomes of complex diseases. We compare different conditions such as naïve SNP assignments and stringent linkage disequilibrium (LD) free assignments for transcripts to remove confounders from LD. Additionally, we compare the results from eQTL networks between lymphoblastoid cell lines and liver tissue. Empirical permutation resampling (p<0.001) and theoretic Mann-Whitney U test (p<10(-30)) statistics indicate that mRNAs corresponding to complex disease SNPs via eQTL associations are likely to be regulated by a larger number of SNPs than expected. We name this novel property mRNA hubness in eQTL networks, and further term mRNAs with high hubness as master integrators. mRNA master integrators receive and coordinate the perturbation signals from large numbers of polymorphisms and respond to the personal genetic architecture integratively. This genetic signal integration contrasts with the mechanism underlying some Mendelian diseases, where a genetic polymorphism affecting a single protein hub produces a divergent signal that affects a large number of downstream proteins. Indeed, we verify that this property is independent of the hubness in protein networks for which these mRNAs are transcribed. Our findings provide novel insights into the pleiotropy of mRNAs targeted by complex disease polymorphisms and the architecture of the information flow between the genetic polymorphisms and transcriptomes of complex diseases.
Asunto(s)
Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , ARN Mensajero/genética , HumanosRESUMEN
MOTIVATION: The conventional approach to personalized medicine relies on molecular data analytics across multiple patients. The path to precision medicine lies with molecular data analytics that can discover interpretable single-subject signals (N-of-1). We developed a global framework, N-of-1-pathways, for a mechanistic-anchored approach to single-subject gene expression data analysis. We previously employed a metric that could prioritize the statistical significance of a deregulated pathway in single subjects, however, it lacked in quantitative interpretability (e.g. the equivalent to a gene expression fold-change). RESULTS: In this study, we extend our previous approach with the application of statistical Mahalanobis distance (MD) to quantify personal pathway-level deregulation. We demonstrate that this approach, N-of-1-pathways Paired Samples MD (N-OF-1-PATHWAYS-MD), detects deregulated pathways (empirical simulations), while not inflating false-positive rate using a study with biological replicates. Finally, we establish that N-OF-1-PATHWAYS-MD scores are, biologically significant, clinically relevant and are predictive of breast cancer survival (P < 0.05, n = 80 invasive carcinoma; TCGA RNA-sequences). CONCLUSION: N-of-1-pathways MD provides a practical approach towards precision medicine. The method generates the magnitude and the biological significance of personal deregulated pathways results derived solely from the patient's transcriptome. These pathways offer the opportunities for deriving clinically actionable decisions that have the potential to complement the clinical interpretability of personal polymorphisms obtained from DNA acquired or inherited polymorphisms and mutations. In addition, it offers an opportunity for applicability to diseases in which DNA changes may not be relevant, and thus expand the 'interpretable 'omics' of single subjects (e.g. personalome). AVAILABILITY AND IMPLEMENTATION: http://www.lussierlab.net/publications/N-of-1-pathways.
Asunto(s)
Neoplasias de la Mama/mortalidad , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos , Neoplasias de la Mama/genética , Neoplasias de la Mama/metabolismo , Interpretación Estadística de Datos , Femenino , Regulación Neoplásica de la Expresión Génica , Humanos , Estimación de Kaplan-Meier , Medicina de PrecisiónRESUMEN
MOTIVATION: With the advance of new sequencing technologies producing massive short reads data, metagenomics is rapidly growing, especially in the fields of environmental biology and medical science. The metagenomic data are not only high dimensional with large number of features and limited number of samples but also complex with a large number of zeros and skewed distribution. Efficient computational and statistical tools are needed to deal with these unique characteristics of metagenomic sequencing data. In metagenomic studies, one main objective is to assess whether and how multiple microbial communities differ under various environmental conditions. RESULTS: We propose a two-stage statistical procedure for selecting informative features and identifying differentially abundant features between two or more groups of microbial communities. In the functional analysis of metagenomes, the features may refer to the pathways, subsystems, functional roles and so on. In the first stage of the proposed procedure, the informative features are selected using elastic net as reducing the dimension of metagenomic data. In the second stage, the differentially abundant features are detected using generalized linear models with a negative binomial distribution. Compared with other available methods, the proposed approach demonstrates better performance for most of the comprehensive simulation studies. The new method is also applied to two real metagenomic datasets related to human health. Our findings are consistent with those in previous reports. AVAILABILITY: R code and two example datasets are available at http://cals.arizona.edu/â¼anling/software.htm. SUPPLEMENTARY INFORMATION: Supplementary file is available at Bioinformatics online.
Asunto(s)
Interpretación Estadística de Datos , Genes Bacterianos/genética , Enfermedades Inflamatorias del Intestino/etiología , Metagenómica/métodos , Obesidad/genética , Estudios de Casos y Controles , Tracto Gastrointestinal/microbiología , Regulación Bacteriana de la Expresión Génica , Humanos , Moco/microbiología , Obesidad/complicaciones , Curva ROC , Saliva/microbiologíaRESUMEN
A report on the 22nd Annual International Conference on Intelligent Systems for Molecular Biology, held in Boston, Massachusetts, USA, July 11-15, 2014.
Asunto(s)
Biología Computacional , Animales , Bases de Datos Genéticas , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Análisis de Secuencia de ADN , Análisis de la Célula IndividualRESUMEN
BACKGROUND: Metagenomics has a great potential to discover previously unattainable information about microbial communities. An important prerequisite for such discoveries is to accurately estimate the composition of microbial communities. Most of prevalent homology-based approaches utilize solely the results of an alignment tool such as BLAST, limiting their estimation accuracy to high ranks of the taxonomy tree. RESULTS: We developed a new homology-based approach called Taxonomic Analysis by Elimination and Correction (TAEC), which utilizes the similarity in the genomic sequence in addition to the result of an alignment tool. The proposed method is comprehensively tested on various simulated benchmark datasets of diverse complexity of microbial structure. Compared with other available methods designed for estimating taxonomic composition at a relatively low taxonomic rank, TAEC demonstrates greater accuracy in quantification of genomes in a given microbial sample. We also applied TAEC on two real metagenomic datasets, oral cavity dataset and Crohn's disease dataset. Our results, while agreeing with previous findings at higher ranks of the taxonomy tree, provide accurate estimation of taxonomic compositions at the species/strain level, narrowing down which species/strains need more attention in the study of oral cavity and the Crohn's disease. CONCLUSIONS: By taking account of the similarity in the genomic sequence TAEC outperforms other available tools in estimating taxonomic composition at a very low rank, especially when closely related species/strains exist in a metagenomic sample.