Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 41
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Rheumatology (Oxford) ; 61(4): 1680-1689, 2022 04 11.
Article in English | MEDLINE | ID: mdl-34175943

ABSTRACT

OBJECTIVES: Advances in immunotherapy by blocking TNF have remarkably improved treatment outcomes for Rheumatoid arthritis (RA) patients. Although treatment specifically targets TNF, the downstream mechanisms of immune suppression are not completely understood. The aim of this study was to detect biomarkers and expression signatures of treatment response to TNF inhibition. METHODS: Peripheral blood mononuclear cells (PBMCs) from 39 female patients were collected before anti-TNF treatment initiation (day 0) and after 3 months. The study cohort included patients previously treated with MTX who failed to respond adequately. Response to treatment was defined based on the EULAR criteria and classified 23 patients as responders and 16 as non-responders. We investigated differences in gene expression in PBMCs, the proportion of cell types and cell phenotypes in peripheral blood using flow cytometry and the level of proteins in plasma. Finally, we used machine learning models to predict non-response to anti-TNF treatment. RESULTS: The gene expression analysis in baseline samples revealed notably higher expression of the gene EPPK1 in future responders. We detected the suppression of genes and proteins following treatment, including suppressed expression of the T cell inhibitor gene CHI3L1 and its protein YKL-40. The gene expression results were replicated in an independent cohort. Finally, machine learning models mainly based on transcriptomic data showed high predictive utility in classifying non-response to anti-TNF treatment in RA. CONCLUSIONS: Our integrative multi-omics analyses identified new biomarkers for the prediction of response, found pathways influenced by treatment and suggested new predictive models of anti-TNF treatment in RA patients.


Subject(s)
Antirheumatic Agents , Arthritis, Rheumatoid , Antirheumatic Agents/metabolism , Antirheumatic Agents/therapeutic use , Arthritis, Rheumatoid/diagnosis , Arthritis, Rheumatoid/drug therapy , Arthritis, Rheumatoid/genetics , Biomarkers , Female , Humans , Leukocytes, Mononuclear/metabolism , Machine Learning , Methotrexate/metabolism , Methotrexate/therapeutic use , Treatment Outcome , Tumor Necrosis Factor Inhibitors/therapeutic use , Tumor Necrosis Factor-alpha/metabolism
2.
BMC Med ; 19(1): 232, 2021 09 10.
Article in English | MEDLINE | ID: mdl-34503513

ABSTRACT

BACKGROUND: Genetic, lifestyle, and environmental factors can lead to perturbations in circulating lipid levels and increase the risk of cardiovascular and metabolic diseases. However, how changes in individual lipid species contribute to disease risk is often unclear. Moreover, little is known about the role of lipids on cardiovascular disease in Pakistan, a population historically underrepresented in cardiovascular studies. METHODS: We characterised the genetic architecture of the human blood lipidome in 5662 hospital controls from the Pakistan Risk of Myocardial Infarction Study (PROMIS) and 13,814 healthy British blood donors from the INTERVAL study. We applied a candidate causal gene prioritisation tool to link the genetic variants associated with each lipid to the most likely causal genes, and Gaussian Graphical Modelling network analysis to identify and illustrate relationships between lipids and genetic loci. RESULTS: We identified 253 genetic associations with 181 lipids measured using direct infusion high-resolution mass spectrometry in PROMIS, and 502 genetic associations with 244 lipids in INTERVAL. Our analyses revealed new biological insights at genetic loci associated with cardiometabolic diseases, including novel lipid associations at the LPL, MBOAT7, LIPC, APOE-C1-C2-C4, SGPP1, and SPTLC3 loci. CONCLUSIONS: Our findings, generated using a distinctive lipidomics platform in an understudied South Asian population, strengthen and expand the knowledge base of the genetic determinants of lipids and their association with cardiometabolic disease-related loci.


Subject(s)
Genome-Wide Association Study , Myocardial Infarction , Asian People/genetics , Genetic Predisposition to Disease , Humans , Lipids , Polymorphism, Single Nucleotide , White People
3.
Nucleic Acids Res ; 47(1): e3, 2019 01 10.
Article in English | MEDLINE | ID: mdl-30239796

ABSTRACT

Quantitative trait locus (QTL) mapping of molecular phenotypes such as metabolites, lipids and proteins through genome-wide association studies represents a powerful means of highlighting molecular mechanisms relevant to human diseases. However, a major challenge of this approach is to identify the causal gene(s) at the observed QTLs. Here, we present a framework for the 'Prioritization of candidate causal Genes at Molecular QTLs' (ProGeM), which incorporates biological domain-specific annotation data alongside genome annotation data from multiple repositories. We assessed the performance of ProGeM using a reference set of 227 previously reported and extensively curated metabolite QTLs. For 98% of these loci, the expert-curated gene was one of the candidate causal genes prioritized by ProGeM. Benchmarking analyses revealed that 69% of the causal candidates were nearest to the sentinel variant at the investigated molecular QTLs, indicating that genomic proximity is the most reliable indicator of 'true positive' causal genes. In contrast, cis-gene expression QTL data led to three false positive candidate causal gene assignments for every one true positive assignment. We provide evidence that these conclusions also apply to other molecular phenotypes, suggesting that ProGeM is a powerful and versatile tool for annotating molecular QTLs. ProGeM is freely available via GitHub.


Subject(s)
Genetic Association Studies , Genome-Wide Association Study/methods , Molecular Sequence Annotation/methods , Quantitative Trait Loci/genetics , Chromosome Mapping/methods , Humans , Lipids/genetics , Phenotype , Proteins/genetics
4.
BMC Bioinformatics ; 21(1): 119, 2020 Mar 20.
Article in English | MEDLINE | ID: mdl-32197580

ABSTRACT

BACKGROUND: The ability to confidently predict health outcomes from gene expression would catalyze a revolution in molecular diagnostics. Yet, the goal of developing actionable, robust, and reproducible predictive signatures of phenotypes such as clinical outcome has not been attained in almost any disease area. Here, we report a comprehensive analysis spanning prediction tasks from ulcerative colitis, atopic dermatitis, diabetes, to many cancer subtypes for a total of 24 binary and multiclass prediction problems and 26 survival analysis tasks. We systematically investigate the influence of gene subsets, normalization methods and prediction algorithms. Crucially, we also explore the novel use of deep representation learning methods on large transcriptomics compendia, such as GTEx and TCGA, to boost the performance of state-of-the-art methods. The resources and findings in this work should serve as both an up-to-date reference on attainable performance, and as a benchmarking resource for further research. RESULTS: Approaches that combine large numbers of genes outperformed single gene methods consistently and with a significant margin, but neither unsupervised nor semi-supervised representation learning techniques yielded consistent improvements in out-of-sample performance across datasets. Our findings suggest that using l2-regularized regression methods applied to centered log-ratio transformed transcript abundances provide the best predictive analyses overall. CONCLUSIONS: Transcriptomics-based phenotype prediction benefits from proper normalization techniques and state-of-the-art regularized regression approaches. In our view, breakthrough performance is likely contingent on factors which are independent of normalization and general modeling techniques; these factors might include reduction of systematic errors in sequencing data, incorporation of other data types such as single-cell sequencing and proteomics, and improved use of prior knowledge.


Subject(s)
Deep Learning , Gene Expression Profiling , Machine Learning , Phenotype , Disease/genetics , Humans , Supervised Machine Learning
5.
Bioinformatics ; 35(18): 3263-3272, 2019 09 15.
Article in English | MEDLINE | ID: mdl-30768166

ABSTRACT

MOTIVATION: Patient stratification methods are key to the vision of precision medicine. Here, we consider transcriptional data to segment the patient population into subsets relevant to a given phenotype. Whereas most existing patient stratification methods focus either on predictive performance or interpretable features, we developed a method striking a balance between these two important goals. RESULTS: We introduce a Bayesian method called SUBSTRA that uses regularized biclustering to identify patient subtypes and interpretable subtype-specific transcript clusters. The method iteratively re-weights feature importance to optimize phenotype prediction performance by producing more phenotype-relevant patient subtypes. We investigate the performance of SUBSTRA in finding relevant features using simulated data and successfully benchmark it against state-of-the-art unsupervised stratification methods and supervised alternatives. Moreover, SUBSTRA achieves predictive performance competitive with the supervised benchmark methods and provides interpretable transcriptional features in diverse biological settings, such as drug response prediction, cancer diagnosis, or kidney transplant rejection. AVAILABILITY AND IMPLEMENTATION: The R code of SUBSTRA is available at https://github.com/sahandk/SUBSTRA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Software , Bayes Theorem , Phenotype , Precision Medicine
6.
PLoS Genet ; 13(4): e1006706, 2017 Apr.
Article in English | MEDLINE | ID: mdl-28369058

ABSTRACT

Recent advances in highly multiplexed immunoassays have allowed systematic large-scale measurement of hundreds of plasma proteins in large cohort studies. In combination with genotyping, such studies offer the prospect to 1) identify mechanisms involved with regulation of protein expression in plasma, and 2) determine whether the plasma proteins are likely to be causally implicated in disease. We report here the results of genome-wide association (GWA) studies of 83 proteins considered relevant to cardiovascular disease (CVD), measured in 3,394 individuals with multiple CVD risk factors. We identified 79 genome-wide significant (p<5e-8) association signals, 55 of which replicated at P<0.0007 in separate validation studies (n = 2,639 individuals). Using automated text mining, manual curation, and network-based methods incorporating information on expression quantitative trait loci (eQTL), we propose plausible causal mechanisms for 25 trans-acting loci, including a potential post-translational regulation of stem cell factor by matrix metalloproteinase 9 and receptor-ligand pairs such as RANK-RANK ligand. Using public GWA study data, we further evaluate all 79 loci for their causal effect on coronary artery disease, and highlight several potentially causal associations. Overall, a majority of the plasma proteins studied showed evidence of regulation at the genetic level. Our results enable future studies of the causal architecture of human disease, which in turn should aid discovery of new drug targets.


Subject(s)
Biomarkers/blood , Blood Proteins/genetics , Cardiovascular Diseases/blood , Cardiovascular Diseases/genetics , Quantitative Trait Loci , Coronary Artery Disease/blood , Coronary Artery Disease/genetics , Female , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Male
7.
J Proteome Res ; 18(6): 2397-2410, 2019 06 07.
Article in English | MEDLINE | ID: mdl-30887811

ABSTRACT

Direct infusion high-resolution mass spectrometry (DIHRMS) is a novel, high-throughput approach to rapidly and accurately profile hundreds of lipids in human serum without prior chromatography, facilitating in-depth lipid phenotyping for large epidemiological studies to reveal the detailed associations of individual lipids with coronary heart disease (CHD) risk factors. Intact lipid profiling by DIHRMS was performed on 5662 serum samples from healthy participants in the Pakistan Risk of Myocardial Infarction Study (PROMIS). We developed a novel semi-targeted peak-picking algorithm to detect mass-to-charge ratios in positive and negative ionization modes. We analyzed lipid partial correlations, assessed the association of lipid principal components with established CHD risk factors and genetic variants, and examined differences between lipids for a common genetic polymorphism. The DIHRMS method provided information on 360 lipids (including fatty acyls, glycerolipids, glycerophospholipids, sphingolipids, and sterol lipids), with a median coefficient of variation of 11.6% (range: 5.4-51.9). The lipids were highly correlated and exhibited a range of associations with clinical chemistry biomarkers and lifestyle factors. This platform can provide many novel insights into the effects of physiology and lifestyle on lipid metabolism, genetic determinants of lipids, and the relationship between individual lipids and CHD risk factors.


Subject(s)
Biomarkers/blood , Coronary Disease/genetics , Lipids/genetics , Coronary Disease/blood , Coronary Disease/pathology , Female , Genetic Variation , Glycerophospholipids/blood , Humans , Lipid Metabolism/genetics , Lipids/blood , Male , Middle Aged , Risk Factors , Sphingolipids/blood , Sphingolipids/genetics , Sterols/blood
8.
Hum Mol Genet ; 26(7): 1391-1406, 2017 04 01.
Article in English | MEDLINE | ID: mdl-28199695

ABSTRACT

Understanding the interaction between humans and mosquitoes is a critical area of study due to the phenomenal burdens on public health from mosquito-transmitted diseases. In this study, we conducted the first genome-wide association studies (GWAS) of self-reported mosquito bite reaction size (n = 84,724), itchiness caused by bites (n = 69,057), and perceived attractiveness to mosquitoes (n = 16,576). In total, 15 independent significant (P < 5×10-8) associations were identified. These loci were enriched for immunity-related genes that are involved in multiple cytokine signalling pathways. We also detected suggestive enrichment of these loci in enhancer regions that are active in stimulated T-cells, as well as within loci previously identified as controlling central memory T-cell levels. Egger regression analysis between the traits suggests that perception of itchiness and attractiveness to mosquitoes is driven, at least in part, by the genetic determinants of bite reaction size.Our findings illustrate the complex genetic and immunological landscapes underpinning human interactions with mosquitoes.


Subject(s)
Genetic Predisposition to Disease , Genome-Wide Association Study , Insect Bites and Stings/genetics , Pruritus/genetics , Animals , Culicidae/genetics , Culicidae/pathogenicity , Genotype , Humans , Insect Bites and Stings/pathology , Phenotype , Polymorphism, Single Nucleotide/genetics , Pruritus/pathology , Self Report , T-Lymphocytes/immunology , T-Lymphocytes/metabolism
9.
Hum Mol Genet ; 25(9): 1867-74, 2016 05 01.
Article in English | MEDLINE | ID: mdl-26908601

ABSTRACT

Thrombotic diseases are among the leading causes of morbidity and mortality in the world. To add insights into the genetic regulation of thrombotic disease, we conducted a genome-wide association study (GWAS) of 6135 self-reported blood clots events and 252 827 controls of European ancestry belonging to the 23andMe cohort of research participants. Eight loci exceeded genome-wide significance. Among the genome-wide significant results, our study replicated previously known venous thromboembolism (VTE) loci near the F5, FGA-FGG, F11, F2, PROCR and ABO genes, and the more recently discovered locus near SLC44A2 In addition, our study reports for the first time a genome-wide significant association between rs114209171, located upstream of the F8 structural gene, and thrombosis risk. Analyses of expression profiles and expression quantitative trait loci across different tissues suggested SLC44A2, ILF3 and AP1M2 as the three most plausible candidate genes for the chromosome 19 locus, our only genome-wide significant thrombosis-related locus that does not harbor likely coagulation-related genes. In addition, we present data showing that this locus also acts as a novel risk factor for stroke and coronary artery disease (CAD). In conclusion, our study reveals novel common genetic risk factors for VTE, stroke and CAD and provides evidence that self-reported data on blood clots used in a GWAS yield results that are comparable with those obtained using clinically diagnosed VTE. This observation opens up the potential for larger meta-analyses, which will enable elucidation of the genetics of thrombotic diseases, and serves as an example for the genetic study of other diseases.


Subject(s)
Genetic Loci/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study , Polymorphism, Single Nucleotide/genetics , Thrombosis/genetics , Adaptor Protein Complex 1/genetics , Adaptor Protein Complex mu Subunits/genetics , Adolescent , Adult , Biomarkers/metabolism , Case-Control Studies , Child , Child, Preschool , Female , Humans , Infant , Infant, Newborn , Male , Membrane Glycoproteins/genetics , Membrane Transport Proteins/genetics , Middle Aged , Nuclear Factor 90 Proteins/genetics , Risk Factors , Self Report , Young Adult
10.
BMC Med ; 16(1): 150, 2018 08 27.
Article in English | MEDLINE | ID: mdl-30145981

ABSTRACT

BACKGROUND: Personalized, precision, P4, or stratified medicine is understood as a medical approach in which patients are stratified based on their disease subtype, risk, prognosis, or treatment response using specialized diagnostic tests. The key idea is to base medical decisions on individual patient characteristics, including molecular and behavioral biomarkers, rather than on population averages. Personalized medicine is deeply connected to and dependent on data science, specifically machine learning (often named Artificial Intelligence in the mainstream media). While during recent years there has been a lot of enthusiasm about the potential of 'big data' and machine learning-based solutions, there exist only few examples that impact current clinical practice. The lack of impact on clinical practice can largely be attributed to insufficient performance of predictive models, difficulties to interpret complex model predictions, and lack of validation via prospective clinical trials that demonstrate a clear benefit compared to the standard of care. In this paper, we review the potential of state-of-the-art data science approaches for personalized medicine, discuss open challenges, and highlight directions that may help to overcome them in the future. CONCLUSIONS: There is a need for an interdisciplinary effort, including data scientists, physicians, patient advocates, regulatory agencies, and health insurance organizations. Partially unrealistic expectations and concerns about data science-based solutions need to be better managed. In parallel, computational methods must advance more to provide direct benefit to clinical practice.


Subject(s)
Precision Medicine/methods , Humans , Prospective Studies
11.
J Am Soc Nephrol ; 28(2): 557-574, 2017 02.
Article in English | MEDLINE | ID: mdl-27647854

ABSTRACT

Diabetes is the leading cause of ESRD. Despite evidence for a substantial heritability of diabetic kidney disease, efforts to identify genetic susceptibility variants have had limited success. We extended previous efforts in three dimensions, examining a more comprehensive set of genetic variants in larger numbers of subjects with type 1 diabetes characterized for a wider range of cross-sectional diabetic kidney disease phenotypes. In 2843 subjects, we estimated that the heritability of diabetic kidney disease was 35% (P=6.4×10-3). Genome-wide association analysis and replication in 12,540 individuals identified no single variants reaching stringent levels of significance and, despite excellent power, provided little independent confirmation of previously published associated variants. Whole-exome sequencing in 997 subjects failed to identify any large-effect coding alleles of lower frequency influencing the risk of diabetic kidney disease. However, sets of alleles increasing body mass index (P=2.2×10-5) and the risk of type 2 diabetes (P=6.1×10-4) associated with the risk of diabetic kidney disease. We also found genome-wide genetic correlation between diabetic kidney disease and failure at smoking cessation (P=1.1×10-4). Pathway analysis implicated ascorbate and aldarate metabolism (P=9.0×10-6), and pentose and glucuronate interconversions (P=3.0×10-6) in pathogenesis of diabetic kidney disease. These data provide further evidence for the role of genetic factors influencing diabetic kidney disease in those with type 1 diabetes and highlight some key pathways that may be responsible. Altogether these results reveal important biology behind the major cause of kidney disease.


Subject(s)
Diabetes Mellitus, Type 1/complications , Diabetes Mellitus, Type 1/genetics , Diabetic Nephropathies/genetics , Adolescent , Adult , Female , Genome-Wide Association Study , Humans , Male , Middle Aged , Young Adult
12.
BMC Bioinformatics ; 18(1): 565, 2017 Dec 19.
Article in English | MEDLINE | ID: mdl-29258445

ABSTRACT

BACKGROUND: Stratification of patient subpopulations that respond favorably to treatment or experience and adverse reaction is an essential step toward development of new personalized therapies and diagnostics. It is currently feasible to generate omic-scale biological measurements for all patients in a study, providing an opportunity for machine learning models to identify molecular markers for disease diagnosis and progression. However, the high variability of genetic background in human populations hampers the reproducibility of omic-scale markers. In this paper, we develop a biological network-based regularized artificial neural network model for prediction of phenotype from transcriptomic measurements in clinical trials. To improve model sparsity and the overall reproducibility of the model, we incorporate regularization for simultaneous shrinkage of gene sets based on active upstream regulatory mechanisms into the model. RESULTS: We benchmark our method against various regression, support vector machines and artificial neural network models and demonstrate the ability of our method in predicting the clinical outcomes using clinical trial data on acute rejection in kidney transplantation and response to Infliximab in ulcerative colitis. We show that integration of prior biological knowledge into the classification as developed in this paper, significantly improves the robustness and generalizability of predictions to independent datasets. We provide a Java code of our algorithm along with a parsed version of the STRING DB database. CONCLUSION: In summary, we present a method for prediction of clinical phenotypes using baseline genome-wide expression data that makes use of prior biological knowledge on gene-regulatory interactions in order to increase robustness and reproducibility of omic-scale markers. The integrated group-wise regularization methods increases the interpretability of biological signatures and gives stable performance estimates across independent test sets.


Subject(s)
Gene Expression Regulation , Gene Regulatory Networks , Models, Theoretical , Neural Networks, Computer , Humans , Phenotype , Reproducibility of Results , Support Vector Machine
13.
BMC Bioinformatics ; 17(1): 318, 2016 Aug 24.
Article in English | MEDLINE | ID: mdl-27553489

ABSTRACT

BACKGROUND: Inference of active regulatory cascades under specific molecular and environmental perturbations is a recurring task in transcriptional data analysis. Commercial tools based on large, manually curated networks of causal relationships offering such functionality have been used in thousands of articles in the biomedical literature. The adoption and extension of such methods in the academic community has been hampered by the lack of freely available, efficient algorithms and an accompanying demonstration of their applicability using current public networks. RESULTS: In this article, we propose a new statistical method that will infer likely upstream regulators based on observed patterns of up- and down-regulated transcripts. The method is suitable for use with public interaction networks with a mix of signed and unsigned causal edges. It subsumes and extends two previously published approaches and we provide a novel algorithmic method for efficient statistical inference. Notably, we demonstrate the feasibility of using the approach to generate biological insights given current public networks in the context of controlled in-vitro overexpression experiments, stem-cell differentiation data and animal disease models. We also provide an efficient implementation of our method in the R package QuaternaryProd available to download from Bioconductor. CONCLUSIONS: In this work, we have closed an important gap in utilizing causal networks to analyze differentially expressed genes. Our proposed Quaternary test statistic incorporates all available evidence on the potential relevance of an upstream regulator. The new approach broadens the use of these types of statistics for highly curated signed networks in which ambiguities arise but also enables the use of networks with unsigned edges. We design and implement a novel computational method that can efficiently estimate p-values for upstream regulators in current biological settings. We demonstrate the ready applicability of the implemented method to analyze differentially expressed genes using the publicly available networks.


Subject(s)
Algorithms , Gene Regulatory Networks , Animals , Cell Differentiation/genetics , Data Interpretation, Statistical , Gene Expression Regulation , Humans , Stem Cells/cytology , Stem Cells/metabolism , Transcription, Genetic
14.
Bioinformatics ; 30(12): i69-77, 2014 Jun 15.
Article in English | MEDLINE | ID: mdl-24932007

ABSTRACT

MOTIVATION: Understanding and predicting an individual's response in a clinical trial is the key to better treatments and cost-: effective medicine. Over the coming years, more and more large-scale omics datasets will become available to characterize patients with complex and heterogeneous diseases at a molecular level. Unfortunately, genetic, phenotypical and environmental variation is much higher in a human trial population than currently modeled or measured in most animal studies. In our experience, this high variability can lead to failure of trained predictors in independent studies and undermines the credibility and utility of promising high-dimensional datasets. METHODS: We propose a method that utilizes patient-level genome-wide expression data in conjunction with causal networks based on prior knowledge. Our approach determines a differential expression profile for each patient and uses a Bayesian approach to infer corresponding upstream regulators. These regulators and their corresponding posterior probabilities of activity are used in a regularized regression framework to predict response. RESULTS: We validated our approach using two clinically relevant phenotypes, namely acute rejection in kidney transplantation and response to Infliximab in ulcerative colitis. To demonstrate pitfalls in translating trained predictors across independent trials, we analyze performance characteristics of our approach as well as alternative feature sets in the regression on two independent datasets for each phenotype. We show that the proposed approach is able to successfully incorporate causal prior knowledge to give robust performance estimates.


Subject(s)
Gene Expression Profiling/methods , Gene Regulatory Networks , Algorithms , Antibodies, Monoclonal/therapeutic use , Bayes Theorem , Colitis, Ulcerative/drug therapy , Colitis, Ulcerative/genetics , Gene Ontology , Graft Rejection/genetics , Humans , Infliximab , Kidney Transplantation , Phenotype , Regression Analysis , Treatment Outcome
15.
PLoS Genet ; 8(12): e1003095, 2012.
Article in English | MEDLINE | ID: mdl-23284290

ABSTRACT

Sensitivity to pain varies considerably between individuals and is known to be heritable. Increased sensitivity to experimental pain is a risk factor for developing chronic pain, a common and debilitating but poorly understood symptom. To understand mechanisms underlying pain sensitivity and to search for rare gene variants (MAF<5%) influencing pain sensitivity, we explored the genetic variation in individuals' responses to experimental pain. Quantitative sensory testing to heat pain was performed in 2,500 volunteers from TwinsUK (TUK): exome sequencing to a depth of 70× was carried out on DNA from singletons at the high and low ends of the heat pain sensitivity distribution in two separate subsamples. Thus in TUK1, 101 pain-sensitive and 102 pain-insensitive were examined, while in TUK2 there were 114 and 96 individuals respectively. A combination of methods was used to test the association between rare variants and pain sensitivity, and the function of the genes identified was explored using network analysis. Using causal reasoning analysis on the genes with different patterns of SNVs by pain sensitivity status, we observed a significant enrichment of variants in genes of the angiotensin pathway (Bonferroni corrected p = 3.8×10(-4)). This pathway is already implicated in animal models and human studies of pain, supporting the notion that it may provide fruitful new targets in pain management. The approach of sequencing extreme exome variation in normal individuals has provided important insights into gene networks mediating pain sensitivity in humans and will be applicable to other common complex traits.


Subject(s)
Angiotensins , Exome/genetics , Gene Regulatory Networks , Pain , Adult , Angiotensins/genetics , Angiotensins/metabolism , Base Sequence , Gene Expression Regulation , Genetic Predisposition to Disease , Hot Temperature , Humans , Male , Pain/genetics , Pain/physiopathology , Pain Threshold , Sensitivity and Specificity , Sequence Analysis, DNA , Signal Transduction
16.
Bioinformatics ; 29(24): 3167-73, 2013 Dec 15.
Article in English | MEDLINE | ID: mdl-24078682

ABSTRACT

MOTIVATION: The abundance of many transcripts changes significantly in response to a variety of molecular and environmental perturbations. A key question in this setting is as follows: what intermediate molecular perturbations gave rise to the observed transcriptional changes? Regulatory programs are not exclusively governed by transcriptional changes but also by protein abundance and post-translational modifications making direct causal inference from data difficult. However, biomedical research over the last decades has uncovered a plethora of causal signaling cascades that can be used to identify good candidates explaining a specific set of transcriptional changes. METHODS: We take a Bayesian approach to integrate gene expression profiling with a causal graph of molecular interactions constructed from prior biological knowledge. In addition, we define the biological context of a specific interaction by the corresponding Medical Subject Headings terms. The Bayesian network can be queried to suggest upstream regulators that can be causally linked to the altered expression profile. RESULTS: Our approach will treat candidate regulators in the right biological context preferentially, enables hierarchical exploration of resulting hypotheses and takes the complete network of causal relationships into account to arrive at the best set of upstream regulators. We demonstrate the power of our method on distinct biological datasets, namely response to dexamethasone treatment, stem cell differentiation and a neuropathic pain model. In all cases relevant biological insights could be validated. AVAILABILITY AND IMPLEMENTATION: Source code for the method is available upon request.


Subject(s)
Bayes Theorem , Gene Expression Profiling , Gene Expression Regulation , Models, Biological , Regulatory Elements, Transcriptional , Animals , Cell Differentiation , Cells, Cultured , Computer Simulation , Dexamethasone/pharmacology , Humans , Insulin-Secreting Cells/cytology , Insulin-Secreting Cells/metabolism , Keratinocytes/cytology , Keratinocytes/drug effects , Keratinocytes/metabolism , Markov Chains , Mice , Pain/genetics , Pain/metabolism , Pain/pathology , Protein Processing, Post-Translational , Rats , Signal Transduction , Stem Cells/cytology , Stem Cells/metabolism
17.
Bioinformatics ; 28(8): 1114-21, 2012 Apr 15.
Article in English | MEDLINE | ID: mdl-22355083

ABSTRACT

MOTIVATION: The interpretation of high-throughput datasets has remained one of the central challenges of computational biology over the past decade. Furthermore, as the amount of biological knowledge increases, it becomes more and more difficult to integrate this large body of knowledge in a meaningful manner. In this article, we propose a particular solution to both of these challenges. METHODS: We integrate available biological knowledge by constructing a network of molecular interactions of a specific kind: causal interactions. The resulting causal graph can be queried to suggest molecular hypotheses that explain the variations observed in a high-throughput gene expression experiment. We show that a simple scoring function can discriminate between a large number of competing molecular hypotheses about the upstream cause of the changes observed in a gene expression profile. We then develop an analytical method for computing the statistical significance of each score. This analytical method also helps assess the effects of random or adversarial noise on the predictive power of our model. RESULTS: Our results show that the causal graph we constructed from known biological literature is extremely robust to random noise and to missing or spurious information. We demonstrate the power of our causal reasoning model on two specific examples, one from a cancer dataset and the other from a cardiac hypertrophy experiment. We conclude that causal reasoning models provide a valuable addition to the biologist's toolkit for the interpretation of gene expression data. AVAILABILITY AND IMPLEMENTATION: R source code for the method is available upon request.


Subject(s)
Breast Neoplasms/genetics , Cardiomegaly/genetics , Computational Biology/methods , Gene Expression Profiling , Algorithms , Humans , Models, Biological
18.
Front Med (Lausanne) ; 10: 1146353, 2023.
Article in English | MEDLINE | ID: mdl-37051216

ABSTRACT

Background: Methotrexate (MTX) is the first line treatment for rheumatoid arthritis (RA), but failure of satisfying treatment response occurs in a significant proportion of patients. Here we present a longitudinal multi-omics study aimed at detecting molecular and cellular processes in peripheral blood associated with a successful methotrexate treatment of rheumatoid arthritis. Methods: Eighty newly diagnosed patients with RA underwent clinical assessment and donated blood before initiation of MTX, and 3 months into treatment. Flow cytometry was used to describe cell types and presence of activation markers in peripheral blood, the expression of 51 proteins was measured in serum or plasma, and RNA sequencing was performed in peripheral blood mononuclear cells (PBMC). Response to treatment after 3 months was determined using the EULAR response criteria. We assessed the changes in biological phenotypes during treatment, and whether these changes differed between responders and non-responders with regression analysis. By using measurements from baseline, we also tried to find biomarkers of future MTX response or, alternatively, to predict MTX response. Results: Among the MTX responders, (Good or Moderate according to EULAR treatment response classification, n = 60, 75%), we observed changes in 29 partly overlapping cell types proportions, levels of 13 proteins and expression of 38 genes during treatment. These changes were in most cases suppressions that were stronger among responders compared to non-responders. Within responders to treatment, we observed a suppression of FOXP3 gene expression, reduction of immunoglobulin gene expression and suppression of genes involved in cell proliferation. The proportion of many HLA-DR expressing T-cell populations were suppressed in all patients irrespective of clinical response, and the proportion of many IL21R+ T-cells were reduced exclusively in non-responders. Using only the baseline measurements we could not detect any biomarkers or prediction models that could predict response to MTX. Conclusion: We conclude that a deep molecular and cellular phenotyping of peripheral blood cells in RA patients treated with methotrexate can reveal previously not recognized differences between responders and non-responders during 3 months of treatment with MTX. This may contribute to the understanding of MTX mode of action and explain non-responsiveness to MTX therapy.

19.
Sci Rep ; 13(1): 10058, 2023 06 21.
Article in English | MEDLINE | ID: mdl-37344505

ABSTRACT

Rheumatoid arthritis (RA) is an autoimmune disease characterized by systemic inflammation and is mediated by multiple immune cell types. In this work, we aimed to determine the relevance of changes in cell proportions in peripheral blood mononuclear cells (PBMCs) during the development of disease and following treatment. Samples from healthy blood donors, newly diagnosed RA patients, and established RA patients that had an inadequate response to MTX and were about to start tumor necrosis factor inhibitors (TNFi) treatment were collected before and after 3 months of treatment. We used in parallel a computational deconvolution approach based on RNA expression and flow cytometry to determine the relative cell-type frequencies. Cell-type frequencies from deconvolution of gene expression indicate that monocytes (both classical and non-classical) and CD4+ cells (Th1 and Th2) were increased in RA patients compared to controls, while NK cells and B cells (naïve and mature) were significantly decreased in RA patients. Treatment with MTX caused a decrease in B cells (memory and plasma cell), and a decrease in CD4 Th cells (Th1 and Th17), while treatment with TNFi resulted in a significant increase in the population of B cells. Characterization of the RNA expression patterns found that most of the differentially expressed genes in RA subjects after treatment can be explained by changes in cell frequencies (98% and 74% respectively for MTX and TNFi).


Subject(s)
Antirheumatic Agents , Arthritis, Rheumatoid , Humans , Antirheumatic Agents/therapeutic use , Leukocytes, Mononuclear/metabolism , Arthritis, Rheumatoid/drug therapy , Arthritis, Rheumatoid/genetics , Arthritis, Rheumatoid/diagnosis , CD4-Positive T-Lymphocytes/metabolism , RNA
20.
BMC Bioinformatics ; 13: 35, 2012 Feb 20.
Article in English | MEDLINE | ID: mdl-22348444

ABSTRACT

BACKGROUND: Causal graphs are an increasingly popular tool for the analysis of biological datasets. In particular, signed causal graphs--directed graphs whose edges additionally have a sign denoting upregulation or downregulation--can be used to model regulatory networks within a cell. Such models allow prediction of downstream effects of regulation of biological entities; conversely, they also enable inference of causative agents behind observed expression changes. However, due to their complex nature, signed causal graph models present special challenges with respect to assessing statistical significance. In this paper we frame and solve two fundamental computational problems that arise in practice when computing appropriate null distributions for hypothesis testing. RESULTS: First, we show how to compute a p-value for agreement between observed and model-predicted classifications of gene transcripts as upregulated, downregulated, or neither. Specifically, how likely are the classifications to agree to the same extent under the null distribution of the observed classification being randomized? This problem, which we call "Ternary Dot Product Distribution" owing to its mathematical form, can be viewed as a generalization of Fisher's exact test to ternary variables. We present two computationally efficient algorithms for computing the Ternary Dot Product Distribution and investigate its combinatorial structure analytically and numerically to establish computational complexity bounds.Second, we develop an algorithm for efficiently performing random sampling of causal graphs. This enables p-value computation under a different, equally important null distribution obtained by randomizing the graph topology but keeping fixed its basic structure: connectedness and the positive and negative in- and out-degrees of each vertex. We provide an algorithm for sampling a graph from this distribution uniformly at random. We also highlight theoretical challenges unique to signed causal graphs; previous work on graph randomization has studied undirected graphs and directed but unsigned graphs. CONCLUSION: We present algorithmic solutions to two statistical significance questions necessary to apply the causal graph methodology, a powerful tool for biological network analysis. The algorithms we present are both fast and provably correct. Our work may be of independent interest in non-biological contexts as well, as it generalizes mathematical results that have been studied extensively in other fields.


Subject(s)
Algorithms , Models, Biological , Animals , Chondrocytes/cytology , Chondrocytes/metabolism , Dexamethasone , Gene Expression Profiling , Hypoxia/drug therapy , Hypoxia/genetics , Hypoxia/metabolism , Mice , Oligonucleotide Array Sequence Analysis , Receptors, Glucocorticoid/metabolism , Statistical Distributions
SELECTION OF CITATIONS
SEARCH DETAIL