Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 28
1.
Nat Commun ; 15(1): 2966, 2024 Apr 05.
Article En | MEDLINE | ID: mdl-38580683

Between 30% and 70% of patients with breast cancer have pre-existing chronic conditions, and more than half are on long-term non-cancer medication at the time of diagnosis. Preliminary epidemiological evidence suggests that some non-cancer medications may affect breast cancer risk, recurrence, and survival. In this nationwide cohort study, we assessed the association between medication use at breast cancer diagnosis and survival. We included 235,368 French women with newly diagnosed non-metastatic breast cancer. In analyzes of 288 medications, we identified eight medications positively associated with either overall survival or disease-free survival: rabeprazole, alverine, atenolol, simvastatin, rosuvastatin, estriol (vaginal or transmucosal), nomegestrol, and hypromellose; and eight medications negatively associated with overall survival or disease-free survival: ferrous fumarate, prednisolone, carbimazole, pristinamycin, oxazepam, alprazolam, hydroxyzine, and mianserin. Full results are available online from an interactive platform ( https://adrenaline.curie.fr ). This resource provides hypotheses for drugs that may naturally influence breast cancer evolution.


Breast Neoplasms , Humans , Female , Breast Neoplasms/drug therapy , Breast Neoplasms/epidemiology , Breast Neoplasms/pathology , Cohort Studies , Comorbidity , Simvastatin
2.
BMC Bioinformatics ; 24(1): 459, 2023 Dec 07.
Article En | MEDLINE | ID: mdl-38057718

BACKGROUND: Variability in datasets is not only the product of biological processes: they are also the product of technical biases. ComBat and ComBat-Seq are among the most widely used tools for correcting those technical biases, called batch effects, in, respectively, microarray and RNA-Seq expression data. RESULTS: In this technical note, we present a new Python implementation of ComBat and ComBat-Seq. While the mathematical framework is strictly the same, we show here that our implementations: (i) have similar results in terms of batch effects correction; (ii) are as fast or faster than the original implementations in R and; (iii) offer new tools for the bioinformatics community to participate in its development. pyComBat is implemented in the Python language and is distributed under GPL-3.0 ( https://www.gnu.org/licenses/gpl-3.0.en.html ) license as a module of the inmoose package. Source code is available at https://github.com/epigenelabs/inmoose and Python package at https://pypi.org/project/inmoose . CONCLUSIONS: We present a new Python implementation of state-of-the-art tools ComBat and ComBat-Seq for the correction of batch effects in microarray and RNA-Seq data. This new implementation, based on the same mathematical frameworks as ComBat and ComBat-Seq, offers similar power for batch effect correction, at reduced computational cost.


Computational Biology , Software , Bayes Theorem , Computational Biology/methods , RNA-Seq
3.
PLoS Comput Biol ; 19(3): e1010342, 2023 03.
Article En | MEDLINE | ID: mdl-36893104

The majority of gene expression studies focus on the search for genes whose mean expression is different between two or more populations of samples in the so-called "differential expression analysis" approach. However, a difference in variance in gene expression may also be biologically and physiologically relevant. In the classical statistical model used to analyze RNA-sequencing (RNA-seq) data, the dispersion, which defines the variance, is only considered as a parameter to be estimated prior to identifying a difference in mean expression between conditions of interest. Here, we propose to evaluate four recently published methods, which detect differences in both the mean and dispersion in RNA-seq data. We thoroughly investigated the performance of these methods on simulated datasets and characterized parameter settings to reliably detect genes with a differential expression dispersion. We applied these methods to The Cancer Genome Atlas datasets. Interestingly, among the genes with an increased expression dispersion in tumors and without a change in mean expression, we identified some key cellular functions, most of which were related to catabolism and were overrepresented in most of the analyzed cancers. In particular, our results highlight autophagy, whose role in cancerogenesis is context-dependent, illustrating the potential of the differential dispersion approach to gain new insights into biological processes and to discover new biomarkers.


Models, Statistical , Neoplasms , Humans , Sequence Analysis, RNA/methods , RNA/genetics , Autophagy/genetics , Neoplasms/genetics , Gene Expression Profiling/methods
4.
STAR Protoc ; 4(1): 101998, 2023 03 17.
Article En | MEDLINE | ID: mdl-36609152

We present a network-based protocol to discover susceptibility genes in case-control genome-wide association studies (GWASs). In short, this protocol looks for biomarkers that are informative of disease status and interconnected in an underlying biological network. This boosts discovery and interpretability. Moreover, the protocol tackles the instability of network methods, producing a stable set of genes most likely to replicate in external cohorts. To apply the procedure to a provided GWAS dataset, install the required software and execute our command-line tool. For complete details on the use and execution of this protocol, please refer to Climente-González et al.1.


Genome-Wide Association Study , Software , Genome-Wide Association Study/methods
5.
JCO Clin Cancer Inform ; 6: e2200054, 2022 11.
Article En | MEDLINE | ID: mdl-36379004

PURPOSE: Administering systemic anticancer treatment (SACT) to patients near death can negatively affect their health-related quality of life. Late SACT administrations should be avoided in these cases. Machine learning techniques could be used to build decision support tools leveraging registry data for clinicians to limit late SACT administration. MATERIALS AND METHODS: Patients with advanced lung cancer who were treated at the Department of Oncology, Aalborg University Hospital and died between 2010 and 2019 were included (N = 2,368). Diagnoses, treatments, biochemical data, and histopathologic results were used to train predictive models of 30-day mortality using logistic regression with elastic net penalty, random forest, gradient tree boosting, multilayer perceptron, and long short-term memory network. The importance of the variables and the clinical utility of the models were evaluated. RESULTS: The random forest and gradient tree boosting models outperformed other models, whereas the artificial neural network-based models underperformed. Adding summary variables had a modest effect on performance with an increase in average precision from 0.500 to 0.505 and from 0.498 to 0.509 for the gradient tree boosting and random forest models, respectively. Biochemical results alone contained most of the information with a limited degradation of the performances when fitting models with only these variables. The utility analysis showed that by applying a simple threshold to the predicted risk of 30-day mortality, 40% of late SACT administrations could have been prevented at the cost of 2% of patients stopping their treatment 90 days before death. CONCLUSION: This study demonstrates the potential of a decision support tool to limit late SACT administration in patients with cancer. Further work is warranted to refine the model, build an easy-to-use prototype, and conduct a prospective validation study.


Lung Neoplasms , Quality of Life , Humans , Machine Learning , Logistic Models , Lung Neoplasms/diagnosis , Lung Neoplasms/drug therapy , Neural Networks, Computer
6.
Cancers (Basel) ; 14(11)2022 May 27.
Article En | MEDLINE | ID: mdl-35681651

BACKGROUND: Breast cancer (BC) is the most frequent cancer and the leading cause of cancer-related death in women. The French National Cancer Institute has created a national cancer cohort to promote cancer research and improve our understanding of cancer using the National Health Data System (SNDS) and amalgamating all cancer sites. So far, no detailed separate data are available for early BC. OBJECTIVES: To describe the creation of the French Early Breast Cancer Cohort (FRESH). METHODS: All French women aged 18 years or over, with early-stage BC newly diagnosed between 1 January 2011 and 31 December 2017, treated by surgery, and registered in the general health insurance coverage plan were included in the cohort. Patients with suspected locoregional or distant metastases at diagnosis were excluded. BC treatments (surgery, chemotherapy, targeted therapy, radiotherapy, and endocrine therapy), and diagnostic procedures (biopsy, cytology, and imaging) were extracted from hospital discharge reports, outpatient care notes, or pharmacy drug delivery data. The BC subtype was inferred from the treatments received. RESULTS: We included 235,368 patients with early BC in the cohort (median age: 60 years). The BC subtype distribution was as follows: luminal (80.2%), triple-negative (TNBC, 9.5%); HER2+ (10.3%), or unidentifiable (n = 44,388, 18.9% of the cohort). Most patients underwent radiotherapy (n = 200,685, 85.3%) and endocrine therapy (n = 165,655, 70.4%), and 38.3% (n = 90,252) received chemotherapy. Treatments and care pathways are described. CONCLUSIONS: The FRESH Cohort is an unprecedented population-based resource facilitating future large-scale real-life studies aiming to improve care pathways and quality of care for BC patients.

7.
BMC Med Genomics ; 15(1): 100, 2022 04 30.
Article En | MEDLINE | ID: mdl-35501860

BACKGROUND: For the most part, genome-wide association studies (GWAS) have only partially explained the heritability of complex diseases. One of their limitations is to assume independent contributions of individual variants to the phenotype. Many tools have therefore been developed to investigate the interactions between distant loci, or epistasis. Among them, the recently proposed EpiGWAS models the interactions between a target variant and the rest of the genome. However, applying this approach to studying interactions along all genes of a disease map is not straightforward. Here, we propose a pipeline to that effect, which we illustrate by investigating a multiple sclerosis GWAS dataset from the Wellcome Trust Case Control Consortium 2 through 19 disease maps from the MetaCore pathway database. RESULTS: For each disease map, we build an epistatic network by connecting the genes that are deemed to interact. These networks tend to be connected, complementary to the disease maps and contain hubs. In addition, we report 4 epistatic gene pairs involving missense variants, and 25 gene pairs with a deleterious epistatic effect mediated by eQTLs. Among these, we highlight the interaction of GLI-1 and SUFU, and of IP10 and NF-[Formula: see text]B, as they both match known biological interactions. The latter pair is particularly promising for therapeutic development, as both genes have known inhibitors. CONCLUSIONS: Our study showcases the ability of EpiGWAS to uncover biologically interpretable epistatic interactions that are potentially actionable for the development of combination therapy.


Epistasis, Genetic , Multiple Sclerosis , Case-Control Studies , Genome-Wide Association Study , Humans , Multiple Sclerosis/genetics , Phenotype
8.
Gigascience ; 112022 02 04.
Article En | MEDLINE | ID: mdl-35134928

BACKGROUND: Detecting epistatic interactions at the gene level is essential to understanding the biological mechanisms of complex diseases. Unfortunately, genome-wide interaction association studies involve many statistical challenges that make such detection hard. We propose a multi-step protocol for epistasis detection along the edges of a gene-gene co-function network. Such an approach reduces the number of tests performed and provides interpretable interactions while keeping type I error controlled. Yet, mapping gene interactions into testable single-nucleotide polymorphism (SNP)-interaction hypotheses, as well as computing gene pair association scores from SNP pair ones, is not trivial. RESULTS: Here we compare 3 SNP-gene mappings (positional overlap, expression quantitative trait loci, and proximity in 3D structure) and use the adaptive truncated product method to compute gene pair scores. This method is non-parametric, does not require a known null distribution, and is fast to compute. We apply multiple variants of this protocol to a genome-wide association study dataset on inflammatory bowel disease. Different configurations produced different results, highlighting that various mechanisms are implicated in inflammatory bowel disease, while at the same time, results overlapped with known disease characteristics. Importantly, the proposed pipeline also differs from a conventional approach where no network is used, showing the potential for additional discoveries when prior biological knowledge is incorporated into epistasis detection.


Epistasis, Genetic , Genome-Wide Association Study , Genome-Wide Association Study/methods , Phenotype , Polymorphism, Single Nucleotide , Quantitative Trait Loci
9.
J Comput Biol ; 29(3): 213-232, 2022 03.
Article En | MEDLINE | ID: mdl-33926217

More and more biologists and bioinformaticians turn to machine learning to analyze large amounts of data. In this context, it is crucial to understand which is the most suitable data analysis pipeline for achieving reliable results. This process may be challenging, due to a variety of factors, the most crucial ones being the data type and the general goal of the analysis (e.g., explorative or predictive). Life science data sets require further consideration as they often contain measures with a low signal-to-noise ratio, high-dimensional observations, and relatively few samples. In this complex setting, regularization, which can be defined as the introduction of additional information to solve an ill-posed problem, is the tool of choice to obtain robust models. Different regularization practices may be used depending both on characteristics of the data and of the question asked, and different choices may lead to different results. In this article, we provide a comprehensive description of the impact and importance of regularization techniques in life science studies. In particular, we provide an intuition of what regularization is and of the different ways it can be implemented and exploited. We propose four general life sciences problems in which regularization is fundamental and should be exploited for robustness. For each of these large families of problems, we enumerate different techniques as well as examples and case studies. Lastly, we provide a unified view of how to approach each data type with various regularization techniques.


Algorithms , Biological Science Disciplines , Machine Learning
10.
Pac Symp Biocomput ; 27: 163-174, 2022.
Article En | MEDLINE | ID: mdl-34890146

Genome-Wide Association Studies, or GWAS, aim at finding Single Nucleotide Polymorphisms (SNPs) that are associated with a phenotype of interest. GWAS are known to suffer from the large dimensionality of the data with respect to the number of available samples. Other limiting factors include the dependency between SNPs, due to linkage disequilibrium (LD), and the need to account for population structure, that is to say, confounding due to genetic ancestry.We propose an efficient approach for the multivariate analysis of multi-population GWAS data based on a multitask group Lasso formulation. Each task corresponds to a subpopulation of the data, and each group to an LD-block. This formulation alleviates the curse of dimensionality, and makes it possible to identify disease LD-blocks shared across populations/tasks, as well as some that are specific to one population/task. In addition, we use stability selection to increase the robustness of our approach. Finally, gap safe screening rules speed up computations enough that our method can run at a genome-wide scale.To our knowledge, this is the first framework for GWAS on diverse populations combining feature selection at the LD-groups level, a multitask approach to address population structure, stability selection, and safe screening rules. We show that our approach outperforms state-of-the-art methods on both a simulated and a real-world cancer datasets.


Computational Biology , Genome-Wide Association Study , Genetics, Population , Humans , Linkage Disequilibrium , Phenotype , Polymorphism, Single Nucleotide
11.
Pac Symp Biocomput ; 27: 349-360, 2022.
Article En | MEDLINE | ID: mdl-34890162

To address the lack of statistical power and interpretability of genome-wide association studies (GWAS), gene-level analyses combine the p-values of individual single nucleotide polymorphisms (SNPs) into gene statistics. However, using all SNPs mapped to a gene, including those with low association scores, can mask the association signal of a gene.We therefore propose a new two-step strategy, consisting in first selecting the SNPs most associated with the phenotype within a given gene, before testing their joint effect on the phenotype. The recently proposed kernelPSI framework for kernel-based post-selection inference makes it possible to model non-linear relationships between features, as well as to obtain valid p-values that account for the selection step.In this paper, we show how we adapted kernelPSI to the setting of quantitative GWAS, using kernels to model epistatic interactions between neighboring SNPs, and post-selection inference to determine the joint effect of selected blocks of SNPs on a phenotype. We illustrate this tool on the study of two continuous phenotypes from the UKBiobank.We show that kernelPSI can be successfully used to study GWAS data and detect genes associated with a phenotype through the signal carried by the most strongly associated regions of these genes. In particular, we show that kernelPSI enjoys more statistical power than other gene-based GWAS tools, such as SKAT or MAGMA.kernelPSI is an effective tool to combine SNP-based and gene-based analyses of GWAS data, and can be used successfully to improve both statistical performance and interpretability of GWAS.


Computational Biology , Genome-Wide Association Study , Humans , Phenotype , Polymorphism, Single Nucleotide
12.
BMC Med Res Methodol ; 21(1): 155, 2021 07 29.
Article En | MEDLINE | ID: mdl-34325649

BACKGROUND: Linking independent sources of data describing the same individuals enable innovative epidemiological and health studies but require a robust record linkage approach. We describe a hybrid record linkage process to link databases from two independent ongoing French national studies, GEMO (Genetic Modifiers of BRCA1 and BRCA2), which focuses on the identification of genetic factors modifying cancer risk of BRCA1 and BRCA2 mutation carriers, and GENEPSO (prospective cohort of BRCAx mutation carriers), which focuses on environmental and lifestyle risk factors. METHODS: To identify as many as possible of the individuals participating in the two studies but not registered by a shared identifier, we combined probabilistic record linkage (PRL) and supervised machine learning (ML). This approach (named "PRL + ML") combined together the candidate matches identified by both approaches. We built the ML model using the gold standard on a first version of the two databases as a training dataset. This gold standard was obtained from PRL-derived matches verified by an exhaustive manual review. Results The Random Forest (RF) algorithm showed a highest recall (0.985) among six widely used ML algorithms: RF, Bagged trees, AdaBoost, Support Vector Machine, Neural Network. Therefore, RF was selected to build the ML model since our goal was to identify the maximum number of true matches. Our combined linkage PRL + ML showed a higher recall (range 0.988-0.992) than either PRL (range 0.916-0.991) or ML (0.981) alone. It identified 1995 individuals participating in both GEMO (6375 participants) and GENEPSO (4925 participants). CONCLUSIONS: Our hybrid linkage process represents an efficient tool for linking GEMO and GENEPSO. It may be generalizable to other epidemiological studies involving other databases and registries.


Breast Neoplasms , BRCA1 Protein/genetics , BRCA2 Protein/genetics , Cohort Studies , Databases, Factual , Female , Genetic Predisposition to Disease , Humans , Mutation , Prospective Studies , Risk
13.
Int J Mol Sci ; 22(10)2021 May 12.
Article En | MEDLINE | ID: mdl-34066072

Identification of the protein targets of hit molecules is essential in the drug discovery process. Target prediction with machine learning algorithms can help accelerate this search, limiting the number of required experiments. However, Drug-Target Interactions databases used for training present high statistical bias, leading to a high number of false positives, thus increasing time and cost of experimental validation campaigns. To minimize the number of false positives among predicted targets, we propose a new scheme for choosing negative examples, so that each protein and each drug appears an equal number of times in positive and negative examples. We artificially reproduce the process of target identification for three specific drugs, and more globally for 200 approved drugs. For the detailed three drug examples, and for the larger set of 200 drugs, training with the proposed scheme for the choice of negative examples improved target prediction results: the average number of false positives among the top ranked predicted targets decreased, and overall, the rank of the true targets was improved.Our method corrects databases' statistical bias and reduces the number of false positive predictions, and therefore the number of useless experiments potentially undertaken.


Computational Biology/methods , Drug Discovery/methods , Machine Learning , Pharmaceutical Preparations/chemistry , Proteins/chemistry , Software , Humans , Pharmaceutical Preparations/metabolism , Protein Interaction Mapping , Proteins/metabolism , Support Vector Machine
14.
PLoS Comput Biol ; 17(3): e1008819, 2021 03.
Article En | MEDLINE | ID: mdl-33735170

Genome-wide association studies (GWAS) explore the genetic causes of complex diseases. However, classical approaches ignore the biological context of the genetic variants and genes under study. To address this shortcoming, one can use biological networks, which model functional relationships, to search for functionally related susceptibility loci. Many such network methods exist, each arising from different mathematical frameworks, pre-processing steps, and assumptions about the network properties of the susceptibility mechanism. Unsurprisingly, this results in disparate solutions. To explore how to exploit these heterogeneous approaches, we selected six network methods and applied them to GENESIS, a nationwide French study on familial breast cancer. First, we verified that network methods recovered more interpretable results than a standard GWAS. We addressed the heterogeneity of their solutions by studying their overlap, computing what we called the consensus. The key gene in this consensus solution was COPS5, a gene related to multiple cancer hallmarks. Another issue we observed was that network methods were unstable, selecting very different genes on different subsamples of GENESIS. Therefore, we proposed a stable consensus solution formed by the 68 genes most consistently selected across multiple subsamples. This solution was also enriched in genes known to be associated with breast cancer susceptibility (BLM, CASP8, CASP10, DNAJC1, FGFR2, MRPS30, and SLC4A7, P-value = 3 × 10-4). The most connected gene was CUL3, a regulator of several genes linked to cancer progression. Lastly, we evaluated the biases of each method and the impact of their parameters on the outcome. In general, network methods preferred highly connected genes, even after random rewirings that stripped the connections of any biological meaning. In conclusion, we present the advantages of network-guided GWAS, characterize their shortcomings, and provide strategies to address them. To compute the consensus networks, implementations of all six methods are available at https://github.com/hclimente/gwas-tools.


Breast Neoplasms , Genetic Predisposition to Disease/genetics , Genome-Wide Association Study/methods , Algorithms , Breast Neoplasms/epidemiology , Breast Neoplasms/genetics , Databases, Genetic , Female , Humans , Polymorphism, Single Nucleotide/genetics
15.
PLoS One ; 15(11): e0242927, 2020.
Article En | MEDLINE | ID: mdl-33253293

More and more genome-wide association studies are being designed to uncover the full genetic basis of common diseases. Nonetheless, the resulting loci are often insufficient to fully recover the observed heritability. Epistasis, or gene-gene interaction, is one of many hypotheses put forward to explain this missing heritability. In the present work, we propose epiGWAS, a new approach for epistasis detection that identifies interactions between a target SNP and the rest of the genome. This contrasts with the classical strategy of epistasis detection through exhaustive pairwise SNP testing. We draw inspiration from causal inference in randomized clinical trials, which allows us to take into account linkage disequilibrium. EpiGWAS encompasses several methods, which we compare to state-of-the-art techniques for epistasis detection on simulated and real data. The promising results demonstrate empirically the benefits of EpiGWAS to identify pairwise interactions.


Epistasis, Genetic/genetics , Genome-Wide Association Study/statistics & numerical data , Linkage Disequilibrium/genetics , Models, Genetic , Algorithms , Humans , Polymorphism, Single Nucleotide/genetics
16.
Bioinformatics ; 35(14): i427-i435, 2019 07 15.
Article En | MEDLINE | ID: mdl-31510671

MOTIVATION: Finding non-linear relationships between biomolecules and a biological outcome is computationally expensive and statistically challenging. Existing methods have important drawbacks, including among others lack of parsimony, non-convexity and computational overhead. Here we propose block HSIC Lasso, a non-linear feature selector that does not present the previous drawbacks. RESULTS: We compare block HSIC Lasso to other state-of-the-art feature selection techniques in both synthetic and real data, including experiments over three common types of genomic data: gene-expression microarrays, single-cell RNA sequencing and genome-wide association studies. In all cases, we observe that features selected by block HSIC Lasso retain more information about the underlying biology than those selected by other techniques. As a proof of concept, we applied block HSIC Lasso to a single-cell RNA sequencing experiment on mouse hippocampus. We discovered that many genes linked in the past to brain development and function are involved in the biological differences between the types of neurons. AVAILABILITY AND IMPLEMENTATION: Block HSIC Lasso is implemented in the Python 2/3 package pyHSICLasso, available on PyPI. Source code is available on GitHub (https://github.com/riken-aip/pyHSICLasso). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Biomarkers , Genome-Wide Association Study , Software , Animals , Genome , Genomics , Mice
17.
PLoS One ; 13(10): e0204999, 2018.
Article En | MEDLINE | ID: mdl-30286165

Adverse drug reactions, also called side effects, range from mild to fatal clinical events and significantly affect the quality of care. Among other causes, side effects occur when drugs bind to proteins other than their intended target. As experimentally testing drug specificity against the entire proteome is out of reach, we investigate the application of chemogenomics approaches. We formulate the study of drug specificity as a problem of predicting interactions between drugs and proteins at the proteome scale. We build several benchmark datasets, and propose NN-MT, a multi-task Support Vector Machine (SVM) algorithm that is trained on a limited number of data points, in order to solve the computational issues or proteome-wide SVM for chemogenomics. We compare NN-MT to different state-of-the-art methods, and show that its prediction performances are similar or better, at an efficient calculation cost. Compared to its competitors, the proposed method is particularly efficient to predict (protein, ligand) interactions in the difficult double-orphan case, i.e. when no interactions are previously known for the protein nor for the ligand. The NN-MT algorithm appears to be a good default method providing state-of-the-art or better performances, in a wide range of prediction scenario that are considered in the present study: proteome-wide prediction, protein family prediction, test (protein, ligand) pairs dissimilar to pairs in the train set, and orphan cases.


Genomics , Pharmaceutical Preparations , Drug-Related Side Effects and Adverse Reactions/diagnosis , Pharmaceutical Preparations/metabolism , Prognosis , Support Vector Machine
20.
Nat Commun ; 7: 12460, 2016 08 23.
Article En | MEDLINE | ID: mdl-27549343

Rheumatoid arthritis (RA) affects millions world-wide. While anti-TNF treatment is widely used to reduce disease progression, treatment fails in ∼one-third of patients. No biomarker currently exists that identifies non-responders before treatment. A rigorous community-based assessment of the utility of SNP data for predicting anti-TNF treatment efficacy in RA patients was performed in the context of a DREAM Challenge (http://www.synapse.org/RA_Challenge). An open challenge framework enabled the comparative evaluation of predictions developed by 73 research groups using the most comprehensive available data and covering a wide range of state-of-the-art modelling methodologies. Despite a significant genetic heritability estimate of treatment non-response trait (h(2)=0.18, P value=0.02), no significant genetic contribution to prediction accuracy is observed. Results formally confirm the expectations of the rheumatology community that SNP information does not significantly improve predictive performance relative to standard clinical traits, thereby justifying a refocusing of future efforts on collection of other data.


Antibodies, Monoclonal, Humanized/therapeutic use , Arthritis, Rheumatoid/drug therapy , Genetic Predisposition to Disease/genetics , Polymorphism, Single Nucleotide , Tumor Necrosis Factor-alpha/antagonists & inhibitors , Adult , Aged , Antibodies, Monoclonal/therapeutic use , Antirheumatic Agents/therapeutic use , Arthritis, Rheumatoid/genetics , Arthritis, Rheumatoid/pathology , Certolizumab Pegol/therapeutic use , Cohort Studies , Crowdsourcing , Female , Humans , Male , Middle Aged , Prognosis , Treatment Outcome , Tumor Necrosis Factor-alpha/immunology
...