RESUMEN
Many diseases recur after recovery, for example, recurrences in cancer and infections. However, research is often focused on analysing only time-to-first recurrence, thereby ignoring any subsequent recurrences that may occur after the first. Statistical models for the analysis of recurrent events are available, of which the extended Cox proportional hazards frailty model is the current state-of-the-art. However, this model is too statistically complex for computationally efficient application in high-dimensional data sets, including genome-wide association studies (GWAS). Here, we develop an application for fast and accurate recurrent event analysis in GWAS, called SPARE (SaddlePoint Approximation for Recurrent Event analysis). In SPARE, every DNA variant is tested for association with recurrence risk using a modified score statistic. A saddlepoint approximation is implemented to achieve statistical accuracy. SPARE controls the Type I error, and its statistical power is similar to existing recurrent event models, yet SPARE is significantly faster. An application of SPARE in a recurrent event GWAS on bladder cancer for 6.2 million DNA variants in 1,443 individuals required less than 15 min, whereas existing recurrent event methods would require several weeks.
Asunto(s)
Estudio de Asociación del Genoma Completo , Recurrencia Local de Neoplasia , Humanos , Modelos Genéticos , Modelos Estadísticos , Modelos de Riesgos ProporcionalesRESUMEN
INTRODUCTION: After a stroke, a wide range of deficits can occur with varying onset latencies. As a result, assessing impairment and recovery are enormous challenges in neurorehabilitation. Although several clinical scales are generally accepted, they are time-consuming, show high inter-rater variability, have low ecological validity, and are vulnerable to biases introduced by compensatory movements and action modifications. Alternative methods need to be developed for efficient and objective assessment. In this study, we explore the potential of computer-based body tracking systems and classification tools to estimate the motor impairment of the more affected arm in stroke patients. METHODS: We present a method for estimating clinical scores from movement parameters that are extracted from kinematic data recorded during unsupervised computer-based rehabilitation sessions. We identify a number of kinematic descriptors that characterise the patients' hemiparesis (e.g., movement smoothness, work area), we implement a double-noise model and perform a multivariate regression using clinical data from 98 stroke patients who completed a total of 191 sessions with RGS. RESULTS: Our results reveal a new digital biomarker of arm function, the Total Goal-Directed Movement (TGDM), which relates to the patients work area during the execution of goal-oriented reaching movements. The model's performance to estimate FM-UE scores reaches an accuracy of [Formula: see text]: 0.38 with an error ([Formula: see text]: 12.8). Next, we evaluate its reliability ([Formula: see text] for test-retest), longitudinal external validity ([Formula: see text] true positive rate), sensitivity, and generalisation to other tasks that involve planar reaching movements ([Formula: see text]: 0.39). The model achieves comparable accuracy also for the Chedoke Arm and Hand Activity Inventory ([Formula: see text]: 0.40) and Barthel Index ([Formula: see text]: 0.35). CONCLUSIONS: Our results highlight the clinical value of kinematic data collected during unsupervised goal-oriented motor training with the RGS combined with data science techniques, and provide new insight into factors underlying recovery and its biomarkers.
Asunto(s)
Rehabilitación de Accidente Cerebrovascular , Accidente Cerebrovascular , Fenómenos Biomecánicos , Objetivos , Humanos , Recuperación de la Función , Reproducibilidad de los Resultados , Rehabilitación de Accidente Cerebrovascular/métodos , Extremidad SuperiorRESUMEN
Most previous studies of prostate cancer have not taken into account that men in the studied populations are also at risk of competing event, and that these men may have different susceptibility to prostate cancer risk. The aim of our study was to investigate heterogeneity in risk of prostate cancer, using a recently developed latent class regression method for competing risks. We further aimed to elucidate the association between Type 2 diabetes mellitus (T2DM) and prostate cancer risk, and to compare the results with conventional methods for survival analysis. We analysed the risk of prostate cancer in 126,482 men from the comparison cohort of the Prostate Cancer Data base Sweden (PCBaSe) 3.0. During a mean follow-up of 6 years 6,036 men were diagnosed with prostate cancer and 22,393 men died. We detected heterogeneity in risk of prostate cancer with two distinct latent classes in the study population. The smaller class included 9% of the study population in which men had a higher risk of prostate cancer and the risk was stronger associated with class membership than any of the covariates included in the study. Moreover, we found no association between T2DM and risk of prostate cancer after removal of the effect of informative censoring due to competing risks. The recently developed latent class for competing risks method could be used to provide new insights in precision medicine with the target to classify individuals regarding different susceptibility to a particular disease, reaction to a risk factor or response to treatment.
Asunto(s)
Diabetes Mellitus Tipo 2/complicaciones , Neoplasias de la Próstata/etiología , Anciano , Anciano de 80 o más Años , Estudios de Cohortes , Humanos , Masculino , Persona de Mediana Edad , Proyectos de Investigación , Factores de Riesgo , Análisis de Supervivencia , SueciaRESUMEN
The analysis of high-dimensional survival data is challenging, primarily owing to the problem of overfitting, which occurs when spurious relationships are inferred from data that subsequently fail to exist in test data. Here, we propose a novel method of extracting a low-dimensional representation of covariates in survival data by combining the popular Gaussian process latent variable model with a Weibull proportional hazards model. The combined model offers a flexible non-linear probabilistic method of detecting and extracting any intrinsic low-dimensional structure from high-dimensional data. By reducing the covariate dimension, we aim to diminish the risk of overfitting and increase the robustness and accuracy with which we infer relationships between covariates and survival outcomes. In addition, we can simultaneously combine information from multiple data sources by expressing multiple datasets in terms of the same low-dimensional space. We present results from several simulation studies that illustrate a reduction in overfitting and an increase in predictive performance, as well as successful detection of intrinsic dimensionality. We provide evidence that it is advantageous to combine dimensionality reduction with survival outcomes rather than performing unsupervised dimensionality reduction on its own. Finally, we use our model to analyse experimental gene expression data and detect and extract a low-dimensional representation that allows us to distinguish high-risk and low-risk groups with superior accuracy compared with doing regression on the original high-dimensional data.
Asunto(s)
Modelos Estadísticos , Análisis de Supervivencia , Bioestadística , Neoplasias de la Mama/genética , Neoplasias de la Mama/mortalidad , Simulación por Computador , Interpretación Estadística de Datos , Femenino , Perfilación de la Expresión Génica/estadística & datos numéricos , Humanos , Aprendizaje Automático , Análisis Multivariante , Dinámicas no Lineales , Distribución Normal , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Modelos de Riesgos ProporcionalesRESUMEN
Breast cancer heterogeneity demands that prognostic models must be biologically driven and recent clinical evidence indicates that future prognostic signatures need evaluation in the context of early compared with late metastatic risk prediction. In pre-clinical studies, we and others have shown that various protein-protein interactions, pertaining to the actin microfilament-associated proteins, ezrin and cofilin, mediate breast cancer cell migration, a prerequisite for cancer metastasis. Moreover, as a direct substrate for protein kinase Cα, ezrin has been shown to be a determinant of cancer metastasis for a variety of tumour types, besides breast cancer; and has been described as a pivotal regulator of metastasis by linking the plasma membrane to the actin cytoskeleton. In the present article, we demonstrate that our tissue imaging-derived parameters that pertain to or are a consequence of the PKC-ezrin interaction can be used for breast cancer prognostication, with inter-cohort reproducibility. The application of fluorescence lifetime imaging microscopy (FLIM) in formalin-fixed paraffin-embedded patient samples to probe protein proximity within the typically <10 nm range to address the oncological challenge of tumour heterogeneity, is discussed.
Asunto(s)
Neoplasias de la Mama/patología , Proteína Quinasa C-alfa/metabolismo , Factores Despolimerizantes de la Actina/metabolismo , Neoplasias de la Mama/enzimología , Neoplasias de la Mama/metabolismo , Proteínas del Citoesqueleto/metabolismo , Femenino , Transferencia Resonante de Energía de Fluorescencia , Humanos , Metástasis de la Neoplasia , Fosforilación , Fracciones Subcelulares/metabolismo , Especificidad por Sustrato , Resultado del TratamientoRESUMEN
BACKGROUND: Patients with muscle-invasive bladder cancer (MIBC) constitute a heterogenous group in terms of patient and tumour characteristics ('case-mix') and prognosis. The aim of the current study was to investigate whether differences in survival could be used to separate MIBC patients into separate classes using a recently developed latent class regression method for survival analysis with competing risks. METHODS: We selected all participants diagnosed with MIBC in the Bladder Cancer Data Base Sweden (BladderBase) and analysed inter-patient heterogeneity in risk of death from bladder cancer and other causes. RESULTS: Using data from 9653 MIBC patients, we detected heterogeneity with six distinct latent classes in the studied population. The largest, and most frail class included 50% of the study population and was characterised by a somewhat larger proportion of women, higher age at diagnosis, more advanced disease and lower probability of curative treatment. Despite this, patients in this class treated with curative intent by radical cystectomy or radiotherapy had a lower association to risk of death. The second largest class included 23% and was substantially less frail as compared to the largest class. The third and fourth class included each around 9%-10%, whereas the fifth and sixth class included each 3%-4% of the population. CONCLUSIONS: Results from the current study are compatible with previous research and the method can be used to adjust comparisons in prognosis between MIBC populations for influential differences in the distribution of sub-classes.
Asunto(s)
Neoplasias de la Vejiga Urinaria , Humanos , Femenino , Suecia/epidemiología , Invasividad Neoplásica , Neoplasias de la Vejiga Urinaria/epidemiología , Neoplasias de la Vejiga Urinaria/terapia , Neoplasias de la Vejiga Urinaria/patología , Pronóstico , Cistectomía , Músculos/patologíaRESUMEN
There is a growing interest in real world evidence when developing antineoplastic drugs owing to the shorter length of time and low costs compared to randomised controlled trials. External validity of studies in the regulatory phase can be enhanced by complementing randomised controlled trials with real world evidence. Furthermore, the use of real world evidence ensures the inclusion of patients often excluded from randomised controlled trials such as the elderly, certain ethnicities or those from certain geographical areas. This review explores approaches in which real world data may be integrated with randomised controlled trials. One approach is by using big data, especially when investigating drugs in the antineoplastic setting. This can even inform artificial intelligence thus ensuring faster and more precise diagnosis and treatment decisions. Pragmatic trials also offer an approach to examine the effectiveness of novel antineoplastic drugs without evading the benefits of randomised controlled trials. A well-designed pragmatic trial would yield results with high external validity by employing a simple study design with a large sample size and diverse settings. Although randomised controlled trials can determine efficacy of antineoplastic drugs, effectiveness in the real world may differ. The need for pragmatic trials to help guide healthcare decision-making led to the development of trials within cohorts (TWICs). TWICs make use of cohorts to conduct multiple randomised controlled trials while maintaining characteristics of real world data in routine clinical practice. Although real world data is often affected by incomplete data and biases such as selection and unmeasured biases, the use of big data and pragmatic approaches can improve the use of real world data in the development of antineoplastic drugs that can in turn steer decision-making in clinical practice.
RESUMEN
Background: Advanced head and neck squamous cell carcinoma (HNSCC) is associated with a poor prognosis, and biomarkers that predict response to treatment are highly desirable. The primary aim was to predict progression-free survival (PFS) with a multivariate risk prediction model. Methods: Experimental covariates were derived from blood samples of 56 HNSCC patients which were prospectively obtained within a Phase 2 clinical trial (NCT02633800) at baseline and after the first treatment cycle of combined platinum-based chemotherapy with cetuximab treatment. Clinical and experimental covariates were selected by Bayesian multivariate regression to form risk scores to predict PFS. Results: A 'baseline' and a 'combined' risk prediction model were generated, each of which featuring clinical and experimental covariates. The baseline risk signature has three covariates and was strongly driven by baseline percentage of CD33+CD14+HLADRhigh monocytes. The combined signature has six covariates, also featuring baseline CD33+CD14+HLADRhigh monocytes but is strongly driven by on-treatment relative change of CD8+ central memory T cells percentages. The combined model has a higher predictive power than the baseline model and was successfully validated to predict therapeutic response in an independent cohort of nine patients from an additional Phase 2 trial (NCT03494322) assessing the addition of avelumab to cetuximab treatment in HNSCC. We identified tissue counterparts for the immune cells driving the models, using imaging mass cytometry, that specifically colocalized at the tissue level and correlated with outcome. Conclusions: This immune-based combined multimodality signature, obtained through longitudinal peripheral blood monitoring and validated in an independent cohort, presents a novel means of predicting response early on during the treatment course. Funding: Daiichi Sankyo Inc, Cancer Research UK, EU IMI2 IMMUCAN, UK Medical Research Council, European Research Council (335326), Merck Serono. Cancer Research Institute, National Institute for Health Research, Guy's and St Thomas' NHS Foundation Trust and The Institute of Cancer Research. Clinical trial number: NCT02633800.
Asunto(s)
Neoplasias de Cabeza y Cuello , Humanos , Carcinoma de Células Escamosas de Cabeza y Cuello/tratamiento farmacológico , Cetuximab/uso terapéutico , Supervivencia sin Progresión , Teorema de Bayes , Neoplasias de Cabeza y Cuello/tratamiento farmacológicoRESUMEN
Herein we discuss how FRET imaging can contribute at various stages to delineate the function of the proteome. Therefore, we briefly describe FRET imaging techniques, the selection of suitable FRET pairs and potential caveats. Furthermore, we discuss state-of-the-art FRET-based screening approaches (underpinned by protein interaction network analysis using computational biology) and preclinical intravital FRET-imaging techniques that can be used for functional validation of candidate hits (nodes and edges) from the network screen, as well as measurement of the efficacy of perturbing these nodes/edges by short hairpin RNA (shRNA) and/or small molecule-based approaches.
Asunto(s)
Transferencia Resonante de Energía de Fluorescencia/métodos , Neoplasias/metabolismo , Mapeo de Interacción de Proteínas , Proteínas/química , Biología Computacional , Colorantes Fluorescentes/química , Humanos , Dominios y Motivos de Interacción de Proteínas , Proteínas/metabolismoRESUMEN
It is clear that conventional statistical inference protocols need to be revised to deal correctly with the high-dimensional data that are now common. Most recent studies aimed at achieving this revision rely on powerful approximation techniques that call for rigorous results against which they can be tested. In this context, the simplest case of high-dimensional linear regression has acquired significant new relevance and attention. In this paper we use the statistical physics perspective on inference to derive several exact results for linear regression in the high-dimensional regime.
RESUMEN
Given the immune system's importance for cancer surveillance and treatment, we have investigated how it may be affected by SARS-CoV-2 infection of cancer patients. Across some heterogeneity in tumor type, stage, and treatment, virus-exposed solid cancer patients display a dominant impact of SARS-CoV-2, apparent from the resemblance of their immune signatures to those for COVID-19+ non-cancer patients. This is not the case for hematological malignancies, with virus-exposed patients collectively displaying heterogeneous humoral responses, an exhausted T cell phenotype and a high prevalence of prolonged virus shedding. Furthermore, while recovered solid cancer patients' immunophenotypes resemble those of non-virus-exposed cancer patients, recovered hematological cancer patients display distinct, lingering immunological legacies. Thus, while solid cancer patients, including those with advanced disease, seem no more at risk of SARS-CoV-2-associated immune dysregulation than the general population, hematological cancer patients show complex immunological consequences of SARS-CoV-2 exposure that might usefully inform their care.
Asunto(s)
COVID-19/inmunología , Neoplasias/inmunología , Neoplasias/virología , Síndrome Respiratorio Agudo Grave/inmunología , Adulto , Anciano , Anciano de 80 o más Años , COVID-19/etiología , COVID-19/mortalidad , Femenino , Neoplasias Hematológicas/inmunología , Neoplasias Hematológicas/mortalidad , Neoplasias Hematológicas/terapia , Neoplasias Hematológicas/virología , Humanos , Inmunofenotipificación , Masculino , Persona de Mediana Edad , Nasofaringe/virología , Neoplasias/mortalidad , Neoplasias/terapia , Síndrome Respiratorio Agudo Grave/etiología , Síndrome Respiratorio Agudo Grave/mortalidad , Síndrome Respiratorio Agudo Grave/virología , Linfocitos T/virología , Esparcimiento de Virus , Adulto JovenRESUMEN
BACKGROUND: The phase III MRC COIN trial showed no statistically significant benefit from adding the EGFR-target cetuximab to oxaliplatin-based chemotherapy in first-line treatment of advanced colorectal cancer. This study exploits additional information on HER2-HER3 dimerization to achieve patient stratification and reveal previously hidden subgroups of patients who had differing disease progression and treatment response. METHODS: HER2-HER3 dimerization was quantified by fluorescence lifetime imaging microscopy in primary tumor samples from 550 COIN trial patients receiving oxaliplatin and fluoropyrimidine chemotherapy with or without cetuximab. Bayesian latent class analysis and covariate reduction was performed to analyze the effects of HER2-HER3 dimer, RAS mutation, and cetuximab on progression-free survival and overall survival (OS). All statistical tests were two-sided. RESULTS: Latent class analysis on a cohort of 398 patients revealed two patient subclasses with differing prognoses (median OS = 1624 days [95% confidence interval [CI] = 1466 to 1816 days] vs 461 days [95% CI = 431 to 504 days]): Class 1 (15.6%) showed a benefit from cetuximab in OS (hazard ratio = 0.43, 95% CI = 0.25 to 0.76, P = .004). Class 2 showed an association of increased HER2-HER3 with better OS (hazard ratio = 0.64, 95% CI = 0.44 to 0.94, P = .02). A class prediction signature was formed and tested on an independent validation cohort (n = 152) validating the prognostic utility of the dimer assay. Similar subclasses were also discovered in full trial dataset (n = 1630) based on 10 baseline clinicopathological and genetic covariates. CONCLUSIONS: Our work suggests that the combined use of HER dimer imaging and conventional mutation analyses will be able to identify a small subclass of patients (>10%) who will have better prognosis following chemotherapy. A larger prospective cohort will be required to confirm its utility in predicting the outcome of anti-EGFR treatment.
Asunto(s)
Adenocarcinoma/diagnóstico , Neoplasias Colorrectales/diagnóstico , Transferencia Resonante de Energía de Fluorescencia , Receptor ErbB-2/metabolismo , Receptor ErbB-3/metabolismo , Adenocarcinoma/metabolismo , Adenocarcinoma/mortalidad , Adenocarcinoma/terapia , Anciano , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapéutico , Teorema de Bayes , Capecitabina/uso terapéutico , Estudios de Cohortes , Neoplasias Colorrectales/metabolismo , Neoplasias Colorrectales/mortalidad , Neoplasias Colorrectales/terapia , Femenino , Humanos , Análisis de Clases Latentes , Masculino , Microscopía/métodos , Persona de Mediana Edad , Oxaloacetatos/uso terapéutico , Pronóstico , Multimerización de Proteína , Ensayos Clínicos Controlados Aleatorios como Asunto/estadística & datos numéricos , Receptor ErbB-2/análisis , Receptor ErbB-3/análisis , Análisis de Matrices Tisulares , Resultado del TratamientoRESUMEN
The entropy of a hierarchical network topology in an ensemble of sparse random networks, with "hidden variables" associated with its nodes, is the log-likelihood that a given network topology is present in the chosen ensemble. We obtain a general formula for this entropy, which has a clear interpretation in some simple limiting cases. The results provide keys with which to solve the general problem of "fitting" a given network with an appropriate ensemble of random networks.
RESUMEN
Protein-protein interaction networks (PPINs) have been employed to identify potential novel interconnections between proteins as well as crucial cellular functions. In this study we identify fundamental principles of PPIN topologies by analysing network motifs of short loops, which are small cyclic interactions of between 3 and 6 proteins. We compared 30 PPINs with corresponding randomised null models and examined the occurrence of common biological functions in loops extracted from a cross-validated high-confidence dataset of 622 human protein complexes. We demonstrate that loops are an intrinsic feature of PPINs and that specific cell functions are predominantly performed by loops of different lengths. Topologically, we find that loops are strongly related to the accuracy of PPINs and define a core of interactions with high resilience. The identification of this core and the analysis of loop composition are promising tools to assess PPIN quality and to uncover possible biases from experimental detection methods. More than 96% of loops share at least one biological function, with enrichment of cellular functions related to mRNA metabolic processing and the cell cycle. Our analyses suggest that these motifs can be used in the design of targeted experiments for functional phenotype detection.
Asunto(s)
Mapeo de Interacción de Proteínas , Proteínas/metabolismo , Algoritmos , Humanos , Mapas de Interacción de Proteínas , Proteínas/químicaRESUMEN
We apply our recently developed information-theoretic measures for the characterisation and comparison of protein-protein interaction networks. These measures are used to quantify topological network features via macroscopic statistical properties. Network differences are assessed based on these macroscopic properties as opposed to microscopic overlap, homology information or motif occurrences. We present the results of a large-scale analysis of protein-protein interaction networks. Precise null models are used in our analyses, allowing for reliable interpretation of the results. By quantifying the methodological biases of the experimental data, we can define an information threshold above which networks may be deemed to comprise consistent macroscopic topological properties, despite their small microscopic overlaps. Based on this rationale, data from yeast-two-hybrid methods are sufficiently consistent to allow for intra-species comparisons (between different experiments) and inter-species comparisons, while data from affinity-purification mass-spectrometry methods show large differences even within intra-species comparisons.