RESUMO
Diagnostic tests play an important role in medical research and clinical practice. The ultimate goal of a diagnostic test is to distinguish between diseased and nondiseased individuals and before a test is routinely used in practice, it is a pivotal requirement that its ability to discriminate between these two states is thoroughly assessed. The overlap coefficient, which is defined as the proportion of overlap area between two probability density functions, has gained popularity as a summary measure of diagnostic accuracy. We propose two Bayesian nonparametric estimators, based on Dirichlet process mixtures, for estimating the overlap coefficient. We further introduce the covariate-specific overlap coefficient and develop a Bayesian nonparametric approach based on Dirichlet process mixtures of additive normal models for estimating it. A simulation study is conducted to assess the empirical performance of our proposed estimators. Two illustrations are provided: one concerned with the search for biomarkers of ovarian cancer and another one aimed to assess the age-specific accuracy of glucose as a biomarker of diabetes.
Assuntos
Modelos Estatísticos , Neoplasias Ovarianas , Teorema de Bayes , Biomarcadores , Simulação por Computador , Feminino , Humanos , Neoplasias Ovarianas/diagnóstico , Estatísticas não ParamétricasRESUMO
The extraordinary advancements in neuroscientific technology for brain recordings over the last decades have led to increasingly complex spatiotemporal data sets. To reduce oversimplifications, new models have been developed to be able to identify meaningful patterns and new insights within a highly demanding data environment. To this extent, we propose a new model called parameter clustering functional principal component analysis (PCl-fPCA) that merges ideas from functional data analysis and Bayesian nonparametrics to obtain a flexible and computationally feasible signal reconstruction and exploration of spatiotemporal neuroscientific data. In particular, we use a Dirichlet process Gaussian mixture model to cluster functional principal component scores within the standard Bayesian functional PCA framework. This approach captures the spatial dependence structure among smoothed time series (curves) and its interaction with the time domain without imposing a prior spatial structure on the data. Moreover, by moving the mixture from data to functional principal component scores, we obtain a more general clustering procedure, thus allowing a higher level of intricate insight and understanding of the data. We present results from a simulation study showing improvements in curve and correlation reconstruction compared with different Bayesian and frequentist fPCA models and we apply our method to functional magnetic resonance imaging and electroencephalogram data analyses providing a rich exploration of the spatiotemporal dependence in brain time series.
Assuntos
Imageamento por Ressonância Magnética , Teorema de Bayes , Análise por Conglomerados , Simulação por Computador , Humanos , Análise de Componente PrincipalRESUMO
Diagnostic tests are of critical importance in health care and medical research. Motivated by the impact that atypical and outlying test outcomes might have on the assessment of the discriminatory ability of a diagnostic test, we develop a robust and flexible model for conducting inference about the covariate-specific receiver operating characteristic (ROC) curve that safeguards against outlying test results while also accommodating for possible nonlinear effects of the covariates. Specifically, we postulate a location-scale regression model for the test outcomes in both the diseased and nondiseased populations, combining additive regression B-splines and M-estimation for the regression function, while the distribution of the error term is estimated via a weighted empirical distribution function of the standardized residuals. The results of the simulation study show that our approach successfully recovers the true covariate-specific area under the ROC curve on a variety of conceivable test outcomes contamination scenarios. Our method is applied to a dataset derived from a prostate cancer study where we seek to assess the ability of the Prostate Health Index to discriminate between men with and without Gleason 7 or above prostate cancer, and if and how such discriminatory capacity changes with age.
Assuntos
Testes Diagnósticos de Rotina , Neoplasias da Próstata , Área Sob a Curva , Simulação por Computador , Humanos , Masculino , Neoplasias da Próstata/diagnóstico , Curva ROCRESUMO
BACKGROUND: Studies of agreement examine the distance between readings made by different devices or observers measuring the same quantity. If the values generated by each device are close together most of the time then we conclude that the devices agree. Several different agreement methods have been described in the literature, in the linear mixed modelling framework, for use when there are time-matched repeated measurements within subjects. METHODS: We provide a tutorial to help guide practitioners when choosing among different methods of assessing agreement based on a linear mixed model assumption. We illustrate the use of five methods in a head-to-head comparison using real data from a study involving Chronic Obstructive Pulmonary Disease (COPD) patients and matched repeated respiratory rate observations. The methods used were the concordance correlation coefficient, limits of agreement, total deviation index, coverage probability, and coefficient of individual agreement. RESULTS: The five methods generated similar conclusions about the agreement between devices in the COPD example; however, some methods emphasized different aspects of the between-device comparison, and the interpretation was clearer for some methods compared to others. CONCLUSIONS: Five different methods used to assess agreement have been compared in the same setting to facilitate understanding and encourage the use of multiple agreement methods in practice. Although there are similarities between the methods, each method has its own strengths and weaknesses which are important for researchers to be aware of. We suggest that researchers consider using the coverage probability method alongside a graphical display of the raw data in method comparison studies. In the case of disagreement between devices, it is important to look beyond the overall summary agreement indices and consider the underlying causes. Summarising the data graphically and examining model parameters can both help with this.
Assuntos
Doença Pulmonar Obstrutiva Crônica , Humanos , Modelos Lineares , Doença Pulmonar Obstrutiva Crônica/diagnóstico , Reprodutibilidade dos Testes , Projetos de PesquisaRESUMO
The receiver operating characteristic (ROC) curve is the most widely used measure for evaluating the discriminatory performance of a continuous marker. Often, covariate information is also available and several regression methods have been proposed to incorporate covariate information in the ROC framework. Until now, these methods are only developed for the case where the covariate is univariate or multivariate. We extend ROC regression methodology for the case where the covariate is functional rather than univariate or multivariate. To this end, semiparametric- and nonparametric-induced ROC regression estimators are proposed. A simulation study is performed to assess the performance of the proposed estimators. The methods are applied to and motivated by a metabolic syndrome study in Galicia (NW Spain).
Assuntos
Biomarcadores/análise , Modelos Estatísticos , Curva ROC , Simulação por Computador , Humanos , Hipóxia/sangue , Síndrome Metabólica/enzimologia , Espanha , gama-Glutamiltransferase/sangueRESUMO
We describe a nonparametric Bayesian approach for estimating the three-way ROC surface based on mixtures of finite Polya trees (MFPT) priors. Mixtures of finite Polya trees are robust models that can handle nonstandard features in the data. We address the difficulties in modeling continuous diagnostic data with skewness, multimodality, or other nonstandard features, and how parametric approaches can lead to misleading results in such cases. Robust, data-driven inference for the ROC surface and for the volume under the ROC surface is obtained. A simulation study is performed to assess the performance of the proposed method. Methods are applied to data from a magnetic resonance spectroscopy study on human immunodeficiency virus patients.