Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 101
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Clin Exp Dermatol ; 47(9): 1658-1665, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-35426450

RESUMO

BACKGROUND: Previous studies of second opinions in the diagnosis of melanocytic skin lesions have examined blinded second opinions, which do not reflect usual clinical practice. The current study, conducted in the USA, investigated both blinded and nonblinded second opinions for their impact on diagnostic accuracy. METHODS: In total, 100 melanocytic skin biopsy cases, ranging from benign to invasive melanoma, were interpreted by 74 dermatopathologists. Subsequently, 151 dermatopathologists performed nonblinded second and third reviews. We compared the accuracy of single reviewers, second opinions obtained from independent, blinded reviewers and second opinions obtained from sequential, nonblinded reviewers. Accuracy was defined with respect to a consensus reference diagnosis. RESULTS: The mean case-level diagnostic accuracy of single reviewers was 65.3% (95% CI 63.4-67.2%). Second opinions arising from sequential, nonblinded reviewers significantly improved accuracy to 69.9% (95% CI 68.0-71.7%; P < 0.001). Similarly, second opinions arising from blinded reviewers improved upon the accuracy of single reviewers (69.2%; 95% CI 68.0-71.7%). Nonblinded reviewers were more likely than blinded reviewers to give diagnoses in the same diagnostic classes as the first diagnosis. Nonblinded reviewers tended to be more confident when they agreed with previous reviewers, even with inaccurate diagnoses. CONCLUSION: We found that both blinded and nonblinded second reviewers offered a similar modest improvement in diagnostic accuracy compared with single reviewers. Obtaining second opinions with knowledge of previous reviews tends to generate agreement among reviews, and may generate unwarranted confidence in an inaccurate diagnosis. Combining aspects of both blinded and nonblinded review in practice may leverage the advantages while mitigating the disadvantages of each approach. Specifically, a second pathologist could give an initial diagnosis blinded to the results of the first pathologist, with subsequent nonblinded discussion between the two pathologists if their diagnoses differ.


Assuntos
Melanoma , Neoplasias Cutâneas , Humanos , Melanócitos/patologia , Melanoma/diagnóstico , Melanoma/patologia , Patologistas , Encaminhamento e Consulta , Neoplasias Cutâneas/diagnóstico , Neoplasias Cutâneas/patologia
2.
Biom J ; 63(6): 1223-1240, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-33871887

RESUMO

Biomarkers abound in many areas of clinical research, and often investigators are interested in combining them for diagnosis, prognosis, or screening. In many applications, the true positive rate (TPR) for a biomarker combination at a prespecified, clinically acceptable false positive rate (FPR) is the most relevant measure of predictive capacity. We propose a distribution-free method for constructing biomarker combinations by maximizing the TPR while constraining the FPR. Theoretical results demonstrate desirable properties of biomarker combinations produced by the new method. In simulations, the biomarker combination provided by our method demonstrated improved operating characteristics in a variety of scenarios when compared with alternative methods for constructing biomarker combinations. Thus, use of our method could lead to the development of better biomarker combinations, increasing the likelihood of clinical adoption.


Assuntos
Programas de Rastreamento , Biomarcadores , Reações Falso-Positivas , Probabilidade , Prognóstico
3.
Biometrics ; 76(3): 843-852, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-31732971

RESUMO

Referral strategies based on risk scores and medical tests are commonly proposed. Direct assessment of their clinical utility requires implementing the strategy and is not possible in the early phases of biomarker research. Prior to late-phase studies, net benefit measures can be used to assess the potential clinical impact of a proposed strategy. Validation studies, in which the biomarker defines a prespecified referral strategy, are a gold standard approach to evaluating biomarker potential. Uncertainty, quantified by a confidence interval, is important to consider when deciding whether a biomarker warrants an impact study, does not demonstrate clinical potential, or that more data are needed. We establish distribution theory for empirical estimators of net benefit and propose empirical estimators of variance. The primary results are for the most commonly employed estimators of net benefit: from cohort and unmatched case-control samples, and for point estimates and net benefit curves. Novel estimators of net benefit under stratified two-phase and categorically matched case-control sampling are proposed and distribution theory developed. Results for common variants of net benefit and for estimation from right-censored outcomes are also presented. We motivate and demonstrate the methodology with examples from lung cancer research and highlight its application to study design.


Assuntos
Projetos de Pesquisa , Biomarcadores , Estudos de Casos e Controles , Humanos , Incerteza
4.
Breast Cancer Res Treat ; 167(1): 195-203, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-28879558

RESUMO

PURPOSE: To estimate the potential near-term population impact of alternative second opinion breast biopsy pathology interpretation strategies. METHODS: Decision analysis examining 12-month outcomes of breast biopsy for nine breast pathology interpretation strategies in the U.S. health system. Diagnoses of 115 practicing pathologists in the Breast Pathology Study were compared to reference-standard-consensus diagnoses with and without second opinions. Interpretation strategies were defined by whether a second opinion was sought universally or selectively (e.g., 2nd opinion if invasive). Main outcomes were the expected proportion of concordant breast biopsy diagnoses, the proportion involving over- or under-interpretation, and cost of care in U.S. dollars within one-year of biopsy. RESULTS: Without a second opinion, 92.2% of biopsies received a concordant diagnosis. Concordance rates increased under all second opinion strategies, and the rate was highest (95.1%) and under-treatment lowest (2.6%) when all biopsies had second opinions. However, over-treatment was lowest when second opinions were sought selectively for initial diagnoses of invasive cancer, DCIS, or atypia (1.8 vs. 4.7% with no 2nd opinions). This strategy also had the lowest projected 12-month care costs ($5.907 billion vs. $6.049 billion with no 2nd opinions). CONCLUSIONS: Second opinion strategies could lower overall care costs while reducing both over- and under-treatment. The most accurate cost-saving strategy required second opinions for initial diagnoses of invasive cancer, DCIS, or atypia.


Assuntos
Neoplasias da Mama/diagnóstico , Neoplasias da Mama/epidemiologia , Padrões de Referência , Encaminhamento e Consulta/normas , Biópsia/economia , Biópsia/normas , Mama/patologia , Neoplasias da Mama/economia , Neoplasias da Mama/patologia , Técnicas de Apoio para a Decisão , Erros de Diagnóstico/economia , Feminino , Humanos , Uso Excessivo dos Serviços de Saúde/economia , Patologistas/normas , Encaminhamento e Consulta/economia , Estados Unidos
5.
J Am Acad Dermatol ; 79(1): 52-59.e5, 2018 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-29524584

RESUMO

BACKGROUND: Diagnostic interpretations of melanocytic skin lesions vary widely among pathologists, yet the underlying reasons remain unclear. OBJECTIVE: Identify pathologist characteristics associated with rates of accuracy and reproducibility. METHODS: Pathologists independently interpreted the same set of biopsy specimens from melanocytic lesions on 2 occasions. Diagnoses were categorized into 1 of 5 classes according to the Melanocytic Pathology Assessment Tool and Hierarchy for Diagnosis system. Reproducibility was determined by pathologists' concordance of diagnoses across 2 occasions. Accuracy was defined by concordance with a consensus reference standard. Associations of pathologist characteristics with reproducibility and accuracy were assessed individually and in multivariable logistic regression models. RESULTS: Rates of diagnostic reproducibility and accuracy were highest among pathologists with board certification and/or fellowship training in dermatopathology and in those with 5 or more years of experience. In addition, accuracy was high among pathologists with a higher proportion of melanocytic lesions in their caseload composition and higher volume of melanocytic lesions. LIMITATIONS: Data gathered in a test set situation by using a classification tool not currently in clinical use. CONCLUSION: Diagnoses are more accurate among pathologists with specialty training and those with more experience interpreting melanocytic lesions. These findings support the practice of referring difficult cases to more experienced pathologists to improve diagnostic accuracy, although the impact of these referrals on patient outcomes requires additional research.


Assuntos
Melanoma/patologia , Patologistas , Patologia Clínica/normas , Neoplasias Cutâneas/patologia , Biópsia por Agulha , Competência Clínica , Consenso , Técnica Delphi , Feminino , Humanos , Masculino , Variações Dependentes do Observador , Melanoma Maligno Cutâneo
6.
Ann Surg Oncol ; 24(5): 1234-1241, 2017 May.
Artigo em Inglês | MEDLINE | ID: mdl-27913946

RESUMO

BACKGROUND: Surgeons may receive a different diagnosis when a breast biopsy is interpreted by a second pathologist. The extent to which diagnostic agreement by the same pathologist varies at two time points is unknown. METHODS: Pathologists from eight U.S. states independently interpreted 60 breast specimens, one glass slide per case, on two occasions separated by ≥9 months. Reproducibility was assessed by comparing interpretations between the two time points; associations between reproducibility (intraobserver agreement rates); and characteristics of pathologists and cases were determined and also compared with interobserver agreement of baseline interpretations. RESULTS: Sixty-five percent of invited, responding pathologists were eligible and consented; 49 interpreted glass slides in both study phases, resulting in 2940 interpretations. Intraobserver agreement rates between the two phases were 92% [95% confidence interval (CI) 88-95] for invasive breast cancer, 84% (95% CI 81-87) for ductal carcinoma-in-situ, 53% (95% CI 47-59) for atypia, and 84% (95% CI 81-86) for benign without atypia. When comparing all study participants' case interpretations at baseline, interobserver agreement rates were 89% (95% CI 84-92) for invasive cancer, 79% (95% CI 76-81) for ductal carcinoma-in-situ, 43% (95% CI 41-45) for atypia, and 77% (95% CI 74-79) for benign without atypia. CONCLUSIONS: Interpretive agreement between two time points by the same individual pathologist was low for atypia and was similar to observed rates of agreement for atypia between different pathologists. Physicians and patients should be aware of the diagnostic challenges associated with a breast biopsy diagnosis of atypia when considering treatment and surveillance decisions.


Assuntos
Neoplasias da Mama/patologia , Mama/patologia , Carcinoma Ductal de Mama/patologia , Carcinoma Intraductal não Infiltrante/patologia , Patologistas , Adulto , Biópsia , Densidade da Mama , Competência Clínica , Feminino , Humanos , Pessoa de Meia-Idade , Variações Dependentes do Observador , Reprodutibilidade dos Testes , Fatores de Tempo , Estados Unidos
7.
Ann Intern Med ; 164(10): 649-55, 2016 05 17.
Artigo em Inglês | MEDLINE | ID: mdl-26999810

RESUMO

BACKGROUND: The effect of physician diagnostic variability on accuracy at a population level depends on the prevalence of diagnoses. OBJECTIVE: To estimate how diagnostic variability affects accuracy from the perspective of a U.S. woman aged 50 to 59 years having a breast biopsy. DESIGN: Applied probability using Bayes' theorem. SETTING: B-Path (Breast Pathology) Study comparing pathologists' interpretations of a single biopsy slide versus a reference consensus interpretation from 3 experts. PARTICIPANTS: 115 practicing pathologists (6900 total interpretations from 240 distinct cases). MEASUREMENTS: A single representative slide from each of the 240 cases was used to estimate the proportion of biopsies with a diagnosis that would be verified if the same slide were interpreted by a reference group of 3 expert pathologists. Probabilities of confirmation (predictive values) were estimated using B-Path Study results and prevalence of biopsy diagnoses for women aged 50 to 59 years in the Breast Cancer Surveillance Consortium. RESULTS: Overall, if 1 representative slide were used per case, 92.3% (95% CI, 91.4% to 93.1%) of breast biopsy diagnoses would be verified by reference consensus diagnoses, with 4.6% (CI, 3.9% to 5.3%) overinterpreted and 3.2% (CI, 2.7% to 3.6%) underinterpreted. Verification of invasive breast cancer and benign without atypia diagnoses is highly probable; estimated predictive values were 97.7% (CI, 96.5% to 98.7%) and 97.1% (CI, 96.7% to 97.4%), respectively. Verification is less probable for atypia (53.6% overinterpreted and 8.6% underinterpreted) and ductal carcinoma in situ (DCIS) (18.5% overinterpreted and 11.8% underinterpreted). LIMITATIONS: Estimates are based on a testing situation with 1 slide used per case and without access to second opinions. Population-adjusted estimates may differ for women from other age groups, unscreened women, or women in different practice settings. CONCLUSION: This analysis, based on interpretation of a single breast biopsy slide per case, predicts a low likelihood that a diagnosis of atypia or DCIS would be verified by a reference consensus diagnosis. This diagnostic grey zone should be considered in clinical management decisions in patients with these diagnoses. PRIMARY FUNDING SOURCE: National Cancer Institute.


Assuntos
Biópsia , Neoplasias da Mama/diagnóstico , Competência Clínica , Patologistas/normas , Teorema de Bayes , Carcinoma de Mama in situ/diagnóstico , Carcinoma Ductal de Mama/diagnóstico , Feminino , Humanos , Pessoa de Meia-Idade , Padrões de Referência
8.
Clin Chem ; 62(5): 737-42, 2016 05.
Artigo em Inglês | MEDLINE | ID: mdl-27001493

RESUMO

BACKGROUND: Many cancer biomarker research studies seek to develop markers that can accurately detect or predict future onset of disease. To design and evaluate these studies, one must specify the levels of accuracy sought. However, justified target levels are rarely available. METHODS: We describe a way to calculate target levels of sensitivity and specificity for a biomarker intended to be applied in a defined clinical context. The calculation requires knowledge of the prevalence or incidence of cases in the clinical population and the ratio of benefit associated with the clinical consequences of a positive biomarker test in cases (true positive) to cost associated with a positive biomarker test in controls (false positive). Guidance is offered on soliciting the cost/benefit ratio. The calculations are based on the longstanding decision theory concept of providing a net benefit on average in the population, and they rely on some assumptions about uniformity of costs and benefits to those tested. RESULTS: Calculations are illustrated with 3 applications: predicting colon cancer recurrence in stage 1 patients; predicting interval breast cancer (between mammography screenings); and screening for ovarian cancer. CONCLUSIONS: It is feasible to specify target levels of biomarker performance that enable evaluation of the potential clinical impact of biomarkers in early-phase studies. Nevertheless, biomarkers meeting the criteria should still be tested rigorously in studies that measure the actual impact on patient outcomes of using the biomarker to make clinical decisions.


Assuntos
Biomarcadores Tumorais/análise , Neoplasias da Mama/diagnóstico , Neoplasias do Colo/diagnóstico , Neoplasias Ovarianas/diagnóstico , Idoso , Feminino , Humanos , Pessoa de Meia-Idade , Sensibilidade e Especificidade
9.
Stat Med ; 34(27): 3503-15, 2015 Nov 30.
Artigo em Inglês | MEDLINE | ID: mdl-26112650

RESUMO

Biomarkers that predict the efficacy of treatment can potentially improve clinical outcomes and decrease medical costs by allowing treatment to be provided only to those most likely to benefit. We consider the design of a randomized clinical trial in which one objective is to evaluate a treatment selection marker. The marker may be measured prospectively or retrospectively using samples collected at baseline. We describe and contrast criteria around which the trial can be designed. An existing approach focuses on determining if there is a statistical interaction between the marker and treatment. We propose three alternative approaches based on estimating clinically relevant measures of improvement in outcomes with use of the marker. Importantly, our approaches accommodate the common scenario in which the marker-based rule for recommending treatment is developed with data from the trial. Sample sizes are calculated for powering a trial to assess these criteria in the context of adjuvant chemotherapy for the treatment of estrogen-receptor-positive, node-positive breast cancer. In this example, we find that larger sample sizes are generally required for assessing clinical impact than for simply evaluating if there is a statistical interaction between marker and treatment. We also find that retrospectively selecting a case-control subset of subjects for marker evaluation can lead to large efficiency gains, especially if cases and controls are matched on treatment assignment.


Assuntos
Biomarcadores , Seleção de Pacientes , Projetos de Pesquisa , Neoplasias da Mama , Feminino , Humanos , Modelos Estatísticos , Ensaios Clínicos Controlados Aleatórios como Assunto , Projetos de Pesquisa/estatística & dados numéricos , Resultado do Tratamento
10.
JAMA ; 313(11): 1122-32, 2015 Mar 17.
Artigo em Inglês | MEDLINE | ID: mdl-25781441

RESUMO

IMPORTANCE: A breast pathology diagnosis provides the basis for clinical treatment and management decisions; however, its accuracy is inadequately understood. OBJECTIVES: To quantify the magnitude of diagnostic disagreement among pathologists compared with a consensus panel reference diagnosis and to evaluate associated patient and pathologist characteristics. DESIGN, SETTING, AND PARTICIPANTS: Study of pathologists who interpret breast biopsies in clinical practices in 8 US states. EXPOSURES: Participants independently interpreted slides between November 2011 and May 2014 from test sets of 60 breast biopsies (240 total cases, 1 slide per case), including 23 cases of invasive breast cancer, 73 ductal carcinoma in situ (DCIS), 72 with atypical hyperplasia (atypia), and 72 benign cases without atypia. Participants were blinded to the interpretations of other study pathologists and consensus panel members. Among the 3 consensus panel members, unanimous agreement of their independent diagnoses was 75%, and concordance with the consensus-derived reference diagnoses was 90.3%. MAIN OUTCOMES AND MEASURES: The proportions of diagnoses overinterpreted and underinterpreted relative to the consensus-derived reference diagnoses were assessed. RESULTS: Sixty-five percent of invited, responding pathologists were eligible and consented to participate. Of these, 91% (N = 115) completed the study, providing 6900 individual case diagnoses. Compared with the consensus-derived reference diagnosis, the overall concordance rate of diagnostic interpretations of participating pathologists was 75.3% (95% CI, 73.4%-77.0%; 5194 of 6900 interpretations). Among invasive carcinoma cases (663 interpretations), 96% (95% CI, 94%-97%) were concordant, and 4% (95% CI, 3%-6%) were underinterpreted; among DCIS cases (2097 interpretations), 84% (95% CI, 82%-86%) were concordant, 3% (95% CI, 2%-4%) were overinterpreted, and 13% (95% CI, 12%-15%) were underinterpreted; among atypia cases (2070 interpretations), 48% (95% CI, 44%-52%) were concordant, 17% (95% CI, 15%-21%) were overinterpreted, and 35% (95% CI, 31%-39%) were underinterpreted; and among benign cases without atypia (2070 interpretations), 87% (95% CI, 85%-89%) were concordant and 13% (95% CI, 11%-15%) were overinterpreted. Disagreement with the reference diagnosis was statistically significantly higher among biopsies from women with higher (n = 122) vs lower (n = 118) breast density on prior mammograms (overall concordance rate, 73% [95% CI, 71%-75%] for higher vs 77% [95% CI, 75%-80%] for lower, P < .001), and among pathologists who interpreted lower weekly case volumes (P < .001) or worked in smaller practices (P = .034) or nonacademic settings (P = .007). CONCLUSIONS AND RELEVANCE: In this study of pathologists, in which diagnostic interpretation was based on a single breast biopsy slide, overall agreement between the individual pathologists' interpretations and the expert consensus-derived reference diagnoses was 75.3%, with the highest level of concordance for invasive carcinoma and lower levels of concordance for DCIS and atypia. Further research is needed to understand the relationship of these findings with patient management.


Assuntos
Neoplasias da Mama/patologia , Mama/patologia , Erros de Diagnóstico , Variações Dependentes do Observador , Patologia Clínica , Adulto , Biópsia , Carcinoma Ductal de Mama/patologia , Carcinoma Intraductal não Infiltrante/patologia , Feminino , Humanos , Pessoa de Meia-Idade , Invasividade Neoplásica/patologia , Patologia Clínica/normas
11.
Epidemiology ; 25(1): 114-21, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24240655

RESUMO

Net reclassification indices have recently become popular statistics for measuring the prediction increment of new biomarkers. We review the various types of net reclassification indices and their correct interpretations. We evaluate the advantages and disadvantages of quantifying the prediction increment with these indices. For predefined risk categories, we relate net reclassification indices to existing measures of the prediction increment. We also consider statistical methodology for constructing confidence intervals for net reclassification indices and evaluate the merits of hypothesis testing based on such indices. We recommend that investigators using net reclassification indices should report them separately for events (cases) and nonevents (controls). When there are two risk categories, the components of net reclassification indices are the same as the changes in the true- and false-positive rates. We advocate the use of true- and false-positive rates and suggest it is more useful for investigators to retain the existing, descriptive terms. When there are three or more risk categories, we recommend against net reclassification indices because they do not adequately account for clinically important differences in shifts among risk categories. The category-free net reclassification index is a new descriptive device designed to avoid predefined risk categories. However, it experiences many of the same problems as other measures such as the area under the receiver operating characteristic curve. In addition, the category-free index can mislead investigators by overstating the incremental value of a biomarker, even in independent validation data. When investigators want to test a null hypothesis of no prediction increment, the well-established tests for coefficients in the regression model are superior to the net reclassification index. If investigators want to use net reclassification indices, confidence intervals should be calculated using bootstrap methods rather than published variance formulas. The preferred single-number summary of the prediction increment is the improvement in net benefit.


Assuntos
Medição de Risco , Estatística como Assunto , Intervalos de Confiança , Humanos , Análise de Regressão
12.
Clin Chem ; 59(1): 68-74, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23193062

RESUMO

BACKGROUND: The mission of the National Cancer Institute's Early Detection Research Network (EDRN) is to identify and validate cancer biomarkers for clinical use. Since its inception, EDRN investigators have learned a great deal about the process of validating biomarkers for clinical use. Translational research requires a broad spectrum of research expertise, and coordinating collaborative activities can be challenging. The EDRN has developed a robust triage and validation system that serves the roles of both "facilitator" and "brake." CONTENT: The system consists of (a) establishing a reference set of specimens collected under PRoBE (Prospective Specimen Collection Retrospective Blinded Evaluation) design criteria; (b) using the reference set to prevalidate candidate biomarkers before committing to full-scale validation; (c) performing full-scale validation for those markers that pass prevalidation testing; and (d) ensuring that the reference set is sufficiently large in numbers and volumes of sample that it can also be used to study future candidate biomarkers. This system provides rigorous and efficient evaluation of candidate biomarkers and biomarker panels. Reference sets should also be constructed to enable high-quality biomarker-discovery research. SUMMARY: We describe the process of establishing our system in the hope that it will serve as an example of how to validate biomarkers for clinical application. We also hope that this description of the biospecimen reference sets available from the EDRN will encourage the biomarker research community--from academia or industry--to use this resource to advance biomarkers into clinical use.


Assuntos
Biomarcadores Tumorais/análise , Diagnóstico Precoce , Neoplasias/diagnóstico , Humanos , Masculino , Neoplasias da Próstata/diagnóstico , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz
13.
Stat Med ; 32(11): 1877-92, 2013 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-23348801

RESUMO

When an existing standard marker does not have sufficient classification accuracy on its own, new markers are sought with the goal of yielding a combination with better performance. The primary criterion for selecting new markers is that they have good performance on their own and preferably be uncorrelated with the standard. Most often linear combinations are considered. In this paper, we investigate the increment in performance that is possible by combining a novel continuous marker with a moderately performing standard continuous marker under a variety of biologically motivated models for their joint distribution. We find that an uncorrelated continuous marker with moderate performance on its own usually yields only minimally improved performance. We identify other settings that lead to large improvements, including a novel marker that has very poor performance on its own but is highly correlated with the standard and a novel marker with poor to moderate performance that is highly correlated with the standard but only in one class category. These results suggest changing current strategies for identifying markers to be included in panels for possible combination. Using simulated and real datasets, we examine the merits of a broadened strategy that selects panels of markers as candidates on the basis of their joint performance with existing markers, compared with the standard strategy that selects markers on the basis of their marginal performance. We find that a broadened strategy can be fruitful but necessitates using studies with large numbers of subjects.


Assuntos
Biomarcadores/análise , Interpretação Estatística de Dados , Modelos Estatísticos , Curva ROC , Humanos , Sensibilidade e Especificidade
14.
Stat Med ; 32(9): 1467-82, 2013 Apr 30.
Artigo em Inglês | MEDLINE | ID: mdl-23296397

RESUMO

Authors have proposed new methodology in recent years for evaluating the improvement in prediction performance gained by adding a new predictor, Y, to a risk model containing a set of baseline predictors, X, for a binary outcome D. We prove theoretically that null hypotheses concerning no improvement in performance are equivalent to the simple null hypothesis that Y is not a risk factor when controlling for X, H0 : P(D = 1 | X,Y ) = P(D = 1 | X). Therefore, testing for improvement in prediction performance is redundant if Y has already been shown to be a risk factor. We also investigate properties of tests through simulation studies, focusing on the change in the area under the ROC curve (AUC). An unexpected finding is that standard testing procedures that do not adjust for variability in estimated regression coefficients are extremely conservative. This may explain why the AUC is widely considered insensitive to improvements in prediction performance and suggests that the problem of insensitivity has to do with use of invalid procedures for inference rather than with the measure itself. To avoid redundant testing and use of potentially problematic methods for inference, we recommend that hypothesis testing for no improvement be limited to evaluation of Y as a risk factor, for which methods are well developed and widely available. Analyses of measures of prediction performance should focus on estimation rather than on testing for no improvement in performance.


Assuntos
Biomarcadores/análise , Modelos Estatísticos , Valor Preditivo dos Testes , Medição de Risco/métodos , Área Sob a Curva , Simulação por Computador , Humanos , Curva ROC , Obstrução da Artéria Renal/diagnóstico , Obstrução da Artéria Renal/cirurgia
15.
BMC Womens Health ; 13: 3, 2013 Feb 05.
Artigo em Inglês | MEDLINE | ID: mdl-23379630

RESUMO

BACKGROUND: Diagnostic test sets are a valuable research tool that contributes importantly to the validity and reliability of studies that assess agreement in breast pathology. In order to fully understand the strengths and weaknesses of any agreement and reliability study, however, the methods should be fully reported. In this paper we provide a step-by-step description of the methods used to create four complex test sets for a study of diagnostic agreement among pathologists interpreting breast biopsy specimens. We use the newly developed Guidelines for Reporting Reliability and Agreement Studies (GRRAS) as a basis to report these methods. METHODS: Breast tissue biopsies were selected from the National Cancer Institute-funded Breast Cancer Surveillance Consortium sites. We used a random sampling stratified according to woman's age (40-49 vs. ≥50), parenchymal breast density (low vs. high) and interpretation of the original pathologist. A 3-member panel of expert breast pathologists first independently interpreted each case using five primary diagnostic categories (non-proliferative changes, proliferative changes without atypia, atypical ductal hyperplasia, ductal carcinoma in situ, and invasive carcinoma). When the experts did not unanimously agree on a case diagnosis a modified Delphi method was used to determine the reference standard consensus diagnosis. The final test cases were stratified and randomly assigned into one of four unique test sets. CONCLUSIONS: We found GRRAS recommendations to be very useful in reporting diagnostic test set development and recommend inclusion of two additional criteria: 1) characterizing the study population and 2) describing the methods for reference diagnosis, when applicable.


Assuntos
Doenças Mamárias/patologia , Neoplasias da Mama/classificação , Neoplasias da Mama/patologia , Garantia da Qualidade dos Cuidados de Saúde/normas , Mama/patologia , Neoplasias da Mama/diagnóstico , Diagnóstico Diferencial , Feminino , Humanos , Variações Dependentes do Observador , Valor Preditivo dos Testes , Reprodutibilidade dos Testes , Projetos de Pesquisa/normas , Sensibilidade e Especificidade
16.
Lifetime Data Anal ; 19(2): 170-201, 2013 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-23358916

RESUMO

When an existing risk prediction model is not sufficiently predictive, additional variables are sought for inclusion in the model. This paper addresses study designs to evaluate the improvement in prediction performance that is gained by adding a new predictor to a risk prediction model. We consider studies that measure the new predictor in a case-control subset of the study cohort, a practice that is common in biomarker research. We ask if matching controls to cases in regards to baseline predictors improves efficiency. A variety of measures of prediction performance are studied. We find through simulation studies that matching improves the efficiency with which most measures are estimated, but can reduce efficiency for some. Efficiency gains are less when more controls per case are included in the study. A method that models the distribution of the new predictor in controls appears to improve estimation efficiency considerably.


Assuntos
Estudos de Casos e Controles , Tomada de Decisões , Medição de Risco/normas , Doenças Cardiovasculares/etiologia , Feminino , Humanos , Masculino , Modelos Estatísticos , Estudos Prospectivos , Curva ROC , Projetos de Pesquisa , Medição de Risco/estatística & dados numéricos , Fatores de Risco
17.
Lifetime Data Anal ; 19(4): 568-88, 2013 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-23807695

RESUMO

Two-phase study methods, in which more detailed or more expensive exposure information is only collected on a sample of individuals with events and a small proportion of other individuals, are expected to play a critical role in biomarker validation research. One major limitation of standard two-phase designs is that they are most conveniently employed with study cohorts in which information on longitudinal follow-up and other potential matching variables is electronically recorded. However for many practical situations, at the sampling stage such information may not be readily available for every potential candidates. Study eligibility needs to be verified by reviewing information from medical charts one by one. In this manuscript, we study in depth a novel study design commonly undertaken in practice that involves sampling until quotas of eligible cases and controls are identified. We propose semiparametric methods to calculate risk distributions and a wide variety of prediction indices when outcomes are censored failure times and data are collected under the quota sampling design. Consistency and asymptotic normality of our estimators are established using empirical process theory. Simulation results indicate that the proposed procedures perform well in finite samples. Application is made to the evaluation of a new risk model for predicting the onset of cardiovascular disease.


Assuntos
Risco , Biomarcadores/sangue , Bioestatística , Proteína C-Reativa/análise , Doenças Cardiovasculares/sangue , Doenças Cardiovasculares/epidemiologia , Doenças Cardiovasculares/etiologia , Estudos de Casos e Controles , Humanos , Massachusetts/epidemiologia , Modelos Estatísticos , Valor Preditivo dos Testes , Modelos de Riscos Proporcionais , Fatores de Risco , Estudos de Amostragem
18.
Am J Epidemiol ; 176(6): 482-7, 2012 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-22875756

RESUMO

In this issue of the Journal, Pencina and et al. (Am J Epidemiol. 2012;176(6):492-494) examine the operating characteristics of measures of incremental value. Their goal is to provide benchmarks for the measures that can help identify the most promising markers among multiple candidates. They consider a setting in which new predictors are conditionally independent of established predictors. In the present article, the authors consider more general settings. Their results indicate that some of the conclusions made by Pencina et al. are limited to the specific scenarios the authors considered. For example, Pencina et al. observed that continuous net reclassification improvement was invariant to the strength of the baseline model, but the authors of the present study show this invariance does not hold generally. Further, they disagree with the suggestion that such invariance would be desirable for a measure of incremental value. They also do not see evidence to support the claim that the measures provide complementary information. In addition, they show that correlation with baseline predictors can lead to much bigger gains in performance than the conditional independence scenario studied by Pencina et al. Finally, the authors note that the motivation of providing benchmarks actually reinforces previous observations that the problem with these measures is they do not have useful clinical interpretations. If they did, researchers could use the measures directly and benchmarks would not be needed.


Assuntos
Biomarcadores/metabolismo , Interpretação Estatística de Dados , Modelos Logísticos , Curva ROC , Medição de Risco/métodos , Humanos
19.
Biostatistics ; 12(1): 87-101, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-20639522

RESUMO

The diagnostic likelihood ratio function, DLR, is a statistical measure used to evaluate risk prediction markers. The goal of this paper is to develop new methods to estimate the DLR function. Furthermore, we show how risk prediction markers can be compared using rank-invariant DLR functions. Various estimators are proposed that accommodate cohort or case-control study designs. Performances of the estimators are compared using simulation studies. The methods are illustrated by comparing a lung function measure and a nutritional status measure for predicting subsequent onset of major pulmonary infection in children suffering from cystic fibrosis. For continuous markers, the DLR function is mathematically related to the slope of the receiver operating characteristic (ROC) curve, an entity used to evaluate diagnostic markers. We show that our methodology can be used to estimate the slope of the ROC curve and illustrate use of the estimated ROC derivative in variance and sample size calculations for a diagnostic biomarker study.


Assuntos
Biomarcadores/análise , Modelos Estatísticos , Curva ROC , Medição de Risco/métodos , Peso Corporal/fisiologia , Criança , Simulação por Computador , Fibrose Cística/complicações , Feminino , Volume Expiratório Forçado/fisiologia , Humanos , Masculino , Valor Preditivo dos Testes
20.
Clin Chem ; 58(8): 1242-51, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22730452

RESUMO

BACKGROUND: Selecting controls that match cases on risk factors for the outcome is a pervasive practice in biomarker research studies. Such matching, however, biases estimates of biomarker prediction performance. The magnitudes of these biases are unknown. METHODS: We examined the prediction performance of biomarkers and improvements in prediction gained by adding biomarkers to risk factor information. Data simulated from bivariate normal statistical models and data from a study to identify critically ill patients were used. We compared true performance with that estimated from case control studies that do or do not use matching. ROC curves were used to quantify performance. We propose a new statistical method to estimate prediction performance from matched studies for which data on the matching factors are available for subjects in the population. RESULTS: Performance estimated with standard analyses can be grossly biased by matching, especially when biomarkers are highly correlated with matching risk factors. In our studies, the performance of the biomarker alone was underestimated whereas the improvement in performance gained by adding the marker to risk factors was overestimated by 2-10-fold. We found examples for which the relative ranking of 2 biomarkers for prediction was inappropriately reversed by use of a matched design. The new approach to estimation corrected for bias in matched studies. CONCLUSIONS: To properly gauge prediction performance in the population or the improvement gained by adding a biomarker to known risk factors, matched case control studies must be supplemented with risk factor information from the population and must be analyzed with nonstandard statistical methods.


Assuntos
Biomarcadores , Pesquisa Biomédica/estatística & dados numéricos , Estudos de Casos e Controles , Modelos Estatísticos , Viés , Biomarcadores/análise , Estado Terminal , Humanos , Programas de Rastreamento/estatística & dados numéricos , Curva ROC , Fatores de Risco
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA