RESUMO
A major application of tumor biomarkers is in serial monitoring of cancer patients, but there are no published guidelines on how to evaluate biomarkers for this purpose. The European Group on Tumor Markers has convened a multidisciplinary panel of scientists to develop guidance on the design of such monitoring trials. The panel proposes a 4-phase model for biomarker-monitoring trials analogous to that in use for the investigation of new drugs. In phase I, biomarker kinetics and correlation with tumor burden are assessed. Phase II evaluates the ability of the biomarker to identify, exclude, and/or predict a change in disease status. In phase III, the effectiveness of tumor biomarker-guided intervention is assessed by measuring patient outcome in randomized trials. Phase IV consists of an audit of the long-term effects after biomarker monitoring has been included into standard patient care. Systematic well-designed evaluations of biomarkers for monitoring may provide a stronger evidence base that might enable their earlier use in evaluating responses to cancer therapy.
Assuntos
Biomarcadores Tumorais/análise , Monitorização Fisiológica , Neoplasias/diagnóstico , Ensaios Clínicos como Assunto , Europa (Continente) , Humanos , Neoplasias/patologiaRESUMO
BACKGROUND: Reference change values are used to assess the significance of a difference in two consecutive results from an individual. Reference change value calculations provide the limits for significant differences between two results due to analytical and inherent biological variations. Often more than two serial results are available. Using the reference change value concept on more than two measurements results in an increased number of false-positive results. This problem has been solved for both uni- and bidirectional differences through use of wider limits when additional results are included. METHODS: Based on normally (Gaussianly) distributed simulated data, a dynamic reference change value model was developed using more than two results and total coefficients of variation. The dynamic reference change value model includes validation of a set-point as the mean of the four first serial results and additional results are assessed for compliance to the steady state with the same set-point. Furthermore, the dynamic reference change value model compensates for increasing false-positive results with subsequent results. The dynamic reference change value model was designed to calculate significant limits for bidirectional differences. RESULTS: Reference change factors were calculated for multiplication of the mean of previous results to create the limits for significant differences. The reference change factors are provided as a function of number of results and total coefficients of variation both in tables and in figures. CONCLUSIONS: The dynamic reference change value model is appropriate for ongoing assessment of the steady state of a biomarker using more than two serial results.
Assuntos
Biomarcadores/análise , Técnicas de Laboratório Clínico/normas , Reações Falso-Positivas , Homeostase , Modelos Estatísticos , Distribuição Normal , Valores de Referência , Reprodutibilidade dos TestesRESUMO
Background Many clinical decisions are based on comparison of patient results with reference intervals. Therefore, an estimation of the analytical performance specifications for the quality that would be required to allow sharing common reference intervals is needed. The International Federation of Clinical Chemistry (IFCC) recommended a minimum of 120 reference individuals to establish reference intervals. This number implies a certain level of quality, which could then be used for defining analytical performance specifications as the maximum combination of analytical bias and imprecision required for sharing common reference intervals, the aim of this investigation. Methods Two methods were investigated for defining the maximum combination of analytical bias and imprecision that would give the same quality of common reference intervals as the IFCC recommendation. Method 1 is based on a formula for the combination of analytical bias and imprecision and Method 2 is based on the Microsoft Excel formula NORMINV including the fractional probability of reference individuals outside each limit and the Gaussian variables of mean and standard deviation. The combinations of normalized bias and imprecision are illustrated for both methods. The formulae are identical for Gaussian and log-Gaussian distributions. Results Method 2 gives the correct results with a constant percentage of 4.4% for all combinations of bias and imprecision. Conclusion The Microsoft Excel formula NORMINV is useful for the estimation of analytical performance specifications for both Gaussian and log-Gaussian distributions of reference intervals.
Assuntos
Química Clínica , Agências Internacionais/normas , Viés , Humanos , Distribuição Normal , Valores de ReferênciaRESUMO
BACKGROUND: Diagnostic decisions based on decision limits according to medical guidelines are different from the majority of clinical decisions due to the strict dichotomization of patients into diseased and non-diseased. Consequently, the influence of analytical performance is more critical than for other diagnostic decisions where much other information is included. The aim of this opinion paper is to investigate consequences of analytical quality and other circumstances for the outcome of "Guideline-Driven Medical Decision Limits". TERMS: Effects of analytical bias and imprecision should be investigated separately and analytical quality specifications should be estimated accordingly. BIOLOGICAL VARIATION AND ANALYTICAL PERFORMANCE: Use of sharp decision limits doesn't consider biological variation and effects of this variation are closely connected with the effects of analytical performance. Such relationships are investigated for the guidelines for HbA1c in diagnosis of diabetes and in risk of coronary heart disease based on serum cholesterol. The effects of a second sampling in diagnosis give dramatic reduction in the effects of analytical quality showing minimal influence of imprecision up to 3 to 5% for two independent samplings, whereas the reduction in bias is more moderate and a 2% increase in concentration doubles the percentage of false positive diagnoses, both for HbA1c and cholesterol. FREQUENCY OF FOLLOW-UP LABORATORY TESTS: An alternative approach comes from the current application of guidelines for follow-up laboratory tests according to clinical procedure orders, e.g. frequency of parathyroid hormone requests as a function of serum calcium concentrations. Here, the specifications for bias can be evaluated from the functional increase in requests for increasing serum calcium concentrations. PROBABILITY FUNCTION FOR DIAGNOSES: In consequence of the difficulties with biological variation and the practical utilization of concentration dependence of frequency of follow-up laboratory tests already in use, a kind of probability function for diagnosis as function of the key-analyte is proposed.
Assuntos
Colesterol/sangue , Técnicas de Laboratório Clínico/normas , Tomada de Decisões Assistida por Computador , Hemoglobinas Glicadas/análise , Guias de Prática Clínica como Assunto , Viés , Cálcio/sangue , Doença das Coronárias/sangue , Doença das Coronárias/diagnóstico , Diabetes Mellitus/diagnóstico , Reações Falso-Positivas , HumanosRESUMO
BACKGROUND: Diagnostic decisions based on decision limits according to medical guidelines are different from the majority of clinical decisions due to the strict dichotomization of patients into diseased and non-diseased. Consequently, the influence of analytical performance is more critical than for other diagnostic decisions where much other information is included. The aim of this opinion paper is to investigate consequences of analytical quality and other circumstances for the outcome of "Guideline-Driven Medical Decision Limits". TERMS: Effects of analytical bias and imprecision should be investigated separately and analytical quality specifications should be estimated accordingly. BIOLOGICAL VARIATION AND ANALYTICAL PERFORMANCE: Use of sharp decision limits doesn't consider biological variation and effects of this variation are closely connected with the effects of analytical performance. Such relationships are investigated for the guidelines for HbA1c in diagnosis of diabetes and in risk of coronary heart disease based on serum cholesterol. The effects of a second sampling in diagnosis give dramatic reduction in the effects of analytical quality showing minimal influence of imprecision up to 3 to 5% for two independent samplings, whereas the reduction in bias is more moderate and a 2% increase in concentration doubles the percentage of false positive diagnoses, both for HbA1c and cholesterol. FREQUENCY OF FOLLOW-UP LABORATORY TESTS: An alternative approach comes from the current application of guidelines for follow-up laboratory tests according to clinical procedure orders, e.g. frequency of parathyroid hormone requests as a function of serum calcium concentrations. Here, the specifications for bias can be evaluated from the functional increase in requests for increasing serum calcium concentrations. PROBABILITY FUNCTION FOR DIAGNOSES: In consequence of the difficulties with biological variation and the practical utilization of concentration dependence of frequency of follow-up laboratory tests already in use, a kind of probability function for diagnosis as function of the key-analyte is proposed.
Assuntos
Técnicas de Laboratório Clínico/métodos , Guias de Prática Clínica como Assunto , Viés , Técnicas de Laboratório Clínico/normas , Reações Falso-Positivas , Humanos , Limite de Detecção , Probabilidade , Garantia da Qualidade dos Cuidados de Saúde , Valores de Referência , Reprodutibilidade dos Testes , Projetos de PesquisaRESUMO
Clinical activity indices are essential instruments in monitoring inflammatory bowel diseases such as Crohn's disease (CD) and ulcerative colitis (UC). To subclassify components of disease indices in CD and UC, investigate technical noise in estimation of the indices, establish a signal-to-noise ratio (SNR), evaluate correlation between indices and calculate the reference change value (RCV) for selected biochemical variables in individual cases, 50 patients with CD and 49 patients with UC were included in the study. Qualitative index variables were assessed for scoring errors. The standard deviation (SD) was estimated according to a rectangular model, while SD in biochemical variable scoring was estimated according to a Gaussian model; a combined SD was also calculated. These values were investigated for their individual contribution to variation. The 95% CI of an index value was based on +/- 1.96 x SD(combined) and a change in separate biochemical variables was calculated as RCV 1.96 x radical2 x SD(combined). Correlation between different disease activity indices was assessed for unexplained variation. The Crohn's disease activity index (CDAI) had the highest variation compared to the van Hees (Hees) and the Harvey-Bradshaw index (HBI) in CD, but it also had the best SNR, whereas HBI had the lowest. In UC the clinical activity index (CAI) showed the highest variance, but the best SNR compared to Seo's activity index (AI). The 95% CI of the CDAI discriminatory activity sum of 150 in individual cases was 105-195, whereas the 95% interval for a change was +/-62.4. Self-reported wellness contributed 40% to total variance in the CDAI. Factors of clinical importance increased errors in estimates and variance of the indices. Poor correlation was obtained between activity indices, with up to 70% unexplained variance. The SD(combined) for estimated errors was as high as 23 points, with the best SNR being approximately 20. Index factors increase the sensitivity of SNRs to errors and lower the disease specificity. Sensitivity optimisation may be achieved by standardisation of the variables and their use.
Assuntos
Colite Ulcerativa/diagnóstico , Doença de Crohn/diagnóstico , Patologia Clínica/normas , Índice de Gravidade de Doença , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Progressão da Doença , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Sensibilidade e EspecificidadeRESUMO
BACKGROUND: The aims of this report were to examine how unequal subgroup prevalences in the source population may affect reference interval partitioning decisions and to develop generally applicable guidelines for partitioning gaussian-distributed data. METHODS: We recently proposed a new model for partitioning reference intervals when the underlying data distribution is gaussian. This model is based on controlling the proportions of the subgroup distributions that fall outside each of the common reference limits, using the distances between the reference limits of the subgroup distributions as functions to these proportions. We examine the significance of the unequal prevalence effect for the partitioning problem and quantify it for distance partitioning criteria by deriving analytical expressions to express these criteria as a function of the ratio of prevalences. An application example, illustrating various aspects of the importance of the prevalence effect, is also presented. RESULTS: Dramatic shrinkage of the critical distances between reference limits of the subgroups needed for partitioning was observed as the ratio of prevalences, the larger one divided by the smaller one, was increased from unity. Because of this shrinkage, the same critical distances are not valid for all ratios of prevalences, but specific critical distances should be used for each particular value of this ratio. Although proportion criteria used in determining the need for reference interval partitioning are not dependent on the prevalence effect, this effect should be accounted for when these criteria are being applied by adjusting the sample sizes of the subgroups to make them correspond to the ratio of prevalences. CONCLUSIONS: The prevalences of subgroups in the reference population should be known and observed in the calculations for every reference interval study, irrespective of whether distance or proportion criteria are being used to determine the need for reference interval partitioning. We present detailed methods to account for the prevalences when applying each of these types of criteria. Analytical expressions for the distance criteria, to be used when high precision is needed, and approximate distances, to be used in practical work, are derived. General guidelines for partitioning gaussian distributed data are presented. Following these guidelines and using the new model, we suggest that partitioning can be performed more reliably than with any of the earlier models because the new model not only offers an improved correspondence between the critical distances and the critical proportions, but also accounts for the prevalence effect.
Assuntos
Técnicas de Laboratório Clínico/estatística & dados numéricos , Biometria , Humanos , Modelos Biológicos , Distribuição Normal , Valores de Referência , Tamanho da AmostraRESUMO
The reference interval is probably the most widely used decision-making tool in clinical practice, with a modern use aiming at identifying wellness during health check and screening. Its use as a diagnostic tool is much less recognised and may be obsolete. The present study investigates the consequences of the new practice for the interpretation of prospective value, negative vs. positive, the probability of confirming wellness, and number of false results based on selected strategy for reference interval establishment. Calculations assumed normalised Gaussian-distributed reference intervals with analytical variation set to zero and absolute accuracy. Also assumed is the independency of tests. Probability for no values outside reference intervals in healthy subjects was calculated from the formula p(no) outside=(1 - p(single)) and according to the formula for repeated testing: p(one) outside =n x p(single) (1 - p(single))n-1 etc. Here n is the number of tests performed and p(single) is the probability of one result outside reference limits with the general formula p(i) outside n-i=k x p(single)i (1- p(single))n-i, with k being the binominal coefficient and i the number outside the reference intervals. Use of the 99.9 centile for health checks will increase the probability for no false from 60% to 99% for 10 tests, and from 46% to 98% for 15 tests. The probability for one false-positive result in 10 tests in a panel can be reduced from 32% to 1% if the 99.9% centile is substituted for the 95% centile. For two in 10 tests, the probability can be reduced from 8% to below 0.1%. In both cases, selection of the 99.9% centile improves the diagnostic accuracy. Reference intervals are needed as a "true" negative reference for absence of disease, and should cover the 99.9% centile of the reference distribution of an analyte to avoid false positives. For this new use, it is critical that reference persons are absolutely normal without clinical, genetic and biochemical signs of the condition being investigated. However, reference intervals cannot substitute clinical decision limits for diagnosis and medical intervention.
Assuntos
Técnicas de Laboratório Clínico/normas , Valores de Referência , Intervalos de Confiança , Erros de Diagnóstico , Humanos , Probabilidade , Tamanho da Amostra , Distribuições EstatísticasRESUMO
The aim of this study was to investigate similarities and differences in the distribution of serum concentrations of nine proteins in two racial groups (Caucasian and Asian Indian) of adult males living in the same geographical area (Leeds, Bradford, UK) for at least two generations. This is part of a larger study to determine the need for separating reference intervals for racial and ethnic groups worldwide. The distributions of concentrations for all proteins evaluated in the Indians fit In-Gaussian distributions, indicating probable homogeneity. However, for the Caucasians, the distributions for alpha1-antitrypsin and possibly haptoglobin were not In-Gaussian. In the former case, this is undoubtedly due to the number of Caucasians with lower-concentration phenotypes (Pi MS and MZ). Although haptoglobin differences may be due to genetic variants as well, this is not a complete explanation. In addition, the Indians have lower serum concentrations of orosomucoid (alpha1-acid glycoprotein), as has been reported by others. It is apparent that for some proteins, including alpha1-antitrypsin, orosomucoid, and possibly haptoglobin, the populations show differences that require the use of separate reference intervals. In addition to genetic influences, environmental differences cannot be ruled out as partial causes for some of the differences noted.
Assuntos
Proteínas Sanguíneas/normas , Valores de Referência , Adulto , Distribuição por Idade , Proteínas Sanguíneas/análise , Interpretação Estatística de Dados , Humanos , Índia , Masculino , Pessoa de Meia-Idade , Reino Unido/etnologia , População BrancaRESUMO
BACKGROUND: The aim of this study was to develop new and useful criteria for partitioning reference values into subgroups applicable to gaussian distributions and to distributions that can be transformed to gaussian distributions. METHODS: The proposed criteria relate to percentages of the subgroups outside each of the reference limits of the combined distribution. Critical values suggested as partitioning criteria for these percentages were derived from analytical bias quality specifications for using common reference intervals throughout a geographic area. As alternative partitioning criteria to the actual percentages, these were transformed mathematically to critical distances between the reference limits of the subgroup distributions, to be applied to each pair of reference limits, the upper and the lower, at a time. The new criteria were tested using data on various plasma proteins collected from approximately 500 reference individuals, and the outcomes were compared with those given by the currently widely applied and recommended partitioning model of Harris and Boyd, the "Harris-Boyd model". RESULTS: We suggest 4.1% as the critical minimum percentage outside that would justify partitioning into subgroups, and 3.2% as the critical maximum percentage outside that would justify combining them. Percentages between these two values should be classified as marginal, implying that nonstatistical considerations are required to make the final decision on partitioning. The correlation between the critical percentages and the critical distances was mathematically precise in the new model, whereas this correlation is rather approximate in the Harris-Boyd model because focus on the difference between means in this model makes high precision hard to achieve. The application examples suggested that the new model is more radical than the Harris-Boyd model. CONCLUSIONS: New percentage and distance criteria, to be used for partitioning gaussian-distributed data, have been developed. The distance criteria, applied separately to both reference limit pairs of the subgroup distributions, seemed more reliable and correlated more accurately with the critical percentages than the distance criteria of the Harris-Boyd model. As opposed to the Harris-Boyd model, the new model is easily adjustable to new critical values of the percentages, should they need to be changed in the future.
Assuntos
Técnicas de Laboratório Clínico/estatística & dados numéricos , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Modelos Biológicos , Distribuição Normal , Valores de ReferênciaRESUMO
Repeated samplings and measurements in the monitoring of patients to look for changes are common clinical problems. The "reference change value", calculated as zp x [2 x (CVI2 + CVA2)](1/2), where zp is the z-statistic and CVI and CVA are within-subject and analytical coefficients of variation, respectively, has been used to detect whether a measured difference between measurements is statistically significant. However, a reference change value only detects the probability of false-positives (type I error), and for this reason, a model to calculate the risk of missing significant changes in serial results from individuals (probability of false-negatives) is investigated in this work by means of power functions. Therefore, when an analyte is being monitored in a patient, power functions estimate the probability of detecting a defined real change by measuring the difference. Thus, when a measured difference is the same as the calculated reference change value, then it will be detected in only 50% of situations.
Assuntos
Cálcio/sangue , Creatinina/sangue , Hemoglobina A/análise , Valores de Referência , Análise de Variância , Interpretação Estatística de Dados , Reações Falso-Positivas , Humanos , Modelos Biológicos , Valor Preditivo dos Testes , ProbabilidadeRESUMO
A well-known transformation from the bell-shaped Gaussian (normal) curve to a straight line in the rankit plot is investigated, and a tool for evaluation of the distribution of reference groups is presented. It is based on the confidence intervals for percentiles of the calculated Gaussian distribution and the percentage of cumulative points exceeding these limits. The process is to rank the reference values and plot the cumulative frequency points in a rankit plot with a logarithmic (In=log(e)) transformed abscissa. If the distribution is close to In-Gaussian the cumulative frequency points will fit to the straight line describing the calculated In-Gaussian distribution. The quality of the fit is evaluated by adding confidence intervals (CI) to each point on the line and calculating the percentage of points outside the hyperbola-like CI-curves. The assumption was that the 95% confidence curves for percentiles would show 5% of points outside these limits. However, computer simulations disclosed that approximate 10% of the series would have 5% or more points outside the limits. This is a conservative validation, which is more demanding than the Kolmogorov-Smirnov test. The graphical presentation, however, makes it easy to disclose deviations from In-Gaussianity, and to make other interpretations of the distributions, e.g., comparison to non-Gaussian distributions in the same plot, where the cumulative frequency percentage can be read from the ordinate. A long list of examples of In-Gaussian distributions of subgroups of reference values from healthy individuals is presented. In addition, distributions of values from well-defined diseased individuals may show up as In-Gaussian. It is evident from the examples that the rankit transformation and simple graphical evaluation for non-Gaussianity is a useful tool for the description of sub-groups.
Assuntos
Valores de Referência , Distribuições Estatísticas , Análise Química do Sangue/normas , Intervalos de Confiança , Interpretação Estatística de Dados , Humanos , Estado Pré-Diabético/diagnósticoRESUMO
Reference intervals are recommended for naturally occurring quantities and required in the evaluation of new components in order to provide clinically useful information. The aim of the present study is to present a method for selecting reference individuals for the determination of fasting venous plasma glucose (f-vPG) reference intervals and ways to determine if disease groups can share reference intervals with an ideal reference population. Reference subjects were randomly selected, eligibility was judged according to predetermined inclusion and exclusion criteria. Using the literature we selected risk indicators for diabetes mellitus (DM) and used these indicators to rule out high-risk individuals in order to obtain a reference distribution of f-vPG determined using individuals with low risk of DM. The distribution of f-vPG in the high-risk individuals was compared with that determined for the low-risk group. We then estimated the ability of the high-risk individuals to share the reference interval of the low-risk individuals, and calculated the fraction that was outside this interval. Distributions were also investigated for linearity in the cumulated frequency rankit distribution of In-values. The allowable difference between two reference limits could not exceed 0.375 times the population biological variation. Most risk indicators were powerful predictors of high f-vPG values. Subgroups with these risk indicators should not be included in the homogeneous In-normally distributed reference distribution. Distributions of f-vPG concentrations in individuals with risk factors were not homogeneous and varying percentages of individuals were outside the reference distribution, having f-vPG greater than 7.0 mmol/l. We conclude that randomisation is only useful to recruit candidate reference subjects. To rule out subjects according to clinical risk factors for diabetes, it is necessary to identify a reference population with low risk of exhibiting increased f-vPG concentrations. This method may be used to validate a reference interval for a particular analyte with respect to an investigated disease, and to stratify risk factors of importance.
Assuntos
Glicemia/análise , Diabetes Mellitus/sangue , Valores de Referência , Jejum , Humanos , Risco , Estudos de Amostragem , Distribuições EstatísticasRESUMO
It has previously been shown that thyroid antibodies affect thyroid stimulating hormone (TSH) concentrations in men and women and that TSH levels are predictive of future thyroid disease. We investigated the validity of the National Academy of Clinical Biochemistry (NACB) guidelines regarding the TSH reference interval by studying 1512 individuals. Two hundred and fifty had at least one thyroid antibody, 121 were taking medications other than estrogens and occasional analgesics, and 105 reported a family history of thyroid disease. Serum TSH, thyroid peroxidase antibodies (TPOab) and thyroglobulin antibodies (Tgab) were determined on AutoDELFIA and TSHRab by a radioreceptor assay (RRA) from Brahms Diagnostica. For individuals without thyroid antibodies and other risk factors, no effect of age and gender was seen for serum TSH. Neither medication nor the presence of Tgab alone had any influence on serum TSH. TPOab alone or in combination with Tgab were associated with an increased serum TSH level. The 'cumulative percentage distributions' of subgroups, as well as the combined population, was In-Gaussian distributed. The central 95% of the population was within the 95% CI in rankit-plots. Consequently, a common reference interval for serum TSH of 0.58-4.07 mlU/l for all adults between 17 and 66 years of age was established. This reference interval is much higher than expected from the NACB-guidelines.