RESUMO
ISO 15189 requires laboratories to estimate the uncertainty of their quantitative measurements and to maintain them within relevant performance specifications. Furthermore, it refers to ISO TS 20914 for instructions on how to estimate the uncertainty and what to take into consideration when communicating uncertainty of measurement with requesting clinicians. These instructions include the responsibility of laboratories to verify that bias is not larger than medically significant. If estimated to be larger than acceptable, such bias first needs to be eliminated or (temporarily) corrected for. In the latter case, the uncertainty of such correction becomes part of the estimation of the total measurement uncertainty. If small enough to be acceptable, bias becomes part of the long term within laboratory random variation. Sources of possible bias are (not limited to) changes in reagent or calibrator lot variation or calibration itself. In this paper we clarify how the rationale and mathematics from an EFLM WG ISO/A position paper on allowable between reagent lot variation can be applied to calculate whether bias can be accepted to become part of long-term imprecision. The central point of this rationale is to prevent the risk that requesting clinicians confuse changes in bias with changes in the steady state of their patients.
Assuntos
Viés , Humanos , Incerteza , Calibragem , Técnicas de Laboratório Clínico/normasRESUMO
In this computer simulation study, we examine four different statistical approaches of linearity assessment, including two variants of deviation from linearity (individual (IDL) and averaged (AD)), along with detection capabilities of residuals of linear regression (individual and averaged). From the results of the simulation, the following broad suggestions are provided to laboratory practitioners when performing linearity assessment. A high imprecision can challenge linearity investigations by producing a high false positive rate or low power of detection. Therefore, the imprecision of the measurement procedure should be considered when interpreting linearity assessment results. In the presence of high imprecision, the results of linearity assessment should be interpreted with caution. Different linearity assessment approaches examined in this study performed well under different analytical scenarios. For optimal outcomes, a considered and tailored study design should be implemented. With the exception of specific scenarios, both ADL and IDL methods were suboptimal for the assessment of linearity compared. When imprecision is low (3â¯%), averaged residual of linear regression with triplicate measurements and a non-linearity acceptance limit of 5â¯% produces <5â¯% false positive rates and a high power for detection of non-linearity of >70â¯% across different types and degrees of non-linearity. Detection of departures from linearity are difficult to identify in practice and enhanced methods of detection need development.
Assuntos
Simulação por Computador , Modelos Lineares , HumanosRESUMO
OBJECTIVES: This study performed an analytical validation study of the Mindray high-sensitivity cardiac troponin I (hs-cTnI) assay addressing limit of blank (LoB), limit of detection (LoD), precision, linearity, analytical specificity and sex-specific 99th percentile upper reference limits. METHODS: LoB, LoD, precision, linearity and analytical specificity were studied according to Clinical and Laboratory Standards Institute. We used one reagent lot and one CL1200i analyzer. Skeletal troponin I and T, cardiac troponin T, troponin C, actin, tropomyosin, myosin light chain, myoglobin and creatine kinase (CK-MB) were studied for cross-reactivity. Interference with biotin was examined. Lithium heparin samples (one freeze thaw cycle) from healthy males and females were measured to determine the 99th percentiles by using the non-parametric method. Analyses were performed before and after excluding subjects with clinical conditions and/or increased surrogate biomarkers. RESULTS: The Mindray hs-cTnI assay met criteria to be considered as a hs-cTn assay. LoB and LoD was <0.1â¯ng/L and 0.1â¯ng/L, respectively. Repeatability had a coefficient of variation 1.2-3.8â¯%, and within-laboratory imprecision 1.7-5.0â¯%. The measuring interval ranged from 1.1 to 28,180â¯ng/L. The analytical specificity was clinically acceptable for the interferents studied. After exclusions, the 99th percentile URLs obtained were 10â¯ng/L overall, 5â¯ng/L for females and 12â¯ng/L for males. CONCLUSIONS: Analytical observations of the Mindray hs-cTnI assay demonstrated excellent LoB, LoD, precision, linearity and analytical specificity, that were in alignment with the manufacturer's claims and regulatory guidelines for hs-cTnI. The assay is suitable for clinical investigation for patient-oriented studies.
Assuntos
Limite de Detecção , Troponina I , Humanos , Troponina I/sangue , Troponina I/análise , Masculino , Feminino , Adulto , Pessoa de Meia-Idade , Análise Química do Sangue/normas , Análise Química do Sangue/métodos , Valores de Referência , Reprodutibilidade dos Testes , Adulto JovemRESUMO
BACKGROUND: Serial echocardiographic assessments are common in clinical cardiology, e.g., for timing of intervention in mitral and aortic regurgitation. When following patients with serial echocardiograms, each new measurement is a combination of true change and confounding noise. The current investigation compares linear chamber dimensions with volume estimates of chamber size. The aim is to assess which measure is best for serial echocardiograms, when the ideal parameter will be sensitive to change in chamber size and have minimal spurious variation (noise). We present a method that disentangles true change from noise. Linear regression of chamber size against elapsed time gives a slope, being the ability of the method to detect change. Noise is the scatter of individual points away from the trendline, measured as the standard error of the slope. The higher the signal-to-noise ratio (SNR), the more reliably a parameter will distinguish true change from noise. METHODS: LV and LA parasternal dimensions and apical biplane volumes were obtained from serial clinical echocardiogram reports. Change over time was assessed as the slope of the linear regression line, and noise was assessed as the standard error of the regression slope. Signal-to-noise ratio is the slope divided by its standard error. RESULTS: The median number of LV studies was 5 (4-11) for LV over a mean duration of 5.9 ± 3.0 years in 561 patients (diastole) and 386 (systole). The median number of LA studies was 5 (4-11) over a mean duration of 5.3 ± 2.0 years in 137 patients. Linear estimates of LV size had better signal-to-noise than volume estimates (p < 0.001 for diastolic and p = 0.035 for systolic). For the left atrium, the difference was not significant (p = 0.214). This may be due to sample size; the effect size was similar to that for LV systolic size. All three parameters had a numerical value of signal-to-noise that favoured linear dimensions over volumes. CONCLUSION: Linear measures of LV size have better signal-to-noise than volume measures. There was no difference in signal-to-noise between linear and volume measures of LA size, although this may be a Type II error. The use of regression lines may be better than relying on single measurements. Linear dimensions may clarify whether changes in volumes are real or spurious.
Assuntos
Apêndice Atrial , Ventrículos do Coração , Humanos , Ventrículos do Coração/diagnóstico por imagem , Átrios do Coração/diagnóstico por imagem , Ecocardiografia/métodos , Função Ventricular Esquerda , Volume SistólicoRESUMO
OBJECTIVES: Except for the large bias of some measurement systems for serum cystatin C (CysC) measurements, unacceptable imprecision has been observed for the heterogenous system. This study analyzed the external quality assessment (EQA) results in 2018-2021 to provide an insight into the imprecision of CysC assays. METHODS: Five EQA samples were sent to participating laboratories every year. Participants were divided into reagent/calibrator-based peer groups, for which the robust mean of each sample and robust coefficient of variation (CV) were calculated by Algorithm A from ISO 13528. Peers with more than 12 participants per year were selected for further analysis. The limit of CV was determined to be 4.85% based on clinical application requirements. The concentration-related effect on CVs was investigated using logarithmic curve fitting; the difference in medians and robust CVs between instrument-based subgroups was also evaluated. RESULTS: The total number of participating laboratories increased from 845 to 1,695 in four years and heterogeneous systems remained the mainstream (≥85%). Of 18 peers with ≥12 participants, those using homogeneous systems showed relatively steady and small CVs over four years, with the mean four-year CVs ranging from 3.21 to 3.68%. Some peers using heterogenous systems showed reduced CVs over four years, while 7/15 still had unacceptable CVs in 2021 (5.01-8.34%). Six peers showed larger CVs at the low or high concentrations, and some instrument-based subgroups presented greater imprecision than others. CONCLUSIONS: More efforts should be made to improve the imprecision of heterogeneous systems for CysC measurement.
Assuntos
Cistatina C , Humanos , Testes de Função RenalRESUMO
Lot-to-lot verification is an integral component for monitoring the long-term stability of a measurement procedure. The practice is challenged by the resource requirements as well as uncertainty surrounding experimental design and statistical analysis that is optimal for individual laboratories, although guidance is becoming increasingly available. Collaborative verification efforts as well as application of patient-based monitoring are likely to further improve identification of any differences in performance in a relatively timely manner. Appropriate follow up actions of failed lot-to-lot verification is required and must balance potential disruptions to clinical services provided by the laboratory. Manufacturers need to increase transparency surrounding release criteria and work closer with laboratory professionals to ensure acceptable reagent lots are released to end users. A tripartite collaboration between regulatory bodies, manufacturers, and laboratory medicine professional bodies is key to developing a balanced system where regulatory, manufacturing, and clinical requirements of laboratory testing are met, to minimize differences between reagent lots and ensure patient safety. Clinical Chemistry and Laboratory Medicine has served as a fertile platform for advancing the discussion and practice of lot-to-lot verification in the past 60 years and will continue to be an advocate of this important topic for many more years to come.
Assuntos
Química Clínica , Kit de Reagentes para Diagnóstico , Humanos , Controle de Qualidade , LaboratóriosRESUMO
BACKGROUND: In human and veterinary medicine calprotectin is most widely used in diagnosing different gastro-intestinal diseases. The aim of this study was to assess the stability of canine calprotectin (cCP) in serum after storage at low temperatures and imprecision of the method. METHODS: Blood samples were collected from dogs with different clinical diagnoses. Twenty-two dogs were included in this study. Calprotectin concentration was measured 4 hours after serum separation (T0), and after being frozen at - 80 °C for 8 (T1) and 16 weeks (T2). The maximum permissible difference (MPD) was derived from the equation for calculating total error (TE) TE = %Bias + (1.96 x %CV), where bias and coefficient of variation (CV) were defined by the manufacturer. The dogs enrolled in this study were patients admitted during the morning (9-12 a.m.), on the day the first measurement was performed. All sample analysis for determination of stability were done in duplicates. For determination of within-run precision, the two patients' serum samples were analyzed in 20 replicates. Imprecision was assessed by analyzing 20 replicates on one plate on two samples where high and low concentrations were anticipated. RESULTS: The calculated value of MPD was 32.52%. Median calprotectin concentrations were higher at T1 114.08 µg/L (IQR = 55.05-254.56) and T2 133.6 µg/L (IQR = 100.57-332.98) than at T0 83.60 µg/L (IQR = 50.38-176.07). Relative and absolute bias at T1 (49.3%; 45.98 µg/L) and T2 (109.93%; 94.09 µg /L) have shown that calprotectin concentrations increase after long term storage at - 80 °C. CONCLUSION: The results of the present study indicate that c-CP was not stable for 16 weeks at low storage temperature (- 80 °C). Considering the observed change in the concentration of c-CP at T1, a storage time of 8 weeks should be safely applied. The method imprecision was not satisfactory, especially in the lower concentration range.
Assuntos
Complexo Antígeno L1 Leucocitário , Soro , Humanos , Cães , Animais , Temperatura , Complexo Antígeno L1 Leucocitário/análise , Congelamento , Soro/químicaRESUMO
BACKGROUND: It is crucial to improve the accuracy of HbA1c measurement as its essential role in diabetes diagnosis and treatment. We aimed to establish the biological variation (BV) and sigma metrics (SM) models and apply the models to evaluate the analytical performance of HbA1c in external quality assessment (EQA) program. METHODS: Data of HbA1c EQA (2021) and internal quality control (IQC) (March-August 2021) were collected. The group-specific bias and coefficient of variance (CV) were computed for measuring systems with laboratory number >9 in EQA program. The analytical bias and CV for individual laboratory were estimated from EQA and IQC data. The CV% and bias% were plotted in the BV-SM models for performance evaluation of measuring system and individual laboratory. RESULTS: Totally, 380 laboratories participated in EQA program. The overall inter-laboratory CV of five EQA samples ranged from 3.02% to 3.63%. There were five measuring systems that met the minimum performance for 5/5 samples: Arkary, Primus, Roche, Mindray and Tosoh, but none of them achieved the optimum performance. Half of the 196 laboratories that reported IQC and EQA results simultaneously achieved 3σ and minimum performance limits. Further analysis indicated that 88.8%, and 31.6% of the laboratories met the minimum performance for bias and CV, respectively. CONCLUSIONS: The biological variation and sigma metrics are appropriate quality management models for evaluating the performance of HbA1c in EQA program. The intra-laboratory and inter-laboratory imprecision need to be improved in order to achieve the required analytical goals for diabetes diagnosis.
Assuntos
Diabetes Mellitus , Gestão da Qualidade Total , Diabetes Mellitus/diagnóstico , Hemoglobinas Glicadas/análise , Humanos , Laboratórios , Controle de QualidadeRESUMO
BACKGROUND: The introduction of new medical technologies such as sensors has accelerated the process of collecting patient data for relevant clinical decisions, which has led to the introduction of a new technology known as digital biomarkers. OBJECTIVE: This study aims to assess the methodological quality and quality of evidence from meta-analyses of digital biomarker-based interventions. METHODS: This study follows the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guideline for reporting systematic reviews, including original English publications of systematic reviews reporting meta-analyses of clinical outcomes (efficacy and safety endpoints) of digital biomarker-based interventions compared with alternative interventions without digital biomarkers. Imaging or other technologies that do not measure objective physiological or behavioral data were excluded from this study. A literature search of PubMed and the Cochrane Library was conducted, limited to 2019-2020. The quality of the methodology and evidence synthesis of the meta-analyses were assessed using AMSTAR-2 (A Measurement Tool to Assess Systematic Reviews 2) and GRADE (Grading of Recommendations, Assessment, Development, and Evaluations), respectively. This study was funded by the National Research, Development and Innovation Fund of Hungary. RESULTS: A total of 25 studies with 91 reported outcomes were included in the final analysis; 1 (4%), 1 (4%), and 23 (92%) studies had high, low, and critically low methodologic quality, respectively. As many as 6 clinical outcomes (7%) had high-quality evidence and 80 outcomes (88%) had moderate-quality evidence; 5 outcomes (5%) were rated with a low level of certainty, mainly due to risk of bias (85/91, 93%), inconsistency (27/91, 30%), and imprecision (27/91, 30%). There is high-quality evidence of improvements in mortality, transplant risk, cardiac arrhythmia detection, and stroke incidence with cardiac devices, albeit with low reporting quality. High-quality reviews of pedometers reported moderate-quality evidence, including effects on physical activity and BMI. No reports with high-quality evidence and high methodological quality were found. CONCLUSIONS: Researchers in this field should consider the AMSTAR-2 criteria and GRADE to produce high-quality studies in the future. In addition, patients, clinicians, and policymakers are advised to consider the results of this study before making clinical decisions regarding digital biomarkers to be informed of the degree of certainty of the various interventions investigated in this study. The results of this study should be considered with its limitations, such as the narrow time frame. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR2-10.2196/28204.
Assuntos
Biomarcadores , Tecnologia , Humanos , Viés , Hungria , Revisões Sistemáticas como AssuntoRESUMO
BACKGROUND: Reduce the effects in the storage-and-thawing process of commercial control materials based on their interchangeability evaluation. METHODS: Seven assays-anti-streptolysin O, complement 3, carcinoembryonic antigen, urea, ferritin, total bilirubin, and glucose-were selected. Commercial control materials and serum samples with similar concentrations were chosen as samples. The experiment was carried out in three stages. In the first stage, the assays with statistical differences in imprecision were screened. In the second stage, two specimens were sealed with parafilm and frozen at -80°C and thawed in the water bath, and the imprecision differences were compared again. Finally, the effective means to reduce the effects were included in the standard operating procedure to repeat confirmation. RESULTS: In the first stage, there was only a statistical difference (p < 0.05) in the imprecision of glucose and total bilirubin between two specimens, and the imprecision of control materials was higher than the serum samples. In the second stage, glucose imprecision was not statistically different (p > 0.05) and lower than in the first stage. In the third stage, the methods from the second stage were confirmed to be effective at reducing control material effects. CONCLUSION: Finding variation factors and confirming and standardizing the measures will help lessen commercial control material effects.
Assuntos
Bioensaio/métodos , Soro/metabolismo , Bilirrubina/sangue , Humanos , Controle de QualidadeRESUMO
BACKGROUND: We aimed to evaluate the analytical performance of five commercial RT-PCR kits (Genekey, Daan, BioGerm, Liferiver, and Yaneng) commonly used in China, since such comparison data are lacking. METHODS: A total of 20 COVID-19 confirmed patients and 30 negative nasopharyngeal swab specimens were analyzed by five kits. The detection ability of five RT-PCR kits was evaluated with 5 concentration gradients diluted by a single positive sample. The limit of detection was evaluated by N gene fragment solid standard. Two positive clinical specimens were used to evaluate the repeatability and imprecision. Finally, we used six human coronaviruses plasmid and four respiratory pathogens plasmid to check for cross-reactivity. RESULTS: The positive detection rate was 100% for Genekey, Daan, and BioGerm,and 90% for Liferiver and Yaneng in 20 clinical SARS-CoV-2 infection. The coincidence rate of five kits in 10 negative samples was 100%. The detection rate of target genes for Daan, BioGerm, Liferiver, and Yaneng was 100% from Level 1 to Level 3. In Level 4, only Daan detection rate was 100%. In Level 5, five kits presented poor positive rate. The limit of detection declared by each manufacturer was verified. The repeatability for target genes was less than 5% and so did the total imprecision. There is no cross-reactivity of five kits with six human coronaviruses and four respiratory pathogens for ORF1ab and N gene. CONCLUSIONS: Five RT-PCR kits assessed in this study showed acceptable analytical performance characteristics and are useful tools for the routine diagnosis of SARS-CoV-2.
Assuntos
Teste para COVID-19/métodos , Reação em Cadeia da Polimerase Via Transcriptase Reversa/métodos , SARS-CoV-2/genética , Humanos , Limite de Detecção , Nasofaringe/virologia , Poliproteínas/genética , Reprodutibilidade dos Testes , Proteínas Virais/genéticaRESUMO
The purpose of this study was to identify aspects of impaired tongue motor performance that limit the ability to produce distinct speech sounds and contribute to reduced speech intelligibility in individuals with dysarthria secondary to amyotrophic lateral sclerosis (ALS). We analyzed simultaneously recorded tongue kinematic and acoustic data from 22 subjects during three target words (cat, dog, and took). The subjects included 11 participants with ALS and 11 healthy controls from the X-ray microbeam dysarthria database (Westbury, 1994). Novel measures were derived based on the range and speed of relative movement between two quasi-independent regions of the tongue - blade and dorsum - to characterize the global pattern of tongue dynamics. These "whole tongue" measures, along with the range and speed of single tongue regions, were compared across words, groups (ALS vs. control), and measure types (whole tongue vs. tongue blade vs. tongue dorsum). Reduced range and speed of both global and regional tongue movements were found in participants with ALS relative to healthy controls, reflecting impaired tongue motor performance in ALS. The extent of impairment, however, varied across words and measure types. Compared with the regional tongue measures, the whole tongue measures showed more consistent disease-related changes across the target words and were more robust predictors of speech intelligibility. Furthermore, these whole tongue measures were correlated with various word-specific acoustic features associated with intelligibility decline in ALS, suggesting that impaired tongue movement likely contributes to reduced phonetic distinctiveness of both vowels and consonants that underlie speech intelligibility decline in ALS.
Assuntos
Esclerose Lateral Amiotrófica , Inteligibilidade da Fala , Acústica , Esclerose Lateral Amiotrófica/complicações , Disartria/etiologia , Humanos , Movimento , Acústica da Fala , Medida da Produção da Fala , LínguaRESUMO
Plural definite descriptions across many languages display two well-known properties. First, they can give rise to so-called non-maximal readings, in the sense that they 'allow for exceptions' (Mary read the books on the reading list, in some contexts, can be judged true even if Mary didn't read all the books on the reading list). Second, while they tend to have a quasi-universal quantificational force in affirmative sentences ('quasi-universal' rather than simply 'universal' due to the possibility of exceptions we have just mentioned), they tend to be interpreted existentially in the scope of negation (a property often referred to as homogeneity, cf. Löbner in Linguist Philos 23:213-308, 2000). Building on previous works (in particular Krifka in Proceedings of SALT VI, Cornell University, pp 136-153, 1996 and Malamud in Semant Pragmat, 5:1-28, 2012), we offer a theory in which sentences containing plural definite expressions trigger a family of possible interpretations, and where general principles of language use account for their interpretation in various contexts and syntactic environments. Our theory solves a number of problems that these previous works encounter, and has broader empirical coverage in that it offers a precise analysis for sentences that display complex interactions between plural definites, quantifiers and bound variables, as well as for cases involving non-distributive predicates. The resulting proposal is briefly compared with an alternative proposal by Kriz (Aspects of homogeneity in the semantics of natural language, University of Vienna, 2015), which has similar coverage but is based on a very different architecture and sometimes makes subtly different predictions.
RESUMO
We question the reliability of the vague symptoms that most commonly define catheter-associated urinary tract infection (CAUTI) and encourage further examination of whether the current CAUTI definition reflects a true infection. While diagnosing CAUTI using the current surveillance definition, physicians may be missing a number of nonurinary etiologies for fever, prematurely diagnosing urinary tract infection, and prescribing unnecessary antibiotics. We believe it is time to reconsider the quality metric of CAUTI. By doing so, we can improve antibiotic use and quality of patient care.
Assuntos
Infecções Relacionadas a Cateter , Infecção Hospitalar , Infecções Urinárias , Infecções Relacionadas a Cateter/diagnóstico , Infecções Relacionadas a Cateter/tratamento farmacológico , Catéteres , Testes Diagnósticos de Rotina , Humanos , Reprodutibilidade dos Testes , Infecções Urinárias/diagnóstico , Infecções Urinárias/tratamento farmacológicoRESUMO
BACKGROUND: With increased interest in lipoprotein(a) (Lp[a]) concentration as a target for risk reduction and growing clinical evidence of its impact on cardiovascular disease (CVD) risk, rigorous analytical performance specifications (APS) and accuracy targets for Lp(a) are required. We investigated the biological variation (BV) of Lp(a), and 2 other major biomarkers of CVD, apolipoprotein A-I (apoA-I) and apolipoprotein B-100 (apoB), in the European Biological Variation Study population. METHOD: Serum samples were drawn from 91 healthy individuals for 10 consecutive weeks at 6 European laboratories and analyzed in duplicate on a Roche Cobas 8000 c702. Outlier, homogeneity, and trend analysis were performed, followed by CV-ANOVA to determine BV estimates and their 95% CIs. These estimates were used to calculate APS and reference change values. For Lp(a), BV estimates were determined on normalized concentration quintiles. RESULTS: Within-subject BV estimates were significantly different between sexes for Lp(a) and between women aged <50 and >50 years for apoA-I and apoB. Lp(a) APS was constant across concentration quintiles and, overall, lower than APS based on currently published data, whereas results were similar for apoA-I and apoB. CONCLUSION: Using a fully Biological Variation Data Critical Appraisal Checklist (BIVAC)-compliant protocol, our study data confirm BV estimates of Lp(a) listed in the European Federation of Clinical Chemistry and Laboratory Medicine database and reinforce concerns expressed in recent articles regarding the suitability of older APS recommendations for Lp(a) measurements. Given the heterogeneity of Lp(a), more BIVAC-compliant studies on large numbers of individuals of different ethnic groups would be desirable.
Assuntos
Apolipoproteína A-I/sangue , Apolipoproteína B-100/sangue , Variação Biológica Individual , Lipoproteína(a)/sangue , Adulto , Idoso , Apolipoproteína A-I/normas , Apolipoproteína B-100/normas , Feminino , Humanos , Lipoproteína(a)/normas , Masculino , Pessoa de Meia-Idade , Valores de Referência , Adulto JovemRESUMO
Medical laboratories are required to ensure the quality of their diagnostic results. Quality assurance procedures include quality assessments (internal and external), quality controls (negative, positive, or internal controls), equipment monitoring, and audits. Quality control data may be used to evaluate the uncertainty of measurement. All clinical virology laboratories require a standard operating procedure detailing their consideration of uncertainty of measurement, as this parameter may impact on the overall quality of diagnostic results as well as the clinical interpretation thereof. This review aims to provide a simplified approach to the concept of uncertainty of measurement, specific for clinical virology laboratories.
Assuntos
Técnicas de Laboratório Clínico/métodos , Testes Diagnósticos de Rotina/métodos , Viroses/diagnóstico , Técnicas de Laboratório Clínico/normas , Testes Diagnósticos de Rotina/normas , Humanos , Garantia da Qualidade dos Cuidados de Saúde , Controle de Qualidade , Reprodutibilidade dos TestesRESUMO
Choice-based stated preference methods, such as time trade-offs (TTOs), are used to establish health state utilities informing healthcare allocation. However, little is known about the presence of (position-dependent and precedent-dependent) sequence effects in the valuation of health states, despite techniques requiring respondents to evaluate several health states in a sequence. This paper is the first to explicitly test for the presence of sequence effects in the health domain using a new explanation based on contrast effects and preference imprecision; the implication being that randomisation cannot avoid sequence effects. Six TTO questions were designed using the EQ-5D-3L descriptive system. These were grouped into two blocks of three and within each block four sequences were used. In an online survey, 1,197 Spanish respondents answered one grouping of three TTO questions. Results indicate that sequence effects can affect preferences as utilities of health states are biased downwards if preceded by a better health state and biased upwards if preceded by a worse health state. This study informs our understanding of how context effects interact with preference elicitation methods, which is essential for interpreting survey results used to inform policy.