Pesquisa | Secretaria de Estado da Saúde

1.

Reliability of single-lead electrocardiogram interpretation to detect atrial fibrillation: insights from the SAFER feasibility study.

Hibbitt, Katie; Brimicombe, James; Cowie, Martin R; Dymond, Andrew; Freedman, Ben; Griffin, Simon J; Hobbs, F D R Ichard; Lindén, Hannah Clair; Lip, Gregory Y H; Mant, Jonathan; McManus, Richard J; Pandiaraja, Madhumitha; Williams, Kate; Charlton, Peter H.

Europace ; 26(7)2024 Jul 02.

Artigo em Inglês | MEDLINE | ID: mdl-38941497

RESUMO

AIMS: Single-lead electrocardiograms (ECGs) can be recorded using widely available devices such as smartwatches and handheld ECG recorders. Such devices have been approved for atrial fibrillation (AF) detection. However, little evidence exists on the reliability of single-lead ECG interpretation. We aimed to assess the level of agreement on detection of AF by independent cardiologists interpreting single-lead ECGs and to identify factors influencing agreement. METHODS AND RESULTS: In a population-based AF screening study, adults aged ≥65 years old recorded four single-lead ECGs per day for 1-4 weeks using a handheld ECG recorder. Electrocardiograms showing signs of possible AF were identified by a nurse, aided by an automated algorithm. These were reviewed by two independent cardiologists who assigned participant- and ECG-level diagnoses. Inter-rater reliability of AF diagnosis was calculated using linear weighted Cohen's kappa (κw). Out of 2141 participants and 162 515 ECGs, only 1843 ECGs from 185 participants were reviewed by both cardiologists. Agreement was moderate: κw = 0.48 (95% confidence interval, 0.37-0.58) at participant level and κw = 0.58 (0.53-0.62) at ECG level. At participant level, agreement was associated with the number of adequate-quality ECGs recorded, with higher agreement in participants who recorded at least 67 adequate-quality ECGs. At ECG level, agreement was associated with ECG quality and whether ECGs exhibited algorithm-identified possible AF. CONCLUSION: Inter-rater reliability of AF diagnosis from single-lead ECGs was found to be moderate in older adults. Strategies to improve reliability might include participant and cardiologist training and designing AF detection programmes to obtain sufficient ECGs for reliable diagnoses.

Assuntos

Algoritmos , Fibrilação Atrial , Eletrocardiografia , Estudos de Viabilidade , Variações Dependentes do Observador , Humanos , Fibrilação Atrial/diagnóstico , Fibrilação Atrial/fisiopatologia , Idoso , Reprodutibilidade dos Testes , Feminino , Masculino , Eletrocardiografia/instrumentação , Eletrocardiografia/métodos , Valor Preditivo dos Testes , Idoso de 80 Anos ou mais , Processamento de Sinais Assistido por Computador , Frequência Cardíaca

2.

Inter-Rater and Intra-Rater Agreement in Scoring Severity of Rodent Cardiomyopathy and Relation to Artificial Intelligence-Based Scoring.

Steinbach, Thomas J; Tokarz, Debra A; Co, Caroll A; Harris, Shawn F; McBride, Sandra J; Shockley, Keith R; Lokhande, Avinash; Srivastava, Gargi; Ugalmugle, Rajesh; Kazi, Arshad; Singletary, Emily; Cesta, Mark F; Thomas, Heath C; Chen, Vivian S; Hobbie, Kristen; Crabbs, Torrie A.

Toxicol Pathol ; 52(5): 258-265, 2024 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-38907685

RESUMO

We previously developed a computer-assisted image analysis algorithm to detect and quantify the microscopic features of rodent progressive cardiomyopathy (PCM) in rat heart histologic sections and validated the results with a panel of five veterinary toxicologic pathologists using a multinomial logistic model. In this study, we assessed both the inter-rater and intra-rater agreement of the pathologists and compared pathologists' ratings to the artificial intelligence (AI)-predicted scores. Pathologists and the AI algorithm were presented with 500 slides of rodent heart. They quantified the amount of cardiomyopathy in each slide. A total of 200 of these slides were novel to this study, whereas 100 slides were intentionally selected for repetition from the previous study. After a washout period of more than six months, the repeated slides were examined to assess intra-rater agreement among pathologists. We found the intra-rater agreement to be substantial, with weighted Cohen's kappa values ranging from k = 0.64 to 0.80. Intra-rater variability is not a concern for the deterministic AI. The inter-rater agreement across pathologists was moderate (Cohen's kappa k = 0.56). These results demonstrate the utility of AI algorithms as a tool for pathologists to increase sensitivity and specificity for the histopathologic assessment of the heart in toxicology studies.

Assuntos

Inteligência Artificial , Cardiomiopatias , Variações Dependentes do Observador , Animais , Cardiomiopatias/patologia , Ratos , Algoritmos , Miocárdio/patologia , Processamento de Imagem Assistida por Computador/métodos , Patologistas , Reprodutibilidade dos Testes

3.

The value of a multidisciplinary consensus meeting in achieving agreement on eating disorders diagnosis at a specialized referral center.

Charrat, Jean Philippe; Massoubre, Catherine; Gay, Aurelia; Ravey, Baptiste; Germain, Natacha; Galusca, Bogdan.

Int J Eat Disord ; 57(2): 463-469, 2024 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-38135878

RESUMO

OBJECTIVE: This study aimed to evaluate the concordance of eating disorders (EDs) diagnoses within a multidisciplinary team in a specialized hospital unit dedicated to the medical care of ED. METHODS: The study analyzed data from 608 female patients who sought consultation at the Eating Disorders Referral Center between 2017 and 2021. The diagnoses were established according to the DSM-5 criteria by endocrinologists, psychiatrists, and finally confirmed or discussed within a monthly multidisciplinary consensus meeting (MCM). Fleiss' Kappa tests were conducted to assess inter-raters' agreement. RESULTS: Overall, substantial agreement was observed between endocrinologists and psychiatrists and the MCM. A more detailed analysis revealed variations in agreement across different disorders. Certain EDs demonstrated substantial agreement (e.g., anorexia nervosa restrictive subtype), while others approached near-perfect agreement (e.g., binge-eating disorder). In contrast, agreement was fair to poor for anorexia nervosa binge-purge subtype (ANBP) and slight for other specified feeding and ED. A period of temporary disagreement was noted for ANBP, partially attributed to practitioner turnover. An improvement in interdisciplinary agreement was observed for all ED diagnoses by the end of the study period. DISCUSSION: Variations or lower levels of inter-rater agreement may stem from atypical cases that fall on the border between two diagnoses or complex cases, as well as fluctuating symptoms. The progress observed throughout the study can be attributed in part to interdisciplinary learning, particularly facilitated by the MCM. The findings underscore the significance of striving for optimal concordance among different medical specialties to enhance patient care in ED treatment. PUBLIC SIGNIFICANCE: This study scrutinizes the agreement levels of ED diagnoses among endocrinologists and psychiatrists within a multidisciplinary team at an Eating Disorders Referral Center. While substantial overall agreement was achieved, disparities or lower agreement levels were evident for certain diagnoses such as anorexia nervosa binge-purge subtype. However, collaborative meetings led to a progressive enhancement in agreement over time. This research underscores the crucial role of a multidisciplinary team working collectively to ensure precise diagnoses and improved care for patients with EDs.

Assuntos

Anorexia Nervosa , Transtorno da Compulsão Alimentar , Bulimia Nervosa , Transtornos da Alimentação e da Ingestão de Alimentos , Humanos , Feminino , Consenso , Transtornos da Alimentação e da Ingestão de Alimentos/diagnóstico , Transtornos da Alimentação e da Ingestão de Alimentos/terapia , Anorexia Nervosa/diagnóstico , Transtorno da Compulsão Alimentar/diagnóstico , Encaminhamento e Consulta , Manual Diagnóstico e Estatístico de Transtornos Mentais , Bulimia Nervosa/diagnóstico

4.

Comparison of the patient-derived modified Japanese Orthopaedic Association scale and the European myelopathy score.

de Dios, Eddie; Löfgren, Håkan; Laesser, Mats; Lindhagen, Lars; Björkman-Burtscher, Isabella M; MacDowall, Anna.

Eur Spine J ; 33(3): 1205-1212, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38112768

RESUMO

PURPOSE: To compare the patient-derived modified Japanese Orthopaedic Association (P-mJOA) scale with the European myelopathy score (EMS) for the assessment of patients with degenerative cervical myelopathy (DCM). METHODS: In this register-based cohort study with prospectively collected data, included patients were surgically treated for DCM and had reported both P-mJOA and EMS scores at baseline, 1-year follow-up, and/or 2-year follow-up to the Swedish Spine Register. P-mJOA and EMS scores were defined as severe (P-mJOA 0-11 and EMS 5-8), moderate (P-mJOA 12-14 and EMS 9-12), or mild (P-mJOA 15-18 and EMS 13-18). P-mJOA and EMS mean scores were compared, and agreement was evaluated with Spearman's rank correlation coefficient (ρ), the intraclass correlation coefficient (ICC), and kappa (κ) statistics. RESULTS: Included patients (n = 714, mean age 63.2 years, 42.2% female) completed 937 pairs of the P-mJOA and the EMS. The mean P-mJOA and EMS scores were 13.9 ± 3.0 and 14.5 ± 2.7, respectively (mean difference -0.61 [95% CI -0.72 to -0.51; p < 0.001]). Spearman's ρ was 0.84 (p < 0.001), and intra-rater agreement measured with ICC was 0.83 (p < 0.001). Agreement of severity level measured with unweighted and weighted κ was fair (κ = 0.22 [p < 0.001]; κ = 0.34 [p < 0.001], respectively). Severity levels were significantly higher using the P-mJOA (p < 0.001). CONCLUSION: The P-mJOA and the EMS had similar mean scores, and intra-rater agreement was high, whereas severity levels only demonstrated fair agreement. The EMS has a lower sensitivity for detecting severe myelopathy but shows an increasing agreement with the P-mJOA for milder disease severity. A larger interval to define severe myelopathy with the EMS is recommended.

Assuntos

Ortopedia , Doenças da Medula Espinal , Humanos , Feminino , Pessoa de Meia-Idade , Masculino , Estudos de Coortes , Resultado do Tratamento , Japão , Estudos Prospectivos , Vértebras Cervicais/cirurgia , Índice de Gravidade de Doença , Doenças da Medula Espinal/diagnóstico , Doenças da Medula Espinal/cirurgia

5.

German Parents and Educators of Two to Four-Year-Old Children as Informants for the Strengths and Difficulties Questionnaire (SDQ).

Dubiel, Simone; Cohen, Franziska; Anders, Yvonne.

Child Psychiatry Hum Dev ; 2024 Oct 03.

Artigo em Inglês | MEDLINE | ID: mdl-39361212

RESUMO

Screeners are used in early intervention and early childhood education and care programs to identify children's potential need for further evaluation and diagnostics. The Strengths and Difficulties Questionnaire (SDQ) is a brief behavioral screening instrument that can be completed by both parents and educators to assess the social and emotional traits of children. However, multiple informants' reports vary. In this study, the extent to which parents' (n = 241) and educators' (n = 157) differ and agree in their assessments of children aged 3.5 years on average, was examined. T-tests were used to examine differences between informants and correlations within a multitrait-multimethod matrix (MTMM) in their agreement. Results showed moderate to high levels of rater agreement ranging from r = .35 and r = .53 on the five subscales of the SDQ. We found that hyperactivity, peer relationship problems, and prosocial behavior vary due to meaningful reasons, e.g., the home vs. pre-school setting, and the informant's relationship towards the child. Hyperactivity seems to be relatively consistent across settings. Methodological variations might explain differences in emotional symptoms and conduct problems. Considering ratings from multiple informants outlines a more comprehensive view of children's behavior and should be preferred over single-informant research designs.

6.

Interrater Agreement of CT Grading of Blunt Splenic Injuries: Does the AAST Grading Need to Be Reimagined?

Adams-McGavin, R Chris; Tafur, Monica; Vlachou, Paraskevi A; Wu, Matthew; Brassil, Michael; Crivellaro, Priscila; Lin, Hui-Ming; Gomez, David; Colak, Errol.

Can Assoc Radiol J ; 75(1): 171-177, 2024 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-37405424

RESUMO

Introduction: The Revised Organ Injury Scale (OIS) of the American Association for Surgery of Trauma (AAST) is the most widely accepted classification of splenic trauma. The objective of this study was to evaluate inter-rater agreement for CT grading of blunt splenic injuries. Methods: CT scans in adult patients with splenic injuries at a level 1 trauma centre were independently graded by 5 fellowship trained abdominal radiologists using the AAST OIS for splenic injuries - 2018 revision. The inter-rater agreement for AAST CT injury score, as well as low-grade (IIII) versus high-grade (IV-V) splenic injury was assessed. Disagreement in two key clinical scenarios (no injury versus injury, and high versus low grade) were qualitatively reviewed to identify possible sources of disagreement. Results: A total of 610 examinations were included. The inter-rater absolute agreement was low (Fleiss kappa statistic 0.38, P < 0.001), but improved when comparing agreement between low and high grade injuries (Fleiss kappa statistic of 0.77, P < .001). There were 34 cases (5.6%) of minimum two-rater disagreement about no injury vs injury (AAST grade ≥ I). There were 46 cases (7.5%) of minimum two-rater disagreement of low grade (AAST grade I-III) versus high grade (AAST grade IV-V) injuries. Likely sources of disagreement were interpretation of clefts versus lacerations, peri-splenic fluid versus subcapsular hematoma, application of adding multiple low grade injuries to higher grade injuries, and identification of subtle vascular injuries. Conclusion: There is low absolute agreement in grading of splenic injuries using the existing AAST OIS for splenic injuries.

Assuntos

Traumatismos Abdominais , Lesões do Sistema Vascular , Ferimentos não Penetrantes , Adulto , Humanos , Estados Unidos , Tomografia Computadorizada por Raios X , Baço/lesões , Estudos Retrospectivos , Escala de Gravidade do Ferimento

7.

Evaluation of inter-rater and test-retest reliability for near-infrared spectroscopy reactive hyperemia measures.

McGranahan, Melissa J; Kibildis, Samuel W; McCully, Kevin K; O'Connor, Patrick J.

Microvasc Res ; 148: 104532, 2023 07.

Artigo em Inglês | MEDLINE | ID: mdl-36963482

RESUMO

BACKGROUND: Near-infrared spectroscopy (NIRS) is a non-invasive tool used to measure blood flow in peripheral tissues. More information on inter-rater agreement and test-retest reliability of NIRS-based reperfusion assessments is needed. PURPOSE: To assess inter-rater agreement for NIRS based data analysis, and evaluate the measurement's reliability across days. METHODS: On three separate days (average days between visits 1 and 3: 19.4 ± 6.9 days), participants' (N = 15 males, 22 ± 2 yr.) post-occlusion reactive hyperemia (PORH) was measured in the left gastrocnemius muscle using Continuous-Wave NIRS (CW-NIRS). A blood pressure cuff was placed proximal to the knee and inflated to occlude lower leg blood flow for 5 min. The following CW-NIRS parameters were selected: (1) percent saturation in HbO2 (StO2%) at baseline; (2) the O2Hb range used to normalize the NIRS signal; (3) the time for the O2Hb signal to reach 50 % peak post-occlusion hyperemia (T1/2), and (4) the post peak hyperemic O2Hb recovery slope (O2REC-SLP). Absolute agreement between the two analysts was calculated using two-way random effects Intraclass Correlation Coefficients (ICC2,1). Consistency between analysts and across days was calculated using two-way mixed models (ICC3,1). Mean and 95 % confidence intervals (CI) of ICCs are reported. Coefficient of variation (CV) and standard error of the measurement (SEM) are reported. RESULTS: The ICC2,1 data indicated "adequate" to "excellent" absolute agreement between the two NIRS analysts. ICC2,3 data indicated "adequate" to "good" reliability across visits. The CV and SEM for rater 1 and rater 2 across visit were StO2 (CV: 3.79 % ± 2.71 % and 4.50 % ± 2.37 %; SEM: 3.42 and 3.82), O2Hb range (CV: 10.50 ± 5.93 and 12.79 ± 12.41; SEM: 3.26 and 4.71), T1/2 (CV: 11.15 % ± 5.52 % and 10.96 % ± 4.50; SEM: 1.22 and 1.11), and O2REC-SLP (CV: 19.49 % ± 9.99 % and 18.45 % ± 9.48 %; SEM: 0.04 and 0.04). CONCLUSION: It is concluded that NIRS parameters assessed show adequate reliability between analysts and across three visits. It is recommended, when feasible and because of the absence of 100 % reliability, that investigators employ more than one rater for scoring at least a portion of the data across each trial in a study's control condition in order to have the ability to estimate the magnitude of error attributable to imperfect reliability.

Assuntos

Hiperemia , Doenças Vasculares , Masculino , Humanos , Espectroscopia de Luz Próxima ao Infravermelho , Reprodutibilidade dos Testes , Hemodinâmica

8.

Comparing self-report and parental report of psychopathologies in adolescents with substance use disorders.

Kuitunen-Paul, Sören; Eichler, Anna; Wiedmann, Melina; Basedow, Lukas A; Roessner, Veit; Golub, Yulia.

Eur Child Adolesc Psychiatry ; 32(2): 331-342, 2023 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-34480628

RESUMO

Both internalizing and externalizing psychopathologies interfere with the treatment of substance use disorders (SUD) in adolescents. Self-reports of psychopathologies are likely biased and may be validated with parental reports. We compared N = 70 standardized self-reports of adolescents entering outpatient SUD treatment (13.2-18.6 years old, 43% female) to parental reports on the same psychopathologies, and explored biases due to gender, age, SUD diagnoses and SUD severity. Bivariate bootstrapped Pearson correlation coefficients revealed several small to moderate correlations between both reporting sources (r = 0.29-0.49, all pcorrected ≤ 0.039). A repeated measures MANOVA revealed moderately stronger parental reports of adolescent psychopathologies compared to adolescent self-reports for most externalizing problems (dissocial and aggressive behaviors, p ≤ 0.016, Î·2part = 0.09-0.12) and social/attention problems (p ≤ 0.012, Î·2part = 0.10), but no differences for most internalizing problems (p ≥ 0.073, Î·2part = 0.02-0.05). Differences were not associated with other patient or parental characteristics including age, gender, number of co-occurring diagnoses or presence/absence of a certain SUD (all puncorrected ≥ 0.088). We concluded that treatment-seeking German adolescents with SUD present with a multitude of extensive psychopathologies. The relevant deviation between self- and parental reports indicate that the combination of both reports might help to counteract dissimulation and other reporting biases. The generalizability of results to inpatients, psychiatry patients in general, or adolescents without SUD, as well as the validity of self- and parental reports in comparison to clinical judgements remain unknown.

Assuntos

Transtornos Relacionados ao Uso de Substâncias , Humanos , Adolescente , Feminino , Masculino , Autorrelato , Transtornos Relacionados ao Uso de Substâncias/diagnóstico , Psicopatologia , Pais , Análise Multivariada

9.

Assessing inter-rater agreement of the intellectual disability-frailty index short form: A descriptive pilot study.

Hirst, Heather; Campbell, Jennifer; Chamberlin, Samantha; Olagunju, Ibukun; Bird, Frank; Luiselli, James K.

J Intellect Disabil ; : 17446295231213436, 2023 Nov 03.

Artigo em Inglês | MEDLINE | ID: mdl-37922940

RESUMO

Frailty is a health concern for many adults with intellectual disability and should be measured to detect at-risk conditions, monitor disease, plan treatment, and gauge mortality. This descriptive pilot study evaluated measurement consistency (inter-rater agreement) of the Intellectual Disability-Frailty Index Short Form among multiple assessors with 20 adults (M age = 48.3 years) who had intellectual and multiple disabilities. Agreement percentages were computed for (a) non-frail, pre-frail, and frail categories derived from total index scores, and (b) each of 17 deficits listed on the form. Low average inter-rater agreement (<85%) was obtained on the index frail categories, several of the assessed deficits had acceptable inter-rater agreement (84.2-100%), while the majority of deficits were associated with moderate-to-low agreement percentages. Though research supports the Intellectual Disability-Frailty Index Short Form as a valid and practical frailty assessment instrument, our findings suggest that full-scale inter-rater agreement must be improved by adding more specificity to the form, clarifying instructions for assessors, and providing competency-based training in assessment implementation.

10.

Robustness of κ -type coefficients for clinical agreement.

Vanacore, Amalia; Pellegrino, Maria Sole.

Stat Med ; 41(11): 1986-2004, 2022 05 20.

Artigo em Inglês | MEDLINE | ID: mdl-35124830

RESUMO

The degree of inter-rater agreement is usually assessed through κ -type coefficients and the extent of agreement is then characterized by comparing the value of the adopted coefficient against a benchmark scale. Through two motivating examples, it is displayed the different behavior of some κ -type coefficients due to asymmetric distribution of marginal frequencies over categories. In order to investigate the robustness of four κ -type coefficients for nominal and ordinal classifications and of an inferential benchmarking procedure that, differently from straightforward benchmarking, does not neglect the influence of the experimental conditions, an extensive Monte Carlo simulation study has been conducted. The robustness has been investigated for several scenarios, differing for sample size, rating scale dimension, number of raters, frequency distribution of rater classifications, pattern of agreement across raters. Simulation results reveal an higher paradoxical behavior of Fleiss kappa and Conger kappa with ordinal rather than nominal classifications; the coefficients robustness improves with increasing sample size and number of raters for both nominal and ordinal classifications whereas robustness improves with rating scale dimension only for nominal classifications. By identifying the scenarios (ie, minimum sample size, number of raters, rating scale dimension) with acceptable robustness, this study provides guidelines about the design of robust agreement studies.

Assuntos

Reprodutibilidade dos Testes , Simulação por Computador , Humanos , Método de Monte Carlo , Variações Dependentes do Observador , Tamanho da Amostra

11.

Validation of central serous chorioretinopathy multimodal imaging-based classification system.

Chhablani, Jay; Behar-Cohen, Francine.

Graefes Arch Clin Exp Ophthalmol ; 260(4): 1161-1169, 2022 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-34669028

RESUMO

PURPOSE: Validation of a recently described central serous chorioretinopathy (CSCR) classification system and assessment of levels of agreement among 10 retina physicians. METHODS: This was a cross-sectional (inter-reader agreement) study. Ten retina physicians (assigned a role of masked grader) were provided with a comprehensive dataset of 61 eyes of 34 patients of presumed CSCR. Relevant clinical details and multimodal imaging (fundus autofluorescence, fluorescein and indocyanine green angiography, optical coherence tomography) of both involved and fellow eye were electronically shared. Later, only the fellow eye images were resent to understand the influence of affected eye on the grading of the fellow eye. Multiple inter-grader agreement using Fleiss Kappa was performed to determine the level of agreement among the 10 graders. p value of ≤ 0.05 was considered statistically significant. RESULTS: Sixty-one eyes of 34 patients were evaluated. There was moderate agreement for major criteria with Fleiss Kappa value of 0.50 (p < 0.0001) with a single outlier observer. After excluding that observer, the Fleiss Kappa value increased to 0.57 (p < 0.0001) with statistically significant p values among all categories, i.e., simple CSC ([Formula: see text] = 0.575), complex CSC ([Formula: see text] = 0.621), and no CSC ([Formula: see text] = 0.452). Overall, moderate to substantial agreement was noted among the subtypes (primary, recurrent, and resolved). The influence of the affected eye on fellow eye grading was studied. The global Fleiss Kappa coefficient ([Formula: see text] = 0.642, p < 0.0001) showed substantial agreement when observers were aware of the affected eye grading. However, without prior available information on the affected eye, the inter-grader agreement was significantly lower (global [Formula: see text] = 0.255, p < 0.0001). CONCLUSION: A fair-moderate inter-grader agreement among the masked graders suggests a need for further refinement of this novel classification system. Disease grading should include both eyes as lack of information on affected eye has a bearing on fellow eye grading and inter-grader agreement as shown by a significant difference in global [Formula: see text] values.

Assuntos

Coriorretinopatia Serosa Central , Coriorretinopatia Serosa Central/diagnóstico , Corioide , Estudos Transversais , Angiofluoresceinografia/métodos , Humanos , Imagem Multimodal , Estudos Retrospectivos , Tomografia de Coerência Óptica/métodos

12.

Evaluating diagnostic and management agreement between audiology and ENT: a prospective inter-rater agreement study in a paediatric primary contact clinic.

Eakin, Jennifer; Michael, Simone; Payten, Christopher; Smith, Tamsin; Stewart, Vicky; Noonan, Elle; Weir, Kelly A.

BMC Pediatr ; 22(1): 646, 2022 11 08.

Artigo em Inglês | MEDLINE | ID: mdl-36348376

RESUMO

BACKGROUND: Ear, Nose and Throat (ENT) primary contact models of care use audiologists as the first triage point for children referred to ENT for middle ear and hearing concerns; and have shown reduced waiting time, improved ENT surgical conversion rates and increased service capacity. This study aimed to investigate 'safety and quality' of the model by looking at agreement between audiologists' and an ENT's clinical decisions. METHODS: We performed an inter-rater agreement study on diagnosis and management decisions made by audiologists and an ENT for 50 children seen in an Australian hospital's ENT primary contact service, and examined the nature and patterns of disagreements. RESULTS: Professionals agreed on at least one site-of-lesion diagnosis for all children (100%) and on the primary management for 74% (Gwet's AC1 = 0.67). Management disagreements clustered around i) providing 'watchful waiting' versus sooner medical opinion (18%), and ii) providing monitoring versus discharge for children with no current symptoms (8%). There were no cases where the audiologist recommended discharge when the ENT recommended further medical opinion. CONCLUSIONS: Our novel research provides further evidence that Audiologist-led primary contact models for children with middle ear and hearing concerns are safe as well as efficient.

Assuntos

Audiologia , Criança , Humanos , Estudos Prospectivos , Austrália , Audição , Encaminhamento e Consulta

13.

External validation of the deep learning system "SpineNet" for grading radiological features of degeneration on MRIs of the lumbar spine.

Grob, Alexandra; Loibl, Markus; Jamaludin, Amir; Winklhofer, Sebastian; Fairbank, Jeremy C T; Fekete, Tamás; Porchet, François; Mannion, Anne F.

Eur Spine J ; 31(8): 2137-2148, 2022 08.

Artigo em Inglês | MEDLINE | ID: mdl-35835892

RESUMO

BACKGROUND: Magnetic resonance imaging (MRI) is used to detect degenerative changes of the lumbar spine. SpineNet (SN), a computer vision-based system, performs an automated analysis of degenerative features in MRI scans aiming to provide high accuracy, consistency and objectivity. This study evaluated SN's ratings compared with those of an expert radiologist. METHOD: MRIs of 882 patients (mean age, 72 ± 8.8 years) with degenerative spinal disorders from two previous trials carried out in our spine center between 2011 and 2019, were analyzed by an expert radiologist. Lumbar segments (L1/2-L5/S1) were graded for Pfirrmann Grades (PG), Spondylolisthesis (SL) and Central Canal Stenosis (CCS). SN's analysis for the equivalent parameters was generated. Agreement between methods was analyzed using kappa (κ), Spearman correlation (ρ) and Lin's concordance correlation (ρc) coefficients and class average accuracy (CAA). RESULTS: 4410 lumbar segments were analyzed. κ statistics showed moderate to substantial agreement in PG between the radiologist and SN depending on spinal level (range κ 0.63-0.77, all levels together 0.72; range CAA 45-68%, all levels 55%), slight to substantial agreement for SL (range κ 0.07-0.60, all levels 0.63; range CAA 47-57%, all levels 56%) and CCS (range κ 0.17-0.57, all levels 0.60; range CAA 35-41%, all levels 43%). SN tended to record more pathological features in PG than did the radiologist whereas the contrary was the case for CCS. SL showed an even distribution between methods. CONCLUSION: SN is a robust and reliable tool with the ability to grade degenerative features such as PG, SL or CCS in lumbar MRIs with moderate to substantial agreement compared to the current gold-standard, the radiologist. It is a valuable alternative for analyzing MRIs from large cohorts for diagnostic and research purposes.

Assuntos

Aprendizado Profundo , Degeneração do Disco Intervertebral , Espondilolistese , Idoso , Idoso de 80 Anos ou mais , Constrição Patológica , Humanos , Degeneração do Disco Intervertebral/diagnóstico por imagem , Degeneração do Disco Intervertebral/patologia , Vértebras Lombares/diagnóstico por imagem , Vértebras Lombares/patologia , Região Lombossacral/patologia , Imageamento por Ressonância Magnética/métodos , Pessoa de Meia-Idade , Espondilolistese/diagnóstico por imagem , Espondilolistese/patologia

14.

Investigating the Measurement Invariance and Method-Trait Effects of Parent and Teacher SNAP-IV Ratings of Preschool Children.

Lúcio, Patrícia Silva; Eid, Michael; Cogo-Moreira, Hugo; Puglisi, Marina Leite; Polanczyk, Guilherme V.

Child Psychiatry Hum Dev ; 53(3): 489-501, 2022 06.

Artigo em Inglês | MEDLINE | ID: mdl-33638743

RESUMO

The Swanson, Nolan, and Pelham scale version IV (SNAP-IV) is widely used to assess symptoms of attention deficit hyperactivity disorder (ADHD) and oppositional defiant disorder (ODD) in children and adolescents. Nevertheless, there is insufficient data to support its use in preschool children. The study had three goals: First, to test the factorial validity of the three correlated-factors model of ADHD and ODD items of the SNAP-IV. Second, to investigate the measurement invariance of the items over time (6-month longitudinal interval) and by sex. Third, to investigate the convergent validity and method-specific influences on ADHD/ODD assessments with respect to multiple raters (parents/teachers) of children's symptoms. Participants were 618 preschool children (3.5-6 years) at baseline and 6-month follow-up. For model testing, we used confirmatory factor analysis for categorical observed variables. Method and trait effects were examined using the CT-C(M-1) model. The analyses showed partial measurement invariance over time and according to sex. Moreover, strong rater-specific effects were detected. The implication of the results for construct validation of the instrument and clinical assessment of ADHD and ODD traits are discussed.

Assuntos

Transtorno do Deficit de Atenção com Hiperatividade , Transtornos de Deficit da Atenção e do Comportamento Disruptivo , Adolescente , Transtorno do Deficit de Atenção com Hiperatividade/diagnóstico , Transtornos de Deficit da Atenção e do Comportamento Disruptivo/diagnóstico , Pré-Escolar , Análise Fatorial , Humanos , Pais

15.

Inter-rater agreement in pPOSSUM scores of geriatric trauma patients: a prospective evaluation.

Kusen, Jip Q; Beeres, Frank J P; van der Vet, Puck C R; Poblete, Beate; Geuss, Steffen; Babst, Reto; Knobe, Matthias; Wijdicks, Franciscus J G; Link, Björn C.

Arch Orthop Trauma Surg ; 142(12): 3869-3876, 2022 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-35031826

RESUMO

PURPOSE: Risk prediction models are widely used in the perioperative setting to identify high-risk patients who may benefit from additional care and to aid clinical decision-making. pPOSSUM is such a prediction model, however, little is known about the inter-rater agreement when scoring subjective parameters. This study assessed the inter-rater agreement between clinicians of different specialties and work-level when scoring 30 clinical case reports of geriatric hip fracture patients with pPOSSUM. METHODS: Eighteen clinicians of the department of Surgery (three specialists, four residents), Anaesthesiology (four specialists, two residents) and Emergency Medicine (three specialists, two residents) who were familiar with the pPOSSUM scoring system were asked to calculate the scores. The kappa statistic and the statistical method of Fleiss were used to analyse inter-rater agreement. RESULTS: The response rate was 100%. Among surgeons, Anaesthesiologists and Emergency department doctors (ED), the overall mean kappa values were 0.42, 0.08 and 0.20, respectively. Among surgery, anaesthesiology and ED residents the overall mean kappa values were 0.21, 0.33 and 0.37, respectively. Within the department of Surgery, Anaesthesiology and Emergency Medicine the overall mean kappa values were 0.23, 0.12 and 0.22, respectively. An overall mean kappa value of 0.19 was seen among all specialists. All residents had an overall mean kappa value of 0.21 and all clinicians had an overall mean kappa value of 0.21. CONCLUSION: The overall inter-rater agreement of clinicians and interdisciplinary agreement when scoring geriatric hip fracture patients with pPOSSUM was low and prone to subjectivity in our study. A higher work-experience level did not lead to better agreement. When pPOSSUM is calculated without clinical assessment by the same clinician, caution is advised to prevent over-reliance on the pPOSSUM risk prediction model. LEVEL OF EVIDENCE: III.

Assuntos

Fraturas Ósseas , Humanos , Idoso , Variações Dependentes do Observador , Reprodutibilidade dos Testes

16.

Comparable reliability and acceptability of telepsychiatry and face-to-face psychiatric assessments in the emergency room setting.

Bistre, Moises; Juven-Wetzler, Alzbeta; Argo, Daniel; Barash, Igor; Katz, Gregory; Teplitz, Ronen; Said, Muhamad-Musa; Kohn, Yoav; Linkovski, Omer; Eitan, Renana.

Int J Psychiatry Clin Pract ; 26(3): 228-233, 2022 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-34565277

RESUMO

OBJECTIVE: This study aims to compare the reliability and acceptability of psychiatric interviews using telepsychiatry and face-to-face modalities in the emergency room setting. METHODS: In this prospective observational feasibility study, psychiatric patients (n = 38) who presented in emergency rooms between April and June 2020, went through face-to-face and videoconference telepsychiatry interviews in a non-randomised varying order. Interviewers and a senior psychiatry resident who observed both interviews determined diagnosis, recommended disposition and indication for involuntary admission. Patients and psychiatrists completed acceptability post-assessment surveys. RESULTS: Agreement between raters on recommended disposition and indication for involuntary admission as measured by Cohen's kappa was 'strong' to 'almost perfect' (0.84/0.81, 0.95/0.87 and 0.89/0.94 for face-to-face vs. telepsychiatry, observer vs. face-to-face and observer vs. telepsychiatry, respectively). Partial agreement between the raters on diagnosis was 'strong' (Cohen's kappa of 0.81, 0.85 and 0.85 for face-to-face vs. telepsychiatry, observer vs. face-to-face and observer vs. telepsychiatry, respectively).Psychiatrists' and patients' satisfaction rates, and psychiatrists' perceived certainty rates, were comparably high in both face-to-face and telepsychiatry groups. CONCLUSIONS: Telepsychiatry is a reliable and acceptable alternative to face-to-face psychiatric assessments in the emergency room setting. Implementing telepsychiatry may improve the quality and accessibility of mental health services.Key pointsTelepsychiatry and face-to-face psychiatric assessments in the emergency room setting have comparable reliability.Patients and providers report a comparable high level of satisfaction with telepsychiatry and face-to-face modalities in the emergency room setting.Providers report a comparable level of perceived certainty in their clinical decisions based on telepsychiatry and face-to-face psychiatric assessments in the emergency room setting.

Assuntos

Transtornos Mentais , Psiquiatria , Telemedicina , Humanos , Transtornos Mentais/diagnóstico , Transtornos Mentais/terapia , Reprodutibilidade dos Testes , Serviço Hospitalar de Emergência

17.

Inter-rater reliability of assessments regarding the quality of drug treatment, and drug-related hospital admissions.

Lönnbro, Johan; Holmqvist, Lina; Persson, Elisabeth; Thysell, Per; Åberg, N David; Wallerstedt, Susanna M.

Br J Clin Pharmacol ; 87(10): 3825-3834, 2021 10.

Artigo em Inglês | MEDLINE | ID: mdl-33609324

RESUMO

AIMS: To investigate inter-rater agreement on the quality of drug treatment, and the relationship between the drug treatment and hospital admission. METHODS: Three specialist physicians and two resident physicians determined, independently and in consensus, the quality of drug treatment from an overall medical perspective, and its association with admission, in 30 randomly selected patients (50% female, median age 72 years) admitted to Sahlgrenska University Hospital, Sweden, in April 2018. The inter-rater agreement was evaluated with Gwet's agreement coefficient (AC1 ). RESULTS: In all, 200 (95%) out of 210 drugs at admission and 238 (97%) out of 245 drugs at discharge were assessed as reasonable drug treatment by all assessors. Conversely, none of the drugs at admission, and two at discharge, were assessed as unreasonable drug treatment by all assessors (AC1 : 0.88 and 0.94 [all], 0.86 and 0.95 [specialists], 0.92 and 0.92 [residents], respectively). The assessments regarding the association between the drug treatment and the hospital admission (not related or main/contributory reason) were consistent between the assessors for 16 out of 30 patients (AC1 : 0.67 [all], 0.74 [specialists], 0.54 [residents]). In none of the three cases where the hospital admission was considered possibly attributable to a prescribing error did the assessors make consistent assessments. CONCLUSIONS: As the inter-rater agreement ranged between weak and almost perfect, the reliability of assessments of drug treatment quality, as well as adverse consequences, appears to be a methodological concern. To yield acceptably reliable results regarding both drug treatment aspects at issue, specialist physicians should be involved.

Assuntos

Hospitalização , Preparações Farmacêuticas , Idoso , Feminino , Hospitais , Humanos , Masculino , Variações Dependentes do Observador , Reprodutibilidade dos Testes , Suécia

18.

Life-threatening danger assessments of penetrating injuries in Eastern Danish clinical forensic medicine.

Jakobsen, Lykke Schrøder; Christensen, Marie Toftdahl; Lundemose, Sissel Banner; Munkholm, Julie; Bugge, Anne Birgitte Dyhre; Lynnerup, Niels; Banner, Jytte.

Int J Legal Med ; 135(3): 861-870, 2021 May.

Artigo em Inglês | MEDLINE | ID: mdl-33410922

RESUMO

Clinical forensic assessments of injuries' life-threatening danger may have an impact on the legal aftermath following a violent assault. The pursuit of evidence-based guidelines should ensure a user-independent and reproducible forensic practice. However, does it? The aim of this study was to evaluate the forensic life-threatening danger assessments after a protocol implementation in 2016. The evaluation concerned usability and reproducibility of the protocol, and its influence on assessment severity. We analyzed the level of inter- and intra-rater agreement using 169 blinded, prior-protocol cases that were reassessed by two forensic specialists. We compared assessment made the year before and after protocol implementation (n = 262), and the forensic specialists' reassessments with the prior-protocol cases' original assessments (n = 169). Whether to make an assessment, the levels of agreement varied between weak agreement (inter-rater, Κ = 0.43; assessor 1, Κ = 0.57) and strong agreement (assessor 2, Κ = 0.90). Regarding severity, the levels of agreement varied between strong agreement (inter-rater, Κ = 0.87; assessor 1: Κ = 0.90) and almost perfect agreement (assessor 2: Κ = 0.94). The assessments were statistically significant redistributed after the implementation (chi-square test: p < 0.0001). The proportion of cases assessed as having not been in life-threatening danger increased from 9 to 43%, and moderate severity assessments decreased from 55 to 23%. Of the moderate severity assessments, 55% were reassessed as having not been in life-threatening danger. The protocol ensured independent and reproducible assessments when the forensic specialists agreed on making one. The protocol resulted in less severe assessments. Future studies should examine the reliability of the protocol and its consequences for legal aftermaths.

Assuntos

Medicina Legal/normas , Guias de Prática Clínica como Assunto/normas , Índices de Gravidade do Trauma , Ferimentos Penetrantes/classificação , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Dinamarca , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Variações Dependentes do Observador , Reprodutibilidade dos Testes , Adulto Jovem

19.

The impact of grey zones on the accuracy of agreement measures for ordinal tables.

Tran, Quoc Duyet; Dolgun, Anil; Demirhan, Haydar.

BMC Med Res Methodol ; 21(1): 70, 2021 04 14.

Artigo em Inglês | MEDLINE | ID: mdl-33853549

RESUMO

BACKGROUND: In an inter-rater agreement study, if two raters tend to rate considering different aspects of the subject of interest or have different experience levels, a grey zone occurs among the levels of a square contingency table showing the inter-rater agreement. These grey zones distort the degree of agreement between raters and negatively impact the decisions based on the inter-rater agreement tables. In this sense, it is important to know how the existence of a grey zone impacts the inter-rater agreement coefficients to choose the most reliable agreement coefficient against the grey zones to reach out with more reliable decisions. METHODS: In this article, we propose two approaches to create grey zones in simulations setting and conduct an extensive Monte Carlo simulation study to figure out the impact of having grey zones on the weighted inter-rater agreement measures for ordinal tables over a comprehensive simulation space. RESULTS: The weighted inter-rater agreement coefficients are not reliable against the existence of grey zones. Increasing sample size and the number of categories in the agreement table decreases the accuracy of weighted inter-rater agreement measures when there is a grey zone. When the degree of agreement between the raters is high, the agreement measures are not significantly impacted by the existence of grey zones. However, if there is a medium to low degree of inter-rater agreement, all the weighted coefficients are more or less impacted. CONCLUSIONS: It is observed in this study that the existence of grey zones has a significant negative impact on the accuracy of agreement measures especially for a low degree of true agreement and high sample and tables sizes. In general, Gwet's AC2 and Brennan-Prediger's κ with quadratic or ordinal weights are reliable against the grey zones.

Assuntos

Reprodutibilidade dos Testes , Humanos , Método de Monte Carlo , Variações Dependentes do Observador

20.

Ultrasound diagnosis of endometrial cancer by subjective pattern recognition in women with postmenopausal bleeding: prospective inter-rater agreement and reliability study.

Wong, M; Thanatsis, N; Amin, T; Bean, E; Madhvani, K; Jurkovic, D.

Ultrasound Obstet Gynecol ; 57(3): 471-477, 2021 03.

Artigo em Inglês | MEDLINE | ID: mdl-32621381

RESUMO

OBJECTIVES: To assess the inter-rater agreement and reliability of using subjective pattern recognition for diagnosing endometrial cancer (EC) on ultrasound in women with postmenopausal bleeding (PMB). METHODS: This was a prospective cross-sectional study conducted at a gynecological rapid-access clinic, between October 2016 and December 2017, in which consecutive women with PMB and endometrial thickness of ≥ 4.5 mm on transvaginal ultrasound examination were included. Women on hormone replacement therapy or tamoxifen and those with a history of primary gynecological malignancy were excluded. Two raters independently performed ultrasound examinations, blinded to each other's findings, and classified women as having uniformly thickened endometrium, benign endometrial polyp or EC, using subjective pattern recognition. Inter-rater reliability of ultrasound diagnosis was assessed using Cohen's kappa (κ) statistic. All women subsequently underwent either outpatient endometrial biopsy, hysteroscopy or hysterectomy. RESULTS: Forty women were included in the study, with a median age of 61 (interquartile range (IQR), 57-69) years and a median endometrial thickness of 11.0 (IQR, 6.2-20.3) mm. Final histological analysis confirmed 16 (40%) women with EC, 16 (40%) with benign endometrial polyp, four (10%) with atrophic endometrium, three (8%) with proliferative endometrium and one (3%) with endometrial hyperplasia. Inter-rater agreement for the ultrasound diagnoses of uniformly thickened endometrium, benign endometrial polyp and EC was 14/16 (87.5%), 22/30 (73.3%) and 28/34 (82.4%), respectively; inter-rater reliability was good (κ = 0.69; 95% CI, 0.49-0.88). When the ultrasound diagnoses were grouped as either cancer or no cancer, inter-rater agreement was 85% and inter-rater reliability was good (κ = 0.78; 95% CI, 0.61-0.95). Rater A correctly identified 14/16 cases of EC and Rater B identified 15/16. EC was misdiagnosed as benign polyps on ultrasound in two women by Rater A and in one woman by Rater B. The overall accuracies of Rater A and Rater B in differentiating between benign endometrial pathologies and malignancy were 90% and 90%, respectively. CONCLUSIONS: Our results show good inter-rater reliability of subjective pattern recognition in diagnosing uniformly thickened endometrium, benign endometrial polyp and EC on ultrasound in women with PMB. Our findings should facilitate wider use of subjective pattern recognition in routine clinical practice. © 2020 International Society of Ultrasound in Obstetrics and Gynecology.

Assuntos

Detecção Precoce de Câncer/estatística & dados numéricos , Neoplasias do Endométrio/diagnóstico por imagem , Ultrassonografia/estatística & dados numéricos , Hemorragia Uterina/diagnóstico por imagem , Hemorragia Uterina/patologia , Idoso , Estudos Transversais , Detecção Precoce de Câncer/métodos , Hiperplasia Endometrial/complicações , Hiperplasia Endometrial/diagnóstico por imagem , Neoplasias do Endométrio/complicações , Endométrio/diagnóstico por imagem , Endométrio/patologia , Feminino , Humanos , Pessoa de Meia-Idade , Variações Dependentes do Observador , Pólipos/complicações , Pólipos/diagnóstico por imagem , Pós-Menopausa , Estudos Prospectivos , Reprodutibilidade dos Testes , Ultrassonografia/métodos , Doenças Uterinas/complicações , Doenças Uterinas/diagnóstico por imagem , Hemorragia Uterina/etiologia

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa