Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
1.
Med Educ ; 51(4): 390-400, 2017 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-28078685

RESUMO

CONTEXT: Peer assessment of professional behaviour within problem-based learning (PBL) groups can support learning and provide opportunities to identify and remediate problem behaviours. OBJECTIVES: We investigated whether a peer assessment of learning behaviours in PBL is sufficiently valid to support decision making about student professional behaviours. METHODS: Data were available for two cohorts of students, in which each student was rated by all of their PBL group peers using a modified version of a previously validated scale. Following the provision of feedback to the students, their behaviours were again peer-assessed. A generalisability study was undertaken to calculate the students' professional behaviour scores, sources of error that impacted the reliability of the assessment, changes in student rating behaviour, and changes in mean scores after the delivery of feedback. RESULTS: Peer assessment of professional learning behaviour was highly reliable for within-group comparisons (G = 0.81-0.87), but poor for across-group comparisons (G = 0.47-0.53). Feedback increased the range of ratings given by assessors and brought their mean ratings into closer alignment. More of the increased variance was attributable to assessee performance than to assessor stringency and hence there was a slight improvement in reliability, especially for comparisons across groups. Mean professional behaviour scores were unchanged. CONCLUSIONS: Peer assessment of professional learning behaviours may be unreliable for decision making outside a PBL group. Faculty members should not draw conclusions from peer assessment about a student's behaviour compared with that of their peers in the cohort, and such a tool may not be appropriate for summative assessment. Health professional educators interested in assessing student professional behaviours in PBL groups might focus on opportunities for the provision of formative peer feedback and its impact on learning.


Assuntos
Retroalimentação , Aprendizagem , Revisão por Pares , Aprendizagem Baseada em Problemas , Profissionalismo , Educação de Graduação em Medicina/métodos , Avaliação Educacional , Humanos , Grupo Associado , Reprodutibilidade dos Testes , Ensino
2.
BMC Med Educ ; 17(1): 213, 2017 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-29141622

RESUMO

BACKGROUND: Good clinical handover is critical to safe medical care. Little research has investigated handover in rural settings. In a remote setting where nurses and medical students give telephone handover to an aeromedical retrieval service, we developed a tool by which the receiving clinician might assess the handover; and investigated factors impacting on the reliability and validity of that assessment. METHODS: Researchers consulted with clinicians to develop an assessment tool, based on the ISBAR handover framework, combining validity evidence and the existing literature. The tool was applied 'live' by receiving clinicians and from recorded handovers by academic assessors. The tool's performance was analysed using generalisability theory. Receiving clinicians and assessors provided feedback. RESULTS: Reliability for assessing a call was good (G = 0.73 with 4 assessments). The scale had a single factor structure with good internal consistency (Cronbach's alpha = 0.8). The group mean for the global score for nurses and students was 2.30 (SD 0.85) out of a maximum 3.0, with no difference between these sub-groups. CONCLUSIONS: We have developed and evaluated a tool to assess high-stakes handover in a remote setting. It showed good reliability and was easy for working clinicians to use. Further investigation and use is warranted beyond this setting.


Assuntos
Competência Clínica/normas , Corpo Clínico Hospitalar/normas , Transferência da Responsabilidade pelo Paciente , Qualidade da Assistência à Saúde/normas , Estudantes de Medicina/estatística & dados numéricos , Lista de Checagem , Estudos Transversais , Linhas Diretas , Humanos , Área Carente de Assistência Médica , New South Wales , Transferência da Responsabilidade pelo Paciente/normas , Prática Profissional , Reprodutibilidade dos Testes
3.
Med Educ ; 47(11): 1080-8, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-24117554

RESUMO

OBJECTIVES: Free-text comments in multi-source feedback are intended to facilitate change in the assessee's practice. This study was designed to utilise a large dataset of free-text comments obtained in a national pilot study in order to investigate how helpful these free-text comments may be to assessees. METHODS: We investigated: (i) which areas of performance are usually addressed by free-text comments; (ii) to what extent assessors' (doctors, nurses, allied health professionals and clerical or managerial staff) comments correspond to assessees' (career-grade doctors) self-assessments, and (iii) whether the comments contain specific behavioural evidence and suggestions for change. Initially comments were read through to identify commonly recurring themes. A strong theme was 'respondent-centredness', which refers to the extent to which comments focus on issues that are of value to the assessor rather than to the assessee's personal development. In response to this, the data were re-evaluated against predefined research questions to assess how constructive comments were for the assessee's personal development. RESULTS: Of 11,483 assessor forms, 4777 (42%) included free-text comments. A total of 513 forms contained at least one below average score and 286 (56%) of these forms contained the assessor's free-text feedback. Free-text comments were mostly rater-centred and addressed the effect of the assessee on the colleague's working life rather than areas of relevance to the assessee's personal development. A total of 1806 assessor/assessee pairs of comments were compared; most demonstrated clear differences of opinion or interpretation. Reliability and supportiveness were over-represented; clinical performance and personal development were under-represented. The comments were unlikely to provide specific behavioural evidence or to address how change might be initiated. CONCLUSIONS: Our data indicate that, in their current form, the overwhelming majority of free-text comments add little to facilitate improvement in assessees' personal development and performance.


Assuntos
Educação Médica/métodos , Retroalimentação , Aprendizagem , Atitude do Pessoal de Saúde , Competência Clínica , Pessoal de Saúde , Humanos , Projetos Piloto , Reprodutibilidade dos Testes
4.
Med Educ ; 46(1): 28-37, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22150194

RESUMO

CONTEXT: Historically, assessments have often measured the measurable rather than the important. Over the last 30 years, however, we have witnessed a gradual shift of focus in medical education. We now attempt to teach and assess what matters most. In addition, the component parts of a competence must be marshalled together and integrated to deal with real workplace problems. Workplace-based assessment (WBA) is complex, and has relied on a number of recently developed methods and instruments, of which some involve checklists and others use judgements made on rating scales. Given that judgements are subjective, how can we optimise their validity and reliability? METHODS: This paper gleans psychometric data from a range of evaluations in order to highlight features of judgement-based assessments that are associated with better validity and reliability. It offers some issues for discussion and research around WBA. It refers to literature in a selective way. It does not purport to represent a systematic review, but it does attempt to offer some serious analyses of why some observations occur in studies of WBA and what we need to do about them. RESULTS AND DISCUSSION: Four general principles emerge: the response scale should be aligned to the reality map of the judges; judgements rather than objective observations should be sought; the assessment should focus on competencies that are central to the activity observed, and the assessors who are best-placed to judge performance should be asked to participate.


Assuntos
Educação Médica/métodos , Educação Médica/normas , Avaliação Educacional/métodos , Competência Clínica/normas , Humanos , Psicometria , Reprodutibilidade dos Testes , Local de Trabalho/psicologia
5.
Med Educ ; 45(6): 560-9, 2011 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-21501218

RESUMO

CONTEXT: Assessment in the workplace is important, but many evaluations have shown that assessor agreement and discrimination are poor. Training discussions suggest that assessors find conventional scales invalid. We evaluate scales constructed to reflect developing clinical sophistication and independence in parallel with conventional scales. METHODS: A valid scale should reduce assessor disagreement and increase assessor discrimination. We compare conventional and construct-aligned scales used in parallel to assess approximately 2000 medical trainees by each of three methods of workplace-based assessment (WBA): the mini-clinical evaluation exercise (mini-CEX); the acute care assessment tool (ACAT), and the case-based discussion (CBD). We evaluate how scores reflect assessor disagreement (V(j) and V(j*p) ) and assessor discrimination (V(p) ), and we model reliability using generalisability theory. RESULTS: In all three cases the conventional scale gave a performance similar to that in previous evaluations, but the construct-aligned scales substantially reduced assessor disagreement and substantially increased assessor discrimination. Reliability modelling shows that, using the new scales, the number of assessors required to achieve a generalisability coefficient ≥0.70 fell from six to three for the mini-CEX, from eight to three for the CBD, from 10 to nine for 'on-take' ACAT, and from 30 to 12 for 'post-take' ACAT. CONCLUSIONS: The results indicate that construct-aligned scales have greater utility, both because they are more reliable and because that reliability provides evidence of greater validity. There is also a wider implication: the disappointing reliability of existing WBA methods may reflect not assessors' differing assessments of performance, but, rather, different interpretations of poorly aligned scales. Scales aligned to the expertise of clinician-assessors and the developing independence of trainees may improve confidence in WBA.


Assuntos
Competência Clínica/normas , Educação Médica/métodos , Análise de Variância , Educação Médica/normas , Avaliação Educacional/métodos , Avaliação Educacional/normas , Humanos , Modelos Educacionais , Variações Dependentes do Observador , Valores de Referência , Reprodutibilidade dos Testes , Local de Trabalho
6.
Med Educ ; 45(8): 843-8, 2011 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-21752081

RESUMO

CONTEXT: Multi-source feedback (MSF) provides a window into complex areas of performance in real workplace settings. However, because MSF elicits subjective judgements, many respondents are needed to achieve a reliable assessment. Optimising the consistency with which questions are interpreted will help reliability. METHODS: We compared two parallel forms of an MSF instrument with identical wording and administration procedures. The original instrument contained 10 compound performance items and was used 12,540 times to assess 977 doctors, including 112 general practitioners (GPs). The modified instrument contained the same wording in 21 non-compound items, each of which asked about a single aspect of performance, and was used 2789 times to assess 205 doctors, all of whom were GPs. Generalisability analysis evaluated questionnaire reliability. The reliability of the original instrument was evaluated for both the whole group and the GP subgroup. RESULTS: The two instruments provided similar numbers of responses per doctor. The modified instrument generated more reliable scores. The whole-group comparison examined precision, measured as standard error of measurement (SEM); seven respondents were sufficient to achieve a 95% confidence interval of 0.25 (on a 4-point scale) with the modified instrument, compared with 10 respondents using the original instrument. The subgroup comparison examined the generalisability coefficient; 15 responses provided a reliability of 0.72 using the modified instrument or 0.58 using the original instrument. CONCLUSIONS: Non-compound questions improved the consistency of scores. We recommend that compound questions be avoided in assessment instrument design.


Assuntos
Competência Clínica/normas , Avaliação Educacional/métodos , Avaliação de Desempenho Profissional/métodos , Modelos Educacionais , Médicos/psicologia , Educação Médica Continuada , Retroalimentação , Humanos , Reprodutibilidade dos Testes , Inquéritos e Questionários , Reino Unido
7.
Med Educ ; 45(7): 741-7, 2011 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-21649707

RESUMO

CONTEXT: Good examinations have a number of characteristics, including validity, reliable scores, educational impact, practicability and acceptability. Scores from the objective structured clinical examination (OSCE) are more reliable than the single long case examination, but concerns about its validity have led to modifications and the development of other models, such as the mini-clinical evaluation exercise (mini-CEX) and the objective structured long examination record (OSLER). These retain some of the characteristics of the long case, but feature repeated encounters and more structure. Nevertheless, the practical considerations and costs associated with mounting large-scale examinations remain significant. The lack of metrics handicaps progress. This paper reports a system whereby a sequential design concentrates limited resources where they are most needed in order to maintain the reliability of scores and practicability at the pass/fail interface. METHODS: We analysed data pertaining to the final examination administered in 2009. In the complete final examination, candidates see eight real patients (the OSLER) and encounter 12 OSCE stations. Candidates whose performance is judged as entirely satisfactory after the first four patients and six OSCE stations are not examined further. The others - about a third of candidates - see the remaining patients and stations and are judged on the complete examination. Reliability was calculated from the scores of all candidates on the first part of the examination using generalisability theory and practicability in terms of financial resources. The functioning of the sequential system was assessed by the ability of the first part of the examination to predict the final result for the cohort. RESULTS: Generalisability for the OSLER was 0.63 after four patients and 0.77 after eight patients. The OSCE was less reliable (0.38 after six stations and 0.55 after 12). There was only a weak correlation between the OSLER and the OSCE. The first stage was highly predictive of the results of the second stage. Savings facilitated by the sequential design amounted to approximately GBP 30,000. CONCLUSIONS: The overall utility of examinations involves compromise. The system described provides good perceived validity with reasonably reliable scores; a sequential design can concentrate resources where they are most needed and still allow wide sampling of tasks.


Assuntos
Educação de Graduação em Medicina/métodos , Avaliação Educacional/métodos , Competência Clínica , Análise Custo-Benefício , Educação de Graduação em Medicina/economia , Avaliação Educacional/economia , Estudos de Viabilidade , Humanos , Anamnese/normas , Simulação de Paciente , Exame Físico/normas , Reprodutibilidade dos Testes
8.
Clin Med (Lond) ; 11(1): 48-53, 2011 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-21404785

RESUMO

This paper outlines the development and evaluation of the utility of workplace-based assessments in higher medical training: case-based discussion (CbD); the acute care assessment tool (ACAT); audit assessment; teaching observation and patient survey (PS). The study population included trainees in higher medical training (ST3+) from physician specialties in the U.K. The pilot consisted of a prospective study of the use of the new assessments using local study coordinators (LSCs) and volunteer trainees. In total, 169 LSCs were recruited and 134 trainees returned at least one assessment. The end-of-pilot questionnaire was returned by 44 assessors and 57 trainees. Questionnaire data and qualitative feedback were used to evaluate the validity, impact and feasibility of the new tools. For adequate reliability (co-efficient 0.7) a total of 12 CbDs; three ACATs and 16 PS raters are required. There was evidence for the validity and positive educational impact of all the tools. There were difficulties with the feasibility of the PS.


Assuntos
Competência Clínica/normas , Avaliação Educacional/métodos , Serviços Médicos de Emergência/normas , Ensino/métodos , Local de Trabalho , Humanos , Projetos Piloto , Inquéritos e Questionários , Reino Unido
9.
Med Teach ; 33(2): e75-83, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21275537

RESUMO

BACKGROUND: The UK Department of Health is considering a single, generic multi-source feedback (MSF) questionnaire to inform revalidation. METHOD: Evaluation of an implementation pilot, reporting: response rates, assessor mix, question redundancy and participants' perceptions. Reliability was estimated using Generalisability theory. RESULTS: A total of 12,540 responses were received on 977 doctors. The mean time taken to complete an MSF exercise was 68.2 days. The mean number of responses received per doctor was 12.0 (range 1-17) with no significant difference between specialties. Individual question response rates and participants' comments about questions indicate that some questions are less appropriate for some specialities. There was a significant difference in the mean score between specialities. Despite guidance, there were significant differences in the mix of assessors across specialties. More favourable scores were given by progressively more junior doctors. Nurses gave the most reliable scores. CONCLUSIONS: It is feasible to electronically administer a generic questionnaire to a large population of doctors. Generic content is appropriate for most but not all specialties. The differences in mean scores and the reliability of the MSF between specialties may be in part due to the specialty differences in assessor mix. Therefore the number and assessor mix should be standardised at specialty level and scores should not be compared across specialties.


Assuntos
Avaliação de Desempenho Profissional/métodos , Retroalimentação , Licenciamento/legislação & jurisprudência , Grupo Associado , Médicos , Competência Clínica , Humanos , Medicina , Reino Unido
10.
Med Educ ; 44(7): 690-8, 2010 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20636588

RESUMO

CONTEXT: There are significant levels of variation in candidate multiple mini-interview (MMI) scores caused by interviewer-related factors. Multi-facet Rasch modelling (MFRM) has the capability to both identify these sources of error and partially adjust for them within a measurement model that may be fairer to the candidate. METHODS: Using facets software, a variance components analysis estimated sources of measurement error that were comparable with those produced by generalisability theory. Fair average scores for the effects of the stringency/leniency of interviewers and question difficulty were calculated and adjusted rankings of candidates were modelled. RESULTS: The decisions of 207 interviewers had an acceptable fit to the MFRM model. For one candidate assessed by one interviewer on one MMI question, 19.1% of the variance reflected candidate ability, 8.9% reflected interviewer stringency/leniency, 5.1% reflected interviewer question-specific stringency/leniency and 2.6% reflected question difficulty. If adjustments were made to candidates' raw scores for interviewer stringency/leniency and question difficulty, 11.5% of candidates would see a significant change in their ranking for selection into the programme. Greater interviewer leniency was associated with the number of candidates interviewed. CONCLUSIONS: Interviewers differ in their degree of stringency/leniency and this appears to be a stable characteristic. The MFRM provides a recommendable way of giving a candidate score which adjusts for the stringency/leniency of whichever interviewers the candidate sees and the difficulty of the questions the candidate is asked.


Assuntos
Avaliação Educacional/métodos , Entrevistas como Assunto , Critérios de Admissão Escolar , Competência Clínica , Comunicação , Avaliação Educacional/normas , Docentes de Medicina , Humanos , Variações Dependentes do Observador , Psicometria/métodos
11.
Med Educ ; 43(4): 326-34, 2009 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-19335574

RESUMO

OBJECTIVES: The mini-clinical evaluation exercise (mini-CEX) is widely used in the UK to assess clinical competence, but there is little evidence regarding its implementation in the undergraduate setting. This study aimed to estimate the validity and reliability of the undergraduate mini-CEX and discuss the challenges involved in its implementation. METHODS: A total of 3499 mini-CEX forms were completed. Validity was assessed by estimating associations between mini-CEX score and a number of external variables, examining the internal structure of the instrument, checking competency domain response rates and profiles against expectations, and by qualitative evaluation of stakeholder interviews. Reliability was evaluated by overall reliability coefficient (R), estimation of the standard error of measurement (SEM), and from stakeholders' perceptions. Variance component analysis examined the contribution of relevant factors to students' scores. RESULTS: Validity was threatened by various confounding variables, including: examiner status; case complexity; attachment specialty; patient gender, and case focus. Factor analysis suggested that competency domains reflect a single latent variable. Maximum reliability can be achieved by aggregating scores over 15 encounters (R = 0.73; 95% confidence interval [CI] +/- 0.28 based on a 6-point assessment scale). Examiner stringency contributed 29% of score variation and student attachment aptitude 13%. Stakeholder interviews revealed staff development needs but the majority perceived the mini-CEX as more reliable and valid than the previous long case. CONCLUSIONS: The mini-CEX has good overall utility for assessing aspects of the clinical encounter in an undergraduate setting. Strengths include fidelity, wide sampling, perceived validity, and formative observation and feedback. Reliability is limited by variable examiner stringency, and validity by confounding variables, but these should be viewed within the context of overall assessment strategies.


Assuntos
Educação de Graduação em Medicina/métodos , Avaliação Educacional/métodos , Competência Clínica/normas , Estudos de Avaliação como Assunto , Estatística como Assunto , Reino Unido
12.
Med Teach ; 31(12): e603-7, 2009 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-19995162

RESUMO

BACKGROUND: Professional self-identity is a 'state of mind' -- identifying one's-self as a member of a professional group. Delayed professional self-identity is a barrier to successful transition from student to professional. Current trends in medical education limit student doctors' legitimate peripheral participation and may retard their developing professional self-identity compared with other health and social care students. AIMS: Develop a tool to monitor the development of professional self-identity to operate across the different health and social care professions and evaluate the tool with student doctors before wider data collection. METHOD: Content analysis of relevant curricula, mapped to professional standards documents, defined initial content. Field tests across 10 professional groups refined questionnaire items. A cross-sectional study on 496 student doctors evaluated validity on the basis of internal structure and relationships with external variables. RESULTS: The 9-item questionnaire indicates a three-factor structure reflecting 'interpersonal tasks', 'generic attributes' and 'profession-specific elements'. Students with greater previous experience of health or social care roles, and students with a more positive attitude to qualification had significantly more advanced scores than their peers. Scores advanced through the curriculum showing step changes after the start of clinical attachments. CONCLUSIONS: The data provides sufficient evidence of validity with student doctors to justify wider data collection.


Assuntos
Atitude do Pessoal de Saúde , Autoimagem , Identificação Social , Estudantes de Medicina/psicologia , Inquéritos e Questionários , Benchmarking , Currículo , Educação Médica , Feminino , Humanos , Masculino , Competência Profissional , Reino Unido
13.
14.
Med Educ ; 42(4): 359-63, 2008 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-18338988

RESUMO

CONTEXT: The Chief Medical Officer's recommendations on medical regulation in the UK suggest that National Health Service (NHS) trusts should assess their doctors and confirm whether they remain fit to practise medicine. OBJECTIVE: We set out to evaluate the utility of hospital trust-based assessment in a 'best-case scenario' within existing resources. METHODS: We carried out a generalisability analysis, and feasibility and validity evaluation, based on an assessment process for 137 career-grade doctors at Chesterfield Royal Hospital, Chesterfield, UK, using validated multi-source feedback (MSF) and patient rating (PR) instruments. RESULTS: Uptake and response rates were good for MSF (91% and 85%, respectively). However, only 6% of non-clinical doctors and anaesthetists, and 48% of clinical doctors, obtained sufficient PR ratings. Aggregate scores were acceptably reliable. Nine combined MSF ratings and 15 PR ratings produce standard errors of measurement of 0.19 on a 6-point scale and 0.15 on a 5-point scale, respectively. Overall aggregate scores did not identify any doctor as unsatisfactory, but 6 doctors were scored as unsatisfactory by 2 or more colleagues or patients. These performance concerns appear to merit further investigation. Patients rated female doctors better than male doctors (4.61 versus 4.46; P < 0.05). Colleagues rated UK graduates better than non-UK graduates (5.31 versus 5.15; P < 0.05). CONCLUSIONS: This study shows that the commissioning of professional services makes the implementation of an assessment process linked to appraisal feasible. However, trust-based assessment requires significant development: developmental appraisal needs protection; new instruments are needed for non-clinical specialties; PR requires specific administrative support, and guidance is required over concern thresholds and demographic effects. Disaggregated assessment data may help identify doctors with potential performance problems.


Assuntos
Competência Clínica/normas , Hospitais de Distrito , Licenciamento , Corpo Clínico Hospitalar/normas , Consultores , Estudos de Viabilidade , Reino Unido
15.
Med Educ ; 42(4): 396-404, 2008 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-18338992

RESUMO

CONTEXT: We wished to determine which factors are important in ensuring interviewers are able to make reliable and valid decisions about the non-cognitive characteristics of candidates when selecting candidates for entry into a graduate-entry medical programme using the multiple mini-interview (MMI). METHODS: Data came from a high-stakes admissions procedure. Content validity was assured by using a framework based on international criteria for sampling the behaviours expected of entry-level students. A variance components analysis was used to estimate the reliability and sources of measurement error. Further modelling was used to estimate the optimal configurations for future MMI iterations. RESULTS: This study refers to 485 candidates, 155 interviewers and 21 questions taken from a pre- prepared bank. For a single MMI question and 1 assessor, 22% of the variance between scores reflected candidate-to-candidate variation. The reliability for an 8-question MMI was 0.7; to achieve 0.8 would require 14 questions. Typical inter-question correlations ranged from 0.08 to 0.38. A disattenuated correlation with the Graduate Australian Medical School Admissions Test (GAMSAT) subsection 'Reasoning in Humanities and Social Sciences' was 0.26. CONCLUSIONS: The MMI is a moderately reliable method of assessment. The largest source of error relates to aspects of interviewer subjectivity, suggesting interviewer training would be beneficial. Candidate performance on 1 question does not correlate strongly with performance on another question, demonstrating the importance of context specificity. The MMI needs to be sufficiently long for precise comparison for ranking purposes. We supported the validity of the MMI by showing a small positive correlation with GAMSAT section scores.


Assuntos
Educação de Graduação em Medicina/métodos , Entrevistas como Assunto , Critérios de Admissão Escolar , Faculdades de Medicina , Tomada de Decisões , Licenciamento em Medicina , Aprendizagem Baseada em Problemas , Sensibilidade e Especificidade
16.
Med Educ ; 42(10): 1014-20, 2008 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-18823521

RESUMO

CONTEXT: The white paper 'Trust, Assurance and Safety: the Regulation of Health Professionals in the 21st Century' proposes a single, generic multi-source feedback (MSF) instrument in the UK. Multi-source feedback was proposed as part of the assessment programme for Year 1 specialty training in histopathology. METHODS: An existing instrument was modified following blueprinting against the histopathology curriculum to establish content validity. Trainees were also assessed using an objective structured practical examination (OSPE). Factor analysis and correlation between trainees' OSPE performance and the MSF were used to explore validity. All 92 trainees participated and the assessor response rate was 93%. Reliability was acceptable with eight assessors (95% confidence interval 0.38). Factor analysis revealed two factors: 'generic' and 'histopathology'. Pearson correlation of MSF scores with OSPE performances was 0.48 (P = 0.001) and the histopathology factor correlated more highly (histopathology r = 0.54, generic r = 0.42; t = - 2.76, d.f. = 89, P < 0.01). Trainees scored least highly in relation to ability to use histopathology to solve clinical problems (mean = 4.39) and provision of good reports (mean = 4.39). Three of six doctors whose means were < 4.0 received free text comments about report writing. There were 83 forms with aggregate scores of < 4. Of these, 19.2% included comments about report writing. RESULTS: Specialty-specific MSF is feasible and achieves satisfactory reliability. The higher correlation of the 'histopathology' factor with the OSPE supports validity. This paper highlights the importance of validating an MSF instrument within the specialty-specific context as, in addition to assuring content validity, the PATH-SPRAT (Histopathology-Sheffield Peer Review Assessment Tool) also demonstrates the potential to inform training as part of a quality improvement model.


Assuntos
Competência Clínica/normas , Educação de Pós-Graduação em Medicina/normas , Retroalimentação , Patologia/educação , Avaliação Educacional/métodos , Estudos de Viabilidade , Feminino , Humanos , Masculino , Estatística como Assunto , Reino Unido
17.
Clin Teach ; 8(4): 267-71, 2011 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-22085005

RESUMO

BACKGROUND: New guidelines require all undergraduate medical students to undertake at least one period of assistantship where they assume most of the responsibilities of a first-year graduate doctor (FY1 doctor in the UK) under supervision. AIM: To investigate the feasibility of these assistantships. METHOD: All UK schools were sent a questionnaire addressing the supervision required and the main barriers around implementation. RESULTS: Competencies that students already engage in as part of existing clinical placements and a number of 'tacit' competencies (e.g. practice and promote infection control) were regarded by most as suitable. Activities that present a clear clinical risk (e.g. prescribing and writing clinical correspondence) were regarded by most as unsuitable or requiring continuous supervision. Some lower risk but hard to measure activities (e.g. responding in practice to audit) were also regarded as unsuitable by some. A competency was usually considered inappropriate for one of three reasons: (1) current clinical governance and patient safety protocols appeared to bar students undertaking the competency; (2) a competency was not considered to be part of the current FY1 doctors' role; or (3) brief assistantships were considered unlikely to create sufficient opportunity for performing the competency. DISCUSSION: The article presents a number of practical issues in relation to assigning responsibility to student doctors. Respondents indicate that successful assistantships will only be possible if the UK National Health Service trusts review their attitude to balancing short- and long-term risks: assistantships need to be long enough to create genuine responsibility opportunities, and will require investment in supervision beyond the current capacity.


Assuntos
Estágio Clínico/organização & administração , Competência Clínica , Educação de Graduação em Medicina/organização & administração , Atitude do Pessoal de Saúde , Humanos , Inquéritos e Questionários , Reino Unido
18.
Med Educ ; 41(10): 926-34, 2007 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-17908111

RESUMO

CONTEXT: Investigators applying generalisability theory to educational research and evaluation have sometimes done so poorly. The main difficulties have related to: inadequate or non-random sampling of effects, dealing with naturalistic data, and interpreting and presenting variance components. METHODS: This paper addresses these areas of difficulty, and articulates an informal consensus amongst medical educators from Europe, Australia and the USA, who are familiar with generalisability theory. RESULTS: We make the following recommendations. Ensure that all relevant factors are sampled, and that the sampling meets the theory's assumption that the conditions represent a random and representative sample of the factor's 'universe'. Research evaluations will require large samples of each factor if they are to generalise adequately. Where feasible, conduct 2 separate studies (pilot and evaluation, or Generalisability and Decision studies). For unbalanced data, use either urgenova, or 1 of the procedures minimum norm quadratic unbiased estimator, (minque), maximum likelihood (ml) or restricted maximum likelihood (reml) in spss or sas if the data are too complex. State which mathematical procedure was used and the degrees of freedom (d.f.) of the effect estimates. If the procedure does not report d.f., re-analyse with type III sum of squares anova (anova ss III) and report these d.f. Describe and justify the regression model used. Present the raw variance components. Describe the effects that they represent in plain, non-statistical language. If standard error of measurement (SEM) or Reliability coefficients are presented, give the equations used to calculate them. Make sure that the method of reporting reliability (precision or discrimination) is appropriate to the purpose of the assessment. This will usually demand a precision indicator such as SEM. Consider a graphical presentation to combine precision and discrimination.


Assuntos
Pesquisa Biomédica , Educação Médica/normas , Análise de Variância , Coleta de Dados , Projetos de Pesquisa
19.
Med Educ ; 40(4): 363-70, 2006 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-16573673

RESUMO

INTRODUCTION: We wished to determine whether assessors could make reliable and valid judgements about the quality of completed reflective personal development plans (PDPs) for the purpose of accrediting UK general practitioners (GPs) for a postgraduate education allowance using a marking matrix, and secondly, to plan a feasible model of PDP assessment in the context of forthcoming GP appraisal/revalidation that would overcome the main sources of error identified from this study. METHODS: Within generalisability theory, a variance components analysis on PDP scores estimated reliability and the effect on them of varying, for example, the number of assessors. We investigated the construct validity of the matrix through its internal consistency and detection of differences in the quality of PDPs. RESULTS: For a single PDP and one assessor, 37.6% of the variance in scores was due to true differences in the quality of the PDP. Between 5 and 7 PDP assessors are needed to achieve summative reliability of greater than 0.8. While increasing the number of judges is important, reliability could also be improved by addressing assessor subjectivity. Construct validity was demonstrated, as the matrix distinguished between good, satisfactory and poor PDPs, and it had good internal consistency. CONCLUSION: PDP assessment has reasonable summative characteristics for the purpose of assessing GPs' reflective continuing professional development. If doctors could include their PDPs within their revalidation folders as evidence of their reflections on pursuing better clinical performance, we have described a reliable, valid and feasible method of external assessment.


Assuntos
Competência Clínica/normas , Educação de Pós-Graduação em Medicina/normas , Medicina de Família e Comunidade/educação , Desenvolvimento de Pessoal , Acreditação , Documentação , Psicometria , Reprodutibilidade dos Testes , Reino Unido
20.
Med Educ ; 39(8): 807-19, 2005 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-16048623

RESUMO

CONTEXT: The clinical consultation is an important aspect of the doctor's role. However, there is a particular shortage of methods for assessing its quality, and its complexity makes it a considerable assessment challenge. RESEARCH QUESTION: What are the key components of consultations involving children? METHODS: (1) A content analysis of relevant published and unpublished literature. (2) A nominal group consensus exercise with experienced paediatricians. RESULTS: The content analysis and consensus exercise suggested similar lists of doctor's characteristics, tasks and outcomes as being important components of the consultation. Doctor's characteristics include: clinical judgement, clinical knowledge, physical examination, information gathering, clinical questioning, information giving, patient-centredness, parent-centredness, interpersonal skills, and consultation management. Important tasks include: organisation and efficiency, rapport, information gathering, getting the family perspective, examination and procedures, evaluation, medically appropriate plans, family appropriate plans, enhancing understanding and recall, achieving consensus, sharing responsibility, family knows how to get further help and liaison with other relevant health-care professionals. Important outcomes include: family satisfaction, family perceptions, compliance, health, health-related problems and doctor's satisfaction. The studies reviewed in the literature also provided a catalogue of factors that have been shown to influence the doctor-patient interaction that could potentially confound the assessment of a doctor's performance. These include the doctor's: age, gender, training, speciality, income, social class and politics; the patient's: age, gender, health, prognosis, social class, education, health beliefs and preferences about control and risk. The length of the acquaintance between doctor and patient, and the workload and case-mix in the clinic also affect the interaction. In several studies it is clear that particular combinations of doctor-type and patient-type have especially good or bad interactions. CONCLUSIONS AND FURTHER WORK: These components are synthesised in a single model of the doctor-patient interaction to guide the development and evaluation of assessment instruments aimed at consultations involving children.


Assuntos
Criança , Competência Clínica/normas , Pais , Pediatria/normas , Relações Médico-Paciente , Relações Profissional-Família , Consenso , Humanos , Satisfação do Paciente , Encaminhamento e Consulta
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA