Your browser doesn't support javascript.
loading
Evaluation of freely available data profiling tools for health data research application: a functional evaluation review.
Gordon, Ben; Fennessy, Clara; Varma, Susheel; Barrett, Jake; McCondochie, Enez; Heritage, Trevor; Duroe, Oenone; Jeffery, Richard; Rajamani, Vishnu; Earlam, Kieran; Banda, Victor; Sebire, Neil.
Afiliação
  • Gordon B; Central Team, Health Data Research UK, London, UK.
  • Fennessy C; Central Team, Health Data Research UK, London, UK.
  • Varma S; Central Team, Health Data Research UK, London, UK.
  • Barrett J; Central Team, Health Data Research UK, London, UK.
  • McCondochie E; Inspirata Ltd, Tampa, Florida, USA.
  • Heritage T; Inspirata Ltd, Tampa, Florida, USA.
  • Duroe O; Inspirata Ltd, Tampa, Florida, USA.
  • Jeffery R; Inspirata Ltd, Tampa, Florida, USA.
  • Rajamani V; Inspirata Ltd, Tampa, Florida, USA.
  • Earlam K; Cystic Fibrosis Trust, London, UK.
  • Banda V; Neonatal Data Analysis Unit, Imperial College London Neonatal Medicine Research Group, London, UK.
  • Sebire N; Central Team, Health Data Research UK, London, UK neil.sebire@hdruk.ac.uk.
BMJ Open ; 12(5): e054186, 2022 05 09.
Article em En | MEDLINE | ID: mdl-35534084
ABSTRACT

OBJECTIVES:

To objectively evaluate freely available data profiling software tools using healthcare data.

DESIGN:

Data profiling tools were evaluated for their capabilities using publicly available information and data sheets. From initial assessment, several underwent further detailed evaluation for application on healthcare data using a synthetic dataset of 1000 patients and associated data using a common health data model, and tools scored based on their functionality with this dataset.

SETTING:

Improving the quality of healthcare data for research use is a priority. Profiling tools can assist by evaluating datasets across a range of quality dimensions. Several freely available software packages with profiling capabilities are available but healthcare organisations often have limited data engineering capability and expertise.

PARTICIPANTS:

28 profiling tools, 8 undergoing evaluation on synthetic dataset of 1000 patients.

RESULTS:

Of 28 potential profiling tools initially identified, 8 showed high potential for applicability with healthcare datasets based on available documentation, of which two performed consistently well for these purposes across multiple tasks including determination of completeness, consistency, uniqueness, validity, accuracy and provision of distribution metrics.

CONCLUSIONS:

Numerous freely available profiling tools are serviceable for potential use with health datasets, of which at least two demonstrated high performance across a range of technical data quality dimensions based on testing with synthetic health dataset and common data model. The appropriate tool choice depends on factors including underlying organisational infrastructure, level of data engineering and coding expertise, but there are freely available tools helping profile health datasets for research use and inform curation activity.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Software / Atenção à Saúde Tipo de estudo: Prognostic_studies Aspecto: Determinantes_sociais_saude Limite: Humans Idioma: En Revista: BMJ Open Ano de publicação: 2022 Tipo de documento: Article País de afiliação: Reino Unido

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Software / Atenção à Saúde Tipo de estudo: Prognostic_studies Aspecto: Determinantes_sociais_saude Limite: Humans Idioma: En Revista: BMJ Open Ano de publicação: 2022 Tipo de documento: Article País de afiliação: Reino Unido