Your browser doesn't support javascript.
loading
Customization scenarios for de-identification of clinical notes.
Hartman, Tzvika; Howell, Michael D; Dean, Jeff; Hoory, Shlomo; Slyper, Ronit; Laish, Itay; Gilon, Oren; Vainstein, Danny; Corrado, Greg; Chou, Katherine; Po, Ming Jack; Williams, Jutta; Ellis, Scott; Bee, Gavin; Hassidim, Avinatan; Amira, Rony; Beryozkin, Genady; Szpektor, Idan; Matias, Yossi.
  • Hartman T; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Howell MD; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Dean J; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Hoory S; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA. hoorys@google.com.
  • Slyper R; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Laish I; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Gilon O; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Vainstein D; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Corrado G; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Chou K; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Po MJ; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Williams J; , Palo Alto, CA, USA.
  • Ellis S; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Bee G; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Hassidim A; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Amira R; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Beryozkin G; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Szpektor I; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Matias Y; Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
BMC Med Inform Decis Mak ; 20(1): 14, 2020 01 30.
Article en En | MEDLINE | ID: mdl-32000770
ABSTRACT

BACKGROUND:

Automated machine-learning systems are able to de-identify electronic medical records, including free-text clinical notes. Use of such systems would greatly boost the amount of data available to researchers, yet their deployment has been limited due to uncertainty about their performance when applied to new datasets.

OBJECTIVE:

We present practical options for clinical note de-identification, assessing performance of machine learning systems ranging from off-the-shelf to fully customized.

METHODS:

We implement a state-of-the-art machine learning de-identification system, training and testing on pairs of datasets that match the deployment scenarios. We use clinical notes from two i2b2 competition corpora, the Physionet Gold Standard corpus, and parts of the MIMIC-III dataset.

RESULTS:

Fully customized systems remove 97-99% of personally identifying information. Performance of off-the-shelf systems varies by dataset, with performance mostly above 90%. Providing a small labeled dataset or large unlabeled dataset allows for fine-tuning that improves performance over off-the-shelf systems.

CONCLUSION:

Health organizations should be aware of the levels of customization available when selecting a de-identification deployment solution, in order to choose the one that best matches their resources and target performance level.
Asunto(s)
Palabras clave

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Registros Electrónicos de Salud / Aprendizaje Automático / Anonimización de la Información Tipo de estudio: Diagnostic_studies / Prognostic_studies Límite: Humans Idioma: En Año: 2020 Tipo del documento: Article

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Registros Electrónicos de Salud / Aprendizaje Automático / Anonimización de la Información Tipo de estudio: Diagnostic_studies / Prognostic_studies Límite: Humans Idioma: En Año: 2020 Tipo del documento: Article