Your browser doesn't support javascript.
loading
De-identification of medical records using conditional random fields and long short-term memory networks.
Jiang, Zhipeng; Zhao, Chao; He, Bin; Guan, Yi; Jiang, Jingchi.
Afiliação
  • Jiang Z; School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China. Electronic address: hit.jiang@hotmail.com.
  • Zhao C; School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China. Electronic address: zhaochaocs@gmail.com.
  • He B; School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China. Electronic address: hebin_hit@hotmail.com.
  • Guan Y; School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China. Electronic address: guanyi@hit.edu.cn.
  • Jiang J; School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China. Electronic address: jiangjingchi0118@163.com.
J Biomed Inform ; 75S: S43-S53, 2017 Nov.
Article em En | MEDLINE | ID: mdl-29032162
ABSTRACT
The CEGS N-GRID 2016 Shared Task 1 in Clinical Natural Language Processing focuses on the de-identification of psychiatric evaluation records. This paper describes two participating systems of our team, based on conditional random fields (CRFs) and long short-term memory networks (LSTMs). A pre-processing module was introduced for sentence detection and tokenization before de-identification. For CRFs, manually extracted rich features were utilized to train the model. For LSTMs, a character-level bi-directional LSTM network was applied to represent tokens and classify tags for each token, following which a decoding layer was stacked to decode the most probable protected health information (PHI) terms. The LSTM-based system attained an i2b2 strict micro-F1 measure of 0.8986, which was higher than that of the CRF-based system.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Prontuários Médicos / Anonimização de Dados / Memória de Curto Prazo Idioma: En Ano de publicação: 2017 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Prontuários Médicos / Anonimização de Dados / Memória de Curto Prazo Idioma: En Ano de publicação: 2017 Tipo de documento: Article