Your browser doesn't support javascript.
loading
A unified framework of medical information annotation and extraction for Chinese clinical text.
Zhu, Enwei; Sheng, Qilin; Yang, Huanwan; Liu, Yiyang; Cai, Ting; Li, Jinpeng.
  • Zhu E; Ningbo No. 2 Hospital, Ningbo 315010, Zhejiang Province, PR China; Ningbo Institute of Life and Health Industry, University of Chinese Academy of Sciences, Ningbo 315016, Zhejiang Province, PR China. Electronic address: zhuenwei@ucas.ac.cn.
  • Sheng Q; Ningbo No. 2 Hospital, Ningbo 315010, Zhejiang Province, PR China. Electronic address: shengqilin0106@163.com.
  • Yang H; Ningbo No. 2 Hospital, Ningbo 315010, Zhejiang Province, PR China. Electronic address: yanghuanw@163.com.
  • Liu Y; Ningbo No. 2 Hospital, Ningbo 315010, Zhejiang Province, PR China; Ningbo Institute of Life and Health Industry, University of Chinese Academy of Sciences, Ningbo 315016, Zhejiang Province, PR China. Electronic address: liuyiyang@ucas.ac.cn.
  • Cai T; Ningbo No. 2 Hospital, Ningbo 315010, Zhejiang Province, PR China; Ningbo Institute of Life and Health Industry, University of Chinese Academy of Sciences, Ningbo 315016, Zhejiang Province, PR China. Electronic address: caiting@ucas.ac.cn.
  • Li J; Ningbo No. 2 Hospital, Ningbo 315010, Zhejiang Province, PR China; Ningbo Institute of Life and Health Industry, University of Chinese Academy of Sciences, Ningbo 315016, Zhejiang Province, PR China. Electronic address: lijinpeng@ucas.ac.cn.
Artif Intell Med ; 142: 102573, 2023 08.
Article en En | MEDLINE | ID: mdl-37316096
Medical information extraction consists of a group of natural language processing (NLP) tasks, which collaboratively convert clinical text to pre-defined structured formats. This is a critical step to exploit electronic medical records (EMRs). Given the recent thriving NLP technologies, model implementation and performance seem no longer an obstacle, whereas the bottleneck locates on a high-quality annotated corpus and the whole engineering workflow. This study presents an engineering framework consisting of three tasks, i.e., medical entity recognition, relation extraction and attribute extraction. Within this framework, the whole workflow is demonstrated from EMR data collection through model performance evaluation. Our annotation scheme is designed to be comprehensive and compatible between the multiple tasks. With the EMRs from a general hospital in Ningbo, China, and the manual annotation by experienced physicians, our corpus is of large scale and high quality. Built upon this Chinese clinical corpus, the medical information extraction system show performance that approaches human annotation. The annotation scheme, (a subset of) the annotated corpus, and the code are all publicly released, to facilitate further research.
Asunto(s)
Palabras clave

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Médicos / Registros Electrónicos de Salud Tipo de estudio: Guideline / Prognostic_studies Límite: Humans País como asunto: Asia Idioma: En Año: 2023 Tipo del documento: Article

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Médicos / Registros Electrónicos de Salud Tipo de estudio: Guideline / Prognostic_studies Límite: Humans País como asunto: Asia Idioma: En Año: 2023 Tipo del documento: Article