Machine translation of standardised medical terminology using natural language processing: A scoping review.

Noll, Richard; Frischen, Lena S; Boeker, Martin; Storf, Holger; Schaaf, Jannik

Noll, Richard; Frischen, Lena S; Boeker, Martin; Storf, Holger; Schaaf, Jannik.

Afiliação

Noll R; Goethe University Frankfurt, University Hospital Frankfurt, Institute of Medical Informatics, Frankfurt, Germany. Electronic address: richard.noll@kgu.de.
Frischen LS; University Hospital Frankfurt, Goethe University, Executive Department for medical IT-Systems and digitalization, Frankfurt, Germany.
Boeker M; Institute for Artificial Intelligence and Informatics in Medicine, Chair of Medical Informatics, Medical Center rechts der Isar, Technical University of Munich, Munich, Germany.
Storf H; Goethe University Frankfurt, University Hospital Frankfurt, Institute of Medical Informatics, Frankfurt, Germany.
Schaaf J; Goethe University Frankfurt, University Hospital Frankfurt, Institute of Medical Informatics, Frankfurt, Germany.

N Biotechnol ; 77: 120-129, 2023 Nov 25.

Article em En | MEDLINE | ID: mdl-37652265

RESUMO

Standardised medical terminologies are used to ensure accurate and consistent communication of information and to facilitate data exchange. Currently, many terminologies are only available in English, which hinders international research and automated processing of medical data. Natural language processing (NLP) and Machine Translation (MT) methods can be used to automatically translate these terms. This scoping review examines the research on automated translation of standardised medical terminology. A search was performed in PubMed and Web of Science and results were screened for eligibility by title and abstract as well as full text screening. In addition to bibliographic data, the following data items were considered: 'terminology considered', 'terms considered', 'source language', 'target language', 'translation type', 'NLP technique', 'NLP system', 'machine translation system', 'data source' and 'translation quality'. The results showed that the most frequently translated terminology is SNOMED CT (39.1%), followed by MeSH (13%), ICD (13%) and UMLS (8.7%). The most common source language is English (55.9%), and the most common target language is German (41.2%). Translation methods are often based on Statistical Machine Translation (SMT) (41.7%) and, more recently, Neural Machine Translation (NMT) (30.6%), but can also be combined with various MT methods. Commercial translators such as Google Translate (36.4%) and automatic validation methods such as BLEU (22.2%) are frequently used tools for translation and subsequent validation.

Assuntos

Processamento de Linguagem Natural; Tradução; Idioma; Systematized Nomenclature of Medicine

Palavras-chave

Controlled vocabulary; Machine translation; NLP

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Tradução / Processamento de Linguagem Natural Tipo de estudo: Systematic_reviews Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google