Data mining in clinical big data: the frequently used databases, steps, and methodological models.

Wu, Wen-Tao; Li, Yuan-Jie; Feng, Ao-Zi; Li, Li; Huang, Tao; Xu, An-Ding; Lyu, Jun

Wu, Wen-Tao; Li, Yuan-Jie; Feng, Ao-Zi; Li, Li; Huang, Tao; Xu, An-Ding; Lyu, Jun.

Afiliação

Wu WT; Department of Clinical Research, The First Affiliated Hospital of Jinan University, Tianhe District, 613 W. Huangpu Avenue, Guangzhou, 510632, Guangdong, China.
Li YJ; School of Public Health, Xi'an Jiaotong University Health Science Center, Xi'an, 710061, Shaanxi, China.
Feng AZ; Department of Human Anatomy, Histology and Embryology, School of Basic Medical Sciences, Xi'an Jiaotong University Health Science Center, Xi'an, 710061, Shaanxi, China.
Li L; Department of Clinical Research, The First Affiliated Hospital of Jinan University, Tianhe District, 613 W. Huangpu Avenue, Guangzhou, 510632, Guangdong, China.
Huang T; Department of Clinical Research, The First Affiliated Hospital of Jinan University, Tianhe District, 613 W. Huangpu Avenue, Guangzhou, 510632, Guangdong, China.
Xu AD; Department of Clinical Research, The First Affiliated Hospital of Jinan University, Tianhe District, 613 W. Huangpu Avenue, Guangzhou, 510632, Guangdong, China.
Lyu J; Department of Neurology, The First Affiliated Hospital of Jinan University, Tianhe District, 613 W. Huangpu Avenue, Guangzhou, 510632, Guangdong, China. tlil@jnu.edu.cn.

Mil Med Res ; 8(1): 44, 2021 08 11.

Article em En | MEDLINE | ID: mdl-34380547

RESUMO

Many high quality studies have emerged from public databases, such as Surveillance, Epidemiology, and End Results (SEER), National Health and Nutrition Examination Survey (NHANES), The Cancer Genome Atlas (TCGA), and Medical Information Mart for Intensive Care (MIMIC); however, these data are often characterized by a high degree of dimensional heterogeneity, timeliness, scarcity, irregularity, and other characteristics, resulting in the value of these data not being fully utilized. Data-mining technology has been a frontier field in medical research, as it demonstrates excellent performance in evaluating patient risks and assisting clinical decision-making in building disease-prediction models. Therefore, data mining has unique advantages in clinical big-data research, especially in large-scale medical public databases. This article introduced the main medical public database and described the steps, tasks, and models of data mining in simple language. Additionally, we described data-mining methods along with their practical applications. The goal of this work was to aid clinical researchers in gaining a clear and intuitive understanding of the application of data-mining technology on clinical big-data in order to promote the production of research results that are beneficial to doctors and patients.

Assuntos

Big Data; Mineração de Dados/métodos; Bases de Dados Factuais/tendências; Mineração de Dados/tendências; Humanos

Palavras-chave

Clinical big data; Data mining; MIMIC; Machine learning; Medical public database; NHANES; SEER; TCGA

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Bases de Dados Factuais / Mineração de Dados / Big Data Idioma: En Ano de publicação: 2021 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google