Exploratory Data Mining for Subgroup Cohort Discoveries and Prioritization.
IEEE J Biomed Health Inform
; 24(5): 1456-1468, 2020 05.
Article
en En
| MEDLINE
| ID: mdl-31494566
Finding small homogeneous subgroup cohorts in large heterogeneous populations is a critical process for hypothesis development in biomedical research. Concurrent computational approaches are still lacking in robust answers to the question "what hypotheses are likely to be novel and to produce clinically relevant results with well thought-out study designs?" We have developed a novel subgroup discovery method which employs a deep exploratory mining process to slice and dice thousands of potential subpopulations and prioritize potential cohorts based on their explainable contrast patterns and which may provide interventionable insights. We conducted computational experiments on both synthesized data and a clinical autism data set to assess performance quantitatively for coverage of pre-defined cohorts and qualitatively for novel knowledge discovery, respectively. We also conducted a scaling analysis using a distributed computing environment to suggest computational resource needs for when the subpopulation number increases. This work will provide a robust data-driven framework to automatically tailor potential interventions for precision health.
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Asunto principal:
Estudios de Cohortes
/
Investigación Biomédica
/
Minería de Datos
Tipo de estudio:
Etiology_studies
/
Incidence_studies
/
Observational_studies
/
Risk_factors_studies
Límite:
Female
/
Humans
/
Male
Idioma:
En
Revista:
IEEE J Biomed Health Inform
Año:
2020
Tipo del documento:
Article
Pais de publicación:
Estados Unidos