Your browser doesn't support javascript.
loading
Discrepancies in Stroke Distribution and Dataset Origin in Machine Learning for Stroke.
Velagapudi, Lohit; Mouchtouris, Nikolaos; Baldassari, Michael P; Nauheim, David; Khanna, Omaditya; Saiegh, Fadi Al; Herial, Nabeel; Gooch, M Reid; Tjoumakaris, Stavropoula; Rosenwasser, Robert H; Jabbour, Pascal.
Afiliación
  • Velagapudi L; Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA.
  • Mouchtouris N; Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA.
  • Baldassari MP; Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA.
  • Nauheim D; Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA.
  • Khanna O; Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA.
  • Saiegh FA; Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA.
  • Herial N; Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA.
  • Gooch MR; Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA.
  • Tjoumakaris S; Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA.
  • Rosenwasser RH; Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA.
  • Jabbour P; Department of Neurosurgery, Thomas Jefferson University, Philadelphia, PA. Electronic address: pascal.jabbour@jefferson.edu.
J Stroke Cerebrovasc Dis ; 30(7): 105832, 2021 Jul.
Article en En | MEDLINE | ID: mdl-33940363
BACKGROUND: Machine learning algorithms depend on accurate and representative datasets for training in order to become valuable clinical tools that are widely generalizable to a varied population. We aim to conduct a review of machine learning uses in stroke literature to assess the geographic distribution of datasets and patient cohorts used to train these models and compare them to stroke distribution to evaluate for disparities. AIMS: 582 studies were identified on initial searching of the PubMed database. Of these studies, 106 full texts were assessed after title and abstract screening which resulted in 489 papers excluded. Of these 106 studies, 79 were excluded due to using cohorts from outside the United States or being review articles or editorials. 27 studies were thus included in this analysis. SUMMARY OF REVIEW: Of the 27 studies included, 7 (25.9%) used patient data from California, 6 (22.2%) were multicenter, 3 (11.1%) were in Massachusetts, 2 (7.4%) each in Illinois, Missouri, and New York, and 1 (3.7%) each from South Carolina, Washington, West Virginia, and Wisconsin. 1 (3.7%) study used data from Utah and Texas. These were qualitatively compared to a CDC study showing the highest distribution of stroke in Mississippi (4.3%) followed by Oklahoma (3.4%), Washington D.C. (3.4%), Louisiana (3.3%), and Alabama (3.2%) while the prevalence in California was 2.6%. CONCLUSIONS: It is clear that a strong disconnect exists between the datasets and patient cohorts used in training machine learning algorithms in clinical research and the stroke distribution in which clinical tools using these algorithms will be implemented. In order to ensure a lack of bias and increase generalizability and accuracy in future machine learning studies, datasets using a varied patient population that reflects the unequal distribution of stroke risk factors would greatly benefit the usability of these tools and ensure accuracy on a nationwide scale.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Accidente Cerebrovascular / Minería de Datos / Aprendizaje Automático Tipo de estudio: Clinical_trials / Diagnostic_studies / Prevalence_studies / Prognostic_studies / Risk_factors_studies / Systematic_reviews Límite: Humans País/Región como asunto: America do norte Idioma: En Revista: J Stroke Cerebrovasc Dis Asunto de la revista: ANGIOLOGIA / CEREBRO Año: 2021 Tipo del documento: Article

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Accidente Cerebrovascular / Minería de Datos / Aprendizaje Automático Tipo de estudio: Clinical_trials / Diagnostic_studies / Prevalence_studies / Prognostic_studies / Risk_factors_studies / Systematic_reviews Límite: Humans País/Región como asunto: America do norte Idioma: En Revista: J Stroke Cerebrovasc Dis Asunto de la revista: ANGIOLOGIA / CEREBRO Año: 2021 Tipo del documento: Article