Your browser doesn't support javascript.
loading
An open natural language processing (NLP) framework for EHR-based clinical research: a case demonstration using the National COVID Cohort Collaborative (N3C).
Liu, Sijia; Wen, Andrew; Wang, Liwei; He, Huan; Fu, Sunyang; Miller, Robert; Williams, Andrew; Harris, Daniel; Kavuluru, Ramakanth; Liu, Mei; Abu-El-Rub, Noor; Schutte, Dalton; Zhang, Rui; Rouhizadeh, Masoud; Osborne, John D; He, Yongqun; Topaloglu, Umit; Hong, Stephanie S; Saltz, Joel H; Schaffter, Thomas; Pfaff, Emily; Chute, Christopher G; Duong, Tim; Haendel, Melissa A; Fuentes, Rafael; Szolovits, Peter; Xu, Hua; Liu, Hongfang.
Afiliação
  • Liu S; Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA.
  • Wen A; Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA.
  • Wang L; Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA.
  • He H; Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA.
  • Fu S; Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA.
  • Miller R; Tufts Clinical and Translational Science Institute, Tufts Medical Center, Boston, Massachusetts, USA.
  • Williams A; Tufts Clinical and Translational Science Institute, Tufts Medical Center, Boston, Massachusetts, USA.
  • Harris D; Department of Internal Medicine, University of Kentucky, Lexington, Kentucky, USA.
  • Kavuluru R; Department of Internal Medicine, University of Kentucky, Lexington, Kentucky, USA.
  • Liu M; Department of Internal Medicine, University of Kansas Medical Center, Kansas City, Kansas, USA.
  • Abu-El-Rub N; Department of Internal Medicine, University of Kansas Medical Center, Kansas City, Kansas, USA.
  • Schutte D; Department of Pharmaceutical Care & Health Systems, University of Minnesota at Twin Cities, Minneapolis, Minnesota, USA.
  • Zhang R; Department of Pharmaceutical Care & Health Systems, University of Minnesota at Twin Cities, Minneapolis, Minnesota, USA.
  • Rouhizadeh M; Department of Pharmaceutical Outcomes & Policy, University of Florida, Gainesville, Florida, USA.
  • Osborne JD; Department of Computer Science, University of Alabama at Birmingham, Birmingham, Alabama, USA.
  • He Y; Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, Michigan, USA.
  • Topaloglu U; Department of Cancer Biology, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA.
  • Hong SS; Department of Medicine, Johns Hopkins University, Baltimore, Maryland, USA.
  • Saltz JH; Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York, USA.
  • Schaffter T; Sage Bionetwork, Seattle, Washington, USA.
  • Pfaff E; Department of Medicine, University of North Carolina Chapel Hill, Chapel Hill, North Carolina, USA.
  • Chute CG; Department of Medicine, Johns Hopkins University, Baltimore, Maryland, USA.
  • Duong T; Department of Radiology, Albert Einstein College of Medicine, Bronx, New York, USA.
  • Haendel MA; Center for Health AI, University of Colorado Anschutz Medical Campus, Denver, Colorado, USA.
  • Fuentes R; Alex Informatics, North Bethesda, Maryland, USA.
  • Szolovits P; Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
  • Xu H; School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, USA.
  • Liu H; Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA.
J Am Med Inform Assoc ; 30(12): 2036-2040, 2023 11 17.
Article em En | MEDLINE | ID: mdl-37555837
ABSTRACT
Despite recent methodology advancements in clinical natural language processing (NLP), the adoption of clinical NLP models within the translational research community remains hindered by process heterogeneity and human factor variations. Concurrently, these factors also dramatically increase the difficulty in developing NLP models in multi-site settings, which is necessary for algorithm robustness and generalizability. Here, we reported on our experience developing an NLP solution for Coronavirus Disease 2019 (COVID-19) signs and symptom extraction in an open NLP framework from a subset of sites participating in the National COVID Cohort (N3C). We then empirically highlight the benefits of multi-site data for both symbolic and statistical methods, as well as highlight the need for federated annotation and evaluation to resolve several pitfalls encountered in the course of these efforts.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Processamento de Linguagem Natural / COVID-19 Idioma: En Ano de publicação: 2023 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Processamento de Linguagem Natural / COVID-19 Idioma: En Ano de publicação: 2023 Tipo de documento: Article