Pesquisa | Portal Regional da BVS

Oropharyngeal Cancer Staging Health Record Extraction Using Artificial Intelligence.

Baran, Elif; Lee, Melissa; Aviv, Steven; Weiss, Jessica; Pettengell, Chris; Karam, Irene; Bayley, Andrew; Poon, Ian; Chan, Kelvin K W; Parmar, Ambica; Smoragiewicz, Martin; Klieb, Hagen; Truong, Tra; Maralani, Pejman; Enepekides, Danny J; Higgins, Kevin M; Eskander, Antoine.

JAMA Otolaryngol Head Neck Surg ; 2024 May 16.

Artigo em Inglês | MEDLINE | ID: mdl-38754135

RESUMO

Importance: Accurate, timely, and cost-effective methods for staging oropharyngeal cancers are crucial for patient prognosis and treatment decisions, but staging documentation is often inaccurate or incomplete. With the emergence of artificial intelligence in medicine, data abstraction may be associated with reduced costs but increased efficiency and accuracy of cancer staging. Objective: To evaluate an algorithm using an artificial intelligence engine capable of extracting essential information from medical records of patients with oropharyngeal cancer and assigning tumor, nodal, and metastatic stages according to American Joint Committee on Cancer eighth edition guidelines. Design, Setting, and Participants: This retrospective diagnostic study was conducted among a convenience sample of 806 patients with oropharyngeal squamous cell carcinoma. Medical records of patients with staged oropharyngeal squamous cell carcinomas who presented to a single tertiary care center between January 1, 2010, and August 1, 2020, were reviewed. A ground truth cancer stage dataset and comprehensive staging rule book consisting of 135 rules encompassing p16 status, tumor, and nodal and metastatic stage were developed. Subsequently, 4 distinct models were trained: model T (entity relationship extraction) for anatomical location and invasion state, model S (numerical extraction) for lesion size, model M (sequential classification) for metastasis detection, and a p16 model for p16 status. For validation, results were compared against ground truth established by expert reviewers, and accuracy was reported. Data were analyzed from March to November 2023. Main Outcomes and Measures: The accuracy of algorithm cancer stages was compared with ground truth. Results: Among 806 patients with oropharyngeal cancer (mean [SD] age, 63.6 [10.6] years; 651 males [80.8%]), 421 patients (52.2%) were positive for human papillomavirus. The artificial intelligence engine achieved accuracies of 55.9% (95% CI, 52.5%-59.3%) for tumor, 56.0% (95% CI, 52.5%-59.4%) for nodal, and 87.6% (95% CI, 85.1%-89.7%) for metastatic stages and 92.1% (95% CI, 88.5%-94.6%) for p16 status. Differentiation between localized (stages 1-2) and advanced (stages 3-4) cancers achieved 80.7% (95% CI, 77.8%-83.2%) accuracy. Conclusion and Relevance: This study found that tumor and nodal staging accuracies were fair to good and excellent for metastatic stage and p16 status, with clinical relevance in assigning optimal treatment and reducing toxic effect exposures. Further model refinement and external validation with electronic health records at different institutions are necessary to improve algorithm accuracy and clinical applicability.

Developing a Data and Analytics Platform to Enable a Breast Cancer Learning Health System at a Regional Cancer Center.

Petch, Jeremy; Kempainnen, Joel; Pettengell, Christopher; Aviv, Steven; Butler, Bill; Pond, Greg; Saha, Ashirbani; Bogach, Jessica; Allard-Coutu, Alexandria; Sztur, Peter; Ranisau, Jonathan; Levine, Mark.

JCO Clin Cancer Inform ; 7: e2200182, 2023 03.

Artigo em Inglês | MEDLINE | ID: mdl-37001040

RESUMO

PURPOSE: This study documents the creation of automated, longitudinal, and prospective data and analytics platform for breast cancer at a regional cancer center. This platform combines principles of data warehousing with natural language processing (NLP) to provide the integrated, timely, meaningful, high-quality, and actionable data required to establish a learning health system. METHODS: Data from six hospital information systems and one external data source were integrated on a nightly basis by automated extract/transform/load jobs. Free-text clinical documentation was processed using a commercial NLP engine. RESULTS: The platform contains 141 data elements of 7,019 patients with newly diagnosed breast cancer who received care at our regional cancer center from January 1, 2014, to June 3, 2022. Daily updating of the database takes an average of 56 minutes. Evaluation of the tuning of NLP jobs found overall high performance, with an F1 of 1.0 for 19 variables, with a further 16 variables with an F1 of > 0.95. CONCLUSION: This study describes how data warehousing combined with NLP can be used to create a prospective data and analytics platform to enable a learning health system. Although upfront time investment required to create the platform was considerable, now that it has been developed, daily data processing is completed automatically in less than an hour.

Assuntos

Neoplasias da Mama , Sistema de Aprendizagem em Saúde , Humanos , Feminino , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/epidemiologia , Neoplasias da Mama/terapia , Estudos Prospectivos , Processamento de Linguagem Natural , Data Warehousing

Automating Access to Real-World Evidence.

Gauthier, Marie-Pier; Law, Jennifer H; Le, Lisa W; Li, Janice J N; Zahir, Sajda; Nirmalakumar, Sharon; Sung, Mike; Pettengell, Christopher; Aviv, Steven; Chu, Ryan; Sacher, Adrian; Liu, Geoffrey; Bradbury, Penelope; Shepherd, Frances A; Leighl, Natasha B.

JTO Clin Res Rep ; 3(6): 100340, 2022 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-35719866

RESUMO

Introduction: Real-world evidence is important in regulatory and funding decisions. Manual data extraction from electronic health records (EHRs) is time-consuming and challenging to maintain. Automated extraction using natural language processing (NLP) and artificial intelligence may facilitate this process. Whereas NLP offers a faster solution than manual methods of extraction, the validity of extracted data remains in question. The current study compared manual and automated data extraction from the EHR of patients with advanced lung cancer. Methods: Previously, we extracted EHRs from 1209 patients diagnosed with advanced lung cancer (stage IIIB or IV) between January 2015 and December 2017 at Princess Margaret Cancer Centre (Toronto, Canada) using the commercially available artificial intelligence engine, DARWEN (Pentavere, Ontario, Canada). For comparison, 100 of 333 patients that received systemic therapy were randomly selected and clinical data manually extracted by two trained abstractors using the same accepted gold standard feature definitions, including patient, disease characteristics, and treatment data. All cases were re-reviewed by an expert adjudicator. Accuracy and concordance between automated and manual methods are reported. Results: Automated extraction required considerably less time (<1 day) than manual extraction (â¼225 person-hr). The collection of demographic data (age, sex, diagnosis) was highly accurate and concordant with both methods (96%-100%). Accuracy (for either extraction approach) and concordance were lower for unstructured data elements in EHR, such as performance status, date of diagnosis, and smoking status (NLP accuracy: 88%-94%; Manual accuracy: 78%-94%; concordance: 71%-82%). Concurrent medications (86%-100%) and comorbid conditions (96%-100%), were reported with high accuracy and concordance. Treatment details were also accurately captured with both methods (84%-100%) and highly concordant (83%-99%). Detection of whether biomarker testing was performed was highly accurate and concordant (96%-98%), although detection of biomarker test results was more variable (accuracy 84%-100%, concordance 84%-99%). Features with syntactic or semantic variation requiring clinical interpretation were extracted with slightly lower accuracy by both NLP and manual review. For example, metastatic sites were more accurately identified through NLP extraction (NLP: 88%-99%; manual: 71%-100%; concordance: 70%-99%) with the exception of lung and lymph node metastases (NLP: 66%-71%; manual: 87%-92%; concordance: 58%) owing to analogous terms used in radiology reports not being included in the accepted gold standard definition. Conclusions: Automated data abstraction from EHR is highly accurate and faster than manual abstraction. Key challenges include poorly structured EHR and the use of analogous terms beyond the accepted gold standard definition. The application of NLP can facilitate real-world evidence studies at a greater scale than could be achieved with manual data extraction.

RESUMO

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA