Search | VHL Regional Portal

1.

A certified de-identification system for all clinical text documents for information extraction at scale.

Radhakrishnan, Lakshmi; Schenk, Gundolf; Muenzen, Kathleen; Oskotsky, Boris; Ashouri Choshali, Habibeh; Plunkett, Thomas; Israni, Sharat; Butte, Atul J.

JAMIA Open ; 6(3): ooad045, 2023 Oct.

Article in English | MEDLINE | ID: mdl-37416449

ABSTRACT

Objectives: Clinical notes are a veritable treasure trove of information on a patient's disease progression, medical history, and treatment plans, yet are locked in secured databases accessible for research only after extensive ethics review. Removing personally identifying and protected health information (PII/PHI) from the records can reduce the need for additional Institutional Review Boards (IRB) reviews. In this project, our goals were to: (1) develop a robust and scalable clinical text de-identification pipeline that is compliant with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule for de-identification standards and (2) share routinely updated de-identified clinical notes with researchers. Materials and Methods: Building on our open-source de-identification software called Philter, we added features to: (1) make the algorithm and the de-identified data HIPAA compliant, which also implies type 2 error-free redaction, as certified via external audit; (2) reduce over-redaction errors; and (3) normalize and shift date PHI. We also established a streamlined de-identification pipeline using MongoDB to automatically extract clinical notes and provide truly de-identified notes to researchers with periodic monthly refreshes at our institution. Results: To the best of our knowledge, the Philter V1.0 pipeline is currently the first and only certified, de-identified redaction pipeline that makes clinical notes available to researchers for nonhuman subjects' research, without further IRB approval needed. To date, we have made over 130 million certified de-identified clinical notes available to over 600 UCSF researchers. These notes were collected over the past 40 years, and represent data from 2757016 UCSF patients.

2.

The scalable precision medicine open knowledge engine (SPOKE): a massive knowledge graph of biomedical information.

Morris, John H; Soman, Karthik; Akbas, Rabia E; Zhou, Xiaoyuan; Smith, Brett; Meng, Elaine C; Huang, Conrad C; Cerono, Gabriel; Schenk, Gundolf; Rizk-Jackson, Angela; Harroud, Adil; Sanders, Lauren; Costes, Sylvain V; Bharat, Krish; Chakraborty, Arjun; Pico, Alexander R; Mardirossian, Taline; Keiser, Michael; Tang, Alice; Hardi, Josef; Shi, Yongmei; Musen, Mark; Israni, Sharat; Huang, Sui; Rose, Peter W; Nelson, Charlotte A; Baranzini, Sergio E.

Bioinformatics ; 39(2)2023 02 03.

Article in English | MEDLINE | ID: mdl-36759942

ABSTRACT

MOTIVATION: Knowledge graphs (KGs) are being adopted in industry, commerce and academia. Biomedical KG presents a challenge due to the complexity, size and heterogeneity of the underlying information. RESULTS: In this work, we present the Scalable Precision Medicine Open Knowledge Engine (SPOKE), a biomedical KG connecting millions of concepts via semantically meaningful relationships. SPOKE contains 27 million nodes of 21 different types and 53 million edges of 55 types downloaded from 41 databases. The graph is built on the framework of 11 ontologies that maintain its structure, enable mappings and facilitate navigation. SPOKE is built weekly by python scripts which download each resource, check for integrity and completeness, and then create a 'parent table' of nodes and edges. Graph queries are translated by a REST API and users can submit searches directly via an API or a graphical user interface. Conclusions/Significance: SPOKE enables the integration of seemingly disparate information to support precision medicine efforts. AVAILABILITY AND IMPLEMENTATION: The SPOKE neighborhood explorer is available at https://spoke.rbvi.ucsf.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Pattern Recognition, Automated , Precision Medicine , Databases, Factual

3.

A biomedical open knowledge network harnesses the power of AI to understand deep human biology.

Baranzini, Sergio E; Börner, Katy; Morris, John; Nelson, Charlotte A; Soman, Karthik; Schleimer, Erica; Keiser, Michael; Musen, Mark; Pearce, Roger; Reza, Tahsin; Smith, Brett; Herr, Bruce W; Oskotsky, Boris; Rizk-Jackson, Angela; Rankin, Katherine P; Sanders, Stephan J; Bove, Riley; Rose, Peter W; Israni, Sharat; Huang, Sui.

AI Mag ; 43(1): 46-58, 2022.

Article in English | MEDLINE | ID: mdl-36093122

ABSTRACT

Knowledge representation and reasoning (KR&R) has been successfully implemented in many fields to enable computers to solve complex problems with AI methods. However, its application to biomedicine has been lagging in part due to the daunting complexity of molecular and cellular pathways that govern human physiology and pathology. In this article we describe concrete uses of SPOKE, an open knowledge network that connects curated information from 37 specialized and human-curated databases into a single property graph, with 3 million nodes and 15 million edges to date. Applications discussed in this article include drug discovery, COVID-19 research and chronic disease diagnosis and management.

4.

Emerging role of artificial intelligence in cardiac electrophysiology.

Kabra, Rajesh; Israni, Sharat; Vijay, Bharat; Baru, Chaitanya; Mendu, Raghuveer; Fellman, Mark; Sridhar, Arun; Mason, Pamela; Cheung, Jim W; DiBiase, Luigi; Mahapatra, Srijoy; Kalifa, Jerome; Lubitz, Steven A; Noseworthy, Peter A; Navara, Rachita; McManus, David D; Cohen, Mitchell; Chung, Mina K; Trayanova, Natalia; Gopinathannair, Rakesh; Lakkireddy, Dhanunjaya.

Cardiovasc Digit Health J ; 3(6): 263-275, 2022 Dec.

Article in English | MEDLINE | ID: mdl-36589314

ABSTRACT

Artificial intelligence (AI) and machine learning (ML) have significantly impacted the field of cardiovascular medicine, especially cardiac electrophysiology (EP), on multiple fronts. The goal of this review is to familiarize readers with the field of AI and ML and their emerging role in EP. The current review is divided into 3 sections. In the first section, we discuss the definitions and basics of AI, ML, and big data. In the second section, we discuss their application to EP in the context of detection, prediction, and management of arrhythmias. Finally, we discuss the regulatory issues, challenges, and future directions of AI in EP.

5.

PatientExploreR: an extensible application for dynamic visualization of patient clinical history from electronic health records in the OMOP common data model.

Glicksberg, Benjamin S; Oskotsky, Boris; Thangaraj, Phyllis M; Giangreco, Nicholas; Badgeley, Marcus A; Johnson, Kipp W; Datta, Debajyoti; Rudrapatna, Vivek A; Rappoport, Nadav; Shervey, Mark M; Miotto, Riccardo; Goldstein, Theodore C; Rutenberg, Eugenia; Frazier, Remi; Lee, Nelson; Israni, Sharat; Larsen, Rick; Percha, Bethany; Li, Li; Dudley, Joel T; Tatonetti, Nicholas P; Butte, Atul J.

Bioinformatics ; 35(21): 4515-4518, 2019 11 01.

Article in English | MEDLINE | ID: mdl-31214700

ABSTRACT

MOTIVATION: Electronic health records (EHRs) are quickly becoming omnipresent in healthcare, but interoperability issues and technical demands limit their use for biomedical and clinical research. Interactive and flexible software that interfaces directly with EHR data structured around a common data model (CDM) could accelerate more EHR-based research by making the data more accessible to researchers who lack computational expertise and/or domain knowledge. RESULTS: We present PatientExploreR, an extensible application built on the R/Shiny framework that interfaces with a relational database of EHR data in the Observational Medical Outcomes Partnership CDM format. PatientExploreR produces patient-level interactive and dynamic reports and facilitates visualization of clinical data without any programming required. It allows researchers to easily construct and export patient cohorts from the EHR for analysis with other software. This application could enable easier exploration of patient-level data for physicians and researchers. PatientExploreR can incorporate EHR data from any institution that employs the CDM for users with approved access. The software code is free and open source under the MIT license, enabling institutions to install and users to expand and modify the application for their own purposes. AVAILABILITY AND IMPLEMENTATION: PatientExploreR can be freely obtained from GitHub: https://github.com/BenGlicksberg/PatientExploreR. We provide instructions for how researchers with approved access to their institutional EHR can use this package. We also release an open sandbox server of synthesized patient data for users without EHR access to explore: http://patientexplorer.ucsf.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Electronic Health Records , Software , Computers , Databases, Factual , Humans , Observational Studies as Topic

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL