Búsqueda | Portal de Búsqueda de la BVS

1.

Evaluation of OMOP CDM, i2b2 and ICGC ARGO for supporting data harmonization in a breast cancer use case of a multicentric European AI project.

Frid, Santiago; Bracons Cucó, Guillem; Gil Rojas, Jessyca; López-Rueda, Antonio; Pastor Duran, Xavier; Martínez-Sáez, Olga; Lozano-Rubí, Raimundo.

J Biomed Inform ; 147: 104505, 2023 11.

Artículo en Inglés | MEDLINE | ID: mdl-37774908

RESUMEN

OBJECTIVE: Observational research in cancer poses great challenges regarding adequate data sharing and consolidation based on a homogeneous data semantic base. Common Data Models (CDMs) can help consolidate health data repositories from different institutions minimizing loss of meaning by organizing data into a standard structure. This study aims to evaluate the performance of the Observational Medical Outcomes Partnership (OMOP) CDM, Informatics for Integrating Biology & the Bedside (i2b2) and International Cancer Genome Consortium, Accelerating Research in Genomic Oncology (ICGC ARGO) for representing non-imaging data in a breast cancer use case of EuCanImage. METHODS: We used ontologies to represent metamodels of OMOP, i2b2, and ICGC ARGO and variables used in a cancer use case of a European AI project. We selected four evaluation criteria for the CDMs adapted from previous research: content coverage, simplicity, integration, implementability. RESULTS: i2b2 and OMOP exhibited higher element completeness (100% each) than ICGC ARGO (58.1%), while the three achieved 100% domain completeness. ICGC ARGO normalizes only one of our variables with a standard terminology, while i2b2 and OMOP use standardized vocabularies for all of them. In terms of simplicity, ICGC ARGO and i2b2 proved to be simpler both in terms of ontological model (276 and 175 elements, respectively) and in the queries (7 and 20 lines of code, respectively), while OMOP required a much more complex ontological model (615 elements) and queries similar to those of i2b2 (20 lines). Regarding implementability, OMOP had the highest number of mentions in articles in PubMed (130) and Google Scholar (1,810), ICGC ARGO had the highest number of updates to the CDM since 2020 (4), and i2b2 is the model with more tools specifically developed for the CDM (26). CONCLUSION: ICGC ARGO proved to be rigid and very limited in the representation of oncologic concepts, while i2b2 and OMOP showed a very good performance. i2b2's lack of a common dictionary hinders its scalability, requiring sites that will share data to explicitly define a conceptual framework, and suggesting that OMOP and its Oncology extension could be the more suitable choice. Future research employing these CDMs with actual datasets is needed.

Asunto(s)

Neoplasias de la Mama , Humanos , Femenino , Registros Electrónicos de Salud , Difusión de la Información , Bases de Datos Factuales , Genómica

2.

A scalable method for supporting multiple patient cohort discovery projects using i2b2.

Sholle, Evan T; Davila, Marcos A; Kabariti, Joseph; Schwartz, Julian Z; Varughese, Vinay I; Cole, Curtis L; Campion, Thomas R.

J Biomed Inform ; 84: 179-183, 2018 08.

Artículo en Inglés | MEDLINE | ID: mdl-30009991

RESUMEN

Although i2b2, a popular platform for patient cohort discovery using electronic health record (EHR) data, can support multiple projects specific to individual disease areas or research interests, the standard approach for doing so duplicates data across projects, requiring additional disk space and processing time, which limits scalability. To address this deficiency, we developed a novel approach that stored data in a single i2b2 fact table and used structured query language (SQL) views to access data for specific projects. Compared to the standard approach, the view-based approach reduced required disk space by 59% and extract-transfer-load (ETL) time by 46%, without substantially impacting query performance. The view-based approach has enabled scalability of multiple i2b2 projects and generalized to another data model at our institution. Other institutions may benefit from this approach, code of which is available on GitHub (https://github.com/wcmc-research-informatics/super-i2b2).

Asunto(s)

Registros Electrónicos de Salud , Informática Médica/métodos , Informática Médica/organización & administración , Centros Médicos Académicos , Algoritmos , Estudios de Cohortes , Humanos , Almacenamiento y Recuperación de la Información , Lenguaje , New York , Reproducibilidad de los Resultados , Programas Informáticos , Investigación Biomédica Traslacional/organización & administración

3.

Implementation of informatics for integrating biology and the bedside (i2b2) platform as Docker containers.

Wagholikar, Kavishwar B; Dessai, Pralav; Sanz, Javier; Mendis, Michael E; Bell, Douglas S; Murphy, Shawn N.

BMC Med Inform Decis Mak ; 18(1): 66, 2018 07 16.

Artículo en Inglés | MEDLINE | ID: mdl-30012140

RESUMEN

BACKGROUND: Informatics for Integrating Biology and the Bedside (i2b2) is an open source clinical data analytics platform used at over 200 healthcare institutions for querying patient data. The i2b2 platform has several components with numerous dependencies and configuration parameters, which renders the task of installing or upgrading i2b2 a challenging one. Even with the availability of extensive documentation and tutorials, new users often require several weeks to correctly install a functional i2b2 platform. The goal of this work is to simplify the installation and upgrade process for i2b2. Specifically, we have containerized the core components of the platform, and evaluated the containers for ease of installation. RESULTS: We developed three Docker container images: WildFly, database, and web, to encapsulate the three major deployment components of i2b2. These containers isolate the core functionalities of the i2b2 platform, and work in unison to provide its functionalities. Our evaluations indicate that i2b2 containers function successfully on the Linux platform. Our results demonstrate that the containerized components work out-of-the-box, with minimal configuration. CONCLUSIONS: Containerization offers the potential to package the i2b2 platform components into standalone executable packages that are agnostic to the underlying host operating system. By releasing i2b2 as a Docker container, we anticipate that users will be able to create a working i2b2 hive installation without the need to download, compile, and configure individual components that constitute the i2b2 cells, thus making this platform accessible to a greater number of institutions.

Asunto(s)

Investigación Biomédica , Aplicaciones de la Informática Médica , Computación en Informática Médica , Sistemas de Atención de Punto , Humanos

4.

A Fast Healthcare Interoperability Resources (FHIR) layer implemented over i2b2.

Boussadi, Abdelali; Zapletal, Eric.

BMC Med Inform Decis Mak ; 17(1): 120, 2017 Aug 14.

Artículo en Inglés | MEDLINE | ID: mdl-28806953

RESUMEN

BACKGROUND: Standards and technical specifications have been developed to define how the information contained in Electronic Health Records (EHRs) should be structured, semantically described, and communicated. Current trends rely on differentiating the representation of data instances from the definition of clinical information models. The dual model approach, which combines a reference model (RM) and a clinical information model (CIM), sets in practice this software design pattern. The most recent initiative, proposed by HL7, is called Fast Health Interoperability Resources (FHIR). The aim of our study was to investigate the feasibility of applying the FHIR standard to modeling and exposing EHR data of the Georges Pompidou European Hospital (HEGP) integrating biology and the bedside (i2b2) clinical data warehouse (CDW). RESULTS: We implemented a FHIR server over i2b2 to expose EHR data in relation with five FHIR resources: DiagnosisReport, MedicationOrder, Patient, Encounter, and Medication. The architecture of the server combines a Data Access Object design pattern and FHIR resource providers, implemented using the Java HAPI FHIR API. Two types of queries were tested: query type #1 requests the server to display DiagnosticReport resources, for which the diagnosis code is equal to a given ICD-10 code. A total of 80 DiagnosticReport resources, corresponding to 36 patients, were displayed. Query type #2, requests the server to display MedicationOrder, for which the FHIR Medication identification code is equal to a given code expressed in a French coding system. A total of 503 MedicationOrder resources, corresponding to 290 patients, were displayed. Results were validated by manually comparing the results of each request to the results displayed by an ad-hoc SQL query. CONCLUSION: We showed the feasibility of implementing a Java layer over the i2b2 database model to expose data of the CDW as a set of FHIR resources. An important part of this work was the structural and semantic mapping between the i2b2 model and the FHIR RM. To accomplish this, developers must manually browse the specifications of the FHIR standard. Our source code is freely available and can be adapted for use in other i2b2 sites.

Asunto(s)

Data Warehousing/normas , Sistemas de Administración de Bases de Datos/normas , Registros Electrónicos de Salud/normas , Interoperabilidad de la Información en Salud/normas , Hospitales de Enseñanza/normas , Registros Electrónicos de Salud/organización & administración , Estándar HL7 , Humanos

5.

Automated population of an i2b2 clinical data warehouse from an openEHR-based data repository.

Haarbrandt, Birger; Tute, Erik; Marschollek, Michael.

J Biomed Inform ; 63: 277-294, 2016 10.

Artículo en Inglés | MEDLINE | ID: mdl-27507090

RESUMEN

BACKGROUND: Detailed Clinical Model (DCM) approaches have recently seen wider adoption. More specifically, openEHR-based application systems are now used in production in several countries, serving diverse fields of application such as health information exchange, clinical registries and electronic medical record systems. However, approaches to efficiently provide openEHR data to researchers for secondary use have not yet been investigated or established. METHODS: We developed an approach to automatically load openEHR data instances into the open source clinical data warehouse i2b2. We evaluated query capabilities and the performance of this approach in the context of the Hanover Medical School Translational Research Framework (HaMSTR), an openEHR-based data repository. RESULTS: Automated creation of i2b2 ontologies from archetypes and templates and the integration of openEHR data instances from 903 patients of a paediatric intensive care unit has been achieved. In total, it took an average of â¼2527s to create 2.311.624 facts from 141.917 XML documents. Using the imported data, we conducted sample queries to compare the performance with two openEHR systems and to investigate if this representation of data is feasible to support cohort identification and record level data extraction. DISCUSSION: We found the automated population of an i2b2 clinical data warehouse to be a feasible approach to make openEHR data instances available for secondary use. Such an approach can facilitate timely provision of clinical data to researchers. It complements analytics based on the Archetype Query Language by allowing querying on both, legacy clinical data sources and openEHR data instances at the same time and by providing an easy-to-use query interface. However, due to different levels of expressiveness in the data models, not all semantics could be preserved during the ETL process.

Asunto(s)

Registros Electrónicos de Salud , Almacenamiento y Recuperación de la Información , Investigación Biomédica Traslacional , Recolección de Datos , Humanos , Difusión de la Información , Semántica

6.

[Technical improvement of cohort constitution in administrative health databases: Providing a tool for integration and standardization of data applicable in the French National Health Insurance Database (SNIIRAM)]. / Optimisation de la constitution de cohortes issues de bases de données médico-administratives : mise à disposition d'un algorithme pour l'intégration et la normalisation des données adapté au Système national d'information inter-régimes de l'assurance maladie (SNIIRAM).

Ferdynus, C; Huiart, L.

Rev Epidemiol Sante Publique ; 64(4): 263-9, 2016 Sep.

Artículo en Francés | MEDLINE | ID: mdl-27592033

RESUMEN

AIM: Administrative health databases such as the French National Heath Insurance Database - SNIIRAM - are a major tool to answer numerous public health research questions. However the use of such data requires complex and time-consuming data management. Our objective was to develop and make available a tool to optimize cohort constitution within administrative health databases. METHODS: We developed a process to extract, transform and load (ETL) data from various heterogeneous sources in a standardized data warehouse. This data warehouse is architected as a star schema corresponding to an i2b2 star schema model. We then evaluated the performance of this ETL using data from a pharmacoepidemiology research project conducted in the SNIIRAM database. RESULTS: The ETL we developed comprises a set of functionalities for creating SAS scripts. Data can be integrated into a standardized data warehouse. As part of the performance assessment of this ETL, we achieved integration of a dataset from the SNIIRAM comprising more than 900 million lines in less than three hours using a desktop computer. This enables patient selection from the standardized data warehouse within seconds of the request. CONCLUSION: The ETL described in this paper provides a tool which is effective and compatible with all administrative health databases, without requiring complex database servers. This tool should simplify cohort constitution in health databases; the standardization of warehouse data facilitates collaborative work between research teams.

Asunto(s)

Algoritmos , Bases de Datos Factuales/normas , Almacenamiento y Recuperación de la Información/normas , Registros Médicos/normas , Programas Nacionales de Salud , Estudios de Cohortes , Diseño de Investigaciones Epidemiológicas , Francia/epidemiología , Humanos , Registros Médicos/estadística & datos numéricos , Programas Nacionales de Salud/organización & administración , Programas Nacionales de Salud/normas , Farmacoepidemiología/organización & administración , Farmacoepidemiología/normas , Mejoramiento de la Calidad , Estándares de Referencia

7.

Automatic de-identification of electronic medical records using token-level and character-level conditional random fields.

Liu, Zengjian; Chen, Yangxin; Tang, Buzhou; Wang, Xiaolong; Chen, Qingcai; Li, Haodi; Wang, Jingfeng; Deng, Qiwen; Zhu, Suisong.

J Biomed Inform ; 58 Suppl: S47-S52, 2015 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-26122526

RESUMEN

De-identification, identifying and removing all protected health information (PHI) present in clinical data including electronic medical records (EMRs), is a critical step in making clinical data publicly available. The 2014 i2b2 (Center of Informatics for Integrating Biology and Bedside) clinical natural language processing (NLP) challenge sets up a track for de-identification (track 1). In this study, we propose a hybrid system based on both machine learning and rule approaches for the de-identification track. In our system, PHI instances are first identified by two (token-level and character-level) conditional random fields (CRFs) and a rule-based classifier, and then are merged by some rules. Experiments conducted on the i2b2 corpus show that our system submitted for the challenge achieves the highest micro F-scores of 94.64%, 91.24% and 91.63% under the "token", "strict" and "relaxed" criteria respectively, which is among top-ranked systems of the 2014 i2b2 challenge. After integrating some refined localization dictionaries, our system is further improved with F-scores of 94.83%, 91.57% and 91.95% under the "token", "strict" and "relaxed" criteria respectively.

Asunto(s)

Seguridad Computacional , Confidencialidad , Minería de Datos/métodos , Registros Electrónicos de Salud/organización & administración , Procesamiento de Lenguaje Natural , Reconocimiento de Normas Patrones Automatizadas/métodos , China , Estudios de Cohortes , Interpretación Estadística de Datos , Narración , Vocabulario Controlado

8.

Neurosurgery clinical registry data collection utilizing Informatics for Integrating Biology and the Bedside and electronic health records at the University of Rochester.

Pittman, Christine A; Miranpuri, Amrendra S.

Neurosurg Focus ; 39(6): E16, 2015 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-26621414

RESUMEN

In a population health-driven health care system, data collection through the use of clinical registries is becoming imperative to continue to drive effective and efficient patient care. Clinical registries rely on a department's ability to collect high-quality and accurate data. Currently, however, data are collected manually with a high risk for error. The University of Rochester's Department of Neurosurgery in conjunction with the university's Clinical and Translational Science Institute has implemented the integrated use of the Informatics for Integrating Biology and the Bedside (i2b2) informatics framework with the Research Electronic Data Capture (REDCap) databases.

Asunto(s)

Recolección de Datos , Registros Electrónicos de Salud/estadística & datos numéricos , Procedimientos Neuroquirúrgicos/métodos , Sistema de Registros , Enfermedades de la Médula Espinal/cirugía , Academias e Institutos , Adulto , Anciano , Bases de Datos Factuales/estadística & datos numéricos , Femenino , Humanos , Masculino , Persona de Mediana Edad , Adulto Joven

9.

Federated Aggregate Cohort Estimator (FACE): an easy to deploy, vendor neutral, multi-institutional cohort query architecture.

Wyatt, Matthew C; Hendrickson, R Curtis; Ames, Michael; Bondy, Jessica; Ranauro, Paul; English, Thomas M; Bobitt, Keith; Davidson, Arthur; Houston, Thomas K; Embi, Peter J; Berner, Eta S.

J Biomed Inform ; 52: 65-71, 2014 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-24316052

RESUMEN

Cross-institutional data sharing for cohort discovery is critical to enabling future research. While particularly useful in rare diseases, the ability to target enrollment and to determine if an institution has a sufficient number of patients is valuable in all research, particularly in the initiation of projects and collaborations. An optimal technology solution would work with any source database with minimal resource investment for deployment and would meet all necessary security and confidentiality requirements of participating organizations. We describe a platform-neutral reference implementation to meet these requirements: the Federated Aggregate Cohort Estimator (FACE). FACE was developed and implemented through a collaboration of The University of Alabama at Birmingham (UAB), The Ohio State University (OSU), the University of Massachusetts Medical School (UMMS), and the Denver Health and Hospital Authority (DHHA) a clinical affiliate of the Colorado Clinical and Translational Sciences Institute. The reference implementation of FACE federated diverse SQL data sources and an i2b2 instance to estimate combined research subject availability from three institutions. It used easily-deployed virtual machines and addressed privacy and security concerns for data sharing.

Asunto(s)

Seguridad Computacional , Difusión de la Información/métodos , Almacenamiento y Recuperación de la Información/métodos , Confidencialidad , Humanos , Informática Médica , Interfaz Usuario-Computador

10.

Using patient lists to add value to integrated data repositories.

Wade, Ted D; Zelarney, Pearlanne T; Hum, Richard C; McGee, Sylvia; Batson, Deborah H.

J Biomed Inform ; 52: 72-7, 2014 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-24534444

RESUMEN

Patient lists are project-specific sets of patients that can be queried in integrated data repositories (IDR's). By allowing a set of patients to be an addition to the qualifying conditions of a query, returned results will refer to, and only to, that set of patients. We report a variety of use cases for such lists, including: restricting retrospective chart review to a defined set of patients; following a set of patients for practice management purposes; distributing "honest-brokered" (deidentified) data; adding phenotypes to biosamples; and enhancing the content of study or registry data. Among the capabilities needed to implement patient lists in an IDR are: capture of patient identifiers from a query and feedback of these into the IDR; the existence of a permanent internal identifier in the IDR that is mappable to external identifiers; the ability to add queryable attributes to the IDR; the ability to merge data from multiple queries; and suitable control over user access and de-identification of results. We implemented patient lists in a custom IDR of our own design. We reviewed capabilities of other published IDRs for focusing on sets of patients. The widely used i2b2 IDR platform has various ways to address patient sets, and it could be modified to add the low-overhead version of patient lists that we describe.

Asunto(s)

Sistemas de Administración de Bases de Datos , Registros Electrónicos de Salud , Investigación Biomédica , Confidencialidad , Humanos , Informática Médica

11.

High prevalence of eosinophilic esophagitis in patients with inherited connective tissue disorders.

Abonia, J Pablo; Wen, Ting; Stucke, Emily M; Grotjan, Tommie; Griffith, Molly S; Kemme, Katherine A; Collins, Margaret H; Putnam, Philip E; Franciosi, James P; von Tiehl, Karl F; Tinkle, Brad T; Marsolo, Keith A; Martin, Lisa J; Ware, Stephanie M; Rothenberg, Marc E.

J Allergy Clin Immunol ; 132(2): 378-86, 2013 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-23608731

RESUMEN

BACKGROUND: Eosinophilic esophagitis (EoE) is an emerging chronic inflammatory disease mediated by immune hypersensitization to multiple foods and strongly associated with atopy and esophageal remodeling. OBJECTIVE: We provide clinical and molecular evidence indicating a high prevalence of EoE in patients with inherited connective tissue disorders (CTDs). METHODS: We examined the rate of EoE among patients with CTDs and subsequently analyzed esophageal mRNA transcript profiles in patients with EoE with or without CTD features. RESULTS: We report a cohort of 42 patients with EoE with a CTD-like syndrome, representing 0.8% of patients with CTDs and 1.3% of patients with EoE within our hospital-wide electronic medical record database and our EoE research registry, respectively. An 8-fold risk of EoE in patients with CTDs (relative risk, 8.1; 95% confidence limit, 5.1-12.9; χ(2)1 = 112.0; P < 10(-3)) was present compared with the general population. Esophageal transcript profiling identified a distinct subset of genes, including COL8A2, in patients with EoE and CTDs. CONCLUSION: There is a remarkable association of EoE with CTDs and evidence for a differential expression of genes involved in connective tissue repair in this cohort. Thus, we propose stratification of patients with EoE and CTDs into a subset referred to as EoE-CTD.

Asunto(s)

Síndrome de Ehlers-Danlos/complicaciones , Esofagitis Eosinofílica/complicaciones , Esofagitis Eosinofílica/epidemiología , Síndrome de Marfan/complicaciones , Adolescente , Niño , Preescolar , Colágeno Tipo VIII/genética , Enfermedades del Tejido Conjuntivo/complicaciones , Enfermedades del Tejido Conjuntivo/epidemiología , Enfermedades del Tejido Conjuntivo/genética , Síndrome de Ehlers-Danlos/epidemiología , Síndrome de Ehlers-Danlos/genética , Esofagitis Eosinofílica/genética , Esófago/metabolismo , Femenino , Humanos , Masculino , Síndrome de Marfan/epidemiología , Síndrome de Marfan/genética , Prevalencia , ARN Mensajero/genética , ARN Mensajero/metabolismo

12.

Temporal relation discovery between events and temporal expressions identified in clinical narrative.

Cheng, Yao; Anick, Peter; Hong, Pengyu; Xue, Nianwen.

J Biomed Inform ; 46 Suppl: S48-S53, 2013 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-24076508

RESUMEN

The automatic detection of temporal relations between events in electronic medical records has the potential to greatly augment the value of such records for understanding disease progression and patients' responses to treatments. We present a three-step methodology for labeling temporal relations using machine learning and deterministic rules over an annotated corpus provided by the 2012 i2b2 Shared Challenge. We first create an expanded training network of relations by computing the transitive closure over the annotated data; we then apply hand-written rules and machine learning with a feature set that casts a wide net across potentially relevant lexical and syntactic information; finally, we employ a voting mechanism to resolve global contradictions between the local predictions made by the learned classifier. Results over the testing data illustrate the contributions of initial prediction and conflict resolution.

Asunto(s)

Registros Electrónicos de Salud , Narración , Procesamiento de Lenguaje Natural , Humanos , Informática Médica , Factores de Tiempo

13.

MedTime: a temporal information extraction system for clinical narratives.

Lin, Yu-Kai; Chen, Hsinchun; Brown, Randall A.

J Biomed Inform ; 46 Suppl: S20-S28, 2013 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-23911344

RESUMEN

Temporal information extraction from clinical narratives is of critical importance to many clinical applications. We participated in the EVENT/TIMEX3 track of the 2012 i2b2 clinical temporal relations challenge, and presented our temporal information extraction system, MedTime. MedTime comprises a cascade of rule-based and machine-learning pattern recognition procedures. It achieved a micro-averaged f-measure of 0.88 in both the recognitions of clinical events and temporal expressions. We proposed and evaluated three time normalization strategies to normalize relative time expressions in clinical texts. The accuracy was 0.68 in normalizing temporal expressions of dates, times, durations, and frequencies. This study demonstrates and evaluates the integration of rule-based and machine-learning-based approaches for high performance temporal information extraction from clinical narratives.

Asunto(s)

Registros Electrónicos de Salud , Informática Médica/métodos , Procesamiento de Lenguaje Natural , Algoritmos , Humanos , Narración

14.

Integrating row level security in i2b2: segregation of medical records into data marts without data replication and synchronization.

Scheible, Raphael; Thomczyk, Fabian; Blum, Marco; Rautenberg, Micha; Prunotto, Andrea; Yazijy, Suhail; Boeker, Martin.

JAMIA Open ; 6(3): ooad068, 2023 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-37583654

RESUMEN

Objective: i2b2 offers the possibility to store biomedical data of different projects in subject oriented data marts of the data warehouse, which potentially requires data replication between different projects and also data synchronization in case of data changes. We present an approach that can save this effort and assess its query performance in a case study that reflects real-world scenarios. Material and Methods: For data segregation, we used PostgreSQL's row level security (RLS) feature, the unit test framework pgTAP for validation and testing as well as the i2b2 application. No change of the i2b2 code was required. Instead, to leverage orchestration and deployment, we additionally implemented a command line interface (CLI). We evaluated performance using 3 different queries generated by i2b2, which we performed on an enlarged Harvard demo dataset. Results: We introduce the open source Python CLI i2b2rls, which orchestrates and manages security roles to implement data marts so that they do not need to be replicated and synchronized as different i2b2 projects. Our evaluation showed that our approach is on average 3.55 and on median 2.71 times slower compared to classic i2b2 data marts, but has more flexibility and easier setup. Conclusion: The RLS-based approach is particularly useful in a scenario with many projects, where data is constantly updated, user and group requirements change frequently or complex user authorization requirements have to be defined. The approach applies to both the i2b2 interface and direct database access.

15.

A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation.

Klann, Jeffrey G; Henderson, Darren W; Morris, Michele; Estiri, Hossein; Weber, Griffin M; Visweswaran, Shyam; Murphy, Shawn N.

J Am Med Inform Assoc ; 30(12): 1985-1994, 2023 11 17.

Artículo en Inglés | MEDLINE | ID: mdl-37632234

RESUMEN

OBJECTIVE: Patients who receive most care within a single healthcare system (colloquially called a "loyalty cohort" since they typically return to the same providers) have mostly complete data within that organization's electronic health record (EHR). Loyalty cohorts have low data missingness, which can unintentionally bias research results. Using proxies of routine care and healthcare utilization metrics, we compute a per-patient score that identifies a loyalty cohort. MATERIALS AND METHODS: We implemented a computable program for the widely adopted i2b2 platform that identifies loyalty cohorts in EHRs based on a machine-learning model, which was previously validated using linked claims data. We developed a novel validation approach, which tests, using only EHR data, whether patients returned to the same healthcare system after the training period. We evaluated these tools at 3 institutions using data from 2017 to 2019. RESULTS: Loyalty cohort calculations to identify patients who returned during a 1-year follow-up yielded a mean area under the receiver operating characteristic curve of 0.77 using the original model and 0.80 after calibrating the model at individual sites. Factors such as multiple medications or visits contributed significantly at all sites. Screening tests' contributions (eg, colonoscopy) varied across sites, likely due to coding and population differences. DISCUSSION: This open-source implementation of a "loyalty score" algorithm had good predictive power. Enriching research cohorts by utilizing these low-missingness patients is a way to obtain the data completeness necessary for accurate causal analysis. CONCLUSION: i2b2 sites can use this approach to select cohorts with mostly complete EHR data.

Asunto(s)

Algoritmos , Registros Electrónicos de Salud , Humanos , Aprendizaje Automático , Atención a la Salud , Electrónica

16.

An Ontology and Data Converter from RDF to the i2b2 Data Model.

Fasquelle-Lopez, Jules; Louis Raisaro, Jean.

Stud Health Technol Inform ; 294: 372-376, 2022 May 25.

Artículo en Inglés | MEDLINE | ID: mdl-35612099

RESUMEN

In a national effort aiming at cross-hospitals data interoperability, the Swiss Personalized Health Network elected RDF as preferred data and meta-data representation format. Yet, most clinical research software solutions are not designed to interact with RDF databases. We present a modular Python toolkit allowing easy conversion from RDF graphs to i2b2, adaptable to other common data models (CDM) with reasonable efforts. The tool was designed with feedback from clinicians in both oncology and laboratory research.

Asunto(s)

Programas Informáticos , Bases de Datos Factuales

17.

Clinical Notes De-Identification: Scoping Recent Benchmarks for n2c2 Datasets.

Chomutare, Taridzo.

Stud Health Technol Inform ; 289: 293-296, 2022 Jan 14.

Artículo en Inglés | MEDLINE | ID: mdl-35062150

RESUMEN

Publicly shared repositories play an important role in advancing performance benchmarks for some of the most important tasks in natural language processing (NLP) and healthcare in general. This study reviews most recent benchmarks based on the 2014 n2c2 de-identification dataset. Pre-processing challenges were uncovered, and attention brought to the discrepancies in reported number of Protected Health Information (PHI) entities among the studies. Improved reporting is required for greater transparency and reproducibility.

Asunto(s)

Benchmarking , Registros Electrónicos de Salud , Procesamiento de Lenguaje Natural , Reproducibilidad de los Resultados

18.

The Mass General Brigham Biobank Portal: an i2b2-based data repository linking disparate and high-dimensional patient data to support multimodal analytics.

Castro, Victor M; Gainer, Vivian; Wattanasin, Nich; Benoit, Barbara; Cagan, Andrew; Ghosh, Bhaswati; Goryachev, Sergey; Metta, Reeta; Park, Heekyong; Wang, David; Mendis, Michael; Rees, Martin; Herrick, Christopher; Murphy, Shawn N.

J Am Med Inform Assoc ; 29(4): 643-651, 2022 03 15.

Artículo en Inglés | MEDLINE | ID: mdl-34849976

RESUMEN

OBJECTIVE: Integrating and harmonizing disparate patient data sources into one consolidated data portal enables researchers to conduct analysis efficiently and effectively. MATERIALS AND METHODS: We describe an implementation of Informatics for Integrating Biology and the Bedside (i2b2) to create the Mass General Brigham (MGB) Biobank Portal data repository. The repository integrates data from primary and curated data sources and is updated weekly. The data are made readily available to investigators in a data portal where they can easily construct and export customized datasets for analysis. RESULTS: As of July 2021, there are 125 645 consented patients enrolled in the MGB Biobank. 88 527 (70.5%) have a biospecimen, 55 121 (43.9%) have completed the health information survey, 43 552 (34.7%) have genomic data and 124 760 (99.3%) have EHR data. Twenty machine learning computed phenotypes are calculated on a weekly basis. There are currently 1220 active investigators who have run 58 793 patient queries and exported 10 257 analysis files. DISCUSSION: The Biobank Portal allows noninformatics researchers to conduct study feasibility by querying across many data sources and then extract data that are most useful to them for clinical studies. While institutions require substantial informatics resources to establish and maintain integrated data repositories, they yield significant research value to a wide range of investigators. CONCLUSION: The Biobank Portal and other patient data portals that integrate complex and simple datasets enable diverse research use cases. i2b2 tools to implement these registries and make the data interoperable are open source and freely available.

Asunto(s)

Bancos de Muestras Biológicas , Almacenamiento y Recuperación de la Información , Recolección de Datos , Humanos , Informática

19.

Building an i2b2-Based Population Repository for COVID-19 Research.

Pedrera-Jimenez, Miguel; Garcia-Barrio, Noelia; Hernandez-Ibarburu, Gema; Baselga, Blanca; Blanco, Alvar; Calvo-Boyero, Fernando; Gutierrez-Sacristan, Alba; Quiros, Víctor; Cruz-Bermudez, Juan Luis; Bernal, José Luis; Meloni, Laura; Perez-Rey, David; Palchuk, Matvey; Kohane, Isaac; Serrano, Pablo.

Stud Health Technol Inform ; 294: 287-291, 2022 May 25.

Artículo en Inglés | MEDLINE | ID: mdl-35612078

RESUMEN

Reuse of Electronic Health Records (EHRs) for specific diseases such as COVID-19 requires data to be recorded and persisted according to international standards. Since the beginning of the COVID-19 pandemic, Hospital Universitario 12 de Octubre (H12O) evolved its EHRs: it identified, modeled and standardized the concepts related to this new disease in an agile, flexible and staged way. Thus, data from more than 200,000 COVID-19 cases were extracted, transformed, and loaded into an i2b2 repository. This effort allowed H12O to share data with worldwide networks such as the TriNetX platform and the 4CE Consortium.

Asunto(s)

COVID-19 , COVID-19/epidemiología , Registros Electrónicos de Salud , Humanos , Pandemias

20.

Analytics to monitor local impact of the Protecting Access to Medicare Act's imaging clinical decision support requirements.

Valtchinov, Vladimir I; Murphy, Shawn N; Lacson, Ronilda; Ikonomov, Nikolay; Zhai, Bingxue K; Andriole, Katherine; Rousseau, Justin; Hanson, Dick; Kohane, Isaac S; Khorasani, Ramin.

J Am Med Inform Assoc ; 29(11): 1870-1878, 2022 10 07.

Artículo en Inglés | MEDLINE | ID: mdl-35932187

RESUMEN

OBJECTIVE: This study aimed is to: (1) extend the Integrating the Biology and the Bedside (i2b2) data and application models to include medical imaging appropriate use criteria, enabling it to serve as a platform to monitor local impact of the Protecting Access to Medicare Act's (PAMA) imaging clinical decision support (CDS) requirements, and (2) validate the i2b2 extension using data from the Medicare Imaging Demonstration (MID) CDS implementation. MATERIALS AND METHODS: This study provided a reference implementation and assessed its validity and reliability using data from the MID, the federal government's predecessor to PAMA's imaging CDS program. The Star Schema was extended to describe the interactions of imaging ordering providers with the CDS. New ontologies were added to enable mapping medical imaging appropriateness data to i2b2 schema. z-Ratio for testing the significance of the difference between 2 independent proportions was utilized. RESULTS: The reference implementation used 26 327 orders for imaging examinations which were persisted to the modified i2b2 schema. As an illustration of the analytical capabilities of the Web Client, we report that 331/1192 or 28.1% of imaging orders were deemed appropriate by the CDS system at the end of the intervention period (September 2013), an increase from 162/1223 or 13.2% for the first month of the baseline period, December 2011 (P = .0212), consistent with previous studies. CONCLUSIONS: The i2b2 platform can be extended to monitor local impact of PAMA's appropriateness of imaging ordering CDS requirements.

Asunto(s)

Sistemas de Apoyo a Decisiones Clínicas , Anciano , Diagnóstico por Imagen , Humanos , Medicare , Monitoreo Fisiológico , Reproducibilidad de los Resultados , Estados Unidos

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA