Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
J Biomed Inform ; 46(3): 410-24, 2013 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-23402960

RESUMO

OBJECTIVE: To create an analytics platform for specifying and detecting clinical phenotypes and other derived variables in electronic health record (EHR) data for quality improvement investigations. MATERIALS AND METHODS: We have developed an architecture for an Analytic Information Warehouse (AIW). It supports transforming data represented in different physical schemas into a common data model, specifying derived variables in terms of the common model to enable their reuse, computing derived variables while enforcing invariants and ensuring correctness and consistency of data transformations, long-term curation of derived data, and export of derived data into standard analysis tools. It includes software that implements these features and a computing environment that enables secure high-performance access to and processing of large datasets extracted from EHRs. RESULTS: We have implemented and deployed the architecture in production locally. The software is available as open source. We have used it as part of hospital operations in a project to reduce rates of hospital readmission within 30days. The project examined the association of over 100 derived variables representing disease and co-morbidity phenotypes with readmissions in 5years of data from our institution's clinical data warehouse and the UHC Clinical Database (CDB). The CDB contains administrative data from over 200 hospitals that are in academic medical centers or affiliated with such centers. DISCUSSION AND CONCLUSION: A widely available platform for managing and detecting phenotypes in EHR data could accelerate the use of such data in quality improvement and comparative effectiveness studies.


Assuntos
Registros Eletrônicos de Saúde , Software , Algoritmos , Sistemas de Gerenciamento de Base de Dados , Readmissão do Paciente
2.
JAMIA Open ; 6(4): ooad089, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37860604

RESUMO

Objectives: Using agile software development practices, develop and evaluate an architecture and implementation for reliable and user-friendly self-service management of bioinformatic data stored in the cloud. Materials and methods: Comprehensive Oncology Research Environment (CORE) Browser is a new open-source web application for cancer researchers to manage sequencing data organized in a flexible format in Amazon Simple Storage Service (S3) buckets. It has a microservices- and hypermedia-based architecture, which we integrated with Test-Driven Development (TDD), the iterative writing of computable specifications for how software should work prior to development. Relying on repeating patterns found in hypermedia-based architectures, we hypothesized that hypermedia would permit developing test "templates" that can be parameterized and executed for each microservice, maximizing code coverage while minimizing effort. Results: After one-and-a-half years of development, the CORE Browser backend had 121 test templates and 875 custom tests that were parameterized and executed 3031 times, providing 78% code coverage. Discussion: Architecting to permit test reuse through a hypermedia approach was a key success factor for our testing efforts. CORE Browser's application of hypermedia and TDD illustrates one way to integrate software engineering methods into data-intensive networked applications. Separating bioinformatic data management from analysis distinguishes this platform from others in bioinformatics and may provide stable data management while permitting analysis methods to advance more rapidly. Conclusion: Software engineering practices are underutilized in informatics. Similar informatics projects will more likely succeed through application of good architecture and automated testing. Our approach is broadly applicable to data management tools involving cloud data storage.

3.
J Registry Manag ; 49(4): 153-160, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-37260815

RESUMO

Cancer surveillance at the population level is a highly labor-intensive process, with certified tumor registrars (CTRs) manually reviewing medical charts of cancer patients and entering information into local databases that are centrally merged and curated at state and national levels. Registries face considerable challenges in terms of constrained budgets, staffing shortages, and keeping pace with the evolving national and international data standards that are essential to cancer registration. Advanced informatics methods are needed to increase automation, reduce manual efforts, and to help address some of these challenges. The Cancer Informatics Advisory Group (CIAG) to the North American Association of Central Cancer Registries (NAACCR) board was established in 2019 to advise of external informatics activities and initiatives for long-term strategic planning. Reviewed here by the CIAG are current informatics initiatives that were either born out of the cancer registry field or have implications for expansion to cancer surveillance programs in the future. Several areas of notable activity are presented, including an overview of informatics initiatives and descriptions of 12 specific informatics projects with implications for cancer registries. Recommendations are also provided to the registry community for the continued tracking and impact of the projects and initiatives.


Assuntos
Neoplasias , Humanos , Certificação , Pessoal de Saúde , Sistemas de Informação , Neoplasias/epidemiologia , Sistema de Registros
4.
Biomolecules ; 12(11)2022 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-36358918

RESUMO

In the past decade, defective DNA repair has been increasingly linked with cancer progression. Human tumors with markers of defective DNA repair and increased replication stress exhibit genomic instability and poor survival rates across tumor types. Seminal studies have demonstrated that genomic instability develops following inactivation of BRCA1, BRCA2, or BRCA-related genes. However, it is recognized that many tumors exhibit genomic instability but lack BRCA inactivation. We sought to identify a pan-cancer mechanism that underpins genomic instability and cancer progression in BRCA-wildtype tumors. Methods: Using multi-omics data from two independent consortia, we analyzed data from dozens of tumor types to identify patient cohorts characterized by poor outcomes, genomic instability, and wildtype BRCA genes. We developed several novel metrics to identify the genetic underpinnings of genomic instability in tumors with wildtype BRCA. Associated clinical data was mined to analyze patient responses to standard of care therapies and potential differences in metastatic dissemination. Results: Systematic analysis of the DNA repair landscape revealed that defective single-strand break repair, translesion synthesis, and non-homologous end-joining effectors drive genomic instability in tumors with wildtype BRCA and BRCA-related genes. Importantly, we find that loss of these effectors promotes replication stress, therapy resistance, and increased primary carcinoma to brain metastasis. Conclusions: Our results have defined a new pan-cancer class of tumors characterized by replicative instability (RIN). RIN is defined by the accumulation of intra-chromosomal, gene-level gain and loss events at replication stress sensitive (RSS) genome sites. We find that RIN accelerates cancer progression by driving copy number alterations and transcriptional program rewiring that promote tumor evolution. Clinically, we find that RIN drives therapy resistance and distant metastases across multiple tumor types.


Assuntos
Instabilidade Genômica , Neoplasias , Humanos , Reparo do DNA/genética , Reparo do DNA por Junção de Extremidades , Neoplasias/genética , Replicação do DNA , Aberrações Cromossômicas
5.
JAMIA Open ; 4(4): ooab103, 2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-34927001

RESUMO

OBJECTIVE: The Huntsman Cancer Institute Research Informatics Shared Resource (RISR), a software and database development core facility, sought to address a lack of published operational best practices for research informatics cores. It aimed to use those insights to enhance effectiveness after an increase in team size from 20 to 31 full-time equivalents coincided with a reduction in user satisfaction. MATERIALS AND METHODS: RISR migrated from a water-scrum-fall model of software development to agile software development practices, which emphasize iteration and collaboration. RISR's agile implementation emphasizes the product owner role, which is responsible for user engagement and may be particularly valuable in software development that requires close engagement with users like in science. RESULTS: All RISR's software development teams implemented agile practices in early 2020. All project teams are led by a product owner who serves as the voice of the user on the development team. Annual user survey scores for service quality and turnaround time recorded 9 months after implementation increased by 17% and 11%, respectively. DISCUSSION: RISR is illustrative of the increasing size of research informatics cores and the need to identify best practices for maintaining high effectiveness. Agile practices may address concerns about the fit of software engineering practices in science. The study had one time point after implementing agile practices and one site, limiting its generalizability. CONCLUSIONS: Agile software development may substantially increase a research informatics core facility's effectiveness and should be studied further as a potential best practice for how such cores are operated.

6.
Clin Lab Med ; 28(1): 83-100, vii, 2008 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-18194720

RESUMO

Large-scale clinical databases provide a detailed perspective on patient phenotype in disease and the characteristics of health care processes. Important information is often contained in the relationships between the values and timestamps of sequences of clinical data. The analysis of clinical time sequence data across entire patient populations may reveal data patterns that enable a more precise understanding of disease presentation, progression, and response to therapy, and thus could be of great value for clinical and translational research. Recent work suggests that the combination of temporal data mining methods with techniques from artificial intelligence research on knowledge-based temporal abstraction may enable the mining of clinically relevant temporal features from these previously problematic general clinical data.


Assuntos
Bases de Dados como Assunto , Informática Médica/métodos , Algoritmos , Inteligência Artificial , Humanos , Reconhecimento Automatizado de Padrão , Software , Fatores de Tempo
7.
J Am Med Inform Assoc ; 14(5): 674-83, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17600103

RESUMO

OBJECTIVE: To specify and identify disease and patient care processes represented by temporal patterns in clinical events and observations, and retrieve patient populations containing those patterns from clinical data repositories, in support of clinical research, outcomes studies, and quality assurance. DESIGN: A data processing method called PROTEMPA (Process-oriented Temporal Analysis) was developed for defining and detecting clinically relevant temporal and mathematical patterns in retrospective data. PROTEMPA provides for portability across data sources, "pluggable" data processing environments, and the creation of libraries of pattern definitions and data processing algorithms. MEASUREMENTS: A proof-of-concept implementation of PROTEMPA in Java was evaluated against standard SQL queries for its ability to identify patients from a large clinical data repository who show the features of HELLP syndrome, and categorize those patients by disease severity and progression based on time sequence characteristics in their clinical laboratory test results. RESULTS were verified by manual case review. RESULTS: The proof-of-concept implementation was more accurate than SQL in identifying patients with HELLP and correctly assigned severity and disease progression categories, which was not possible using SQL only. CONCLUSIONS: PROTEMPA supports the identification and categorization of patients with complex disease based on the characteristics of and relationships between time sequences in multiple data types. Identifying patient populations who share these types of patterns may be useful when patient features of interest do not have standard codes, are poorly-expressed in coding schemes, may be inaccurately or incompletely coded, or are not represented explicitly as data values.


Assuntos
Processamento Eletrônico de Dados , Seleção de Pacientes , Estudos Retrospectivos , Software , Humanos , Tempo
9.
AMIA Annu Symp Proc ; 2017: 1411-1420, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29854210

RESUMO

Research data warehouses integrate research and patient data from one or more sources into a single data model that is designed for research. Typically, institutions update their warehouse by fully reloading it periodically. The alternative is to update the warehouse incrementally with new, changed and/or deleted data. Full reloads avoid having to correct and add to a live system, but they can render the data outdated for clinical trial accrual. They place a substantial burden on source systems, involve intermittent work that is challenging to resource, and may involve tight coordination across IT and informatics units. We have implemented daily incremental updating for our i2b2 data warehouse. Incremental updating requires substantial up-front development, and it can expose provisional data to investigators. However, it may support more use cases, it may be a better fit for academic healthcare IT organizational structures, and ongoing support needs appear to be similar or lower.


Assuntos
Pesquisa Biomédica/organização & administração , Data Warehousing/métodos , Bases de Dados como Assunto/organização & administração , Humanos
10.
AMIA Jt Summits Transl Sci Proc ; 2016: 184-93, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27570667

RESUMO

Clinical and Translational Science Award (CTSA) recipients have a need to create research data marts from their clinical data warehouses, through research data networks and the use of i2b2 and SHRINE technologies. These data marts may have different data requirements and representations, thus necessitating separate extract, transform and load (ETL) processes for populating each mart. Maintaining duplicative procedural logic for each ETL process is onerous. We have created an entirely metadata-driven ETL process that can be customized for different data marts through separate configurations, each stored in an extension of i2b2 's ontology database schema. We extended our previously reported and open source Eureka! Clinical Analytics software with this capability. The same software has created i2b2 data marts for several projects, the largest being the nascent Accrual for Clinical Trials (ACT) network, for which it has loaded over 147 million facts about 1.2 million patients.

11.
J Am Med Inform Assoc ; 20(1): 109-16, 2013 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-23059729

RESUMO

OBJECTIVES: We present SHARE, a new system for statistical health information release with differential privacy. We present two case studies that evaluate the software on real medical datasets and demonstrate the feasibility and utility of applying the differential privacy framework on biomedical data. MATERIALS AND METHODS: SHARE releases statistical information in electronic health records with differential privacy, a strong privacy framework for statistical data release. It includes a number of state-of-the-art methods for releasing multidimensional histograms and longitudinal patterns. We performed a variety of experiments on two real datasets, the surveillance, epidemiology and end results (SEER) breast cancer dataset and the Emory electronic medical record (EeMR) dataset, to demonstrate the feasibility and utility of SHARE. RESULTS: Experimental results indicate that SHARE can deal with heterogeneous data present in medical data, and that the released statistics are useful. The Kullback-Leibler divergence between the released multidimensional histograms and the original data distribution is below 0.5 and 0.01 for seven-dimensional and three-dimensional data cubes generated from the SEER dataset, respectively. The relative error for longitudinal pattern queries on the EeMR dataset varies between 0 and 0.3. While the results are promising, they also suggest that challenges remain in applying statistical data release using the differential privacy framework for higher dimensional data. CONCLUSIONS: SHARE is one of the first systems to provide a mechanism for custodians to release differentially private aggregate statistics for a variety of use cases in the medical domain. This proof-of-concept system is intended to be applied to large-scale medical data warehouses.


Assuntos
Confidencialidade , Registros Eletrônicos de Saúde , Disseminação de Informação , Armazenamento e Recuperação da Informação , Software , Neoplasias da Mama/epidemiologia , Gráficos por Computador , Estudos de Viabilidade , Feminino , Humanos , Estudos Longitudinais , Sistemas de Registro de Ordens Médicas/estatística & dados numéricos , Estudos de Casos Organizacionais , Programa de SEER/estatística & dados numéricos , Estados Unidos
12.
Artigo em Inglês | MEDLINE | ID: mdl-24303265

RESUMO

Clinical phenotyping is an emerging research information systems capability. Research uses of electronic health record (EHR) data may require the ability to identify clinical co-morbidities and complications. Such phenotypes may not be represented directly as discrete data elements, but rather as frequency, sequential and temporal patterns in billing and clinical data. These patterns' complexity suggests the need for a robust yet flexible extract, transform and load (ETL) process that can compute them. This capability should be accessible to investigators with limited ability to engage an IT department in data management. We have developed such a system, Eureka! Clinical Analytics. It extracts data from an Excel spreadsheet, computes a broad set of phenotypes of common interest, and loads both raw and computed data into an i2b2 project. A web-based user interface allows executing and monitoring ETL processes. Eureka! is deployed at our institution and is available for deployment in the cloud.

13.
AMIA Annu Symp Proc ; 2013: 1160-9, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24551400

RESUMO

Temporal abstraction, a method for specifying and detecting temporal patterns in clinical databases, is very expressive and performs well, but it is difficult for clinical investigators and data analysts to understand. Such patterns are critical in phenotyping patients using their medical records in research and quality improvement. We have previously developed the Analytic Information Warehouse (AIW), which computes such phenotypes using temporal abstraction but requires software engineers to use. We have extended the AIW's web user interface, Eureka! Clinical Analytics, to support specifying phenotypes using an alternative model that we developed with clinical stakeholders. The software converts phenotypes from this model to that of temporal abstraction prior to data processing. The model can represent all phenotypes in a quality improvement project and a growing set of phenotypes in a multi-site research study. Phenotyping that is accessible to investigators and IT personnel may enable its broader adoption.


Assuntos
Algoritmos , Registros Eletrônicos de Saúde , Reconhecimento Automatizado de Padrão , Software , Mineração de Dados/métodos , Humanos , Bases de Conhecimento , Tempo
14.
AMIA Annu Symp Proc ; : 606-10, 2008 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-18999254

RESUMO

Innovative science frequently occurs as a result of cross-disciplinary collaboration, the importance of which is reflected by recent NIH funding initiatives that promote communication and collaboration. If shared research interests between collaborators are important for the formation of collaborations,methods for identifying these shared interests across scientific domains could potentially reveal new and useful collaboration opportunities. MEDLINE represents a comprehensive database of collaborations and research interests, as reflected by article co-authors and concept content. We analyzed six years of citations using information retrieval based methods to compute articles conceptual similarity, and found that articles by basic and clinical scientists who later collaborated had significantly higher average similarity than articles by similar scientists who did not collaborate.Refinement of these methods and characterization of found conceptual overlaps could allow automated discovery of collaboration opportunities that are currently missed.


Assuntos
Indexação e Redação de Resumos/métodos , Comportamento Cooperativo , Comunicação Interdisciplinar , MEDLINE , Processamento de Linguagem Natural , Reconhecimento Automatizado de Padrão/métodos , Publicações Periódicas como Assunto/estatística & dados numéricos , Descritores , Algoritmos , Inteligência Artificial , Estados Unidos
15.
AMIA Annu Symp Proc ; : 603-7, 2007 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-18693907

RESUMO

Disease and patient care processes often create characteristic states, trends, and temporal patterns in clinical events and observations, called temporal abstractions. Identifying patient populations who share similar abstractions may be useful for clinical research, outcomes studies, and quality assurance. In these settings, abstractions may be specific to a query, and thus allowing the specification of abstractions directly in the query would be desirable. We propose a query language for specifying and retrieving clinical data sets that allows specifying abstractions directly, and automatically selects data for retrieval based on the presence of abstractions inferred from the data. We describe the language and a prototype implementation, demonstrate its features with two queries constructed in response to clinical researcher-initiated data requests submitted to our institution's Clinical Data Repository, and report preliminary results from an evaluation of the implementation's performance.


Assuntos
Pesquisa Biomédica , Sistemas de Gerenciamento de Base de Dados , Armazenamento e Recuperação da Informação/métodos , Bases de Dados como Assunto , Feminino , Síndrome HELLP/diagnóstico , Humanos , Gravidez , Linguagens de Programação , Tempo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA