Your browser doesn't support javascript.
loading
A late-binding, distributed, NoSQL warehouse for integrating patient data from clinical trials.
Yang, Eric; Scheff, Jeremy D; Shen, Shih C; Farnum, Michael A; Sefton, James; Lobanov, Victor S; Agrafiotis, Dimitris K.
Afiliación
  • Yang E; Covance, the Drug Development Division of LabCorp Carnegie Center, Princeton, NJ, USA.
  • Scheff JD; Covance, the Drug Development Division of LabCorp Carnegie Center, Princeton, NJ, USA.
  • Shen SC; Covance, the Drug Development Division of LabCorp Carnegie Center, Princeton, NJ, USA.
  • Farnum MA; Covance, the Drug Development Division of LabCorp Carnegie Center, Princeton, NJ, USA.
  • Sefton J; Covance, the Drug Development Division of LabCorp Carnegie Center, Princeton, NJ, USA.
  • Lobanov VS; Covance, the Drug Development Division of LabCorp Carnegie Center, Princeton, NJ, USA.
  • Agrafiotis DK; Covance, the Drug Development Division of LabCorp Carnegie Center, Princeton, NJ, USA.
Database (Oxford) ; 20192019 01 01.
Article en En | MEDLINE | ID: mdl-30854563
ABSTRACT
Clinical trial data are typically collected through multiple systems developed by different vendors using different technologies and data standards. That data need to be integrated, standardized and transformed for a variety of monitoring and reporting purposes. The need to process large volumes of often inconsistent data in the presence of ever-changing requirements poses a significant technical challenge. As part of a comprehensive clinical data repository, we have developed a data warehouse that integrates patient data from any source, standardizes it and makes it accessible to study teams in a timely manner to support a wide range of analytic tasks for both in-flight and completed studies. Our solution combines Apache HBase, a NoSQL column store, Apache Phoenix, a massively parallel relational query engine and a user-friendly interface to facilitate efficient loading of large volumes of data under incomplete or ambiguous specifications, utilizing an extract-load-transform design pattern that defers data mapping until query time. This approach allows us to maintain a single copy of the data and transform it dynamically into any desirable format without requiring additional storage. Changes to the mapping specifications can be easily introduced and multiple representations of the data can be made available concurrently. Further, by versioning the data and the transformations separately, we can apply historical maps to current data or current maps to historical data, which simplifies the maintenance of data cuts and facilitates interim analyses for adaptive trials. The result is a highly scalable, secure and redundant solution that combines the flexibility of a NoSQL store with the robustness of a relational query engine to support a broad range of applications, including clinical data management, medical review, risk-based monitoring, safety signal detection, post hoc analysis of completed studies and many others.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Sistemas de Administración de Bases de Datos / Ensayos Clínicos como Asunto / Data Warehousing Tipo de estudio: Guideline Límite: Humans Idioma: En Revista: Database (Oxford) Año: 2019 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Sistemas de Administración de Bases de Datos / Ensayos Clínicos como Asunto / Data Warehousing Tipo de estudio: Guideline Límite: Humans Idioma: En Revista: Database (Oxford) Año: 2019 Tipo del documento: Article País de afiliación: Estados Unidos
...