RESUMO
We sought proof of concept of a Big Data Solution incorporating longitudinal structured and unstructured patient-level data from electronic health records (EHR) to predict graft loss (GL) and mortality. For a quality improvement initiative, GL and mortality prediction models were constructed using baseline and follow-up data (0-90 days posttransplant; structured and unstructured for 1-year models; data up to 1 year for 3-year models) on adult solitary kidney transplant recipients transplanted during 2007-2015 as follows: Model 1: United Network for Organ Sharing (UNOS) data; Model 2: UNOS & Transplant Database (Tx Database) data; Model 3: UNOS, Tx Database & EHR comorbidity data; and Model 4: UNOS, Tx Database, EHR data, Posttransplant trajectory data, and unstructured data. A 10% 3-year GL rate was observed among 891 patients (2007-2015). Layering of data sources improved model performance; Model 1: area under the curve (AUC), 0.66; (95% confidence interval [CI]: 0.60, 0.72); Model 2: AUC, 0.68; (95% CI: 0.61-0.74); Model 3: AUC, 0.72; (95% CI: 0.66-077); Model 4: AUC, 0.84, (95 % CI: 0.79-0.89). One-year GL (AUC, 0.87; Model 4) and 3-year mortality (AUC, 0.84; Model 4) models performed similarly. A Big Data approach significantly adds efficacy to GL and mortality prediction models and is EHR deployable to optimize outcomes.