Search | VHL Regional Portal

1.

The "Why" behind including "Y" in your imputation model.

D'Agostino McGowan, Lucy; Lotspeich, Sarah C; Hepler, Staci A.

Stat Methods Med Res ; : 9622802241244608, 2024 Apr 16.

Article in English | MEDLINE | ID: mdl-38625810

ABSTRACT

Missing data is a common challenge when analyzing epidemiological data, and imputation is often used to address this issue. Here, we investigate the scenario where a covariate used in an analysis has missingness and will be imputed. There are recommendations to include the outcome from the analysis model in the imputation model for missing covariates, but it is not necessarily clear if this recommendation always holds and why this is sometimes true. We examine deterministic imputation (i.e. single imputation with fixed values) and stochastic imputation (i.e. single or multiple imputation with random values) methods and their implications for estimating the relationship between the imputed covariate and the outcome. We mathematically demonstrate that including the outcome variable in imputation models is not just a recommendation but a requirement to achieve unbiased results when using stochastic imputation methods. Moreover, we dispel common misconceptions about deterministic imputation models and demonstrate why the outcome should not be included in these models. This article aims to bridge the gap between imputation in theory and in practice, providing mathematical derivations to explain common statistical recommendations. We offer a better understanding of the considerations involved in imputing missing covariates and emphasize when it is necessary to include the outcome variable in the imputation model.

2.

The impact of earthquakes in Latin America on the continuity of HIV care: A retrospective observational cohort study.

Gorsline, Chelsea A; Lotspeich, Sarah C; Belaunzarán-Zamudio, Pablo F; Mejia, Fernando; Cortes, Claudia P; Crabtree-Ramírez, Brenda; Severe, Damocles Patrice; Rouzier, Vanessa; McGowan, Catherine C; Rebeiro, Peter F.

Public Health Pract (Oxf) ; 7: 100479, 2024 Jun.

Article in English | MEDLINE | ID: mdl-38405231

ABSTRACT

Objectives: As earthquakes occur frequently in Latin America and can cause significant disruptions in HIV care, we sought to analyze patterns of HIV care for adults at Latin American clinical sites experiencing a significant earthquake within the past two decades. Study design: Retrospective clinical cohort study. Methods: Adults receiving HIV care at sites experiencing at least a "moderate intensity" (Modified Mercalli scale) earthquake in the Caribbean, Central and South America network for HIV epidemiology (CCASAnet) contributed data from 2003 to 2017. Interrupted Time Series models were fit with discontinuities at site-specific earthquake dates (Sept. 16, 2015 in Chile; Apr. 18, 2014 and Sept. 19, 2017 in Mexico; and Aug. 15, 2007 in Peru) to assess clinical visit, CD4 measure, viral load lab, and ART initiation rates 3- and 6-months after versus before earthquakes. Results: Comparing post-to pre-earthquake periods, there was a sharp drop in median visit (incidence rate ratio [IRR] = 0.79, 95% confidence interval [CI]: 0.68-0.91) and viral load lab (IRR = 0.78, 95% CI: 0.62-0.99) rates per week, using a 3-month window. CD4 measurement rates also decreased (IRR = 0.43; 95% CI: 0.37-0.51), though only using a 6-month window. Conclusions: Given that earthquakes occur frequently in Latin America, disaster preparedness plans must be more broadly implemented to avoid disruptions in HIV care and attendant poor outcomes.

3.

Quantifying the HIV reservoir with dilution assays and deep viral sequencing.

Lotspeich, Sarah C; Richardson, Brian D; Baldoni, Pedro L; Enders, Kimberly P; Hudgens, Michael G.

Biometrics ; 80(1)2024 Jan 29.

Article in English | MEDLINE | ID: mdl-38364812

ABSTRACT

People living with HIV on antiretroviral therapy often have undetectable virus levels by standard assays, but "latent" HIV still persists in viral reservoirs. Eliminating these reservoirs is the goal of HIV cure research. The quantitative viral outgrowth assay (QVOA) is commonly used to estimate the reservoir size, that is, the infectious units per million (IUPM) of HIV-persistent resting CD4+ T cells. A new variation of the QVOA, the ultra deep sequencing assay of the outgrowth virus (UDSA), was recently developed that further quantifies the number of viral lineages within a subset of infected wells. Performing the UDSA on a subset of wells provides additional information that can improve IUPM estimation. This paper considers statistical inference about the IUPM from combined dilution assay (QVOA) and deep viral sequencing (UDSA) data, even when some deep sequencing data are missing. Methods are proposed to accommodate assays with wells sequenced at multiple dilution levels and with imperfect sensitivity and specificity, and a novel bias-corrected estimator is included for small samples. The proposed methods are evaluated in a simulation study, applied to data from the University of North Carolina HIV Cure Center, and implemented in the open-source R package SLDeepAssay.

Subject(s)

HIV Infections , HIV-1 , Humans , Virus Latency , HIV-1/genetics , CD4-Positive T-Lymphocytes , Computer Simulation , Viral Load

4.

Lessons learned from over a decade of data audits in international observational HIV cohorts in Latin America and East Africa.

Lotspeich, Sarah C; Shepherd, Bryan E; Kariuki, Marion Achieng; Wools-Kaloustian, Kara; McGowan, Catherine C; Musick, Beverly; Semeere, Aggrey; Crabtree Ramírez, Brenda E; Mkwashapi, Denna M; Cesar, Carina; Ssemakadde, Matthew; Machado, Daisy Maria; Ngeresa, Antony; Ferreira, Flávia Faleiro; Lwali, Jerome; Marcelin, Adias; Cardoso, Sandra Wagner; Luque, Marco Tulio; Otero, Larissa; Cortés, Claudia P; Duda, Stephany N.

J Clin Transl Sci ; 7(1): e245, 2023.

Article in English | MEDLINE | ID: mdl-38033704

ABSTRACT

Introduction: Routine patient care data are increasingly used for biomedical research, but such "secondary use" data have known limitations, including their quality. When leveraging routine care data for observational research, developing audit protocols that can maximize informational return and minimize costs is paramount. Methods: For more than a decade, the Latin America and East Africa regions of the International epidemiology Databases to Evaluate AIDS (IeDEA) consortium have been auditing the observational data drawn from participating human immunodeficiency virus clinics. Since our earliest audits, where external auditors used paper forms to record audit findings from paper medical records, we have streamlined our protocols to obtain more efficient and informative audits that keep up with advancing technology while reducing travel obligations and associated costs. Results: We present five key lessons learned from conducting data audits of secondary-use data from resource-limited settings for more than 10 years and share eight recommendations for other consortia looking to implement data quality initiatives. Conclusion: After completing multiple audit cycles in both the Latin America and East Africa regions of the IeDEA consortium, we have established a rich reference for data quality in our cohorts, as well as large, audited analytical datasets that can be used to answer important clinical questions with confidence. By sharing our audit processes and how they have been adapted over time, we hope that others can develop protocols informed by our lessons learned from more than a decade of experience in these large, diverse cohorts.

5.

It takes more than a machine: A pilot feasibility study of point-of-care HIV-1 viral load testing at a lower-level health center in rural western Uganda.

Boyce, Ross M; Ndizeye, Ronnie; Ngelese, Herbert; Baguma, Emmanuel; Shem, Bwambale; Rubinstein, Rebecca J; Rockwell, Emmanuel; Lotspeich, Sarah C; Shook-Sa, Bonnie E; Ntaro, Moses; Nyehangane, Dan; Wohl, David A; Siedner, Mark J; Mulogo, Edgar M.

PLOS Glob Public Health ; 3(3): e0001678, 2023.

Article in English | MEDLINE | ID: mdl-36972208

ABSTRACT

Barriers continue to limit access to viral load (VL) monitoring across sub-Saharan Africa adversely impacting control of the HIV epidemic. The objective of this study was to determine whether the systems and processes required to realize the potential of rapid molecular technology are available at a prototypical lower-level (i.e., level III) health center in rural Uganda. In this open-label pilot study, participants underwent parallel VL testing at both the central laboratory (i.e., standard of care) and on-site using the GeneXpert HIV-1 assay. The primary outcome was the number of VL tests completed each clinic day. Secondary outcomes included the number of days from sample collection to receipt of result at clinic and the number of days from sample collection to patient receipt of the result. From August 2020 to July 2021, we enrolled a total of 242 participants. The median number of daily tests performed on the Xpert platform was 4, (IQR = 2-7). Time from sample collection to result was 51 days (IQR = 45-62) for samples sent to the central laboratory and 0 days (IQR = 0-0.25) for the Xpert assay conducted at the health center. However, few participants elected to receive results by one of the expedited options, which contributed to similar time-to-patient between testing approaches (89 versus 84 days, p = 0.07). Implementation of a rapid, near point-of-care VL assay at a lower-level health center in rural Uganda appears feasible, but interventions to promote rapid clinical response and influence patient preferences about result receipt require further study. Trial registration: ClinicalTrials.gov Identifier: NCT04517825, Registered 18 August 2020. Available at: https://clinicaltrials.gov/ct2/show/NCT04517825.

6.

Correcting conditional mean imputation for censored covariates and improving usability.

Lotspeich, Sarah C; Grosser, Kyle F; Garcia, Tanya P.

Biom J ; 64(5): 858-862, 2022 06.

Article in English | MEDLINE | ID: mdl-35199878

ABSTRACT

Missing data are often overcome using imputation, which leverages the entire dataset to replace missing values with informed placeholders. This method can be modified for censored data by also incorporating partial information from censored values. One such modification proposed by Atem et al. (2017, 2019a, 2019b) is conditional mean imputation where censored covariates are replaced by their conditional means given other fully observed information. These methods are robust to additional parametric assumptions on the censored covariate and utilize all available data, which is appealing. However, in implementing these methods, we discovered that these three articles provide nonequivalent formulas and, in fact, none is the correct formula for the conditional mean. Herein, we derive the correct form of the conditional mean and discuss the bias incurred when using the incorrect formulas. Furthermore, we note that even the correct formula can perform poorly for log hazard ratios far from 0${\mathbf {0}}$ . We also provide user-friendly R software, the imputeCensoRd package, to enable future researchers to tackle censored covariates correctly.

Subject(s)

Models, Statistical , Bias , Computer Simulation , Proportional Hazards Models

7.

The Role of External Genital Lesions in Human Immunodeficiency Virus Seroconversion Among Men Participating in a Multinational Study.

Sudenga, Staci L; Lotspeich, Sarah C; Nyitray, Alan G; Sirak, Bradley; Shepherd, Bryan E; Messina, Jane; Sereday, Karen A; Silva, Roberto Carvalho; Abrahamsen, Martha; Baggio, Maria Luiza; Quiterio, Manuel; Lazcano-Ponce, Eduardo; Villa, Luisa; Giuliano, Anna R.

Sex Transm Dis ; 49(1): 55-58, 2022 01 01.

Article in English | MEDLINE | ID: mdl-34282740

ABSTRACT

BACKGROUND: Studies in women have shown an increased risk of human immunodeficiency virus (HIV) acquisition with prior human papilloma virus (HPV) infection; however, few studies have been conducted among men. Our objective was to assess whether HPV-related external genital lesions (EGLs) increase risk of HIV seroconversion among men. METHODS: A total of 1379 HIV-negative men aged 18 to 70 years from the United States, Mexico, and Brazil were followed for up to 7 years and underwent clinical examination for EGLs and blood draws every 6 months. Human immunodeficiency virus seroconversion was assessed in archived serum. Cox proportional hazards and marginal structural models assessed the association between EGL status and time to HIV seroconversion. RESULTS: Twenty-nine participants HIV seroconverted during follow-up. Older age was associated with a lower hazard of HIV seroconversion. We found no significant difference in the risk of HIV seroconversion between men with and without EGLs (adjusted hazard ratio, 0.94; 95% confidence interval, 0.32-2.74). Stratified analyses focusing on men that have sex with men found no association between EGLs and HIV seroconversion risk (hazards ratio, 0.63; 95% confidence interval, 0.00-1.86). CONCLUSIONS: External genital lesions were not associated with higher risk for HIV seroconversion in this multinational population, although statistical power was limited as there were few HIV seroconversions. Results may differ in populations at higher risk for HIV.

Subject(s)

HIV Infections , HIV Seropositivity , Adolescent , Adult , Aged , Female , Genitalia , HIV , HIV Infections/epidemiology , HIV Seropositivity/epidemiology , Humans , Male , Middle Aged , Prospective Studies , Risk Factors , Seroconversion , United States/epidemiology , Young Adult

8.

Efficient odds ratio estimation under two-phase sampling using error-prone data from a multi-national HIV research cohort.

Lotspeich, Sarah C; Shepherd, Bryan E; Amorim, Gustavo G C; Shaw, Pamela A; Tao, Ran.

Biometrics ; 78(4): 1674-1685, 2022 12.

Article in English | MEDLINE | ID: mdl-34213008

ABSTRACT

Persons living with HIV engage in routine clinical care, generating large amounts of data in observational HIV cohorts. These data are often error-prone, and directly using them in biomedical research could bias estimation and give misleading results. A cost-effective solution is the two-phase design, under which the error-prone variables are observed for all patients during Phase I, and that information is used to select patients for data auditing during Phase II. For example, the Caribbean, Central, and South America network for HIV epidemiology (CCASAnet) selected a random sample from each site for data auditing. Herein, we consider efficient odds ratio estimation with partially audited, error-prone data. We propose a semiparametric approach that uses all information from both phases and accommodates a number of error mechanisms. We allow both the outcome and covariates to be error-prone and these errors to be correlated, and selection of the Phase II sample can depend on Phase I data in an arbitrary manner. We devise a computationally efficient, numerically stable EM algorithm to obtain estimators that are consistent, asymptotically normal, and asymptotically efficient. We demonstrate the advantages of the proposed methods over existing ones through extensive simulations. Finally, we provide applications to the CCASAnet cohort.

Subject(s)

HIV Infections , Research Design , Humans , Odds Ratio , Bias , Data Interpretation, Statistical , HIV Infections/epidemiology

9.

Ensemble learning to predict opioid-related overdose using statewide prescription drug monitoring program and hospital discharge data in the state of Tennessee.

Ripperger, Michael; Lotspeich, Sarah C; Wilimitis, Drew; Fry, Carrie E; Roberts, Allison; Lenert, Matthew; Cherry, Charlotte; Latham, Sanura; Robinson, Katelyn; Chen, Qingxia; McPheeters, Melissa L; Tyndall, Ben; Walsh, Colin G.

J Am Med Inform Assoc ; 29(1): 22-32, 2021 12 28.

Article in English | MEDLINE | ID: mdl-34665246

ABSTRACT

OBJECTIVE: To develop and validate algorithms for predicting 30-day fatal and nonfatal opioid-related overdose using statewide data sources including prescription drug monitoring program data, Hospital Discharge Data System data, and Tennessee (TN) vital records. Current overdose prevention efforts in TN rely on descriptive and retrospective analyses without prognostication. MATERIALS AND METHODS: Study data included 3 041 668 TN patients with 71 479 191 controlled substance prescriptions from 2012 to 2017. Statewide data and socioeconomic indicators were used to train, ensemble, and calibrate 10 nonparametric "weak learner" models. Validation was performed using area under the receiver operating curve (AUROC), area under the precision recall curve, risk concentration, and Spiegelhalter z-test statistic. RESULTS: Within 30 days, 2574 fatal overdoses occurred after 4912 prescriptions (0.0069%) and 8455 nonfatal overdoses occurred after 19 460 prescriptions (0.027%). Discrimination and calibration improved after ensembling (AUROC: 0.79-0.83; Spiegelhalter P value: 0-.12). Risk concentration captured 47-52% of cases in the top quantiles of predicted probabilities. DISCUSSION: Partitioning and ensembling enabled all study data to be used given computational limits and helped mediate case imbalance. Predicting risk at the prescription level can aggregate risk to the patient, provider, pharmacy, county, and regional levels. Implementing these models into Tennessee Department of Health systems might enable more granular risk quantification. Prospective validation with more recent data is needed. CONCLUSION: Predicting opioid-related overdose risk at statewide scales remains difficult and models like these, which required a partnership between an academic institution and state health agency to develop, may complement traditional epidemiological methods of risk identification and inform public health decisions.

Subject(s)

Analgesics, Opioid , Prescription Drug Monitoring Programs , Analgesics, Opioid/therapeutic use , Hospitals , Humans , Machine Learning , Patient Discharge , Retrospective Studies , Tennessee/epidemiology

10.

Efficient semiparametric inference for two-phase studies with outcome and covariate measurement errors.

Tao, Ran; Lotspeich, Sarah C; Amorim, Gustavo; Shaw, Pamela A; Shepherd, Bryan E.

Stat Med ; 40(3): 725-738, 2021 02 10.

Article in English | MEDLINE | ID: mdl-33145800

ABSTRACT

In modern observational studies using electronic health records or other routinely collected data, both the outcome and covariates of interest can be error-prone and their errors often correlated. A cost-effective solution is the two-phase design, under which the error-prone outcome and covariates are observed for all subjects during the first phase and that information is used to select a validation subsample for accurate measurements of these variables in the second phase. Previous research on two-phase measurement error problems largely focused on scenarios where there are errors in covariates only or the validation sample is a simple random sample of study subjects. Herein, we propose a semiparametric approach to general two-phase measurement error problems with a quantitative outcome, allowing for correlated errors in the outcome and covariates and arbitrary second-phase selection. We devise a computationally efficient and numerically stable expectation-maximization algorithm to maximize the nonparametric likelihood function. The resulting estimators possess desired statistical properties. We demonstrate the superiority of the proposed methods over existing approaches through extensive simulation studies, and we illustrate their use in an observational HIV study.

Subject(s)

Models, Statistical , Research Design , Algorithms , Computer Simulation , Humans , Likelihood Functions

11.

Incidence and neighborhood-level determinants of child welfare involvement.

Lotspeich, Sarah C; Jarrett, Ryan T; Epstein, Richard A; Shaffer, April M; Gracey, Kathy; Cull, Michael J; Raman, Rameela.

Child Abuse Negl ; 109: 104767, 2020 11.

Article in English | MEDLINE | ID: mdl-33049663

ABSTRACT

BACKGROUND: Child maltreatment is a global public health issue that has been linked with multiple negative health and life outcomes. OBJECTIVE: This study evaluates the association between children placed in out-of-home care and neighborhood-level factors using eight years of administrative data. PARTICIPANTS AND SETTING: Between 2011-2018, 33,890 unique instances of child welfare involvement were captured in a department of child and family services database in a southern state in the United States. METHODS: Removal addresses were geocoded and linked to the U.S. Census Bureau's American Community Survey to obtain census tract socioeconomic factors. Incidence overall and stratified by individual and neighborhood-level factors was computed. Rate ratios, relative indexes of inequality, and concentration curves quantified disparities in incidence of child welfare involvement by neighborhood-level factors. RESULTS: Incidence of children less than 19 years old placed into out-of-home care was 255 per 100,000 person-years (95 % CI: 252, 258). At the individual level, incidence was highest among children <5 and 15-17 years old, comparable between male and female children, and higher among Black children. At the neighborhood level, incidence was highest in census tracts with lower median household incomes, higher percentages of households below poverty or of female-headed or single-parent households, higher unemployment rates, and fewer residents with some college education or health insurance. CONCLUSIONS: Incidence of children placed into out-of-home care is disproportionally higher for those who live in disadvantaged communities. Understanding neighborhood-level risk factors that may be linked to child welfare involvement can help inform policy and target prevention efforts.

Subject(s)

Child Welfare/statistics & numerical data , Residence Characteristics/statistics & numerical data , Adolescent , Black or African American , Censuses , Child , Child Abuse/statistics & numerical data , Child Protective Services/statistics & numerical data , Child, Preschool , Female , Humans , Incidence , Male , Poverty/statistics & numerical data , Risk Factors , Socioeconomic Factors , Unemployment/statistics & numerical data , United States/epidemiology , Young Adult

12.

Self-audits as alternatives to travel-audits for improving data quality in the Caribbean, Central and South America network for HIV epidemiology.

Lotspeich, Sarah C; Giganti, Mark J; Maia, Marcelle; Vieira, Renalice; Machado, Daisy Maria; Succi, Regina Célia; Ribeiro, Sayonara; Pereira, Mario Sergio; Rodriguez, Maria Fernanda; Julmiste, Gaetane; Luque, Marco Tulio; Caro-Vega, Yanink; Mejia, Fernando; Shepherd, Bryan E; McGowan, Catherine C; Duda, Stephany N.

J Clin Transl Sci ; 4(2): 125-132, 2020 Apr.

Article in English | MEDLINE | ID: mdl-32313702

ABSTRACT

INTRODUCTION: Audits play a critical role in maintaining the integrity of observational cohort data. While previous work has validated the audit process, sending trained auditors to sites ("travel-audits") can be costly. We investigate the efficacy of training sites to conduct "self-audits." METHODS: In 2017, eight research groups in the Caribbean, Central, and South America network for HIV Epidemiology each audited a subset of their patient records randomly selected by the data coordinating center at Vanderbilt. Designated investigators at each site compared abstracted research data to the original clinical source documents and captured audit findings electronically. Additionally, two Vanderbilt investigators performed on-site travel-audits at three randomly selected sites (one adult and two pediatric) in late summer 2017. RESULTS: Self- and travel-auditors, respectively, reported that 93% and 92% of 8919 data entries, captured across 28 unique clinical variables on 65 patients, were entered correctly. Across all entries, 8409 (94%) received the same assessment from self- and travel-auditors (7988 correct and 421 incorrect). Of 421 entries mutually assessed as "incorrect," 304 (82%) were corrected by both self- and travel-auditors and 250 of these (72%) received the same corrections. Reason for changing antiretroviral therapy (ART) regimen, ART end date, viral load value, CD4%, and HIV diagnosis date had the most mismatched corrections. CONCLUSIONS: With similar overall error rates, findings suggest that data audits conducted by trained local investigators could provide an alternative to on-site audits by external auditors to ensure continued data quality. However, discrepancies observed between corrections illustrate challenges in determining correct values even with audits.

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL