RESUMEN
OBJECTIVES: Achieving accurate, timely, and complete HIV surveillance data is complicated in the United States by migration and care seeking across jurisdictional boundaries. To address these issues, public health entities use the ATra Black Box-a secure, electronic, privacy-assuring system developed by Georgetown University-to identify and confirm potential duplicate case records, exchange data, and perform other analytics to improve the quality of data in the Enhanced HIV/AIDS Reporting System (eHARS). We aimed to evaluate the ability of 2 ATra software algorithms to identify potential duplicate case-pairs across 6 jurisdictions for people living with diagnosed HIV. METHODS: We implemented 2 matching algorithms for identifying potential duplicate case-pairs in ATra software. The Single Name Matching Algorithm examines only 1 name for a person, whereas the All Names Matching Algorithm examines all names in eHARS for a person. Six public health jurisdictions used the algorithms. We compared outputs for the overall number of potential matches and changes in matching level. RESULTS: The All Names Matching Algorithm found more matches than the Single Name Matching Algorithm and increased levels of match. The All Names Matching Algorithm identified 9070 (4.5%) more duplicate matches than the Single Name Matching Algorithm (n = 198 828) and increased the total number of matches at the exact through high levels by 15.4% (from 167 156 to 192 932; n = 25 776). CONCLUSIONS: HIV data quality across multiple jurisdictions can be improved by using all known first and last names of people living with diagnosed HIV that match with eHARS rather than using only 1 first and last name.
Asunto(s)
Síndrome de Inmunodeficiencia Adquirida , Humanos , Estados Unidos , Síndrome de Inmunodeficiencia Adquirida/epidemiología , Exactitud de los Datos , AlgoritmosRESUMEN
Background: Molecular epidemiological approaches provide opportunities to characterize HIV transmission dynamics. We analyzed HIV sequences and virus load (VL) results obtained during routine clinical care, and individual's zip-code location to determine utility of this approach. Methods: HIV-1 pol sequences aligned using ClustalW were subtyped using REGA. A maximum likelihood (ML) tree was generated using IQTree. Transmission clusters with ≤3% genetic distance (GD) and ≥90% bootstrap support were identified using ClusterPicker. We conducted Bayesian analysis using BEAST to confirm transmission clusters. The proportion of nucleotides with ambiguity ≤0.5% was considered indicative of early infection. Descriptive statistics were applied to characterize clusters and group comparisons were performed using chi-square or t-test. Results: Among 2775 adults with data from 2014−2015, 2589 (93%) had subtype B HIV-1, mean age was 44 years (SD 12.7), 66.4% were male, and 25% had nucleotide ambiguity ≤0.5. There were 456 individuals in 193 clusters: 149 dyads, 32 triads, and 12 groups with ≥ four individuals per cluster. More commonly in clusters were males than females, 349 (76.5%) vs. 107 (23.5%), p < 0.0001; younger individuals, 35.3 years (SD 12.1) vs. 44.7 (SD 12.3), p < 0.0001; and those with early HIV-1 infection by nucleotide ambiguity, 202/456 (44.3%) vs. 442/2133 (20.7%), p < 0.0001. Members of 43/193 (22.3%) of clusters included individuals in different jurisdictions. Clusters ≥ four individuals were similarly found using BEAST. HIV-1 viral load (VL) ≥3.0 log10 c/mL was most common among individuals in clusters ≥ four, 18/21, (85.7%) compared to 137/208 (65.8%) in clusters sized 2−3, and 927/1169 (79.3%) who were not in a cluster (p < 0.0001). Discussion: HIV sequence data obtained for HIV clinical management provide insights into regional transmission dynamics. Our findings demonstrate the additional utility of HIV-1 VL data in combination with phylogenetic inferences as an enhanced contact tracing tool to direct HIV treatment and prevention services. Trans-jurisdictional approaches are needed to optimize efforts to end the HIV epidemic.