RESUMEN
The COVID-19 pandemic is marked by the successive emergence of new SARS-CoV-2 variants, lineages, and sublineages that outcompete earlier strains, largely due to factors like increased transmissibility and immune escape. We propose DeepAutoCoV, an unsupervised deep learning anomaly detection system, to predict future dominant lineages (FDLs). We define FDLs as viral (sub)lineages that will constitute >10% of all the viral sequences added to the GISAID, a public database supporting viral genetic sequence sharing, in a given week. DeepAutoCoV is trained and validated by assembling global and country-specific data sets from over 16 million Spike protein sequences sampled over a period of ~4 years. DeepAutoCoV successfully flags FDLs at very low frequencies (0.01%-3%), with median lead times of 4-17 weeks, and predicts FDLs between ~5 and ~25 times better than a baseline approach. For example, the B.1.617.2 vaccine reference strain was flagged as FDL when its frequency was only 0.01%, more than a year before it was considered for an updated COVID-19 vaccine. Furthermore, DeepAutoCoV outputs interpretable results by pinpointing specific mutations potentially linked to increased fitness and may provide significant insights for the optimization of public health 'pre-emptive' intervention strategies.
Asunto(s)
COVID-19 , Aprendizaje Profundo , SARS-CoV-2 , SARS-CoV-2/genética , SARS-CoV-2/aislamiento & purificación , COVID-19/virología , COVID-19/epidemiología , Humanos , Glicoproteína de la Espiga del Coronavirus/genética , Predicción/métodos , PandemiasRESUMEN
AIM: To develop an automated computable phenotype (CP) algorithm for identifying diabetes cases in children and adolescents using electronic health records (EHRs) from the UF Health System. MATERIALS AND METHODS: The CP algorithm was iteratively derived based on structured data from EHRs (UF Health System 2012-2020). We randomly selected 536 presumed cases among individuals aged <18 years who had (1) glycated haemoglobin levels ≥ 6.5%; or (2) fasting glucose levels ≥126 mg/dL; or (3) random plasma glucose levels ≥200 mg/dL; or (4) a diabetes-related diagnosis code from an inpatient or outpatient encounter; or (5) prescribed, administered, or dispensed diabetes-related medication. Four reviewers independently reviewed the patient charts to determine diabetes status and type. RESULTS: Presumed cases without type 1 (T1D) or type 2 diabetes (T2D) diagnosis codes were categorized as non-diabetes/other types of diabetes. The rest were categorized as T1D if the most recent diagnosis was T1D, or otherwise categorized as T2D if the most recent diagnosis was T2D. Next, we applied a list of diagnoses and procedures that can determine diabetes type (e.g., steroid use suggests induced diabetes) to correct misclassifications from Step 1. Among the 536 reviewed cases, 159 and 64 had T1D and T2D, respectively. The sensitivity, specificity, and positive predictive values of the CP algorithm were 94%, 98% and 96%, respectively, for T1D and 95%, 95% and 73% for T2D. CONCLUSION: We developed a highly accurate EHR-based CP for diabetes in youth based on EHR data from UF Health. Consistent with prior studies, T2D was more difficult to identify using these methods.
RESUMEN
UNSTRUCTURED: Our article provides a viewpoint on population digital health - the use of digital health information sourced from Health IoT and wearable devices for population health modeling - as an emerging research initiative for offering an integrated approach for continuous monitoring and profiling of diseases and health conditions at multiple spatial resolutions. Global healthcare systems are increasingly challenged by rising costs as life expectancy and the average age of people increases. Population digital health looks at how wearables, IoT, and AI can offer an alternative approach for understanding health issues within the population, significantly reducing cost and improving the completeness of information collection by current practices, such as electronic health records - including integration with mhealth personal health records - or survey instruments. This significantly improves our collective understanding of public health priorities, including factors affecting disease prevalence, occurrence and risk factors, ultimately helping to design targeted programmatic interventions apt at reducing the cost of healthcare provision and leading to better life quality, also reducing disparities. Realizing this vision requires overcoming several unique challenges, including data quality, availability, sparsity, and social and technical barriers in the use of health technologies. Our article highlights these challenges and offers solutions and empirical evidence to demonstrate how these challenges can be addressed. As population digital health addresses the impact large-scale sensor data collection and AI can have on improving healthcare delivery and society, we sincerely believe the topic is well within the journal's scope and would be highly interesting to its readership. Our experiments using a combination of real-world health IoT data and electronic health records also highlight the potential cross-disciplinary benefits of population digital health and challenge the research community to address the vision and challenges. Therefore, our article serves the dual purpose of challenging the research community and offering insights into the use of AI and sensor data, and how population digital health can serve as a catalyst for further research by the broader research community.
RESUMEN
A problem extension of the longest common substring (LCS) between two texts is the enumeration of all LCSs given a minimum length k (ALCS- k ), along with their positions in each text. In bioinformatics, an efficient solution to the ALCS- k for very long texts -genomes or metagenomes- can provide useful insights to discover genetic signatures responsible for biological mechanisms. The ALCS- k problem has two additional requirements compared to the LCS problem: one is the minimum length k , and the other is that all common strings longer than k must be reported. We present an efficient, two-stage ALCS- k algorithm exploiting the spectrum of text substrings of length k ( k -mers). Our approach yields a worst-case time complexity loglinear in the number of k -mers for the first stage, and an average-case loglinear in the number of common k -mers for the second stage (several orders of magnitudes smaller than the total k -mer spectrum). The space complexity is linear in the first phase (disk-based), and on average linear in the second phase (disk- and memory-based). Tests performed on genomes for different organisms (including viruses, bacteria and animal chromosomes) show that run times are consistent with our theoretical estimates; further, comparisons with MUMmer4 show an asymptotic advantage with divergent genomes.
RESUMEN
Molecular data analysis is invaluable in understanding the overall behavior of a rapidly spreading virus population when epidemiological surveillance is problematic. It is also particularly beneficial in describing subgroups within the population, often identified as clades within a phylogenetic tree that represent individuals connected via direct transmission or transmission via differing risk factors in viral spread. However, transmission patterns or viral dynamics within these smaller groups should not be expected to exhibit homogeneous behavior over time. As such, standard phylogenetic approaches that identify clusters based on summary statistics would not be expected to capture dynamic clusters of transmission. We, therefore, sought to evaluate the performance of existing and adapted phylogeny-based cluster identification tools on simulated transmission clusters exhibiting dynamic transmission behavior over time. Despite the complementarity of the tools, we provide strong evidence that novel cluster identification methods are needed for reliable detection of epidemiologically linked individuals, particularly those exhibiting changing transmission dynamics during dynamic outbreak scenarios.
Asunto(s)
Brotes de Enfermedades , Filogenia , Humanos , Análisis por Conglomerados , Simulación por ComputadorRESUMEN
Long-acting injectable (LAI) antiretroviral therapy (ART) is available to people with HIV (PWH), but it is unknown which PWH prefer this option. Using the Andersen Behavioral Model this study identifies characteristics of PWH with greater preference for LAI ART. Cross-sectional data from the Florida Cohort, which enrolled adult PWH from community-based clinics included information on predisposing (demographics), enabling (transportation, income), and need (ART adherence <90%) factors. ART preference was assessed via a single question (prefer pills, quarterly LAI, or no preference). Confounder-adjusted multinomial logistic regressions compared those who preferred pills to the other preference options, with covariates identified using directed acyclic graphs. Overall, 314 participants responded (40% non-Hispanic Black, 62% assigned male, 63% aged 50+). Most (63%) preferred the hypothetical LAI, 23% preferred pills, and 14% had no preference. PWH with access to a car (aRRR 1.97 95%CI 1.05-3.71), higher income (aRRR 2.55 95%CI 1.04-6.25), and suboptimal ART adherence (aRRR 7.41 95% CI 1.52-36.23) were more likely to prefer the LAI, while those who reported having no social network were less likely to prefer the LAI (aRRR 0.32 95% CI 0.11-0.88). Overall LAI interest was high, with greater preference associated with enabling and need factors.
Asunto(s)
Fármacos Anti-VIH , Infecciones por VIH , Cumplimiento de la Medicación , Prioridad del Paciente , Humanos , Masculino , Femenino , Infecciones por VIH/tratamiento farmacológico , Florida , Persona de Mediana Edad , Estudios Transversales , Adulto , Cumplimiento de la Medicación/estadística & datos numéricos , Fármacos Anti-VIH/uso terapéutico , Fármacos Anti-VIH/administración & dosificación , Inyecciones , Preparaciones de Acción Retardada/uso terapéuticoRESUMEN
Respiratory tract infections are a serious threat to health, especially in the presence of antimicrobial resistance (AMR). Existing AMR detection methods are limited by slow turnaround times and low accuracy due to the presence of false positives and negatives. In this study, we simulate 1,116 clinical metagenomics samples on both Illumina and Nanopore sequencing from curated, real-world sequencing of A. baumannii respiratory infections and build AI models to predict resistance to amikacin. The best performance is achieved by XGBoost on Illumina sequencing (area under the ROC curve = 0.7993 on 5-fold cross-validation).
Asunto(s)
Acinetobacter baumannii , Amicacina , Farmacorresistencia Bacteriana , Metagenómica , Amicacina/farmacología , Amicacina/uso terapéutico , Acinetobacter baumannii/efectos de los fármacos , Acinetobacter baumannii/genética , Humanos , Farmacorresistencia Bacteriana/genética , Infecciones del Sistema Respiratorio/tratamiento farmacológico , Infecciones del Sistema Respiratorio/microbiología , Antibacterianos/farmacología , Antibacterianos/uso terapéutico , Infecciones por Acinetobacter/tratamiento farmacológico , Infecciones por Acinetobacter/microbiologíaRESUMEN
MOTIVATION: World Health Organization estimates that there were over 10 million cases of tuberculosis (TB) worldwide in 2019, resulting in over 1.4 million deaths, with a worrisome increasing trend yearly. The disease is caused by Mycobacterium tuberculosis (MTB) through airborne transmission. Treatment of TB is estimated to be 85% successful, however, this drops to 57% if MTB exhibits multiple antimicrobial resistance (AMR), for which fewer treatment options are available. RESULTS: We develop a robust machine-learning classifier using both linear and nonlinear models (i.e. LASSO logistic regression (LR) and random forests (RF)) to predict the phenotypic resistance of Mycobacterium tuberculosis (MTB) for a broad range of antibiotic drugs. We use data from the CRyPTIC consortium to train our classifier, which consists of whole genome sequencing and antibiotic susceptibility testing (AST) phenotypic data for 13 different antibiotics. To train our model, we assemble the sequence data into genomic contigs, identify all unique 31-mers in the set of contigs, and build a feature matrix M, where M[i, j] is equal to the number of times the ith 31-mer occurs in the jth genome. Due to the size of this feature matrix (over 350 million unique 31-mers), we build and use a sparse matrix representation. Our method, which we refer to as MTB++, leverages compact data structures and iterative methods to allow for the screening of all the 31-mers in the development of both LASSO LR and RF. MTB++ is able to achieve high discrimination (F-1 >80%) for the first-line antibiotics. Moreover, MTB++ had the highest F-1 score in all but three classes and was the most comprehensive since it had an F-1 score >75% in all but four (rare) antibiotic drugs. We use our feature selection to contextualize the 31-mers that are used for the prediction of phenotypic resistance, leading to some insights about sequence similarity to genes in MEGARes. Lastly, we give an estimate of the amount of data that is needed in order to provide accurate predictions. AVAILABILITY: The models and source code are publicly available on Github at https://github.com/M-Serajian/MTB-Pipeline.
Asunto(s)
Aprendizaje Automático , Mycobacterium tuberculosis , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/efectos de los fármacos , Farmacorresistencia Bacteriana/genética , Pruebas de Sensibilidad Microbiana , Antibacterianos/farmacología , Secuenciación Completa del Genoma/métodos , Genoma Bacteriano , HumanosRESUMEN
In the midst of an outbreak or sustained epidemic, reliable prediction of transmission risks and patterns of spread is critical to inform public health programs. Projections of transmission growth or decline among specific risk groups can aid in optimizing interventions, particularly when resources are limited. Phylogenetic trees have been widely used in the detection of transmission chains and high-risk populations. Moreover, tree topology and the incorporation of population parameters (phylodynamics) can be useful in reconstructing the evolutionary dynamics of an epidemic across space and time among individuals. We now demonstrate the utility of phylodynamic trees for transmission modeling and forecasting, developing a phylogeny-based deep learning system, referred to as DeepDynaForecast. Our approach leverages a primal-dual graph learning structure with shortcut multi-layer aggregation, which is suited for the early identification and prediction of transmission dynamics in emerging high-risk groups. We demonstrate the accuracy of DeepDynaForecast using simulated outbreak data and the utility of the learned model using empirical, large-scale data from the human immunodeficiency virus epidemic in Florida between 2012 and 2020. Our framework is available as open-source software (MIT license) at github.com/lab-smile/DeepDynaForcast.
Asunto(s)
Biología Computacional , Aprendizaje Profundo , Epidemias , Filogenia , Humanos , Epidemias/estadística & datos numéricos , Biología Computacional/métodos , Infecciones por VIH/transmisión , Infecciones por VIH/epidemiología , Programas Informáticos , Florida/epidemiología , Algoritmos , Simulación por Computador , Brotes de Enfermedades/estadística & datos numéricosRESUMEN
Portable genomic sequencers such as Oxford Nanopore's MinION enable real-time applications in clinical and environmental health. However, there is a bottleneck in the downstream analytics when bioinformatics pipelines are unavailable, e.g., when cloud processing is unreachable due to absence of Internet connection, or only low-end computing devices can be carried on site. Here we present a platform-friendly software for portable metagenomic analysis of Nanopore data, the Oligomer-based Classifier of Taxonomic Operational and Pan-genome Units via Singletons (OCTOPUS). OCTOPUS is written in Java, reimplements several features of the popular Kraken2 and KrakenUniq software, with original components for improving metagenomics classification on incomplete/sampled reference databases, making it ideal for running on smartphones or tablets. OCTOPUS obtains sensitivity and precision comparable to Kraken2, while dramatically decreasing (4- to 16-fold) the false positive rate, and yielding high correlation on real-word data. OCTOPUS is available along with customized databases at https://github.com/DataIntellSystLab/OCTOPUS and https://github.com/Ruiz-HCI-Lab/OctopusMobile.
RESUMEN
The current study aimed to examine the prevalence of and risk factors for cancer and pre-cancerous conditions, comparing transgender and cisgender individuals, using 2012-2023 electronic health record data from a large healthcare system. We identified 2,745 transgender individuals using a previously validated computable phenotype and 54,900 matched cisgender individuals. We calculated the prevalence of cancer and pre-cancer related to human papillomavirus (HPV), human immunodeficiency virus (HIV), tobacco, alcohol, lung, breast, colorectum, and built multivariable logistic models to examine the association between gender identity and the presence of cancer or pre-cancer. Results indicated similar odds of developing cancer across gender identities, but transgender individuals exhibited significantly higher risks for pre-cancerous conditions, including alcohol-related, breast, and colorectal pre-cancers compared to cisgender women, and HPV-related, tobacco-related, alcohol-related, and colorectal pre-cancers compared to cisgender men. These findings underscore the need for tailored interventions and policies addressing cancer health disparities affecting the transgender population.
RESUMEN
BACKGROUND: Racial/ethnic disparities in the HIV care continuum have been well documented in the US, with especially striking inequalities in viral suppression rates between White and Black persons with HIV (PWH). The South is considered an epicenter of the HIV epidemic in the US, with the largest population of PWH living in Florida. It is unclear whether any disparities in viral suppression or immune reconstitution-a clinical outcome highly correlated with overall prognosis-have changed over time or are homogenous geographically. In this analysis, we 1) investigate longitudinal trends in viral suppression and immune reconstitution among PWH in Florida, 2) examine the impact of socio-ecological factors on the association between race/ethnicity and clinical outcomes, 3) explore spatial and temporal variations in disparities in clinical outcomes. METHODS: Data were obtained from the Florida Department of Health for 42,369 PWH enrolled in the Ryan White program during 2008-2020. We linked the data to county-level socio-ecological variables available from County Health Rankings. GEE models were fit to assess the effect of race/ethnicity on immune reconstitution and viral suppression longitudinally. Poisson Bayesian hierarchical models were fit to analyze geographic variations in racial/ethnic disparities while adjusting for socio-ecological factors. RESULTS: Proportions of PWH who experienced viral suppression and immune reconstitution rose by 60% and 45%, respectively, from 2008-2020. Odds of immune reconstitution and viral suppression were significantly higher among White [odds ratio =2.34, 95% credible interval=2.14-2.56; 1.95 (1.85-2.05)], and Hispanic [1.70 (1.54-1.87); 2.18(2.07-2.31)] PWH, compared with Black PWH. These findings remained unchanged after accounting for socio-ecological factors. Rural and urban counties in north-central Florida saw the largest racial/ethnic disparities. CONCLUSIONS: There is persistent, spatially heterogeneous, racial/ethnic disparity in HIV clinical outcomes in Florida. This disparity could not be explained by socio-ecological factors, suggesting that further research on modifiable factors that can improve HIV outcomes among Black and Hispanic PWH in Florida is needed.
Asunto(s)
Etnicidad , Infecciones por VIH , Humanos , Teorema de Bayes , Florida/epidemiología , Disparidades en Atención de Salud , Hispánicos o Latinos , Infecciones por VIH/epidemiología , Blanco , Negro o AfroamericanoRESUMEN
Substance use disorder (SUD), a common comorbidity among people with HIV (PWH), adversely affects HIV clinical outcomes and HIV-related comorbidities. However, less is known about the incidence of different chronic conditions, changes in overall comorbidity burden, and health care utilization by SUD status and patterns among PWH in Florida, an area disproportionately affected by the HIV epidemic. We used electronic health records (EHR) from a large southeastern US consortium, the OneFlorida + clinical research data network. We identified a cohort of PWH with 3 + years of EHRs after the first visit with HIV diagnosis. International Classification of Diseases (ICD) codes were used to identify SUD and comorbidity conditions listed in the Charlson comorbidity index (CCI). A total of 42,271 PWH were included (mean age 44.5, 52% Black, 45% female). The prevalence SUD among PWH was 45.1%. Having a SUD diagnosis among PWH was associated with a higher incidence for most of the conditions listed on the CCI and faster increase in CCI score overtime (rate ratio = 1.45, 95%CI 1.42, 1.49). SUD in PWH was associated with a higher mean number of any care visits (21.7 vs. 14.8) and more frequent emergency department (ED, 3.5 vs. 2.0) and inpatient (8.5 vs. 24.5) visits compared to those without SUD. SUD among PWH was associated with a higher comorbidity burden and more frequent ED and inpatient visits than PWH without a diagnosis of SUD. The high SUD prevalence and comorbidity burden call for improved SUD screening, treatment, and integrated care among PWH.
Asunto(s)
Comorbilidad , Infecciones por VIH , Aceptación de la Atención de Salud , Trastornos Relacionados con Sustancias , Humanos , Femenino , Florida/epidemiología , Masculino , Infecciones por VIH/epidemiología , Adulto , Trastornos Relacionados con Sustancias/epidemiología , Persona de Mediana Edad , Aceptación de la Atención de Salud/estadística & datos numéricos , Prevalencia , Incidencia , Registros Electrónicos de Salud , Costo de EnfermedadRESUMEN
Florida is one of the HIV epicenters with high incidence and marked sociodemographic disparities. We analyzed a decade of statewide electronic health record/claims data-OneFlorida+-to identify and characterize pre-exposure prophylaxis (PrEP) recipients and newly diagnosed HIV cases in Florida. Refined computable phenotype algorithms were applied and a total of 2186 PrEP recipients and 7305 new HIV diagnoses were identified between January 2013 and April 2021. We examined patients' sociodemographic characteristics, stratified by self-reported sex, along with both frequency-driven and expert-selected descriptions of clinical conditions documented within 12 months before the first PrEP use or HIV diagnosis. PrEP utilization rate increased in both sexes; higher rates were observed among males with sex differences widening in recent years. HIV incidence peaked in 2016 and then decreased with minimal sex differences observed. Clinical characteristics were similar between the PrEP and new HIV diagnosis cohorts, characterized by a low prevalence of sexually transmitted infections (STIs) and a high prevalence of mental health and substance use conditions. Study limitations include the overrepresentation of Medicaid recipients, with over 96% of female PrEP users on Medicaid, and the inclusion of those engaged in regular health care. Although PrEP uptake increased in Florida, and HIV incidence decreased, sex disparity among PrEP recipients remained. Screening efforts beyond individuals with documented prior STI and high-risk behavior, especially for females, including integration of mental health care with HIV counseling and testing, are crucial to further equalize PrEP access and improve HIV prevention programs.
Asunto(s)
Infecciones por VIH , Profilaxis Pre-Exposición , Estados Unidos , Humanos , Femenino , Masculino , Florida/epidemiología , Registros Electrónicos de Salud , Infecciones por VIH/diagnóstico , Infecciones por VIH/epidemiología , Infecciones por VIH/prevención & control , DemografíaRESUMEN
The benefits and harms of lung cancer screening (LCS) for patients in the real-world clinical setting have been argued. Recently, discriminative prediction modeling of lung cancer with stratified risk factors has been developed to investigate the real-world effectiveness of LCS from observational data. However, most of these studies were conducted at the population level that only measured the difference in the average outcome between groups. In this study, we built counterfactual prediction models for lung cancer risk and mortality and examined for individual patients whether LCS as a hypothetical intervention reduces lung cancer risk and subsequent mortality. We investigated traditional and deep learning (DL)-based causal methods that provide individualized treatment effect (ITE) at the patient level and evaluated them with a cohort from the OneFlorida+ Clinical Research Consortium. We further discussed and demonstrated that the ITE estimation model can be used to personalize clinical decision support for a broader population.
Asunto(s)
Aprendizaje Profundo , Neoplasias Pulmonares , Humanos , Detección Precoz del Cáncer , Neoplasias Pulmonares/diagnóstico , Factores de RiesgoRESUMEN
BACKGROUND: The rapid evolution of artificial intelligence (AI) in conjunction with recent updates in dual antiplatelet therapy (DAPT) management guidelines emphasizes the necessity for innovative models to predict ischemic or bleeding events after drug-eluting stent implantation. Leveraging AI for dynamic prediction has the potential to revolutionize risk stratification and provide personalized decision support for DAPT management. METHODS AND RESULTS: We developed and validated a new AI-based pipeline using retrospective data of drug-eluting stent-treated patients, sourced from the Cerner Health Facts data set (n=98 236) and Optum's de-identified Clinformatics Data Mart Database (n=9978). The 36 months following drug-eluting stent implantation were designated as our primary forecasting interval, further segmented into 6 sequential prediction windows. We evaluated 5 distinct AI algorithms for their precision in predicting ischemic and bleeding risks. Model discriminative accuracy was assessed using the area under the receiver operating characteristic curve, among other metrics. The weighted light gradient boosting machine stood out as the preeminent model, thus earning its place as our AI-DAPT model. The AI-DAPT demonstrated peak accuracy in the 30 to 36 months window, charting an area under the receiver operating characteristic curve of 90% [95% CI, 88%-92%] for ischemia and 84% [95% CI, 82%-87%] for bleeding predictions. CONCLUSIONS: Our AI-DAPT excels in formulating iterative, refined dynamic predictions by assimilating ongoing updates from patients' clinical profiles, holding value as a novel smart clinical tool to facilitate optimal DAPT duration management with high accuracy and adaptability.
Asunto(s)
Enfermedad de la Arteria Coronaria , Stents Liberadores de Fármacos , Infarto del Miocardio , Intervención Coronaria Percutánea , Humanos , Inhibidores de Agregación Plaquetaria/efectos adversos , Infarto del Miocardio/etiología , Enfermedad de la Arteria Coronaria/diagnóstico , Enfermedad de la Arteria Coronaria/cirugía , Stents Liberadores de Fármacos/efectos adversos , Inteligencia Artificial , Estudios Retrospectivos , Resultado del Tratamiento , Factores de Riesgo , Quimioterapia Combinada , Hemorragia/inducido químicamente , Pronóstico , Intervención Coronaria Percutánea/efectos adversosRESUMEN
The coronavirus disease of 2019 (COVID-19) pandemic is characterized by sequential emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants and lineages outcompeting previously circulating ones because of, among other factors, increased transmissibility and immune escape1-3. We devised an unsupervised deep learning AutoEncoder for viral genomes anomaly detection to predict future dominant lineages (FDLs), i.e., lineages or sublineages comprising ≥10% of viral sequences added to the GISAID database on a given week4. The algorithm was trained and validated by assembling global and country-specific data sets from 16,187,950 Spike protein sequences sampled between December 24th, 2019, and November 8th, 2023. The AutoEncoder flags low frequency FDLs (0.01% - 3%), with median lead times of 4-16 weeks. Over time, positive predictive values oscillate, decreasing linearly with the number of unique sequences per data set, showing average performance up to 30 times better than baseline approaches. The B.1.617.2 vaccine reference strain was flagged as FDL when its frequency was only 0.01%, more than one year earlier of being considered for an updated COVID-19 vaccine. Our AutoEncoder, applicable in principle to any pathogen, also pinpoints specific mutations potentially linked to increased fitness, and may provide significant insights for the optimization of public health pre-emptive intervention strategies.
RESUMEN
HIV-related stigma is a key contributor to poor HIV-related health outcomes. The purpose of this study is to explore implementing a stigma measure into routine HIV care focusing on the 10-item Medical Monitoring Project measure as a proposed measure. Healthcare providers engaged in HIV-related care in Florida were recruited. Participants completed an interview about their perceptions of measures to assess stigma during clinical care. The analysis followed a directed content approach. Fifteen participants completed the interviews (87% female, 47% non-Hispanic White, case manager 40%). Most providers thought that talking about stigma would be helpful (89%). Three major themes emerged from the analysis: acceptability, subscales of interest, and utility. In acceptability, participants mentioned that assessing stigma could encourage patient-centered care and serve as a conversation starter, but some mentioned not having enough time. Participants thought that the disclosure concerns and negative self-image subscales were most relevant. Some worried they would not have resources for patients or that some issues were beyond their influence. Participants were generally supportive of routinely addressing HIV-related stigma in clinical care, but were concerned that resources, especially to address concerns about disclosure and negative self-image, were not available.
Asunto(s)
Infecciones por VIH , Humanos , Femenino , Masculino , Florida , Estigma Social , Ansiedad , RevelaciónRESUMEN
This study quantifies health outcome disparities in invasive Methicillin-Resistant Staphylococcus aureus (MRSA) infections by leveraging a novel artificial intelligence (AI) fairness algorithm, the Fairness-Aware Causal paThs (FACTS) decomposition, and applying it to real-world electronic health record (EHR) data. We spatiotemporally linked 9 years of EHRs from a large healthcare provider in Florida, USA, with contextual social determinants of health (SDoH). We first created a causal structure graph connecting SDoH with individual clinical measurements before/upon diagnosis of invasive MRSA infection, treatments, side effects, and outcomes; then, we applied FACTS to quantify outcome potential disparities of different causal pathways including SDoH, clinical and demographic variables. We found moderate disparity with respect to demographics and SDoH, and all the top ranked pathways that led to outcome disparities in age, gender, race, and income, included comorbidity. Prior kidney impairment, vancomycin use, and timing were associated with racial disparity, while income, rurality, and available healthcare facilities contributed to gender disparity. From an intervention standpoint, our results highlight the necessity of devising policies that consider both clinical factors and SDoH. In conclusion, this work demonstrates a practical utility of fairness AI methods in public health settings.
Asunto(s)
Infecciones Comunitarias Adquiridas , Staphylococcus aureus Resistente a Meticilina , Infecciones Estafilocócicas , Humanos , Infecciones Estafilocócicas/tratamiento farmacológico , Infecciones Estafilocócicas/diagnóstico , Inteligencia Artificial , Infecciones Comunitarias Adquiridas/tratamiento farmacológico , Biología Computacional , Algoritmos , Evaluación de Resultado en la Atención de Salud , Antibacterianos/uso terapéuticoRESUMEN
BACKGROUND: Hospital-induced delirium is one of the most common and costly iatrogenic conditions, and its incidence is predicted to increase as the population of the United States ages. An academic and clinical interdisciplinary systems approach is needed to reduce the frequency and impact of hospital-induced delirium. OBJECTIVE: The long-term goal of our research is to enhance the safety of hospitalized older adults by reducing iatrogenic conditions through an effective learning health system. In this study, we will develop models for predicting hospital-induced delirium. In order to accomplish this objective, we will create a computable phenotype for our outcome (hospital-induced delirium), design an expert-based traditional logistic regression model, leverage machine learning techniques to generate a model using structured data, and use machine learning and natural language processing to produce an integrated model with components from both structured data and text data. METHODS: This study will explore text-based data, such as nursing notes, to improve the predictive capability of prognostic models for hospital-induced delirium. By using supervised and unsupervised text mining in addition to structured data, we will examine multiple types of information in electronic health record data to predict medical-surgical patient risk of developing delirium. Development and validation will be compliant to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) statement. RESULTS: Work on this project will take place through March 2024. For this study, we will use data from approximately 332,230 encounters that occurred between January 2012 to May 2021. Findings from this project will be disseminated at scientific conferences and in peer-reviewed journals. CONCLUSIONS: Success in this study will yield a durable, high-performing research-data infrastructure that will process, extract, and analyze clinical text data in near real time. This model has the potential to be integrated into the electronic health record and provide point-of-care decision support to prevent harm and improve quality of care. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/48521.