RESUMO
OBJECTIVE: Healthcare continues to grapple with the persistent issue of treatment disparities, sparking concerns regarding the equitable allocation of treatments in clinical practice. While various fairness metrics have emerged to assess fairness in decision-making processes, a growing focus has been on causality-based fairness concepts due to their capacity to mitigate confounding effects and reason about bias. However, the application of causal fairness notions in evaluating the fairness of clinical decision-making with electronic health record (EHR) data remains an understudied domain. This study aims to address the methodological gap in assessing causal fairness of treatment allocation with electronic health records data. In addition, we investigate the impact of social determinants of health on the assessment of causal fairness of treatment allocation. METHODS: We propose a causal fairness algorithm to assess fairness in clinical decision-making. Our algorithm accounts for the heterogeneity of patient populations and identifies potential unfairness in treatment allocation by conditioning on patients who have the same likelihood to benefit from the treatment. We apply this framework to a patient cohort with coronary artery disease derived from an EHR database to evaluate the fairness of treatment decisions. RESULTS: Our analysis reveals notable disparities in coronary artery bypass grafting (CABG) allocation among different patient groups. Women were found to be 4.4%-7.7% less likely to receive CABG than men in two out of four treatment response strata. Similarly, Black or African American patients were 5.4%-8.7% less likely to receive CABG than others in three out of four response strata. These results were similar when social determinants of health (insurance and area deprivation index) were dropped from the algorithm. These findings highlight the presence of disparities in treatment allocation among similar patients, suggesting potential unfairness in the clinical decision-making process. CONCLUSION: This study introduces a novel approach for assessing the fairness of treatment allocation in healthcare. By incorporating responses to treatment into fairness framework, our method explores the potential of quantifying fairness from a causal perspective using EHR data. Our research advances the methodological development of fairness assessment in healthcare and highlight the importance of causality in determining treatment fairness.
Assuntos
Algoritmos , Registros Eletrônicos de Saúde , Humanos , Masculino , Feminino , Tomada de Decisão Clínica , Doença da Artéria Coronariana/terapia , Disparidades em Assistência à Saúde , Pessoa de Meia-Idade , Determinantes Sociais da Saúde , CausalidadeRESUMO
Clinical documentation in electronic health records contains crucial narratives and details about patients and their care. Natural language processing (NLP) can unlock the information conveyed in clinical notes and reports, and thus plays a critical role in real-world studies. The NLP Working Group at the Observational Health Data Sciences and Informatics (OHDSI) consortium was established to develop methods and tools to promote the use of textual data and NLP in real-world observational studies. In this paper, we describe a framework for representing and utilizing textual data in real-world evidence generation, including representations of information from clinical text in the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), the workflow and tools that were developed to extract, transform and load (ETL) data from clinical notes into tables in OMOP CDM, as well as current applications and specific use cases of the proposed OHDSI NLP solution at large consortia and individual institutions with English textual data. Challenges faced and lessons learned during the process are also discussed to provide valuable insights for researchers who are planning to implement NLP solutions in real-world studies.
Assuntos
Ciência de Dados , Informática Médica , Humanos , Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , NarraçãoRESUMO
BACKGROUND: Work Relative Value Units (wRVUs) are a component of many compensation models, and a proxy for the effort required to care for a patient. Accurate prediction of wRVUs generated per patient at triage could facilitate real-time load balancing between physicians and provide many practical operational and clinical benefits. OBJECTIVE: We examined whether deep-learning approaches could predict the wRVUs generated by a patient's visit using data commonly available at triage. METHODS: Adult patients presenting to an urban, academic emergency department from July 1, 2016-March 1, 2020 were included. Deidentified triage information included structured data (age, sex, vital signs, Emergency Severity Index score, language, race, standardized chief complaint) and unstructured data (free-text chief complaint) with wRVUs as outcome. Five models were examined: average wRVUs per chief complaint, linear regression, neural network and gradient-boosted tree on structured data, and neural network on unstructured textual data. Models were evaluated using mean absolute error. RESULTS: We analyzed 204,064 visits between July 1, 2016 and March 1, 2020. The median wRVUs were 3.80 (interquartile range 2.56-4.21), with significant effects of age, gender, and race. Models demonstrated lower error as complexity increased. Predictions using averages from chief complaints alone demonstrated a mean error of 2.17 predicted wRVUs per visit (95% confidence interval [CI] 2.07-2.27), the linear regression model: 1.00 wRVUs (95% CI 0.97-1.04), gradient-boosted tree: 0.85 wRVUs (95% CI 0.84-0.86), neural network with structured data: 0.86 wRVUs (95% CI 0.85-0.87), and neural network with unstructured data: 0.78 wRVUs (95% CI 0.76-0.80). CONCLUSIONS: Chief complaints are a poor predictor of the effort needed to evaluate a patient; however, deep-learning techniques show promise. These algorithms have the potential to provide many practical applications, including balancing workloads and compensation between emergency physicians, quantify crowding and mobilizing resources, and reducing bias in the triage process.
Assuntos
Serviço Hospitalar de Emergência , Carga de Trabalho , Adulto , Humanos , Triagem/métodos , Algoritmos , Aprendizado de MáquinaRESUMO
The current study examined the frequency and predictors of older adults' engagement with symptom reporting in COVIDWATCHER, a mobile health (mHealth) citizen science application. Citizen science is a type of participatory research that leverages information provided by community members. There were 1,028 COVIDWATCHER participants who engaged with symptom reporting between April 2020 and January 2021. Approximately 13.5% (n = 139) were adults aged ≥65 years. We used a Wilcoxon test to compare the mean frequency of engagement with symptom reporting by older adults (i.e., aged ≥65 years) to younger adults (i.e., aged ≤64 years) and multivariable linear regression to explore the predictors of engagement with symptom reporting. There was a significant difference in engagement with symptom reporting between adults aged ≥65 years compared to those aged ≤64 years (p < 0.001). In our final model, age (ß = 26.0; 95% confidence interval [14.8, 34.2]) was a significant predictor for engagement with symptom reporting. These results help further our understanding of older adult engagement with mHealth-enabled citizen science for symptom reporting. [Journal of Gerontological Nursing, 49(4), 6-11.].
Assuntos
COVID-19 , Ciência do Cidadão , Telemedicina , Humanos , Idoso , COVID-19/epidemiologiaRESUMO
OBJECTIVES: Observed Structured Clinical Exams (OSCEs) allow assessment of, and provide feedback to, medical students. Clinical examiners and standardised patients (SP) typically complete itemised checklists and global scoring scales, which have known shortcomings. In this study, we applied machine learning (ML) to label some communication skills and interview content information in OSCE transcripts and to compare several ML methodologies by performance and transferability. METHODS: One-hundred and twenty-one transcripts of two OSCE scenarios were manually annotated per utterance across 19 communication skills and content areas. Utterances were converted to two types of numeric sentence vector representations and were paired with three types of ML algorithms. First, ML models (MLMs) were evaluated using a five K-fold cross-validation technique on all transcripts in one scenario to generate precision and recall, and their harmonic mean, F1 scores. Second, ML models were trained on all 101 transcripts from scenario 1 and tested for transferability on 20 scenario 2 transcripts. RESULTS: Performance testing in the K-fold cross-validation demonstrated relatively high mean F1 scores: median 0.87 and range 0.53-0.98 across all 19 labels. Transferability testing demonstrated success: F1 median 0.76 and range 0.46-0.97. The combination of a bi-directional long short-term memory neural network (biLSTM) algorithm with GenSen numeric sentence vector representations was associated with greater F1 scores across both performance and transferability (P < .005). CONCLUSIONS: We report the first application of ML in the context of student-SP OSCEs. We demonstrated that several MLMs automatically labelled OSCE transcripts for a range of interview content and some clinical communications skills. Some MLMs achieved greater performance and transferability. Optimised MLMs could provide automated and accurate assessment of OSCEs with potential to track student progress and identify areas for further practice.
Assuntos
Avaliação Educacional , Estudantes de Medicina , Competência Clínica , Comunicação , Humanos , Aprendizado de MáquinaRESUMO
To develop and validate a prediction model for delayed cerebral ischemia (DCI) after subarachnoid hemorrhage (SAH) using a temporal unsupervised feature engineering approach, demonstrating improved precision over standard features. 488 consecutive SAH admissions from 2006 to 2014 to a tertiary care hospital were included. Models were trained on 80%, while 20% were set aside for validation testing. Baseline information and standard grading scales were evaluated: age, sex, Hunt Hess grade, modified Fisher Scale (mFS), and Glasgow Coma Scale (GCS). An unsupervised approach applying random kernels was used to extract features from physiological time series (systolic and diastolic blood pressure, heart rate, respiratory rate, and oxygen saturation). Classifiers (Partial Least Squares, linear and kernel Support Vector Machines) were trained on feature subsets of the derivation dataset. Models were applied to the validation dataset. The performances of the best classifiers on the validation dataset are reported by feature subset. Standard grading scale (mFS): AUC 0.58. Combined demographics and grading scales: AUC 0.60. Random kernel derived physiologic features: AUC 0.74. Combined baseline and physiologic features with redundant feature reduction: AUC 0.77. Current DCI prediction tools rely on admission imaging and are advantageously simple to employ. However, using an agnostic and computationally inexpensive learning approach for high-frequency physiologic time series data, we demonstrated that our models achieve higher classification accuracy.
Assuntos
Isquemia Encefálica/diagnóstico por imagem , Diagnóstico por Computador/métodos , Hemorragia Subaracnóidea/diagnóstico por imagem , Idoso , Área Sob a Curva , Cuidados Críticos , Reações Falso-Positivas , Feminino , Escala de Coma de Glasgow , Humanos , Análise dos Mínimos Quadrados , Masculino , Pessoa de Meia-Idade , Admissão do Paciente , Valor Preditivo dos Testes , Reprodutibilidade dos Testes , Fatores de Risco , Índice de Gravidade de Doença , Máquina de Vetores de Suporte , Centros de Atenção Terciária , Fatores de TempoRESUMO
Most laboratory results are valid for only a certain time period (laboratory tests shelf-life), after which they are outdated and the test needs to be re-administered. Currently, laboratory test shelf-lives are not centrally available anywhere but the implicit knowledge of doctors. In this work we propose an automated method to learn laboratory test-specific shelf-life by identifying prevalent laboratory test order patterns in electronic health records. The resulting shelf-lives performed well in the evaluation of internal validity, clinical interpretability, and external validity.
Assuntos
Técnicas de Laboratório Clínico/estatística & dados numéricos , Glicemia/análise , Técnicas de Laboratório Clínico/normas , Biologia Computacional , Registros Eletrônicos de Saúde/estatística & dados numéricos , Humanos , Estudos Longitudinais , Modelos Estatísticos , Fenótipo , Reprodutibilidade dos Testes , Fatores de TempoRESUMO
Identifying topics of discussions in online health communities (OHC) is critical to various information extraction applications, but can be difficult because topics of OHC content are usually heterogeneous and domain-dependent. In this paper, we provide a multi-class schema, an annotated dataset, and supervised classifiers based on convolutional neural network (CNN) and other models for the task of classifying discussion topics. We apply the CNN classifier to the most popular breast cancer online community, and carry out cross-sectional and longitudinal analyses to show topic distributions and topic dynamics throughout members' participation. Our experimental results suggest that CNN outperforms other classifiers in the task of topic classification and identify several patterns and trajectories. For example, although members discuss mainly disease-related topics, their interest may change through time and vary with their disease severities.
Assuntos
Neoplasias da Mama , Internet , Redes Neurais de Computação , Estudos Transversais , Feminino , Humanos , Participação do PacienteRESUMO
Speculations represent uncertainty toward certain facts. In clinical texts, identifying speculations is a critical step of natural language processing (NLP). While it is a nontrivial task in many languages, detecting speculations in Chinese clinical notes can be particularly challenging because word segmentation may be necessary as an upstream operation. The objective of this paper is to construct a state-of-the-art speculation detection system for Chinese clinical notes and to investigate whether embedding features and word segmentations are worth exploiting toward this overall task. We propose a sequence labeling based system for speculation detection, which relies on features from bag of characters, bag of words, character embedding, and word embedding. We experiment on a novel dataset of 36,828 clinical notes with 5103 gold-standard speculation annotations on 2000 notes, and compare the systems in which word embeddings are calculated based on word segmentations given by general and by domain specific segmenters respectively. Our systems are able to reach performance as high as 92.2% measured by F score. We demonstrate that word segmentation is critical to produce high quality word embedding to facilitate downstream information extraction applications, and suggest that a domain dependent word segmenter can be vital to such a clinical NLP task in Chinese language.
Assuntos
Mineração de Dados/métodos , Registros Eletrônicos de Saúde/instrumentação , Processamento de Linguagem Natural , China , Sistemas Computacionais , Humanos , Idioma , Informática Médica/métodos , Reprodutibilidade dos Testes , Fluxo de TrabalhoRESUMO
We present the Unsupervised Phenome Model (UPhenome), a probabilistic graphical model for large-scale discovery of computational models of disease, or phenotypes. We tackle this challenge through the joint modeling of a large set of diseases and a large set of clinical observations. The observations are drawn directly from heterogeneous patient record data (notes, laboratory tests, medications, and diagnosis codes), and the diseases are modeled in an unsupervised fashion. We apply UPhenome to two qualitatively different mixtures of patients and diseases: records of extremely sick patients in the intensive care unit with constant monitoring, and records of outpatients regularly followed by care providers over multiple years. We demonstrate that the UPhenome model can learn from these different care settings, without any additional adaptation. Our experiments show that (i) the learned phenotypes combine the heterogeneous data types more coherently than baseline LDA-based phenotypes; (ii) they each represent single diseases rather than a mix of diseases more often than the baseline ones; and (iii) when applied to unseen patient records, they are correlated with the patients' ground-truth disorders. Code for training, inference, and quantitative evaluation is made available to the research community.
Assuntos
Registros Eletrônicos de Saúde , Aprendizagem , Probabilidade , Humanos , FenótipoRESUMO
Electronic health record (EHR) data show promise for deriving new ways of modeling human disease states. Although EHR researchers often use numerical values of laboratory tests as features in disease models, a great deal of information is contained in the context within which a laboratory test is taken. For example, the same numerical value of a creatinine test has different interpretation for a chronic kidney disease patient and a patient with acute kidney injury. We study whether EHR research studies are subject to biased results and interpretations if laboratory measurements taken in different contexts are not explicitly separated. We show that the context of a laboratory test measurement can often be captured by the way the test is measured through time. We perform three tasks to study the properties of these temporal measurement patterns. In the first task, we confirm that laboratory test measurement patterns provide additional information to the stand-alone numerical value. The second task identifies three measurement pattern motifs across a set of 70 laboratory tests performed for over 14,000 patients. Of these, one motif exhibits properties that can lead to biased research results. In the third task, we demonstrate the potential for biased results on a specific example. We conduct an association study of lipase test values to acute pancreatitis. We observe a diluted signal when using only a lipase value threshold, whereas the full association is recovered when properly accounting for lipase measurements in different contexts (leveraging the lipase measurement patterns to separate the contexts). Aggregating EHR data without separating distinct laboratory test measurement patterns can intermix patients with different diseases, leading to the confounding of signals in large-scale EHR analyses. This paper presents a methodology for leveraging measurement frequency to identify and reduce laboratory test biases.
Assuntos
Artefatos , Sistemas de Informação em Laboratório Clínico/estatística & dados numéricos , Interpretação Estatística de Dados , Mineração de Dados/métodos , Registros Eletrônicos de Saúde/classificação , Registros Eletrônicos de Saúde/estatística & dados numéricos , Reconhecimento Automatizado de Padrão/métodos , Sistemas de Informação em Laboratório Clínico/classificação , Fatores de Confusão Epidemiológicos , New YorkRESUMO
Without comprehensive examination of available literature on health disparities and minority health (HDMH), the field is left vulnerable to disproportionately focus on specific populations or conditions, curtailing our ability to fully advance health equity. Using scalable open-source methods, we conducted a computational scoping review of more than 200,000 articles to investigate major populations, conditions, and themes as well as notable gaps. We also compared trends in studied conditions to their relative prevalence using insurance claims (42 million Americans). HDMH publications represent 1% of articles in Medical Literature Analysis and Retrieval System Online (MEDLINE). Most studies are observational in nature, although randomized trial reporting has increased fivefold in the past 20 years. Half of HDMH articles concentrate on only three disease groups (cancer, mental health, and endocrine/metabolic disorders), while hearing, vision, and skin-related conditions are among the least well represented despite substantial prevalence. To support further investigation, we present HDMH Monitor, an interactive dashboard and repository generated from the HDMH bibliome.
Assuntos
Audição , Saúde das Minorias , Humanos , Saúde Mental , Desigualdades de SaúdeRESUMO
Precision medicine has the potential to provide more accurate diagnosis, appropriate treatment and timely prevention strategies by considering patients' biological makeup. However, this cannot be realized without integrating clinical and omics data in a data-sharing framework that achieves large sample sizes. Systems that integrate clinical and genetic data from multiple sources are scarce due to their distinct data types, interoperability, security and data ownership issues. Here we present a secure framework that allows immutable storage, querying and analysis of clinical and genetic data using blockchain technology. Our platform allows clinical and genetic data to be harmonized by combining them under a unified framework. It supports combined genotype-phenotype queries and analysis, gives institutions control of their data and provides immutable user access logs, improving transparency into how and when health information is used. We demonstrate the value of our framework for precision medicine by creating genotype-phenotype cohorts and examining relationships within them. We show that combining data across institutions using our secure platform increases statistical power for rare disease analysis. By offering an integrated, secure and decentralized framework, we aim to enhance reproducibility and encourage broader participation from communities and patients in data sharing.
RESUMO
Postpartum depression (PPD) is a mood disorder affecting one in seven women after childbirth that is often under-screened and under-detected. If not diagnosed and treated, PPD is associated with long-term developmental challenges in the child and maternal morbidity. Wearable technologies, such as smartwatches and fitness trackers (e.g., Fitbit), offer continuous and longitudinal digital phenotyping for mood disorder diagnosis and monitoring, with device wear time being an important yet understudied aspect. Using the All of Us Research Program (AoURP) dataset, we assessed the percentage of days women with PPD wore Fitbit devices across pre-pregnancy, pregnancy, postpartum, and PPD periods, as determined by electronic health records. Wear time was compared in women with and without PPD using linear regression models. Results showed a strong trend that women in the PPD cohort wore their Fitbits more those without PPD during the postpartum (PPD: mean=72.9%, SE=13.8%; non-PPD: mean=58.9%, SE=12.2%, P-value=0.09) and PPD time periods (PPD: mean=70.7%, SE=14.5%; non-PPD: mean=55.6%, SE=12.9%, P-value=0.08). We hypothesize this may be attributed to hypervigilance, given the common co-occurrence of anxiety symptoms among women with PPD. Future studies should assess the link between PPD, hypervigilance, and wear time patterns. We envision that device wear patterns with digital biomarkers like sleep and physical activity could enhance early PPD detection using machine learning by alerting clinicians to potential concerns facilitating timely screenings, which may have implications for other mental health disorders.
RESUMO
We developed machine learning and deep learning models to identify mpox cases from clinical notes as part of a learning health system initiative. Lasso regression outperformed deep learning models, excelled in minimizing false positives, and may prove helpful for flagging missed or delayed diagnoses as part of continuous quality improvement.
RESUMO
Sexually transmitted infections (STIs) continue to pose a substantial public health challenge in the United States (US). Surveillance, a cornerstone of disease control and prevention, can be strengthened to promote more timely, efficient, and equitable practices by incorporating health information exchange (HIE) and other large-scale health data sources into reporting. New York City patient-level electronic health record data between January 1, 2018 and June 30, 2023 were obtained from Healthix, the largest US public HIE. Healthix data were linked to neighborhood-level information from the American Community Survey. In this cross-sectional study, we compared patients who received a test or tested positive for chlamydia, gonorrhea, and/or HIV with patients who were untested or tested negative, respectively, using generalized estimating equations with logit function and robust standard errors. Among 1,519,121 tests performed for chlamydia, 1,574,772 for gonorrhea, and 1,200,560 for HIV, 2%, 0.6% and 0.3% were positive for chlamydia, gonorrhea, and HIV, respectively. Chlamydia and gonorrhea co-occurred in 1,854 cases (7% of chlamydia and 21% of gonorrhea total cases). Testing behavior was often incongruent with geographic and sociodemographic patterns of positive cases. For example, people living in areas with the highest levels of poverty were less likely to test for gonorrhea but almost twice as likely to test positive compared to those in low poverty areas. Regional HIE enabled review of testing and cases using granular and complementary data not typically available given existing reporting practices. Enhanced surveillance spotlights potential incongruencies between testing patterns and STI risk in certain populations, signaling potential under- and over-testing. These and future insights derived from HIE data may be used to continuously inform public health practice and drive further improvements in provision and evaluation of services and programs.
RESUMO
Studying near-miss errors is essential to preventing errors from reaching patients. When an error is committed, it may be intercepted (near-miss) or it will reach the patient; estimates of the proportion that reach the patient vary widely. To better understand this relationship, we conducted a retrospective cohort study using two objective measures to identify wrong-patient imaging order errors involving radiation, estimating the proportion of errors that are intercepted and those that reach the patient. This study was conducted at a large integrated healthcare system using data from 1 January to 31 December 2019. The study used two outcome measures of wrong-patient orders: (1) wrong-patient orders that led to misadministration of radiation reported to the New York Patient Occurrence Reporting and Tracking System (NYPORTS) (misadministration events); and (2) wrong-patient orders identified by the Wrong-Patient Retract-and-Reorder (RAR) measure, a measure identifying orders placed for a patient, retracted and rapidly reordered by the same clinician on a different patient (near-miss events). All imaging orders that involved radiation were extracted retrospectively from the healthcare system data warehouse. Among 293 039 total eligible orders, 151 were wrong-patient orders (3 misadministration events, 148 near-miss events), for an overall rate of 51.5 per 100 000 imaging orders involving radiation placed on the wrong patient. Of all wrong-patient imaging order errors, 2% reached the patient, translating to 50 near-miss events for every 1 error that reached the patient. This proportion provides a more accurate and reliable estimate and reinforces the utility of systematic measure of near-miss errors as an outcome for preventative interventions.
Assuntos
Prestação Integrada de Cuidados de Saúde , Humanos , Estudos Retrospectivos , New YorkRESUMO
Given the chronic nature of schizophrenia, it is important to examine age-specific prevalence and incidence to understand the scope of the burden of schizophrenia across the lifespan. Estimates of lifetime prevalence of schizophrenia have varied widely and have often relied upon community-based data estimates from over two decades ago, while more recent studies have shown considerable promise by leveraging pooled datasets. However, the validity of measures of schizophrenia, particularly new onset schizophrenia, has not been well studied in these large health databases. The current study examines prevalence and validity of incidence measures of new diagnoses of schizophrenia in 2019 using two U.S. administrative health databases: MarketScan, a national database of individuals receiving employer-sponsored commercial insurance (N = 16,365,997), and NYS Medicaid, a large state public insurance program (N = 4,414,153). Our results indicate that the prevalence of schizophrenia is over 10-fold higher, and the incidence two-fold higher, in the NYS Medicaid population compared to the MarketScan database. In addition, prevalence increased over the lifespan in the Medicaid population, but decreased in the employment based MarketScan database beginning in early adulthood. Incident measures of new diagnoses of schizophrenia had excellent validity, with positive predictive values and specificity exceeding 95%, but required a longer lookback period for Medicaid compared to MarketScan. Further work is needed to leverage these findings to develop robust clinical outcome predictors for new onset of schizophrenia within large administrative health data systems.
RESUMO
Background: Endometriosis affects 10% of reproductive-age women, and yet, it goes undiagnosed for 3.6 years on average after symptoms onset. Despite large GWAS meta-analyses (N > 750,000), only a few dozen causal loci have been identified. We hypothesized that the challenges in identifying causal genes for endometriosis stem from heterogeneity across clinical and biological factors underlying endometriosis diagnosis. Methods: We extracted known endometriosis risk factors, symptoms, and concomitant conditions from the Penn Medicine Biobank (PMBB) and performed unsupervised spectral clustering on 4,078 women with endometriosis. The 5 clusters were characterized by utilizing additional electronic health record (EHR) variables, such as endometriosis-related comorbidities and confirmed surgical phenotypes. From four EHR-linked genetic datasets, PMBB, eMERGE, AOU, and UKBB, we extracted lead variants and tag variants 39 known endometriosis loci for association testing. We meta-analyzed ancestry-stratified case/control tests for each locus and cluster in addition to a positive control (Total N endometriosis cases = 10,108). Results: We have designated the five subtype clusters as pain comorbidities, uterine disorders, pregnancy complications, cardiometabolic comorbidities, and EHR-asymptomatic based on enriched features from each group. One locus, RNLS , surpassed the genome-wide significant threshold in the positive control. Thirteen more loci reached a Bonferroni threshold of 1.3 x 10 -3 (0.05 / 39) in the positive control. The cluster-stratified tests yielded more significant associations than the positive control for anywhere from 5 to 15 loci depending on the cluster. Bonferroni significant loci were identified for four out of five clusters, including WNT4 and GREB1 for the uterine disorders cluster, RNLS for the cardiometabolic cluster, FSHB for the pregnancy complications cluster, and SYNE1 and CDKN2B-AS1 for the EHR-asymptomatic cluster. This study enhances our understanding of the clinical presentation patterns of endometriosis subtypes, showcasing the innovative approach employed to investigate this complex disease.
RESUMO
Endometriosis is a complex and heterogeneous condition affecting 10% of reproductive-age women, and yet, it often goes undiagnosed for several years. Limited observed heritability (7%) of large genetic association studies may be attributable to underlying heterogeneity of disease mechanisms. Therefore, we conducted this study to investigate genetic associations across sub-phenotypes of endometriosis. We performed unsupervised clustering of 4,078 women with endometriosis based on known endometriosis risk factors, symptoms, and concomitant conditions. The clusters were characterized by examining electronic health record (EHR) data and comprehensive chart reviews. We then performed genetic association for each cluster with 39 endometriosis-associated loci (Total Nendometriosis cases = 12,350). We identified five sub-phenotype clusters: (1) pain comorbidities, (2) uterine disorders, (3) pregnancy complications, (4) cardiometabolic comorbidities, and (5) HER-asymptomatic. Bonferroni significant loci included PDLIM5 for the cluster 1, GREB1 for cluster 2, WNT4 for cluster 3, RNLS for cluster 4, and ABO for cluster 5. The difference in associations between the groups suggests complex and varied genetic mechanisms of endometriosis and its symptoms. This study enhances our understanding of the clinical patterns of endometriosis sub-phenotypes, showcasing the innovative approach employed to investigate this complex disease.