RESUMO
BACKGROUND: Persistent symptoms of SARS-CoV-2 are prevalent weeks to months following the infection. To date, it is difficult to disentangle the direct from the indirect effects of SARS-CoV-2, including lockdown, social, and economic factors. OBJECTIVE: The study aims to characterize the prevalence of symptoms, functional capacity, and quality of life at 12 months in outpatient symptomatic individuals tested positive for SARS-CoV-2 compared to individuals tested negative. METHODS: From 23 April to 27 July 2021, outpatient symptomatic individuals tested for SARS-CoV-2 at the Geneva University Hospitals were followed up 12 months after their test date. RESULTS: At 12 months, out of the 1447 participants (mean age 45.2 years, 61.2% women), 33.4% reported residual mild to moderate symptoms following SARS-CoV-2 infection compared to 6.5% in the control group. Symptoms included fatigue (16% vs. 3.1%), dyspnea (8.9% vs. 1.1%), headache (9.8% vs. 1.7%), insomnia (8.9% vs. 2.7%), and difficulty concentrating (7.4% vs. 2.5%). When compared to the control group, 30.5% of SARS-CoV-2 positive individuals reported functional impairment at 12 months versus 6.6%. SARS-CoV-2 infection was associated with the persistence of symptoms (adjusted odds ratio [aOR] 4.1; 2.60-6.83) and functional impairment (aOR 3.54; 2.16-5.80) overall, and in subgroups of women, men, individuals younger than 40 years, those between 40-59 years, and in individuals with no past medical or psychiatric history. CONCLUSION: SARS-CoV-2 infection leads to persistent symptoms over several months, including in young healthy individuals, in addition to the pandemic effects, and potentially more than other common respiratory infections. Symptoms impact functional capacity up to 12 months post infection.
Assuntos
COVID-19 , SARS-CoV-2 , COVID-19/epidemiologia , Controle de Doenças Transmissíveis , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Pandemias , Qualidade de VidaRESUMO
Representing numeric values such as scalars holds great importance for accurately depicting clinical data. While the result value itself will always be represented using an integer, decimal, or other scalar format, it needs to be linked to its corresponding data element. In SNOMED CT, as in most other terminology systems, this is done through an attribute relationship. While some scalar values are already included in this way, they only represent a small fraction of possibilities. Our intention is to expand the scope of scalar representation by validating new attributes using a previously established method. The result is a list of five attributes validated for local representation of scalar values, improving semantic representation and interoperability.
Assuntos
Semântica , Systematized Nomenclature of Medicine , Humanos , Registros Eletrônicos de Saúde , Terminologia como AssuntoRESUMO
Similarity and clustering tasks based on data extracted from electronic health records on the patient level suffer from the curse of dimensionality and the lack of inter-patient data comparability. Indeed, for many health institutions, there are many more variables, and ways of expressing those variables to represent patients than patients sharing the same set of data. To lower redundancy and increase interoperability one strategy is to map data to semantic-driven representations through medical knowledge graphs such as SNOMED-CT. However, patient similarity metrics based on this knowledge-graph information lack quantitative evaluation and comparisons with pure data-driven methods. The reasons are twofold, firstly, it is hard to conceptually assess and formalize a gold-standard similarity between patients resulting in poor inter-annotator agreement in qualitative evaluations. Secondly, the community has been lacking a clear benchmark to compare existing metrics developed by scientific communities coming from various fields such as ontology, data science, and medical informatics. This study proposes to leverage the known challenges of evaluating patient similarities by proposing SIMpat, a synthetic benchmark to quantitatively evaluate available metrics, based on controlled cohorts, which could later be used to assess their sensibility regarding aspects such as the sparsity of variables or specificities of patient disease patterns.
Assuntos
Benchmarking , Registros Eletrônicos de Saúde , Humanos , Systematized Nomenclature of Medicine , SemânticaRESUMO
Due to the complexity of the biomedical domain, the ability to capture semantically meaningful representations of terms in context is a long-standing challenge. Despite important progress in the past years, no evaluation benchmark has been developed to evaluate how well language models represent biomedical concepts according to their corresponding context. Inspired by the Word-in-Context (WiC) benchmark, in which word sense disambiguation is reformulated as a binary classification task, we propose a novel dataset, BioWiC, to evaluate the ability of language models to encode biomedical terms in context. BioWiC comprises 20'156 instances, covering over 7'400 unique biomedical terms, making it the largest WiC dataset in the biomedical domain. We evaluate BioWiC both intrinsically and extrinsically and show that it could be used as a reliable benchmark for evaluating context-dependent embeddings in biomedical corpora. In addition, we conduct several experiments using a variety of discriminative and generative large language models to establish robust baselines that can serve as a foundation for future research.
Assuntos
Processamento de Linguagem Natural , Semântica , IdiomaRESUMO
Objectives: The objective of this study is the exploration of Artificial Intelligence and Natural Language Processing techniques to support the automatic assignment of the four Response Evaluation Criteria in Solid Tumors (RECIST) scales based on radiology reports. We also aim at evaluating how languages and institutional specificities of Swiss teaching hospitals are likely to affect the quality of the classification in French and German languages. Methods: In our approach, 7 machine learning methods were evaluated to establish a strong baseline. Then, robust models were built, fine-tuned according to the language (French and German), and compared with the expert annotation. Results: The best strategies yield average F1-scores of 90% and 86% respectively for the 2-classes (Progressive/Non-progressive) and the 4-classes (Progressive Disease, Stable Disease, Partial Response, Complete Response) RECIST classification tasks. Conclusions: These results are competitive with the manual labeling as measured by Matthew's correlation coefficient and Cohen's Kappa (79% and 76%). On this basis, we confirm the capacity of specific models to generalize on new unseen data and we assess the impact of using Pre-trained Language Models (PLMs) on the accuracy of the classifiers.
RESUMO
In spring 2020, as the COVID-19 pandemic is in its first wave in Europe, the University hospitals of Geneva (HUG) is tasked to take care of all Covid inpatients of the Geneva canton. It is a crisis with very little tools to support decision-taking authorities, and very little is known about the Covid disease. The need to know more, and fast, highlighted numerous challenges in the whole data pipeline processes. This paper describes the decisions taken and processes developed to build a unified database to support several secondary usages of clinical data, including governance and research. HUG had to answer to 5 major waves of COVID-19 patients since the beginning of 2020. In this context, a database for COVID-19 related data has been created to support the governance of the hospital in their answer to this crisis. The principles about this database were a) a clearly defined cohort; b) a clearly defined dataset and c) a clearly defined semantics. This approach resulted in more than 28 000 variables encoded in SNOMED CT and 1 540 human readable labels. It covers more than 216 000 patients and 590 000 inpatient stays. This database is used daily since the beginning of the pandemic to feed the "Predict" dashboards of HUG and prediction reports as well as several research projects.
Assuntos
COVID-19 , Systematized Nomenclature of Medicine , Bases de Dados Factuais , Humanos , Pandemias , SemânticaRESUMO
The present study shows first attempts to automatically classify oncology treatment responses on the basis of the textual conclusion sections of radiology reports according to the RECIST classification. After a robust and extended manual annotation of 543 conclusion sections (5-to-50-word long), and after the training of several machine learning techniques (from traditional machine learning to deep learning), the best results show an accuracy score of 0.90 for a two-class classification (non-progressive vs. progressive disease) and of 0.82 for a four-class classification (complete response, partial response, stable disease, progressive disease) both with Logistic Regression approach. Some innovative solutions are further suggested to improve these scores in the future.
Assuntos
Radiologia , Aprendizado de Máquina , Processamento de Linguagem Natural , Radiografia , Relatório de Pesquisa , Aprendizado de Máquina SupervisionadoRESUMO
Healthcare workers have potentially been among the most exposed to SARS-CoV-2 infection as well as the deleterious toll of the pandemic. This study has the objective to differentiate the pandemic toll from post-acute sequelae of SARS-CoV-2 infection in healthcare workers compared to the general population. The study was conducted between April and July 2021 at the Geneva University Hospitals, Switzerland. Eligible participants were all tested staff, and outpatient individuals tested for SARS-CoV-2 at the same hospital. The primary outcome was the prevalence of symptoms in healthcare workers compared to the general population, with measures of COVID-related symptoms and functional impairment, using prevalence estimates and multivariable logistic regression models. Healthcare workers (nâ¯=â¯3083) suffered mostly from fatigue (25.5â¯%), headache (10.0â¯%), difficulty concentrating (7.9â¯%), exhaustion/burnout (7.1â¯%), insomnia (6.2â¯%), myalgia (6.7â¯%) and arthralgia (6.3â¯%). Regardless of SARS-CoV-2 infection, all symptoms were significantly higher in healthcare workers than the general population (nâ¯=â¯3556). SARS-CoV-2 infection in healthcare workers was associated with loss or change in smell, loss or change in taste, palpitations, dyspnea, difficulty concentrating, fatigue, and headache. Functional impairment was more significant in healthcare workers compared to the general population (aOR 2.28; 1.76-2.96), with a positive association with SARS-CoV-2 infection (aOR 3.81; 2.59-5.60). Symptoms and functional impairment in healthcare workers were increased compared to the general population, and potentially related to the pandemic toll as well as post-acute sequelae of SARS-CoV-2 infection. These findings are of concern, considering the essential role of healthcare workers in caring for all patients including and beyond COVID-19.