Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
Ophthalmol Sci ; 3(3): 100293, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37193316

ABSTRACT

Purpose: Diabetic retinopathy (DR) is the most common microvascular complication associated with diabetes mellitus (DM), affecting approximately 40% of this patient population. Early detection of DR is vital to ensure monitoring of disease progression and prompt sight saving treatments as required. This article describes the data contained within the INSIGHT Birmingham, Solihull, and Black Country Diabetic Retinopathy Dataset. Design: Dataset descriptor for routinely collected eye screening data. Participants: All diabetic patients aged 12 years and older, attending annual digital retinal photography-based screening within the Birmingham, Solihull, and Black Country Eye Screening Programme. Methods: The INSIGHT Health Data Research Hub for Eye Health is a National Health Service (NHS)-led ophthalmic bioresource that provides researchers with safe access to anonymized, routinely collected data from contributing NHS hospitals to advance research for patient benefit. This report describes the INSIGHT Birmingham, Solihull, and Black Country DR Screening Dataset, a dataset of anonymized images and linked screening data derived from the United Kingdom's largest regional DR screening program. Main Outcome Measures: This dataset consists of routinely collected data from the eye screening program. The data primarily include retinal photographs with the associated DR grading data. Additional data such as corresponding demographic details, information regarding patients' diabetic status, and visual acuity data are also available. Further details regarding available data points are available in the supplementary information, in addition to the INSIGHT webpage included below. Results: At the time point of this analysis (December 31, 2019), the dataset comprised 6 202 161 images from 246 180 patients, with a dataset inception date of January 1, 2007. The dataset includes 1 360 547 grading episodes between R0M0 and R3M1. Conclusions: This dataset descriptor article summarizes the content of the dataset, how it has been curated, and what its potential uses are. Data are available through a structured application process for research studies that support discovery, clinical evidence analyses, and innovation in artificial intelligence technologies for patient benefit. Further information regarding the data repository and contact details can be found at https://www.insight.hdrhub.org/. Financial Disclosures: Proprietary or commercial disclosure may be found after the references.

2.
Comput Biol Med ; 153: 106425, 2023 02.
Article in English | MEDLINE | ID: mdl-36638616

ABSTRACT

Annotation of biomedical entities with ontology classes provides for formal semantic analysis and mobilisation of background knowledge in determining their relationships. To date, enrichment analysis has been routinely employed to identify classes that are over-represented in annotations across sets of groups, such as biosample gene expression profiles or patient phenotypes, and is useful for a range of tasks including differential diagnosis and causative variant prioritisation. These approaches, however, usually consider only univariate relationships, make limited use of the semantic features of ontologies, and provide limited information and evaluation of the explanatory power of both singular and grouped candidate classes. Moreover, they are not designed to solve the problem of deriving cohesive, characteristic, and discriminatory sets of classes for entity groups. We have developed a new tool, called Klarigi, which introduces multiple scoring heuristics for identification of classes that are both compositional and discriminatory for groups of entities annotated with ontology classes. The tool includes a novel algorithm for derivation of multivariable semantic explanations for entity groups, makes use of semantic inference through live use of an ontology reasoner, and includes a classification method for identifying the discriminatory power of candidate sets, in addition to significance testing apposite to traditional enrichment approaches. We describe the design and implementation of Klarigi, including its scoring and explanation determination methods, and evaluate its use in application to two test cases with clinical significance, comparing and contrasting methods and results with literature-based and enrichment analysis methods. We demonstrate that Klarigi produces characteristic and discriminatory explanations for groups of biomedical entities in two settings. We also show that these explanations recapitulate and extend the knowledge held in existing biomedical databases and literature for several diseases. We conclude that Klarigi provides a distinct and valuable perspective on biomedical datasets when compared with traditional enrichment methods, and therefore constitutes a new method by which biomedical datasets can be explored, contributing to improved insight into semantic data.


Subject(s)
Biological Ontologies , Semantics , Algorithms , Phenotype , Databases, Factual
3.
iScience ; 25(7): 104480, 2022 Jul 15.
Article in English | MEDLINE | ID: mdl-35665240

ABSTRACT

Clinical outcomes for patients with COVID-19 are heterogeneous and there is interest in defining subgroups for prognostic modeling and development of treatment algorithms. We obtained 28 demographic and laboratory variables in patients admitted to hospital with COVID-19. These comprised a training cohort (n = 6099) and two validation cohorts during the first and second waves of the pandemic (n = 996; n = 1011). Uniform manifold approximation and projection (UMAP) dimension reduction and Gaussian mixture model (GMM) analysis was used to define patient clusters. 29 clusters were defined in the training cohort and associated with markedly different mortality rates, which were predictive within confirmation datasets. Deconvolution of clinical features within clusters identified unexpected relationships between variables. Integration of large datasets using UMAP-assisted clustering can therefore identify patient subgroups with prognostic information and uncovers unexpected interactions between clinical variables. This application of machine learning represents a powerful approach for delineating disease pathogenesis and potential therapeutic interventions.

4.
BMC Med Inform Decis Mak ; 22(1): 33, 2022 02 05.
Article in English | MEDLINE | ID: mdl-35123470

ABSTRACT

BACKGROUND: Semantic similarity is a valuable tool for analysis in biomedicine. When applied to phenotype profiles derived from clinical text, they have the capacity to enable and enhance 'patient-like me' analyses, automated coding, differential diagnosis, and outcome prediction. While a large body of work exists exploring the use of semantic similarity for multiple tasks, including protein interaction prediction, and rare disease differential diagnosis, there is less work exploring comparison of patient phenotype profiles for clinical tasks. Moreover, there are no experimental explorations of optimal parameters or better methods in the area. METHODS: We develop a platform for reproducible benchmarking and comparison of experimental conditions for patient phentoype similarity. Using the platform, we evaluate the task of ranking shared primary diagnosis from uncurated phenotype profiles derived from all text narrative associated with admissions in the medical information mart for intensive care (MIMIC-III). RESULTS: 300 semantic similarity configurations were evaluated, as well as one embedding-based approach. On average, measures that did not make use of an external information content measure performed slightly better, however the best-performing configurations when measured by area under receiver operating characteristic curve and Top Ten Accuracy used term-specificity and annotation-frequency measures. CONCLUSION: We identified and interpreted the performance of a large number of semantic similarity configurations for the task of classifying diagnosis from text-derived phenotype profiles in one setting. We also provided a basis for further research on other settings and related tasks in the area.


Subject(s)
Rare Diseases , Semantics , Humans , Phenotype , ROC Curve
5.
Comput Biol Med ; 138: 104904, 2021 11.
Article in English | MEDLINE | ID: mdl-34600327

ABSTRACT

Identification of ontology concepts in clinical narrative text enables the creation of phenotype profiles that can be associated with clinical entities, such as patients or drugs. Constructing patient phenotype profiles using formal ontologies enables their analysis via semantic similarity, in turn enabling the use of background knowledge in clustering or classification analyses. However, traditional semantic similarity approaches collapse complex relationships between patient phenotypes into a unitary similarity scores for each pair of patients. Moreover, single scores may be based only on matching terms with the greatest information content (IC), ignoring other dimensions of patient similarity. This process necessarily leads to a loss of information in the resulting representation of patient similarity, and is especially apparent when using very large text-derived and highly multi-morbid phenotype profiles. Moreover, it renders finding a biological explanation for similarity very difficult; the black box problem. In this article, we explore the generation of multiple semantic similarity scores for patients based on different facets of their phenotypic manifestation, which we define through different sub-graphs in the Human Phenotype Ontology. We further present a new methodology for deriving sets of qualitative class descriptions for groups of entities described by ontology terms. Leveraging this strategy to obtain meaningful explanations for our semantic clusters alongside other evaluation techniques, we show that semantic clustering with ontology-derived facets enables the representation, and thus identification of, clinically relevant phenotype relationships not easily recoverable using overall clustering alone. In this way, we demonstrate the potential of faceted semantic clustering for gaining a deeper and more nuanced understanding of text-derived patient phenotypes.


Subject(s)
Semantics , Cluster Analysis , Humans , Phenotype
6.
BMJ Health Care Inform ; 28(1)2021 Apr.
Article in English | MEDLINE | ID: mdl-33849921

ABSTRACT

INTRODUCTION: Health Data Research UK designated seven UK-based Hubs to facilitate health data use for research. PIONEER is the Hub in Acute Care. PIONEER delivered workshops where patients/public citizens agreed key principles to guide access to unconsented, anonymised, routinely collected health data. These were used to inform the protocol. METHODS: This paper describes the PIONEER infrastructure and data access processes. PIONEER is a research database and analytical environment that links routinely collected health data across community, ambulance and hospital healthcare providers. PIONEER aims ultimately to improve patient health and care, by making health data discoverable and accessible for research by National Health Service, academic and commercial organisations. The PIONEER protocol incorporates principles identified in the public/patient workshops. This includes all data access requests being reviewed by the Data Trust Committee, a group of public citizens who advise on whether requests should be supported prior to licensed access. ETHICS AND DISSEMINATION: East Midlands-Derby REC (20/EM/0158): Confidentiality Advisory Group (20/CAG/0084). www.PIONEERdatahub.co.uk.


Subject(s)
Critical Care , Databases, Factual , State Medicine , Critical Care/methods , Databases, Factual/standards , Humans , Research Design , State Medicine/organization & administration , State Medicine/statistics & numerical data , United Kingdom
SELECTION OF CITATIONS
SEARCH DETAIL
...