Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
1.
Front Med (Lausanne) ; 11: 1354070, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38686369

RESUMEN

Introduction: The echocardiographic measurement of left ventricular ejection fraction (LVEF) is fundamental to the diagnosis and classification of patients with heart failure (HF). Methods: This paper aimed to quantify LVEF automatically and accurately with the proposed pipeline method based on deep neural networks and ensemble learning. Within the pipeline, an Atrous Convolutional Neural Network (ACNN) was first trained to segment the left ventricle (LV), before employing the area-length formulation based on the ellipsoid single-plane model to calculate LVEF values. This formulation required inputs of LV area, derived from segmentation using an improved Jeffrey's method, as well as LV length, derived from a novel ensemble learning model. To further improve the pipeline's accuracy, an automated peak detection algorithm was used to identify end-diastolic and end-systolic frames, avoiding issues with human error. Subsequently, single-beat LVEF values were averaged across all cardiac cycles to obtain the final LVEF. Results: This method was developed and internally validated in an open-source dataset containing 10,030 echocardiograms. The Pearson's correlation coefficient was 0.83 for LVEF prediction compared to expert human analysis (p < 0.001), with a subsequent area under the receiver operator curve (AUROC) of 0.98 (95% confidence interval 0.97 to 0.99) for categorisation of HF with reduced ejection (HFrEF; LVEF<40%). In an external dataset with 200 echocardiograms, this method achieved an AUC of 0.90 (95% confidence interval 0.88 to 0.91) for HFrEF assessment. Conclusion: The automated neural network-based calculation of LVEF is comparable to expert clinicians performing time-consuming, frame-by-frame manual evaluations of cardiac systolic function.

2.
Comput Biol Med ; 153: 106425, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-36638616

RESUMEN

Annotation of biomedical entities with ontology classes provides for formal semantic analysis and mobilisation of background knowledge in determining their relationships. To date, enrichment analysis has been routinely employed to identify classes that are over-represented in annotations across sets of groups, such as biosample gene expression profiles or patient phenotypes, and is useful for a range of tasks including differential diagnosis and causative variant prioritisation. These approaches, however, usually consider only univariate relationships, make limited use of the semantic features of ontologies, and provide limited information and evaluation of the explanatory power of both singular and grouped candidate classes. Moreover, they are not designed to solve the problem of deriving cohesive, characteristic, and discriminatory sets of classes for entity groups. We have developed a new tool, called Klarigi, which introduces multiple scoring heuristics for identification of classes that are both compositional and discriminatory for groups of entities annotated with ontology classes. The tool includes a novel algorithm for derivation of multivariable semantic explanations for entity groups, makes use of semantic inference through live use of an ontology reasoner, and includes a classification method for identifying the discriminatory power of candidate sets, in addition to significance testing apposite to traditional enrichment approaches. We describe the design and implementation of Klarigi, including its scoring and explanation determination methods, and evaluate its use in application to two test cases with clinical significance, comparing and contrasting methods and results with literature-based and enrichment analysis methods. We demonstrate that Klarigi produces characteristic and discriminatory explanations for groups of biomedical entities in two settings. We also show that these explanations recapitulate and extend the knowledge held in existing biomedical databases and literature for several diseases. We conclude that Klarigi provides a distinct and valuable perspective on biomedical datasets when compared with traditional enrichment methods, and therefore constitutes a new method by which biomedical datasets can be explored, contributing to improved insight into semantic data.


Asunto(s)
Ontologías Biológicas , Semántica , Algoritmos , Fenotipo , Bases de Datos Factuales
3.
Eur Heart J ; 44(9): 713-725, 2023 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-36629285

RESUMEN

Artificial intelligence (AI) is increasingly being utilized in healthcare. This article provides clinicians and researchers with a step-wise foundation for high-value AI that can be applied to a variety of different data modalities. The aim is to improve the transparency and application of AI methods, with the potential to benefit patients in routine cardiovascular care. Following a clear research hypothesis, an AI-based workflow begins with data selection and pre-processing prior to analysis, with the type of data (structured, semi-structured, or unstructured) determining what type of pre-processing steps and machine-learning algorithms are required. Algorithmic and data validation should be performed to ensure the robustness of the chosen methodology, followed by an objective evaluation of performance. Seven case studies are provided to highlight the wide variety of data modalities and clinical questions that can benefit from modern AI techniques, with a focus on applying them to cardiovascular disease management. Despite the growing use of AI, further education for healthcare workers, researchers, and the public are needed to aid understanding of how AI works and to close the existing gap in knowledge. In addition, issues regarding data access, sharing, and security must be addressed to ensure full engagement by patients and the public. The application of AI within healthcare provides an opportunity for clinicians to deliver a more personalized approach to medical care by accounting for confounders, interactions, and the rising prevalence of multi-morbidity.


Asunto(s)
Inteligencia Artificial , Sistema Cardiovascular , Humanos , Algoritmos , Aprendizaje Automático , Atención a la Salud
4.
NPJ Digit Med ; 5(1): 186, 2022 Dec 21.
Artículo en Inglés | MEDLINE | ID: mdl-36544046

RESUMEN

Much of the knowledge and information needed for enabling high-quality clinical research is stored in free-text format. Natural language processing (NLP) has been used to extract information from these sources at scale for several decades. This paper aims to present a comprehensive review of clinical NLP for the past 15 years in the UK to identify the community, depict its evolution, analyse methodologies and applications, and identify the main barriers. We collect a dataset of clinical NLP projects (n = 94; £ = 41.97 m) funded by UK funders or the European Union's funding programmes. Additionally, we extract details on 9 funders, 137 organisations, 139 persons and 431 research papers. Networks are created from timestamped data interlinking all entities, and network analysis is subsequently applied to generate insights. 431 publications are identified as part of a literature review, of which 107 are eligible for final analysis. Results show, not surprisingly, clinical NLP in the UK has increased substantially in the last 15 years: the total budget in the period of 2019-2022 was 80 times that of 2007-2010. However, the effort is required to deepen areas such as disease (sub-)phenotyping and broaden application domains. There is also a need to improve links between academia and industry and enable deployments in real-world settings for the realisation of clinical NLP's great potential in care delivery. The major barriers include research and development access to hospital data, lack of capable computational resources in the right places, the scarcity of labelled data and barriers to sharing of pretrained models.

5.
Sci Rep ; 12(1): 13094, 2022 Jul 30.
Artículo en Inglés | MEDLINE | ID: mdl-35908043

RESUMEN

In the extensive search for new physics, the precise measurement of the Higgs boson continues to play an important role. To this end, machine learning techniques have been recently applied to processes like the Higgs production via vector-boson fusion. In this paper, we propose to use algorithms for learning to rank, i.e., to rank events into a sorting order, first signal, then background, instead of algorithms for the classification into two classes, for this task. The fact that training is then performed on pairwise comparisons of signal and background events can effectively increase the amount of training data due to the quadratic number of possible combinations. This makes it robust to unbalanced data set scenarios and can improve the overall performance compared to pointwise models like the state-of-the-art boosted decision tree approach. In this work we compare our pairwise neural network algorithm, which is a combination of a convolutional neural network and the DirectRanker, with convolutional neural networks, multilayer perceptrons or boosted decision trees, which are commonly used algorithms in multiple Higgs production channels. Furthermore, we use so-called transfer learning techniques to improve overall performance on different data types.

6.
JAMA Psychiatry ; 79(5): 498-507, 2022 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-35353173

RESUMEN

Importance: Previous in vitro and postmortem research suggests that inflammation may lead to structural brain changes via activation of microglia and/or astrocytic dysfunction in a range of neuropsychiatric disorders. Objective: To investigate the relationship between inflammation and changes in brain structures in vivo and to explore a transcriptome-driven functional basis with relevance to mental illness. Design, Setting, and Participants: This study used multistage linked analyses, including mendelian randomization (MR), gene expression correlation, and connectivity analyses. A total of 20 688 participants in the UK Biobank, which includes clinical, genomic, and neuroimaging data, and 6 postmortem brains from neurotypical individuals in the Allen Human Brain Atlas (AHBA), including RNA microarray data. Data were extracted in February 2021 and analyzed between March and October 2021. Exposures: Genetic variants regulating levels and activity of circulating interleukin 1 (IL-1), IL-2, IL-6, C-reactive protein (CRP), and brain-derived neurotrophic factor (BDNF) were used as exposures in MR analyses. Main Outcomes and Measures: Brain imaging measures, including gray matter volume (GMV) and cortical thickness (CT), were used as outcomes. Associations were considered significant at a multiple testing-corrected threshold of P < 1.1 × 10-4. Differential gene expression in AHBA data was modeled in brain regions mapped to areas significant in MR analyses; genes were tested for biological and disease overrepresentation in annotation databases and for connectivity in protein-protein interaction networks. Results: Of 20 688 participants in the UK Biobank sample, 10 828 (52.3%) were female, and the mean (SD) age was 55.5 (7.5) years. In the UK Biobank sample, genetically predicted levels of IL-6 were associated with GMV in the middle temporal cortex (z score, 5.76; P = 8.39 × 10-9), inferior temporal (z score, 3.38; P = 7.20 × 10-5), fusiform (z score, 4.70; P = 2.60 × 10-7), and frontal (z score, -3.59; P = 3.30 × 10-5) cortex together with CT in the superior frontal region (z score, -5.11; P = 3.22 × 10-7). No significant associations were found for IL-1, IL-2, CRP, or BDNF after correction for multiple comparison. In the AHBA sample, 5 of 6 participants (83%) were male, and the mean (SD) age was 42.5 (13.4) years. Brain-wide coexpression analysis showed a highly interconnected network of genes preferentially expressed in the middle temporal gyrus (MTG), which further formed a highly connected protein-protein interaction network with IL-6 (enrichment test of expected vs observed network given the prevalence and degree of interactions in the STRING database: 43 nodes/30 edges observed vs 8 edges expected; mean node degree, 1.4; genome-wide significance, P = 4.54 × 10-9). MTG differentially expressed genes that were functionally enriched for biological processes in schizophrenia, autism spectrum disorder, and epilepsy. Conclusions and Relevance: In this study, genetically determined IL-6 was associated with brain structure and potentially affects areas implicated in developmental neuropsychiatric disorders, including schizophrenia and autism.


Asunto(s)
Trastorno del Espectro Autista , Esquizofrenia , Adulto , Encéfalo/diagnóstico por imagen , Factor Neurotrófico Derivado del Encéfalo/genética , Proteína C-Reactiva/genética , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Inflamación/epidemiología , Inflamación/genética , Interleucina-1/genética , Interleucina-2/genética , Interleucina-6/genética , Imagen por Resonancia Magnética , Masculino , Análisis de la Aleatorización Mendeliana , Persona de Mediana Edad , Esquizofrenia/genética
7.
BMC Med Inform Decis Mak ; 22(1): 33, 2022 02 05.
Artículo en Inglés | MEDLINE | ID: mdl-35123470

RESUMEN

BACKGROUND: Semantic similarity is a valuable tool for analysis in biomedicine. When applied to phenotype profiles derived from clinical text, they have the capacity to enable and enhance 'patient-like me' analyses, automated coding, differential diagnosis, and outcome prediction. While a large body of work exists exploring the use of semantic similarity for multiple tasks, including protein interaction prediction, and rare disease differential diagnosis, there is less work exploring comparison of patient phenotype profiles for clinical tasks. Moreover, there are no experimental explorations of optimal parameters or better methods in the area. METHODS: We develop a platform for reproducible benchmarking and comparison of experimental conditions for patient phentoype similarity. Using the platform, we evaluate the task of ranking shared primary diagnosis from uncurated phenotype profiles derived from all text narrative associated with admissions in the medical information mart for intensive care (MIMIC-III). RESULTS: 300 semantic similarity configurations were evaluated, as well as one embedding-based approach. On average, measures that did not make use of an external information content measure performed slightly better, however the best-performing configurations when measured by area under receiver operating characteristic curve and Top Ten Accuracy used term-specificity and annotation-frequency measures. CONCLUSION: We identified and interpreted the performance of a large number of semantic similarity configurations for the task of classifying diagnosis from text-derived phenotype profiles in one setting. We also provided a basis for further research on other settings and related tasks in the area.


Asunto(s)
Enfermedades Raras , Semántica , Humanos , Fenotipo , Curva ROC
8.
Regul Toxicol Pharmacol ; 128: 105089, 2022 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-34861320

RESUMEN

Respiratory irritation is an important human health endpoint in chemical risk assessment. There are two established modes of action of respiratory irritation, 1) sensory irritation mediated by the interaction with sensory neurons, potentially stimulating trigeminal nerve, and 2) direct tissue irritation. The aim of our research was to, develop a QSAR method to predict human respiratory irritants, and to potentially reduce the reliance on animal testing for the identification of respiratory irritants. Compounds are classified as irritating based on combined evidence from different types of toxicological data, including inhalation studies with acute and repeated exposure. The curated project database comprised 1997 organic substances, 1553 being classified as irritating and 444 as non-irritating. A comparison of machine learning approaches, including Logistic Regression (LR), Random Forests (RFs), and Gradient Boosted Decision Trees (GBTs), showed, the best classification was obtained by GBTs. The LR model resulted in an area under the curve (AUC) of 0.65, while the optimal performance for both RFs and GBTs gives an AUC of 0.71. In addition to the classification and the information on the applicability domain, the web-based tool provides a list of structurally similar analogues together with their experimental data to facilitate expert review for read-across purposes.


Asunto(s)
Irritantes/química , Aprendizaje Automático , Relación Estructura-Actividad Cuantitativa , Sistema Respiratorio/efectos de los fármacos , Administración por Inhalación , Alternativas a las Pruebas en Animales/métodos , Medición de Riesgo
9.
Front Digit Health ; 3: 781227, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34939069

RESUMEN

Semantic similarity is a useful approach for comparing patient phenotypes, and holds the potential of an effective method for exploiting text-derived phenotypes for differential diagnosis, text and document classification, and outcome prediction. While approaches for context disambiguation are commonly used in text mining applications, forming a standard component of information extraction pipelines, their effects on semantic similarity calculations have not been widely explored. In this work, we evaluate how inclusion and disclusion of negated and uncertain mentions of concepts from text-derived phenotypes affects similarity of patients, and the use of those profiles to predict diagnosis. We report on the effectiveness of these approaches and report a very small, yet significant, improvement in performance when classifying primary diagnosis over MIMIC-III patient visits.

10.
Comput Biol Med ; 138: 104904, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34600327

RESUMEN

Identification of ontology concepts in clinical narrative text enables the creation of phenotype profiles that can be associated with clinical entities, such as patients or drugs. Constructing patient phenotype profiles using formal ontologies enables their analysis via semantic similarity, in turn enabling the use of background knowledge in clustering or classification analyses. However, traditional semantic similarity approaches collapse complex relationships between patient phenotypes into a unitary similarity scores for each pair of patients. Moreover, single scores may be based only on matching terms with the greatest information content (IC), ignoring other dimensions of patient similarity. This process necessarily leads to a loss of information in the resulting representation of patient similarity, and is especially apparent when using very large text-derived and highly multi-morbid phenotype profiles. Moreover, it renders finding a biological explanation for similarity very difficult; the black box problem. In this article, we explore the generation of multiple semantic similarity scores for patients based on different facets of their phenotypic manifestation, which we define through different sub-graphs in the Human Phenotype Ontology. We further present a new methodology for deriving sets of qualitative class descriptions for groups of entities described by ontology terms. Leveraging this strategy to obtain meaningful explanations for our semantic clusters alongside other evaluation techniques, we show that semantic clustering with ontology-derived facets enables the representation, and thus identification of, clinically relevant phenotype relationships not easily recoverable using overall clustering alone. In this way, we demonstrate the potential of faceted semantic clustering for gaining a deeper and more nuanced understanding of text-derived patient phenotypes.


Asunto(s)
Semántica , Análisis por Conglomerados , Humanos , Fenotipo
11.
Gigascience ; 10(9)2021 09 11.
Artículo en Inglés | MEDLINE | ID: mdl-34508578

RESUMEN

BACKGROUND: High-quality phenotype definitions are desirable to enable the extraction of patient cohorts from large electronic health record repositories and are characterized by properties such as portability, reproducibility, and validity. Phenotype libraries, where definitions are stored, have the potential to contribute significantly to the quality of the definitions they host. In this work, we present a set of desiderata for the design of a next-generation phenotype library that is able to ensure the quality of hosted definitions by combining the functionality currently offered by disparate tooling. METHODS: A group of researchers examined work to date on phenotype models, implementation, and validation, as well as contemporary phenotype libraries developed as a part of their own phenomics communities. Existing phenotype frameworks were also examined. This work was translated and refined by all the authors into a set of best practices. RESULTS: We present 14 library desiderata that promote high-quality phenotype definitions, in the areas of modelling, logging, validation, and sharing and warehousing. CONCLUSIONS: There are a number of choices to be made when constructing phenotype libraries. Our considerations distil the best practices in the field and include pointers towards their further development to support portable, reproducible, and clinically valid phenotype design. The provision of high-quality phenotype definitions enables electronic health record data to be more effectively used in medical domains.


Asunto(s)
Registros Electrónicos de Salud , Humanos , Fenotipo , Reproducibilidad de los Resultados
12.
Lancet ; 398(10309): 1427-1435, 2021 10 16.
Artículo en Inglés | MEDLINE | ID: mdl-34474011

RESUMEN

BACKGROUND: Mortality remains unacceptably high in patients with heart failure and reduced left ventricular ejection fraction (LVEF) despite advances in therapeutics. We hypothesised that a novel artificial intelligence approach could better assess multiple and higher-dimension interactions of comorbidities, and define clusters of ß-blocker efficacy in patients with sinus rhythm and atrial fibrillation. METHODS: Neural network-based variational autoencoders and hierarchical clustering were applied to pooled individual patient data from nine double-blind, randomised, placebo-controlled trials of ß blockers. All-cause mortality during median 1·3 years of follow-up was assessed by intention to treat, stratified by electrocardiographic heart rhythm. The number of clusters and dimensions was determined objectively, with results validated using a leave-one-trial-out approach. This study was prospectively registered with ClinicalTrials.gov (NCT00832442) and the PROSPERO database of systematic reviews (CRD42014010012). FINDINGS: 15 659 patients with heart failure and LVEF of less than 50% were included, with median age 65 years (IQR 56-72) and LVEF 27% (IQR 21-33). 3708 (24%) patients were women. In sinus rhythm (n=12 822), most clusters demonstrated a consistent overall mortality benefit from ß blockers, with odds ratios (ORs) ranging from 0·54 to 0·74. One cluster in sinus rhythm of older patients with less severe symptoms showed no significant efficacy (OR 0·86, 95% CI 0·67-1·10; p=0·22). In atrial fibrillation (n=2837), four of five clusters were consistent with the overall neutral effect of ß blockers versus placebo (OR 0·92, 0·77-1·10; p=0·37). One cluster of younger atrial fibrillation patients at lower mortality risk but similar LVEF to average had a statistically significant reduction in mortality with ß blockers (OR 0·57, 0·35-0·93; p=0·023). The robustness and consistency of clustering was confirmed for all models (p<0·0001 vs random), and cluster membership was externally validated across the nine independent trials. INTERPRETATION: An artificial intelligence-based clustering approach was able to distinguish prognostic response from ß blockers in patients with heart failure and reduced LVEF. This included patients in sinus rhythm with suboptimal efficacy, as well as a cluster of patients with atrial fibrillation where ß blockers did reduce mortality. FUNDING: Medical Research Council, UK, and EU/EFPIA Innovative Medicines Initiative BigData@Heart.


Asunto(s)
Antagonistas Adrenérgicos beta/uso terapéutico , Fibrilación Atrial/tratamiento farmacológico , Análisis por Conglomerados , Insuficiencia Cardíaca/tratamiento farmacológico , Aprendizaje Automático , Anciano , Comorbilidad , Método Doble Ciego , Femenino , Insuficiencia Cardíaca/mortalidad , Humanos , Masculino , Persona de Mediana Edad , Volumen Sistólico , Función Ventricular Izquierda
13.
Comput Biol Med ; 135: 104542, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34139439

RESUMEN

BACKGROUND: Unstructured text created by patients represents a rich, but relatively inaccessible resource for advancing patient-centred care. This study aimed to develop an ontology for ocular immune-mediated inflammatory diseases (OcIMIDo), as a tool to facilitate data extraction and analysis, illustrating its application to online patient support forum data. METHODS: We developed OcIMIDo using clinical guidelines, domain expertise, and cross-references to classes from other biomedical ontologies. We developed an approach to add patient-preferred synonyms text-mined from oliviasvision.org online forum, using statistical ranking. We validated the approach with split-sampling and comparison to manual extraction. Using OcIMIDo, we then explored the frequency of OcIMIDo classes and synonyms, and their potential association with natural language sentiment expressed in each online forum post. FINDINGS: OcIMIDo (version 1.2) includes 661 classes, describing anatomy, clinical phenotype, disease activity status, complications, investigations, interventions and functional impacts. It contains 1661 relationships and axioms, 2851 annotations, including 1131 database cross-references, and 187 patient-preferred synonyms. To illustrate OcIMIDo's potential applications, we explored 9031 forum posts, revealing frequent mention of different clinical phenotypes, treatments, and complications. Language sentiment analysis of each post was generally positive (median 0.12, IQR 0.01-0.24). In multivariable logistic regression, the odds of a post expressing negative sentiment were significantly associated with first posts as compared to replies (OR 3.3, 95% CI 2.8 to 3.9, p < 0.001). CONCLUSION: We report the development and validation of a new ontology for inflammatory eye diseases, which includes patient-preferred synonyms, and can be used to explore unstructured patient or physician-reported text data, with many potential applications.


Asunto(s)
Ontologías Biológicas , Bases de Datos Factuales , Humanos , Lenguaje , Fenotipo
14.
Comput Biol Med ; 133: 104360, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-33836447

RESUMEN

Ontology-based phenotype profiles have been utilised for the purpose of differential diagnosis of rare genetic diseases, and for decision support in specific disease domains. Particularly, semantic similarity facilitates diagnostic hypothesis generation through comparison with disease phenotype profiles. However, the approach has not been applied for differential diagnosis of common diseases, or generalised clinical diagnostics from uncurated text-derived phenotypes. In this work, we describe the development of an approach for deriving patient phenotype profiles from clinical narrative text, and apply this to text associated with MIMIC-III patient visits. We then explore the use of semantic similarity with those text-derived phenotypes to classify primary patient diagnosis, comparing the use of patient-patient similarity and patient-disease similarity using phenotype-disease profiles previously mined from literature. We also consider a combined approach, in which literature-derived phenotypes are extended with the content of text-derived phenotypes we mined from 500 patients. The results reveal a powerful approach, showing that in one setting, uncurated text phenotypes can be used for differential diagnosis of common diseases, making use of information both inside and outside the setting. While the methods themselves should be explored for further optimisation, they could be applied to a variety of clinical tasks, such as differential diagnosis, cohort discovery, document and text classification, and outcome prediction.


Asunto(s)
Enfermedades Raras , Semántica , Diagnóstico Diferencial , Humanos , Fenotipo , Enfermedades Raras/diagnóstico , Enfermedades Raras/genética
15.
Heart ; 107(11): 902-908, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-33692093

RESUMEN

OBJECTIVE: To improve the echocardiographic assessment of heart failure in patients with atrial fibrillation (AF) by comparing conventional averaging of consecutive beats with an index-beat approach, whereby measurements are taken after two cycles with similar R-R interval. METHODS: Transthoracic echocardiography was performed using a standardised and blinded protocol in patients enrolled in the RATE-AF (RAte control Therapy Evaluation in permanent Atrial Fibrillation) randomised trial. We compared reproducibility of the index-beat and conventional consecutive-beat methods to calculate left ventricular ejection fraction (LVEF), global longitudinal strain (GLS) and E/e' (mitral E wave max/average diastolic tissue Doppler velocity), and assessed intraoperator/interoperator variability, time efficiency and validity against natriuretic peptides. RESULTS: 160 patients were included, 46% of whom were women, with a median age of 75 years (IQR 69-82) and a median heart rate of 100 beats per minute (IQR 86-112). The index-beat had the lowest within-beat coefficient of variation for LVEF (32%, vs 51% for 5 consecutive beats and 53% for 10 consecutive beats), GLS (26%, vs 43% and 42%) and E/e' (25%, vs 41% and 41%). Intraoperator (n=50) and interoperator (n=18) reproducibility were both superior for index-beats and this method was quicker to perform (p<0.001): 35.4 s to measure E/e' (95% CI 33.1 to 37.8) compared with 44.7 s for 5-beat (95% CI 41.8 to 47.5) and 98.1 s for 10-beat (95% CI 91.7 to 104.4) analyses. Using a single index-beat did not compromise the association of LVEF, GLS or E/e' with natriuretic peptide levels. CONCLUSIONS: Compared with averaging of multiple beats in patients with AF, the index-beat approach improves reproducibility and saves time without a negative impact on validity, potentially improving the diagnosis and classification of heart failure in patients with AF.


Asunto(s)
Fibrilación Atrial/fisiopatología , Ecocardiografía Doppler de Pulso , Insuficiencia Cardíaca/diagnóstico , Anciano , Anciano de 80 o más Años , Biomarcadores/sangre , Diástole/fisiología , Femenino , Humanos , Masculino , Péptido Natriurético Encefálico/sangre , Fragmentos de Péptidos/sangre , Reproducibilidad de los Resultados , Volumen Sistólico/fisiología , Sístole/fisiología , Función Ventricular Izquierda/fisiología
16.
BMC Med ; 19(1): 23, 2021 01 21.
Artículo en Inglés | MEDLINE | ID: mdl-33472631

RESUMEN

BACKGROUND: The National Early Warning Score (NEWS2) is currently recommended in the UK for the risk stratification of COVID-19 patients, but little is known about its ability to detect severe cases. We aimed to evaluate NEWS2 for the prediction of severe COVID-19 outcome and identify and validate a set of blood and physiological parameters routinely collected at hospital admission to improve upon the use of NEWS2 alone for medium-term risk stratification. METHODS: Training cohorts comprised 1276 patients admitted to King's College Hospital National Health Service (NHS) Foundation Trust with COVID-19 disease from 1 March to 30 April 2020. External validation cohorts included 6237 patients from five UK NHS Trusts (Guy's and St Thomas' Hospitals, University Hospitals Southampton, University Hospitals Bristol and Weston NHS Foundation Trust, University College London Hospitals, University Hospitals Birmingham), one hospital in Norway (Oslo University Hospital), and two hospitals in Wuhan, China (Wuhan Sixth Hospital and Taikang Tongji Hospital). The outcome was severe COVID-19 disease (transfer to intensive care unit (ICU) or death) at 14 days after hospital admission. Age, physiological measures, blood biomarkers, sex, ethnicity, and comorbidities (hypertension, diabetes, cardiovascular, respiratory and kidney diseases) measured at hospital admission were considered in the models. RESULTS: A baseline model of 'NEWS2 + age' had poor-to-moderate discrimination for severe COVID-19 infection at 14 days (area under receiver operating characteristic curve (AUC) in training cohort = 0.700, 95% confidence interval (CI) 0.680, 0.722; Brier score = 0.192, 95% CI 0.186, 0.197). A supplemented model adding eight routinely collected blood and physiological parameters (supplemental oxygen flow rate, urea, age, oxygen saturation, C-reactive protein, estimated glomerular filtration rate, neutrophil count, neutrophil/lymphocyte ratio) improved discrimination (AUC = 0.735; 95% CI 0.715, 0.757), and these improvements were replicated across seven UK and non-UK sites. However, there was evidence of miscalibration with the model tending to underestimate risks in most sites. CONCLUSIONS: NEWS2 score had poor-to-moderate discrimination for medium-term COVID-19 outcome which raises questions about its use as a screening tool at hospital admission. Risk stratification was improved by including readily available blood and physiological parameters measured at hospital admission, but there was evidence of miscalibration in external sites. This highlights the need for a better understanding of the use of early warning scores for COVID.


Asunto(s)
COVID-19/diagnóstico , Puntuación de Alerta Temprana , Anciano , COVID-19/epidemiología , COVID-19/virología , Estudios de Cohortes , Registros Electrónicos de Salud , Femenino , Humanos , Masculino , Persona de Mediana Edad , Pandemias , Pronóstico , SARS-CoV-2/aislamiento & purificación , Medicina Estatal , Reino Unido/epidemiología
17.
J Am Med Inform Assoc ; 28(4): 791-800, 2021 03 18.
Artículo en Inglés | MEDLINE | ID: mdl-33185672

RESUMEN

OBJECTIVE: Risk prediction models are widely used to inform evidence-based clinical decision making. However, few models developed from single cohorts can perform consistently well at population level where diverse prognoses exist (such as the SARS-CoV-2 [severe acute respiratory syndrome coronavirus 2] pandemic). This study aims at tackling this challenge by synergizing prediction models from the literature using ensemble learning. MATERIALS AND METHODS: In this study, we selected and reimplemented 7 prediction models for COVID-19 (coronavirus disease 2019) that were derived from diverse cohorts and used different implementation techniques. A novel ensemble learning framework was proposed to synergize them for realizing personalized predictions for individual patients. Four diverse international cohorts (2 from the United Kingdom and 2 from China; N = 5394) were used to validate all 8 models on discrimination, calibration, and clinical usefulness. RESULTS: Results showed that individual prediction models could perform well on some cohorts while poorly on others. Conversely, the ensemble model achieved the best performances consistently on all metrics quantifying discrimination, calibration, and clinical usefulness. Performance disparities were observed in cohorts from the 2 countries: all models achieved better performances on the China cohorts. DISCUSSION: When individual models were learned from complementary cohorts, the synergized model had the potential to achieve better performances than any individual model. Results indicate that blood parameters and physiological measurements might have better predictive powers when collected early, which remains to be confirmed by further studies. CONCLUSIONS: Combining a diverse set of individual prediction models, the ensemble method can synergize a robust and well-performing model by choosing the most competent ones for individual patients.


Asunto(s)
COVID-19/mortalidad , Modelos Estadísticos , Pronóstico , Adulto , Anciano , Anciano de 80 o más Años , COVID-19/epidemiología , COVID-19/prevención & control , China/epidemiología , Femenino , Humanos , Masculino , Persona de Mediana Edad , Medición de Riesgo/métodos , SARS-CoV-2 , Reino Unido/epidemiología
18.
Sci Rep ; 9(1): 17405, 2019 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-31757986

RESUMEN

Identifying and distinguishing cancer driver genes among thousands of candidate mutations remains a major challenge. Accurate identification of driver genes and driver mutations is critical for advancing cancer research and personalizing treatment based on accurate stratification of patients. Due to inter-tumor genetic heterogeneity many driver mutations within a gene occur at low frequencies, which make it challenging to distinguish them from non-driver mutations. We have developed a novel method for identifying cancer driver genes. Our approach utilizes multiple complementary types of information, specifically cellular phenotypes, cellular locations, functions, and whole body physiological phenotypes as features. We demonstrate that our method can accurately identify known cancer driver genes and distinguish between their role in different types of cancer. In addition to confirming known driver genes, we identify several novel candidate driver genes. We demonstrate the utility of our method by validating its predictions in nasopharyngeal cancer and colorectal cancer using whole exome and whole genome sequencing.


Asunto(s)
Biología Computacional/métodos , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad , Neoplasias/etiología , Oncogenes , Biomarcadores de Tumor , Exoma , Ontología de Genes , Estudios de Asociación Genética/métodos , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Aprendizaje Automático , Anotación de Secuencia Molecular , Mutación , Neoplasias/diagnóstico , Curva ROC
19.
Mol Inform ; 32(5-6): 516-28, 2013 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-27481669

RESUMEN

(Q)SAR model validation is essential to ensure the quality of inferred models and to indicate future model predictivity on unseen compounds. Proper validation is also one of the requirements of regulatory authorities in order to accept the (Q)SAR model, and to approve its use in real world scenarios as alternative testing method. However, at the same time, the question of how to validate a (Q)SAR model, in particular whether to employ variants of cross-validation or external test set validation, is still under discussion. In this paper, we empirically compare a k-fold cross-validation with external test set validation. To this end we introduce a workflow allowing to realistically simulate the common problem setting of building predictive models for relatively small datasets. The workflow allows to apply the built and validated models on large amounts of unseen data, and to compare the performance of the different validation approaches. The experimental results indicate that cross-validation produces higher performant (Q)SAR models than external test set validation, reduces the variance of the results, while at the same time underestimates the performance on unseen compounds. The experimental results reported in this paper suggest that, contrary to current conception in the community, cross-validation may play a significant role in evaluating the predictivity of (Q)SAR models.

20.
J Cheminform ; 4(1): 7, 2012 Mar 17.
Artículo en Inglés | MEDLINE | ID: mdl-22424447

RESUMEN

Analyzing chemical datasets is a challenging task for scientific researchers in the field of chemoinformatics. It is important, yet difficult to understand the relationship between the structure of chemical compounds, their physico-chemical properties, and biological or toxic effects. To that respect, visualization tools can help to better comprehend the underlying correlations. Our recently developed 3D molecular viewer CheS-Mapper (Chemical Space Mapper) divides large datasets into clusters of similar compounds and consequently arranges them in 3D space, such that their spatial proximity reflects their similarity. The user can indirectly determine similarity, by selecting which features to employ in the process. The tool can use and calculate different kind of features, like structural fragments as well as quantitative chemical descriptors. These features can be highlighted within CheS-Mapper, which aids the chemist to better understand patterns and regularities and relate the observations to established scientific knowledge. As a final function, the tool can also be used to select and export specific subsets of a given dataset for further analysis.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...