Búsqueda | BVS CLAP/SMR-OPS/OMS

1.

Covariate shift estimation based adaptive ensemble learning for handling non-stationarity in motor imagery related EEG-based brain-computer interface.

Raza, Haider; Rathee, Dheeraj; Zhou, Shang-Ming; Cecotti, Hubert; Prasad, Girijesh.

Neurocomputing (Amst) ; 343: 154-166, 2019 May 28.

Artículo en Inglés | MEDLINE | ID: mdl-32226230

RESUMEN

The non-stationary nature of electroencephalography (EEG) signals makes an EEG-based brain-computer interface (BCI) a dynamic system, thus improving its performance is a challenging task. In addition, it is well-known that due to non-stationarity based covariate shifts, the input data distributions of EEG-based BCI systems change during inter- and intra-session transitions, which poses great difficulty for developments of online adaptive data-driven systems. Ensemble learning approaches have been used previously to tackle this challenge. However, passive scheme based implementation leads to poor efficiency while increasing high computational cost. This paper presents a novel integration of covariate shift estimation and unsupervised adaptive ensemble learning (CSE-UAEL) to tackle non-stationarity in motor-imagery (MI) related EEG classification. The proposed method first employs an exponentially weighted moving average model to detect the covariate shifts in the common spatial pattern features extracted from MI related brain responses. Then, a classifier ensemble was created and updated over time to account for changes in streaming input data distribution wherein new classifiers are added to the ensemble in accordance with estimated shifts. Furthermore, using two publicly available BCI-related EEG datasets, the proposed method was extensively compared with the state-of-the-art single-classifier based passive scheme, single-classifier based active scheme and ensemble based passive schemes. The experimental results show that the proposed active scheme based ensemble learning algorithm significantly enhances the BCI performance in MI classifications.

2.

Incidence of Campylobacter and Salmonella infections following first prescription for PPI: a cohort study using routine data.

Brophy, Sinead; Jones, Kerina H; Rahman, Muhammad A; Zhou, Shang-Ming; John, Ann; Atkinson, Mark D; Francis, Nick; Lyons, Ronan A; Dunstan, Frank.

Am J Gastroenterol ; 108(7): 1094-100, 2013 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-23588238

RESUMEN

OBJECTIVES: To examine the incidence of Campylobacter and Salmonella infection in patients prescribed proton pump inhibitors (PPIs) compared with controls. METHODS: Retrospective cohort study using anonymous general practitioner (GP) data. Anonymised individual-level records from the Secure Anonymised Information Linkage (SAIL) system between 1990 and 2010 in Wales were selected. Data were available from 1,913,925 individuals including 358,938 prescribed a PPI. The main outcome measures examined included incidence of Campylobacter or Salmonella infection following a prescription for PPI. RESULTS: The rate of Campylobacter and Salmonella infections was already at 3.1-6.9 times that of non-PPI patients even before PPI prescription. The PPI group had an increased hazard rate of infection (after prescription for PPI) of 1.46 for Campylobacter and 1.2 for Salmonella, compared with baseline. However, the non-PPI patients also had an increased hazard ratio with time. In fact, the ratio of events in the PPI group compared with the non-PPI group using the prior event rate ratio was 1.17 (95% CI 0.74-1.61) for Campylobacter and 1.00 (0.5-1.5) for Salmonella. CONCLUSIONS: People who go on to be prescribed PPIs have a greater underlying risk of gastrointestinal (GI) infection beforehand and they have a higher prevalence of risk factors before PPI prescription. The rate of diagnosis of infection is increasing with time regardless of PPI use, and there is no evidence that PPI is associated with an increase in diagnosed GI infection. It is likely that factors associated with the demographic profile of the patient are the main contributors to increased rate of GI infection for patients prescribed PPIs.

Asunto(s)

Infecciones por Campylobacter/epidemiología , Inhibidores de la Bomba de Protones , Infecciones por Salmonella/epidemiología , Adulto , Anciano , Distribución de Chi-Cuadrado , Prescripciones de Medicamentos , Femenino , Humanos , Incidencia , Masculino , Persona de Mediana Edad , Modelos de Riesgos Proporcionales , Inhibidores de la Bomba de Protones/uso terapéutico , Estudios Retrospectivos , Factores de Riesgo , Factores de Tiempo , Gales/epidemiología

3.

Identifying Dynamic Patterns of Polypharmacy for Patients with Dementia from Primary Care Electronic Health Records: A Machine Learning Driven Longitudinal Study.

Longo, Elisabetta; Burnett, Bruce; Bauermeister, Sarah; Zhou, Shang-Ming.

Aging Dis ; 14(2): 548-559, 2023 Apr 01.

Artículo en Inglés | MEDLINE | ID: mdl-37008054

RESUMEN

It is unclear how medication use evolved before diagnosis of dementia (DoD). This study aims to identify varied patterns of polypharmacy before DoD, their prevalence and possible complications. We collected primary care e-health records for 33,451 dementia patients in Wales from 1990 to 2015. The medication uses in every 5-year period along with 20-years prior to dementia diagnosis were considered. Exploratory factor analysis was used to identify clusters of medicines for every 5-year period. The prevalence of patients taking three or more medications was 82.16%, 69.7%, 41.1% and 5.5% in the Period 1 (0-5 years before DoD) ~ Period 4 (16-20 years before DoD) respectively. The Period 1 showed 3 clusters of polypharmacy - medicines for respiratory/urinary infections, arthropathies and rheumatism, and cardio-vascular disease (CVD) (66.55%); medicines for infections, arthropathies and rheumatism (AR), cardio-metabolic disease (CMD) and depression (22.02%); and medicines for arthropathies, rheumatism and osteoarthritis (2.6%). The Period 2 showed 4 clusters of polypharmacy - medicines for infections, arthropathies, and CVD (69.7%); medicines for CVD and depression (3%); medicines for CMD and arthropathies (0.3%); and medicines for AR, and CVD (2,5%). The Period 3 showed 6 clusters of polypharmacy - medicines for infections, arthropathies, and CVD (41.1%); medicines for CVD, acute-respiratory-infection (ARI), and arthropathies (1.25%); medicines for AR (1.16%); medicines for depression, anxiety (0.06%); medicines for CMD (1.4%); and medicines for dermatologic disorders (0.9%). The Period 4 showed 3 main clusters of polypharmacy - medicines for infections, arthropathy, and CVD (5.5%); medicines for anxiety, ARI (2.4%); and medicines for ARI and CVD (2.1%). As the development towards dementia progressed, the associative diseases tended to cluster with a larger prevalence in each cluster. Farther away before DoD, the clusters of polypharmacy tended to be clearly distinct between each other, resulting in an increasing number of patterns, but in a smaller prevalence.

4.

Machine Learning in Colorectal Cancer Risk Prediction from Routinely Collected Data: A Review.

Burnett, Bruce; Zhou, Shang-Ming; Brophy, Sinead; Davies, Phil; Ellis, Paul; Kennedy, Jonathan; Bandyopadhyay, Amrita; Parker, Michael; Lyons, Ronan A.

Diagnostics (Basel) ; 13(2)2023 Jan 13.

Artículo en Inglés | MEDLINE | ID: mdl-36673111

RESUMEN

The inclusion of machine-learning-derived models in systematic reviews of risk prediction models for colorectal cancer is rare. Whilst such reviews have highlighted methodological issues and limited performance of the models included, it is unclear why machine-learning-derived models are absent and whether such models suffer similar methodological problems. This scoping review aims to identify machine-learning models, assess their methodology, and compare their performance with that found in previous reviews. A literature search of four databases was performed for colorectal cancer prediction and prognosis model publications that included at least one machine-learning model. A total of 14 publications were identified for inclusion in the scoping review. Data was extracted using an adapted CHARM checklist against which the models were benchmarked. The review found similar methodological problems with machine-learning models to that observed in systematic reviews for non-machine-learning models, although model performance was better. The inclusion of machine-learning models in systematic reviews is required, as they offer improved performance despite similar methodological omissions; however, to achieve this the methodological issues that affect many prediction models need to be addressed.

5.

Analysing patient-generated data to understand behaviours and characteristics of women with epilepsy of childbearing years: A prospective cohort study.

Zhou, Shang-Ming; McLean, Brendan; Roberts, Elis; Baines, Rebecca; Hannon, Peter; Ashby, Samantha; Newman, Craig; Sen, Arjune; Wilkinson, Ellen; Laugharne, Richard; Shankar, Rohit.

Seizure ; 108: 24-32, 2023 May.

Artículo en Inglés | MEDLINE | ID: mdl-37060628

RESUMEN

BACKGROUND: Women with epilepsy (WWE) are vulnerable in pregnancy, with increased risks to mother and baby including teratogenic risks, especially from valproate. The free EpSMon mobile-phone app allows self-monitoring to afford patient-centred feedback on seizure related risks, such as sudden death in epilepsy (SUDEP) to its users. We sought to generate insights into various seizure related risks and its treatments in WWE of childbearing age (16 to 60 years ) using EpSMon. METHODS: The study utilizes a prospective real-world cohort of 5.5 years. Patient reported data on demographics, medication taken, diagnoses, seizure types and recognised biological, psychological, and social factors of seizure related harm were extracted. Data was stratified according to frequent and infrequent users and those scoring lower and higher risk scores. Multivariate logistic regression and different statistical tests were conducted. FINDINGS: Data from 2158 WWE of childbearing age encompassing 4016 self-assessments were analysed. Overall risk awareness was 25.3% for pregnancy and 54.1% for SUDEP. Frequent users were more aware of pregnancy risks but not of SUDEP. Repeated EpSMon use increased SUDEP awareness but not pregnancy risks. Valproate was used by 11% of WWE, ranging from 6.5% of younger to 31.5% of older women. CONCLUSIONS: The awareness to risks to pregnancy, SUDEP and valproate is low. Valproate is being used by a significant minority. It is imperative risk communication continues for WWE based on their individual situation and need. This is unlikely to be delivered by current clinical models. Digital solutions hold promise but require work done to raise implementation and acceptability.

Asunto(s)

Epilepsia , Muerte Súbita e Inesperada en la Epilepsia , Femenino , Humanos , Anciano , Adolescente , Adulto Joven , Adulto , Persona de Mediana Edad , Ácido Valproico/uso terapéutico , Estudios Prospectivos , Epilepsia/tratamiento farmacológico , Epilepsia/epidemiología , Epilepsia/complicaciones , Convulsiones/tratamiento farmacológico , Muerte Súbita/etiología , Anticonvulsivantes/efectos adversos

6.

Incorporation of expert variability into breast cancer treatment recommendation in designing clinical protocol guided fuzzy rule system models.

Garibaldi, Jonathan M; Zhou, Shang-Ming; Wang, Xiao-Ying; John, Robert I; Ellis, Ian O.

J Biomed Inform ; 45(3): 447-59, 2012 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-22265814

RESUMEN

It has been often demonstrated that clinicians exhibit both inter-expert and intra-expert variability when making difficult decisions. In contrast, the vast majority of computerized models that aim to provide automated support for such decisions do not explicitly recognize or replicate this variability. Furthermore, the perfect consistency of computerized models is often presented as a de facto benefit. In this paper, we describe a novel approach to incorporate variability within a fuzzy inference system using non-stationary fuzzy sets in order to replicate human variability. We apply our approach to a decision problem concerning the recommendation of post-operative breast cancer treatment; specifically, whether or not to administer chemotherapy based on assessment of five clinical variables: NPI (the Nottingham Prognostic Index), estrogen receptor status, vascular invasion, age and lymph node status. In doing so, we explore whether such explicit modeling of variability provides any performance advantage over a more conventional fuzzy approach, when tested on a set of 1310 unselected cases collected over a fourteen year period at the Nottingham University Hospitals NHS Trust, UK. The experimental results show that the standard fuzzy inference system (that does not model variability) achieves overall agreement to clinical practice around 84.6% (95% CI: 84.1-84.9%), while the non-stationary fuzzy model can significantly increase performance to around 88.1% (95% CI: 88.0-88.2%), p<0.001. We conclude that non-stationary fuzzy models provide a valuable new approach that may be applied to clinical decision support systems in any application domain.

Asunto(s)

Neoplasias de la Mama/tratamiento farmacológico , Protocolos Clínicos/normas , Modelos Biológicos , Sistemas de Apoyo a Decisiones Clínicas , Lógica Difusa , Humanos , Reino Unido

7.

Automatically Generating Natural Language Descriptions of Images by a Deep Hierarchical Framework.

Huo, Lin; Bai, Lin; Zhou, Shang-Ming.

IEEE Trans Cybern ; 52(8): 7441-7452, 2022 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-33400668

RESUMEN

Automatically generating an accurate and meaningful description of an image is very challenging. However, the recent scheme of generating an image caption by maximizing the likelihood of target sentences lacks the capacity of recognizing the human-object interaction (HOI) and semantic relationship between HOIs and scenes, which are the essential parts of an image caption. This article proposes a novel two-phase framework to generate an image caption by addressing the above challenges: 1) a hybrid deep learning and 2) an image description generation. In the hybrid deep-learning phase, a novel factored three-way interaction machine was proposed to learn the relational features of the human-object pairs hierarchically. In this way, the image recognition problem is transformed into a latent structured labeling task. In the image description generation phase, a lexicalized probabilistic context-free tree growing scheme is innovatively integrated with a description generator to transform the descriptions generation task into a syntactic-tree generation process. Extensively comparing state-of-the-art image captioning methods on benchmark datasets, we demonstrated that our proposed framework outperformed the existing captioning methods in different ways, such as significantly improving the performance of the HOI and relationships between HOIs and scenes (RHIS) predictions, and quality of generated image captions in a semantically and structurally coherent manner.

Asunto(s)

Algoritmos , Lenguaje , Humanos , Semántica

8.

Concept Libraries for Repeatable and Reusable Research: Qualitative Study Exploring the Needs of Users.

Almowil, Zahra; Zhou, Shang-Ming; Brophy, Sinead; Croxall, Jodie.

JMIR Hum Factors ; 9(1): e31021, 2022 Mar 15.

Artículo en Inglés | MEDLINE | ID: mdl-35289755

RESUMEN

BACKGROUND: Big data research in the field of health sciences is hindered by a lack of agreement on how to identify and define different conditions and their medications. This means that researchers and health professionals often have different phenotype definitions for the same condition. This lack of agreement makes it difficult to compare different study findings and hinders the ability to conduct repeatable and reusable research. OBJECTIVE: This study aims to examine the requirements of various users, such as researchers, clinicians, machine learning experts, and managers, in the development of a data portal for phenotypes (a concept library). METHODS: This was a qualitative study using interviews and focus group discussion. One-to-one interviews were conducted with researchers, clinicians, machine learning experts, and senior research managers in health data science (N=6) to explore their specific needs in the development of a concept library. In addition, a focus group discussion with researchers (N=14) working with the Secured Anonymized Information Linkage databank, a national eHealth data linkage infrastructure, was held to perform a SWOT (strengths, weaknesses, opportunities, and threats) analysis for the phenotyping system and the proposed concept library. The interviews and focus group discussion were transcribed verbatim, and 2 thematic analyses were performed. RESULTS: Most of the participants thought that the prototype concept library would be a very helpful resource for conducting repeatable research, but they specified that many requirements are needed before its development. Although all the participants stated that they were aware of some existing concept libraries, most of them expressed negative perceptions about them. The participants mentioned several facilitators that would stimulate them to share their work and reuse the work of others, and they pointed out several barriers that could inhibit them from sharing their work and reusing the work of others. The participants suggested some developments that they would like to see to improve reproducible research output using routine data. CONCLUSIONS: The study indicated that most interviewees valued a concept library for phenotypes. However, only half of the participants felt that they would contribute by providing definitions for the concept library, and they reported many barriers regarding sharing their work on a publicly accessible platform. Analysis of interviews and the focus group discussion revealed that different stakeholders have different requirements, facilitators, barriers, and concerns about a prototype concept library.

9.

Predicting Hospital Readmission for Campylobacteriosis from Electronic Health Records: A Machine Learning and Text Mining Perspective.

Zhou, Shang-Ming; Lyons, Ronan A; Rahman, Muhammad A; Holborow, Alexander; Brophy, Sinead.

J Pers Med ; 12(1)2022 Jan 10.

Artículo en Inglés | MEDLINE | ID: mdl-35055401

RESUMEN

(1) Background: This study investigates influential risk factors for predicting 30-day readmission to hospital for Campylobacter infections (CI). (2) Methods: We linked general practitioner and hospital admission records of 13,006 patients with CI in Wales (1990-2015). An approach called TF-zR (term frequency-zRelevance) technique was presented to evaluates how relevant a clinical term is to a patient in a cohort characterized by coded health records. The zR is a supervised term-weighting metric to assign weight to a term based on relative frequencies of the term across different classes. Cost-sensitive classifier with swarm optimization and weighted subset learning was integrated to identify influential clinical signals as predictors and optimal model for readmission prediction. (3) Results: From a pool of up to 17,506 variables, 33 most predictive factors were identified, including age, gender, Townsend deprivation quintiles, comorbidities, medications, and procedures. The predictive model predicted readmission with 73% sensitivity and 54% specificity. Variables associated with readmission included male gender, recurrent tonsillitis, non-healing open wounds, operation for in-gown toenails. Cystitis, paracetamol/codeine use, age (21-25), and heliclear triple pack use, were associated with a lower risk of readmission. (4) Conclusions: This study gives a profile of clustered variables that are predictive of readmission associated with campylobacteriosis.

10.

Concept libraries for automatic electronic health record based phenotyping: A review.

Almowil, Zahra A; Zhou, Shang-Ming; Brophy, Sinead.

Int J Popul Data Sci ; 6(1): 1362, 2021 Jun 16.

Artículo en Inglés | MEDLINE | ID: mdl-34189274

RESUMEN

INTRODUCTION: Electronic health records (EHR) are linked together to examine disease history and to undertake research into the causes and outcomes of disease. However, the process of constructing algorithms for phenotyping (e.g., identifying disease characteristics) or health characteristics (e.g., smoker) is very time consuming and resource costly. In addition, results can vary greatly between researchers. Reusing or building on algorithms that others have created is a compelling solution to these problems. However, sharing algorithms is not a common practice and many published studies do not detail the clinical code lists used by the researchers in the disease/characteristic definition. To address these challenges, a number of centres across the world have developed health data portals which contain concept libraries (e.g., algorithms for defining concepts such as disease and characteristics) in order to facilitate disease phenotyping and health studies. OBJECTIVES: This study aims to review the literature of existing concept libraries, examine their utilities, identify the current gaps, and suggest future developments. METHODS: The five-stage framework of Arksey and O'Malley was used for the literature search. This approach included defining the research questions, identifying relevant studies through literature review, selecting eligible studies, charting and extracting data, and summarising and reporting the findings. RESULTS: This review identified seven publicly accessible Electronic Health data concept libraries which were developed in different countries including UK, USA, and Canada. The concept libraries (n = 7) investigated were either general libraries that hold phenotypes of multiple specialties (n = 4) or specialized libraries that manage only certain specialities such as rare diseases (n = 3). There were some clear differences between the general libraries such as archiving data from different electronic sources, and using a range of different types of coding systems. However, they share some clear similarities such as enabling users to upload their own code lists, and allowing users to use/download the publicly accessible code. In addition, there were some differences between the specialized libraries such as difference in ability to search, and if it was possible to use different searching queries such as simple or complex searches. Conversely, there were some similarities between the specialized libraries such as enabling users to upload their own concepts into the libraries and to show where they were published, which facilitates assessing the validity of the concepts. All the specialized libraries aimed to encourage the reuse of research methods such as lists of clinical code and/or metadata. CONCLUSION: The seven libraries identified have been developed independently and appear to replicate similar concepts but in different ways. Collaboration between similar libraries would greatly facilitate the use of these libraries for the user. The process of building code lists takes time and effort. Access to existing code lists increases consistency and accuracy of definitions across studies. Concept library developers should collaborate with each other to raise awareness of their existence and of their various functions, which could increase users' contributions to those libraries and promote their wide-ranging adoption.

Asunto(s)

Registros Electrónicos de Salud , Bibliotecas , Recolección de Datos , Publicaciones , Informe de Investigación

11.

Modeling Large Sparse Data for Feature Selection: Hospital Admission Predictions of the Dementia Patients Using Primary Care Electronic Health Records.

Tsang, Gavin; Zhou, Shang-Ming; Xie, Xianghua.

IEEE J Transl Eng Health Med ; 9: 3000113, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-33354439

RESUMEN

A growing elderly population suffering from incurable, chronic conditions such as dementia present a continual strain on medical services due to mental impairment paired with high comorbidity resulting in increased hospitalization risk. The identification of at risk individuals allows for preventative measures to alleviate said strain. Electronic health records provide opportunity for big data analysis to address such applications. Such data however, provides a challenging problem space for traditional statistics and machine learning due to high dimensionality and sparse data elements. This article proposes a novel machine learning methodology: entropy regularization with ensemble deep neural networks (ECNN), which simultaneously provides high predictive performance of hospitalization of patients with dementia whilst enabling an interpretable heuristic analysis of the model architecture, able to identify individual features of importance within a large feature domain space. Experimental results on health records containing 54,647 features were able to identify 10 event indicators within a patient timeline: a collection of diagnostic events, medication prescriptions and procedural events, the highest ranked being essential hypertension. The resulting subset was still able to provide a highly competitive hospitalization prediction (Accuracy: 0.759) as compared to the full feature domain (Accuracy: 0.755) or traditional feature selection techniques (Accuracy: 0.737), a significant reduction in feature size. The discovery and heuristic evidence of correlation provide evidence for further clinical study of said medical events as potential novel indicators. There also remains great potential for adaption of ECNN within other medical big data domains as a data mining tool for novel risk factor identification.

Asunto(s)

Demencia , Registros Electrónicos de Salud , Anciano , Demencia/epidemiología , Hospitalización , Hospitales , Humanos , Atención Primaria de Salud

12.

Identifying Prenatal and Postnatal Determinants of Infant Growth: A Structural Equation Modelling Based Cohort Analysis.

Morgan, Kelly; Zhou, Shang-Ming; Hill, Rebecca; Lyons, Ronan A; Paranjothy, Shantini; Brophy, Sinead T.

Int J Environ Res Public Health ; 18(19)2021 09 29.

Artículo en Inglés | MEDLINE | ID: mdl-34639581

RESUMEN

BACKGROUND: The growth and maturation of infants reflect their overall health and nutritional status. The purpose of this study is to examine the associations of prenatal and early postnatal factors with infant growth (IG). METHODS: A data-driven model was constructed by structural equation modelling to examine the relationships between pre- and early postnatal environmental factors and IG at age 12 months. The IG was a latent variable created from infant weight and waist circumference. Data were obtained on 274 mother-child pairs during pregnancy and the postnatal periods. RESULTS: Maternal pre-pregnancy BMI emerged as an important predictor of IG with both direct and indirect (mediated through infant birth weight) effects. Infants who gained more weight from birth to 6 months and consumed starchy foods daily at age 12 months, were more likely to be larger by age 12 months. Infant physical activity (PA) levels also emerged as a determinant. The constructed model provided a reasonable fit (χ2 (11) = 21.5, p < 0.05; RMSEA = 0.07; CFI = 0.94; SRMR = 0.05) to the data with significant pathways for all examined variables. CONCLUSION: Promoting healthy weight amongst women of child bearing age is important in preventing childhood obesity, and increasing daily infant PA is as important as a healthy infant diet.

Asunto(s)

Obesidad Infantil , Peso al Nacer , Niño , Estudios de Cohortes , Femenino , Humanos , Lactante , Análisis de Clases Latentes , Embarazo , Circunferencia de la Cintura

13.

Mining Primary Care Electronic Health Records for Automatic Disease Phenotyping: A Transparent Machine Learning Framework.

Fernández-Gutiérrez, Fabiola; Kennedy, Jonathan I; Cooksey, Roxanne; Atkinson, Mark; Choy, Ernest; Brophy, Sinead; Huo, Lin; Zhou, Shang-Ming.

Diagnostics (Basel) ; 11(10)2021 Oct 15.

Artículo en Inglés | MEDLINE | ID: mdl-34679609

RESUMEN

(1) Background: We aimed to develop a transparent machine-learning (ML) framework to automatically identify patients with a condition from electronic health records (EHRs) via a parsimonious set of features. (2) Methods: We linked multiple sources of EHRs, including 917,496,869 primary care records and 40,656,805 secondary care records and 694,954 records from specialist surgeries between 2002 and 2012, to generate a unique dataset. Then, we treated patient identification as a problem of text classification and proposed a transparent disease-phenotyping framework. This framework comprises a generation of patient representation, feature selection, and optimal phenotyping algorithm development to tackle the imbalanced nature of the data. This framework was extensively evaluated by identifying rheumatoid arthritis (RA) and ankylosing spondylitis (AS). (3) Results: Being applied to the linked dataset of 9657 patients with 1484 cases of rheumatoid arthritis (RA) and 204 cases of ankylosing spondylitis (AS), this framework achieved accuracy and positive predictive values of 86.19% and 88.46%, respectively, for RA and 99.23% and 97.75% for AS, comparable with expert knowledge-driven methods. (4) Conclusions: This framework could potentially be used as an efficient tool for identifying patients with a condition of interest from EHRs, helping clinicians in clinical decision-support process.

14.

Harnessing the Power of Machine Learning in Dementia Informatics Research: Issues, Opportunities, and Challenges.

Tsang, Gavin; Xie, Xianghua; Zhou, Shang-Ming.

IEEE Rev Biomed Eng ; 13: 113-129, 2020.

Artículo en Inglés | MEDLINE | ID: mdl-30872241

RESUMEN

Dementia is a chronic and degenerative condition affecting millions globally. The care of patients with dementia presents an ever-continuing challenge to healthcare systems in the 21st century. Medical and health sciences have generated unprecedented volumes of data related to health and wellbeing for patients with dementia due to advances in information technology, such as genetics, neuroimaging, cognitive assessment, free texts, routine electronic health records, etc. Making the best use of these diverse and strategic resources will lead to high-quality care of patients with dementia. As such, machine learning becomes a crucial factor in achieving this objective. The aim of this paper is to provide a state-of-the-art review of machine learning methods applied to health informatics for dementia care. We collate and review the existing scientific methodologies and identify the relevant issues and challenges when faced with big health data. Machine learning has demonstrated promising applications to neuroimaging data analysis for dementia care, while relatively less effort has been made to make use of integrated heterogeneous data via advanced machine learning approaches. We further indicate future potential and research directions in applying advanced machine learning, such as deep learning, to dementia informatics.

Asunto(s)

Demencia , Aprendizaje Automático , Informática Médica , Investigación Biomédica , Demencia/diagnóstico por imagen , Demencia/terapia , Humanos , Pruebas de Estado Mental y Demencia , Procesamiento de Lenguaje Natural , Neuroimagen

15.

Response to Fujita et al.

Brophy, Sinead; Jones, Kerina H; Rahman, Muhammad A; Zhou, Shang-Ming; John, Ann; Atkinson, Mark D; Francis, Nick; Lyons, Ronan A; Dunstan, Frank.

Am J Gastroenterol ; 109(1): 138-9, 2014 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-24402542

Asunto(s)

Infecciones por Campylobacter/epidemiología , Inhibidores de la Bomba de Protones , Infecciones por Salmonella/epidemiología , Femenino , Humanos , Masculino

16.

Predictors of objectively measured physical activity in 12-month-old infants: A study of linked birth cohort data with electronic health records.

Raza, Haider; Zhou, Shang-Ming; Todd, Charlotte; Christian, Danielle; Marchant, Emily; Morgan, Kelly; Khanom, Ashrafunnesa; Hill, Rebecca; Lyons, Ronan A; Brophy, Sinead.

Pediatr Obes ; 14(7): e12512, 2019 07.

Artículo en Inglés | MEDLINE | ID: mdl-30729733

RESUMEN

BACKGROUND: Physical activity (PA) levels are associated with long-term health, and levels of PA when young are predictive of adult activity levels. OBJECTIVES: This study examines factors associated with PA levels in 12-month infants. METHOD: One hundred forty-one mother-infant pairs were recruited via a longitudinal birth cohort study (April 2010 to March 2013). The PA level was collected using accelerometers and linked to postnatal notes and electronic medical records via the Secure Anonymised Information Linkage databank. Univariable and multivariable linear regressions were used to examine the factors associated with PA levels. RESULTS: Using univariable analysis, higher PA was associated with the following (P value less than 0.05): being male, larger infant size, healthy maternal blood pressure levels, full-term gestation period, higher consumption of vegetables (infant), lower consumption of juice (infant), low consumption of adult crisps (infant), longer breastfeeding duration, and more movement during sleep (infant) but fewer night wakings. Combined into a multivariable regression model (R2 = 0.654), all factors remained significant, showing lower PA levels were associated with female gender, smaller infant, preterm birth, higher maternal blood pressure, low vegetable consumption, high crisp consumption, and less night movement. CONCLUSION: The PA levels of infants were strongly associated with both gestational and postnatal environmental factors. Healthy behaviours appear to cluster, and a healthy diet was associated with a more active infant. Boys were substantially more active than girls, even at age 12 months. These findings can help inform interventions to promote healthier lives for infants and to understand the determinants of their PA levels.

Asunto(s)

Registros Electrónicos de Salud , Ejercicio Físico , Adulto , Peso Corporal , Estudios de Cohortes , Dieta , Femenino , Conductas Relacionadas con la Salud , Humanos , Lactante , Masculino , Embarazo

17.

Learning Differentially Expressed Gene Pairs in Microarray Data.

Xia, Xiao-Lei; Brophy, Sinead; Zhou, Shang-Ming.

Stud Health Technol Inform ; 235: 191-195, 2017.

Artículo en Inglés | MEDLINE | ID: mdl-28423781

RESUMEN

To identify differentially expressed genes (DEGs) in analysis of microarray data, a majority of existing filter methods rank gene individually. Such a paradigm could overlook the genes with trivial individual discriminant powers but significant powers of discrimination in their combinations. This paper proposed an impurity metric in which the number of split intervals for each feature is considered as a parameter to be optimized for gaining maximal discrimination. The proposed method was first evaluated by applying to a synthesized noisy rectangular grid dataset, in which the significant feature pair which forms a rectangular grid pattern was successfully recognized. Furthermore, applying to the identification of DEGs on colon microarray data, the proposed method demonstrated that it could become an alternative to Fisher's test for the prescreening of genes which led to better performance of the SVM-RFE method.

Asunto(s)

Perfilación de la Expresión Génica/métodos , Análisis por Micromatrices , Algoritmos , Aprendizaje Automático , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Reconocimiento de Normas Patrones Automatizadas

18.

Defining Disease Phenotypes in Primary Care Electronic Health Records by a Machine Learning Approach: A Case Study in Identifying Rheumatoid Arthritis.

Zhou, Shang-Ming; Fernandez-Gutierrez, Fabiola; Kennedy, Jonathan; Cooksey, Roxanne; Atkinson, Mark; Denaxas, Spiros; Siebert, Stefan; Dixon, William G; O'Neill, Terence W; Choy, Ernest; Sudlow, Cathie; Brophy, Sinead.

PLoS One ; 11(5): e0154515, 2016.

Artículo en Inglés | MEDLINE | ID: mdl-27135409

RESUMEN

OBJECTIVES: 1) To use data-driven method to examine clinical codes (risk factors) of a medical condition in primary care electronic health records (EHRs) that can accurately predict a diagnosis of the condition in secondary care EHRs. 2) To develop and validate a disease phenotyping algorithm for rheumatoid arthritis using primary care EHRs. METHODS: This study linked routine primary and secondary care EHRs in Wales, UK. A machine learning based scheme was used to identify patients with rheumatoid arthritis from primary care EHRs via the following steps: i) selection of variables by comparing relative frequencies of Read codes in the primary care dataset associated with disease case compared to non-disease control (disease/non-disease based on the secondary care diagnosis); ii) reduction of predictors/associated variables using a Random Forest method, iii) induction of decision rules from decision tree model. The proposed method was then extensively validated on an independent dataset, and compared for performance with two existing deterministic algorithms for RA which had been developed using expert clinical knowledge. RESULTS: Primary care EHRs were available for 2,238,360 patients over the age of 16 and of these 20,667 were also linked in the secondary care rheumatology clinical system. In the linked dataset, 900 predictors (out of a total of 43,100 variables) in the primary care record were discovered more frequently in those with versus those without RA. These variables were reduced to 37 groups of related clinical codes, which were used to develop a decision tree model. The final algorithm identified 8 predictors related to diagnostic codes for RA, medication codes, such as those for disease modifying anti-rheumatic drugs, and absence of alternative diagnoses such as psoriatic arthritis. The proposed data-driven method performed as well as the expert clinical knowledge based methods. CONCLUSION: Data-driven scheme, such as ensemble machine learning methods, has the potential of identifying the most informative predictors in a cost-effective and rapid way to accurately and reliably classify rheumatoid arthritis or other complex medical conditions in primary care EHRs.

Asunto(s)

Registros Electrónicos de Salud , Aprendizaje Automático , Algoritmos , Antirreumáticos/uso terapéutico , Artritis Reumatoide/tratamiento farmacológico , Humanos , Atención Primaria de Salud

19.

Classification of accelerometer wear and non-wear events in seconds for monitoring free-living physical activity.

Zhou, Shang-Ming; Hill, Rebecca A; Morgan, Kelly; Stratton, Gareth; Gravenor, Mike B; Bijlsma, Gunnar; Brophy, Sinead.

BMJ Open ; 5(5): e007447, 2015 May 11.

Artículo en Inglés | MEDLINE | ID: mdl-25968000

RESUMEN

OBJECTIVE: To classify wear and non-wear time of accelerometer data for accurately quantifying physical activity in public health or population level research. DESIGN: A bi-moving-window-based approach was used to combine acceleration and skin temperature data to identify wear and non-wear time events in triaxial accelerometer data that monitor physical activity. SETTING: Local residents in Swansea, Wales, UK. PARTICIPANTS: 50 participants aged under 16âyears (n=23) and over 17âyears (n=27) were recruited in two phases: phase 1: design of the wear/non-wear algorithm (n=20) and phase 2: validation of the algorithm (n=30). METHODS: Participants wore a triaxial accelerometer (GeneActiv) against the skin surface on the wrist (adults) or ankle (children). Participants kept a diary to record the timings of wear and non-wear and were asked to ensure that events of wear/non-wear last for a minimum of 15âmin. RESULTS: The overall sensitivity of the proposed method was 0.94 (95% CI 0.90 to 0.98) and specificity 0.91 (95% CI 0.88 to 0.94). It performed equally well for children compared with adults, and females compared with males. Using surface skin temperature data in combination with acceleration data significantly improved the classification of wear/non-wear time when compared with methods that used acceleration data only (p<0.01). CONCLUSIONS: Using either accelerometer seismic information or temperature information alone is prone to considerable error. Combining both sources of data can give accurate estimates of non-wear periods thus giving better classification of sedentary behaviour. This method can be used in population studies of physical activity in free-living environments.

Asunto(s)

Acelerometría/métodos , Ejercicio Físico , Monitoreo Ambulatorio/métodos , Conducta Sedentaria , Aceleración , Adolescente , Adulto , Algoritmos , Tobillo , Temperatura Corporal , Niño , Femenino , Humanos , Masculino , Actividad Motora , Piel , Gales , Muñeca , Adulto Joven

20.

Local modelling techniques for assessing micro-level impacts of risk factors in complex data: understanding health and socioeconomic inequalities in childhood educational attainments.

Zhou, Shang-Ming; Lyons, Ronan A; Bodger, Owen G; John, Ann; Brunt, Huw; Jones, Kerina; Gravenor, Mike B; Brophy, Sinead.

PLoS One ; 9(11): e113592, 2014.

Artículo en Inglés | MEDLINE | ID: mdl-25409038

RESUMEN

Although inequalities in health and socioeconomic status have an important influence on childhood educational performance, the interactions between these multiple factors relating to variation in educational outcomes at micro-level is unknown, and how to evaluate the many possible interactions of these factors is not well established. This paper aims to examine multi-dimensional deprivation factors and their impact on childhood educational outcomes at micro-level, focusing on geographic areas having widely different disparity patterns, in which each area is characterised by six deprivation domains (Income, Health, Geographical Access to Services, Housing, Physical Environment, and Community Safety). Traditional health statistical studies tend to use one global model to describe the whole population for macro-analysis. In this paper, we combine linked educational and deprivation data across small areas (median population of 1500), then use a local modelling technique, the Takagi-Sugeno fuzzy system, to predict area educational outcomes at ages 7 and 11. We define two new metrics, "Micro-impact of Domain" and "Contribution of Domain", to quantify the variations of local impacts of multidimensional factors on educational outcomes across small areas. The two metrics highlight differing priorities. Our study reveals complex multi-way interactions between the deprivation domains, which could not be provided by traditional health statistical methods based on single global model. We demonstrate that although Income has an expected central role, all domains contribute, and in some areas Health, Environment, Access to Services, Housing and Community Safety each could be the dominant factor. Thus the relative importance of health and socioeconomic factors varies considerably for different areas, depending on the levels of each of the other factors, and therefore each component of deprivation must be considered as part of a wider system. Childhood educational achievement could benefit from policies and intervention strategies that are tailored to the local geographic areas' profiles.

Asunto(s)

Estado de Salud , Modelos Teóricos , Factores Socioeconómicos , Niño , Bases de Datos Factuales , Vivienda , Humanos , Renta , Factores de Riesgo , Medio Social

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA