Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
BMC Med Inform Decis Mak ; 24(1): 170, 2024 Jun 17.
Article in English | MEDLINE | ID: mdl-38886772

ABSTRACT

BACKGROUND: Artificial intelligence (AI) has become a pivotal tool in advancing contemporary personalised medicine, with the goal of tailoring treatments to individual patient conditions. This has heightened the demand for access to diverse data from clinical practice and daily life for research, posing challenges due to the sensitive nature of medical information, including genetics and health conditions. Regulations like the Health Insurance Portability and Accountability Act (HIPAA) in the U.S. and the General Data Protection Regulation (GDPR) in Europe aim to strike a balance between data security, privacy, and the imperative for access. RESULTS: We present the Gemelli Generator - Real World Data (GEN-RWD) Sandbox, a modular multi-agent platform designed for distributed analytics in healthcare. Its primary objective is to empower external researchers to leverage hospital data while upholding privacy and ownership, obviating the need for direct data sharing. Docker compatibility adds an extra layer of flexibility, and scalability is assured through modular design, facilitating combinations of Proxy and Processor modules with various graphical interfaces. Security and reliability are reinforced through components like Identity and Access Management (IAM) agent, and a Blockchain-based notarisation module. Certification processes verify the identities of information senders and receivers. CONCLUSIONS: The GEN-RWD Sandbox architecture achieves a good level of usability while ensuring a blend of flexibility, scalability, and security. Featuring a user-friendly graphical interface catering to diverse technical expertise, its external accessibility enables personnel outside the hospital to use the platform. Overall, the GEN-RWD Sandbox emerges as a comprehensive solution for healthcare distributed analytics, maintaining a delicate equilibrium between accessibility, scalability, and security.


Subject(s)
Computer Security , Confidentiality , Humans , Computer Security/standards , Confidentiality/standards , Artificial Intelligence , Hospitals
2.
Sci Rep ; 14(1): 11514, 2024 05 20.
Article in English | MEDLINE | ID: mdl-38769364

ABSTRACT

Comorbidity is widespread in the ageing population, implying multiple and complex medical needs for individuals and a public health burden. Determining risk factors and predicting comorbidity development can help identify at-risk subjects and design prevention strategies. Using socio-demographic and clinical data from approximately 11,000 subjects monitored over 11 years in the English Longitudinal Study of Ageing, we develop a dynamic Bayesian network (DBN) to model the onset and interaction of three cardio-metabolic comorbidities, namely type 2 diabetes (T2D), hypertension, and heart problems. The DBN allows us to identify risk factors for developing each morbidity, simulate ageing progression over time, and stratify the population based on the risk of outcome occurrence. By applying hierarchical agglomerative clustering to the simulated, dynamic risk of experiencing morbidities, we identified patients with similar risk patterns and the variables contributing to their discrimination. The network reveals a direct joint effect of biomarkers and lifestyle on outcomes over time, such as the impact of fasting glucose, HbA1c, and BMI on T2D development. Mediated cross-relationships between comorbidities also emerge, showcasing the interconnected nature of these health issues. The model presents good calibration and discrimination ability, particularly in predicting the onset of T2D (iAUC-ROC = 0.828, iAUC-PR = 0.294) and survival (iAUC-ROC = 0.827, iAUC-PR = 0.311). Stratification analysis unveils two distinct clusters for all comorbidities, effectively discriminated by variables like HbA1c for T2D and age at baseline for heart problems. The developed DBN constitutes an effective, highly-explainable predictive risk tool for simulating and stratifying the dynamic risk of developing cardio-metabolic comorbidities. Its use could help identify the effects of risk factors and develop health policies that prevent the occurrence of comorbidities.


Subject(s)
Aging , Bayes Theorem , Comorbidity , Diabetes Mellitus, Type 2 , Models, Statistical , Humans , Diabetes Mellitus, Type 2/epidemiology , Female , Male , Aged , Middle Aged , Longitudinal Studies , Risk Factors , Hypertension/epidemiology , Adult , Aged, 80 and over , Heart Diseases/epidemiology
3.
Artif Intell Med ; 142: 102588, 2023 08.
Article in English | MEDLINE | ID: mdl-37316101

ABSTRACT

BACKGROUND: Amyotrophic Lateral Sclerosis (ALS) is a fatal neurodegenerative disorder characterised by the progressive loss of motor neurons in the brain and spinal cord. The fact that ALS's disease course is highly heterogeneous, and its determinants not fully known, combined with ALS's relatively low prevalence, renders the successful application of artificial intelligence (AI) techniques particularly arduous. OBJECTIVE: This systematic review aims at identifying areas of agreement and unanswered questions regarding two notable applications of AI in ALS, namely the automatic, data-driven stratification of patients according to their phenotype, and the prediction of ALS progression. Differently from previous works, this review is focused on the methodological landscape of AI in ALS. METHODS: We conducted a systematic search of the Scopus and PubMed databases, looking for studies on data-driven stratification methods based on unsupervised techniques resulting in (A) automatic group discovery or (B) a transformation of the feature space allowing patient subgroups to be identified; and for studies on internally or externally validated methods for the prediction of ALS progression. We described the selected studies according to the following characteristics, when applicable: variables used, methodology, splitting criteria and number of groups, prediction outcomes, validation schemes, and metrics. RESULTS: Of the starting 1604 unique reports (2837 combined hits between Scopus and PubMed), 239 were selected for thorough screening, leading to the inclusion of 15 studies on patient stratification, 28 on prediction of ALS progression, and 6 on both stratification and prediction. In terms of variables used, most stratification and prediction studies included demographics and features derived from the ALSFRS or ALSFRS-R scores, which were also the main prediction targets. The most represented stratification methods were K-means, and hierarchical and expectation-maximisation clustering; while random forests, logistic regression, the Cox proportional hazard model, and various flavours of deep learning were the most widely used prediction methods. Predictive model validation was, albeit unexpectedly, quite rarely performed in absolute terms (leading to the exclusion of 78 eligible studies), with the overwhelming majority of included studies resorting to internal validation only. CONCLUSION: This systematic review highlighted a general agreement in terms of input variable selection for both stratification and prediction of ALS progression, and in terms of prediction targets. A striking lack of validated models emerged, as well as a general difficulty in reproducing many published studies, mainly due to the absence of the corresponding parameter lists. While deep learning seems promising for prediction applications, its superiority with respect to traditional methods has not been established; there is, instead, ample room for its application in the subfield of patient stratification. Finally, an open question remains on the role of new environmental and behavioural variables collected via novel, real-time sensors.


Subject(s)
Amyotrophic Lateral Sclerosis , Humans , Amyotrophic Lateral Sclerosis/diagnosis , Artificial Intelligence , Brain , Cluster Analysis , Databases, Factual
4.
BMC Med Inform Decis Mak ; 22(Suppl 6): 346, 2023 02 02.
Article in English | MEDLINE | ID: mdl-36732801

ABSTRACT

BACKGROUND: Amyotrophic Lateral Sclerosis (ALS) is a neurodegenerative disease whose spreading and progression mechanisms are still unclear. The ability to predict ALS prognosis would improve the patients' quality of life and support clinicians in planning treatments. In this paper, we investigate ALS evolution trajectories using Process Mining (PM) techniques enriched to both easily mine processes and automatically reveal how the pathways differentiate according to patients' characteristics. METHODS: We consider data collected in two distinct data sources, namely the Pooled Resource Open-Access ALS Clinical Trials (PRO-ACT) dataset and a real-world clinical register (ALS-BS) including data of patients followed up in two tertiary clinical centers of Brescia (Italy). With a focus on the functional abilities progressively impaired as the disease progresses, we use two Process Discovery methods, namely the Directly-Follows Graph and the CareFlow Miner, to mine the population disease trajectories on the PRO-ACT dataset. We characterize the impairment trajectories in terms of patterns, timing, and probabilities, and investigate the effect of some patients' characteristics at onset on the followed paths. Finally, we perform a comparative study of the impairment trajectories mined in PRO-ACT versus ALS-BS. RESULTS: We delineate the progression pathways on PRO-ACT, identifying the predominant disabilities at different stages of the disease: for instance, 85% of patients enter the trials without disabilities, and 48% of them experience the impairment of Walking/Self-care abilities first. We then test how a spinal onset increases the risk of experiencing the loss of Walking/Self-care ability as first impairment (52% vs. 27% of patients develop it as the first impairment in the spinal vs. the bulbar cohorts, respectively), as well as how an older age at onset corresponds to a more rapid progression to death. When compared, the PRO-ACT and the ALS-BS patient populations present some similarities in terms of natural progression of the disease, as well as some differences in terms of observed trajectories plausibly due to the trial scheduling and recruitment criteria. CONCLUSIONS: We exploited PM to provide an overview of the evolution scenarios of an ALS trial population and to preliminary compare it to the progression observed in a clinical cohort. Future work will focus on further improving the understanding of the disease progression mechanisms, by including additional real-world subjects as well as by extending the set of events considered in the impairment trajectories.


Subject(s)
Amyotrophic Lateral Sclerosis , Neurodegenerative Diseases , Humans , Amyotrophic Lateral Sclerosis/therapy , Disease Progression , Quality of Life , Prognosis
5.
PLoS Comput Biol ; 18(12): e1010718, 2022 12.
Article in English | MEDLINE | ID: mdl-36520712

ABSTRACT

Applying computational statistics or machine learning methods to data is a key component of many scientific studies, in any field, but alone might not be sufficient to generate robust and reliable outcomes and results. Before applying any discovery method, preprocessing steps are necessary to prepare the data to the computational analysis. In this framework, data cleaning and feature engineering are key pillars of any scientific study involving data analysis and that should be adequately designed and performed since the first phases of the project. We call "feature" a variable describing a particular trait of a person or an observation, recorded usually as a column in a dataset. Even if pivotal, these data cleaning and feature engineering steps sometimes are done poorly or inefficiently, especially by beginners and unexperienced researchers. For this reason, we propose here our quick tips for data cleaning and feature engineering on how to carry out these important preprocessing steps correctly avoiding common mistakes and pitfalls. Although we designed these guidelines with bioinformatics and health informatics scenarios in mind, we believe they can more in general be applied to any scientific area. We therefore target these guidelines to any researcher or practitioners wanting to perform data cleaning or feature engineering. We believe our simple recommendations can help researchers and scholars perform better computational analyses that can lead, in turn, to more solid outcomes and more reliable discoveries.


Subject(s)
Computational Biology , Machine Learning , Humans , Computational Biology/methods , Engineering
6.
Front Oncol ; 12: 1043675, 2022.
Article in English | MEDLINE | ID: mdl-36568192

ABSTRACT

During the acute phase of the COVID-19 pandemic, hospitals faced a challenge to manage patients, especially those with other comorbidities and medical needs, such as cancer patients. Here, we use Process Mining to analyze real-world therapeutic pathways in a cohort of 1182 cancer patients of the Lausanne University Hospital following COVID-19 infection. The algorithm builds trees representing sequences of coarse-grained events such as Home, Hospitalization, Intensive Care and Death. The same trees can also show probability of death or time-to-event statistics in each node. We introduce a new tool, called Differential Process Mining, which enables comparison of two patient strata in each node of the tree, in terms of hits and death rate, together with a statistical significance test. We thus compare management of COVID-19 patients with an active cancer in the first vs. second COVID-19 waves to quantify hospital adaptation to the pandemic. We also compare patients having undergone systemic therapy within 1 year to the rest of the cohort to understand the impact of an active cancer and/or its treatment on COVID-19 outcome. This study demonstrates the value of Process Mining to analyze complex event-based real-world data and generate hypotheses on hospital resource management or on clinical patient care.

7.
J Neurol ; 269(7): 3858-3878, 2022 Jul.
Article in English | MEDLINE | ID: mdl-35266043

ABSTRACT

OBJECTIVE: To employ Artificial Intelligence to model, predict and simulate the amyotrophic lateral sclerosis (ALS) progression over time in terms of variable interactions, functional impairments, and survival. METHODS: We employed demographic and clinical variables, including functional scores and the utilisation of support interventions, of 3940 ALS patients from four Italian and two Israeli registers to develop a new approach based on Dynamic Bayesian Networks (DBNs) that models the ALS evolution over time, in two distinct scenarios of variable availability. The method allows to simulate patients' disease trajectories and predict the probability of functional impairment and survival at different time points. RESULTS: DBNs explicitly represent the relationships between the variables and the pathways along which they influence the disease progression. Several notable inter-dependencies were identified and validated by comparison with literature. Moreover, the implemented tool allows the assessment of the effect of different markers on the disease course, reproducing the probabilistically expected clinical progressions. The tool shows high concordance in terms of predicted and real prognosis, assessed as time to functional impairments and survival (integral of the AU-ROC in the first 36 months between 0.80-0.93 and 0.84-0.89 for the two scenarios, respectively). CONCLUSIONS: Provided only with measurements commonly collected during the first visit, our models can predict time to the loss of independence in walking, breathing, swallowing, communicating, and survival and it can be used to generate in silico patient cohorts with specific characteristics. Our tool provides a comprehensive framework to support physicians in treatment planning and clinical decision-making.


Subject(s)
Amyotrophic Lateral Sclerosis , Amyotrophic Lateral Sclerosis/diagnosis , Artificial Intelligence , Bayes Theorem , Disease Progression , Humans , Models, Statistical
8.
Article in English | MEDLINE | ID: mdl-32932877

ABSTRACT

In the age of Evidence-Based Medicine, Clinical Guidelines (CGs) are recognized to be an indispensable tool to support physicians in their daily clinical practice. Medical Informatics is expected to play a relevant role in facilitating diffusion and adoption of CGs. However, the past pioneering approaches, often fragmented in many disciplines, did not lead to solutions that are actually exploited in hospitals. Process Mining for Healthcare (PM4HC) is an emerging discipline gaining the interest of healthcare experts, and seems able to deal with many important issues in representing CGs. In this position paper, we briefly describe the story and the state-of-the-art of CGs, and the efforts and results of the past approaches of medical informatics. Then, we describe PM4HC, and we answer questions like how can PM4HC cope with this challenge? Which role does PM4HC play and which rules should be employed for the PM4HC scientific community?


Subject(s)
Delivery of Health Care , Evidence-Based Medicine
9.
BMC Med Inform Decis Mak ; 20(Suppl 5): 174, 2020 08 20.
Article in English | MEDLINE | ID: mdl-32819346

ABSTRACT

BACKGROUND: Clinical registers constitute an invaluable resource in the medical data-driven decision making context. Accurate machine learning and data mining approaches on these data can lead to faster diagnosis, definition of tailored interventions, and improved outcome prediction. A typical issue when implementing such approaches is the almost unavoidable presence of missing values in the collected data. In this work, we propose an imputation algorithm based on a mutual information-weighted k-nearest neighbours approach, able to handle the simultaneous presence of missing information in different types of variables. We developed and validated the method on a clinical register, constituted by the information collected over subsequent screening visits of a cohort of patients affected by amyotrophic lateral sclerosis. METHODS: For each subject with missing data to be imputed, we create a feature vector constituted by the information collected over his/her first three months of visits. This vector is used as sample in a k-nearest neighbours procedure, in order to select, among the other patients, the ones with the most similar temporal evolution of the disease over time. An ad hoc similarity metric was implemented for the sample comparison, capable of handling the different nature of the data, the presence of multiple missing values and include the cross-information among features captured by the mutual information statistic. RESULTS: We validated the proposed imputation method on an independent test set, comparing its performance with those of three state-of-the-art competitors, resulting in better performance. We further assessed the validity of our algorithm by comparing the performance of a survival classifier built on the data imputed with our method versus the one built on the data imputed with the best-performing competitor. CONCLUSIONS: Imputation of missing data is a crucial -and often mandatory- step when working with real-world datasets. The algorithm proposed in this work could effectively impute an amyotrophic lateral sclerosis clinical dataset, by handling the temporal and the mixed-type nature of the data and by exploiting the cross-information among features. We also showed how the imputation quality can affect a machine learning task.


Subject(s)
Algorithms , Computational Biology/methods , Data Mining , Datasets as Topic , Amyotrophic Lateral Sclerosis , Bayes Theorem , Disease/classification , Humans , Information Storage and Retrieval
10.
J Healthc Inform Res ; 4(2): 174-188, 2020 Jun.
Article in English | MEDLINE | ID: mdl-35415441

ABSTRACT

The presence of missing data is a common problem that affects almost all clinical datasets. Since most available data mining and machine learning algorithms require complete datasets, accurately imputing (i.e. "filling in") the missing data is an essential step. This paper presents a methodology for the missing data imputation of longitudinal clinical data based on the integration of linear interpolation and a weighted K-Nearest Neighbours (KNN) algorithm. The Maximal Information Coefficient (MIC) values among features are employed as weights for the distance computation in the KNN algorithm in order to integrate intra- and inter-patient information. An interpolation-based imputation approach was also employed and tested both independently and in combination with the KNN algorithm. The final imputation is carried out by applying the best performing method for each feature. The methodology was validated on a dataset of clinical laboratory test results of 13 commonly measured analytes of patients in an intensive care unit (ICU) setting. The performance results are compared with those of 3D-MICE, a state-of-the-art imputation method for cross-sectional and longitudinal patient data. This work was presented in the context of the 2019 ICHI Data Analytics Challenge on Missing data Imputation (DACMI).

SELECTION OF CITATIONS
SEARCH DETAIL
...