Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 343
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Nat Rev Genet ; 23(3): 169-181, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-34837041

RESUMO

The scale of genetic, epigenomic, transcriptomic, cheminformatic and proteomic data available today, coupled with easy-to-use machine learning (ML) toolkits, has propelled the application of supervised learning in genomics research. However, the assumptions behind the statistical models and performance evaluations in ML software frequently are not met in biological systems. In this Review, we illustrate the impact of several common pitfalls encountered when applying supervised ML in genomics. We explore how the structure of genomics data can bias performance evaluations and predictions. To address the challenges associated with applying cutting-edge ML methods to genomics, we describe solutions and appropriate use cases where ML modelling shows great potential.


Assuntos
Genômica/métodos , Aprendizado de Máquina , Animais , Genômica/normas , Genômica/tendências , Humanos , Aprendizado de Máquina/normas , Modelos Estatísticos , Software
8.
Semin Cell Dev Biol ; 121: 135-142, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34446357

RESUMO

Assigning function to single nucleotide polymorphisms (SNPs) to understand the mechanisms that link genetic and phenotypic variation and disease is an area of intensive research that is necessary to contribute to the continuing development of precision medicine. However, despite the apparent simplicity that is captured in the name SNP - 'single nucleotide' changes are not easy to functionally characterize. This complexity arises from multiple features of the genome including the fact that function is development and environment specific. As such, we are often fooled by our terminology and underlying assumptions that there is a single function for a SNP. Here we discuss some of what is known about SNPs, their functions and how we can go about characterizing them.


Assuntos
Variação Genética/genética , Aprendizado de Máquina/normas , Polimorfismo de Nucleotídeo Único/genética , Medicina de Precisão/métodos , Humanos
9.
Crit Care ; 28(1): 180, 2024 05 28.
Artigo em Inglês | MEDLINE | ID: mdl-38802973

RESUMO

BACKGROUND: Sepsis, an acute and potentially fatal systemic response to infection, significantly impacts global health by affecting millions annually. Prompt identification of sepsis is vital, as treatment delays lead to increased fatalities through progressive organ dysfunction. While recent studies have delved into leveraging Machine Learning (ML) for predicting sepsis, focusing on aspects such as prognosis, diagnosis, and clinical application, there remains a notable deficiency in the discourse regarding feature engineering. Specifically, the role of feature selection and extraction in enhancing model accuracy has been underexplored. OBJECTIVES: This scoping review aims to fulfill two primary objectives: To identify pivotal features for predicting sepsis across a variety of ML models, providing valuable insights for future model development, and To assess model efficacy through performance metrics including AUROC, sensitivity, and specificity. RESULTS: The analysis included 29 studies across diverse clinical settings such as Intensive Care Units (ICU), Emergency Departments, and others, encompassing 1,147,202 patients. The review highlighted the diversity in prediction strategies and timeframes. It was found that feature extraction techniques notably outperformed others in terms of sensitivity and AUROC values, thus indicating their critical role in improving sepsis prediction models. CONCLUSION: Key dynamic indicators, including vital signs and critical laboratory values, are instrumental in the early detection of sepsis. Applying feature selection methods significantly boosts model precision, with models like Random Forest and XG Boost showing promising results. Furthermore, Deep Learning models (DL) reveal unique insights, spotlighting the pivotal role of feature engineering in sepsis prediction, which could greatly benefit clinical practice.


Assuntos
Aprendizado de Máquina , Sepse , Humanos , Sepse/diagnóstico , Sepse/terapia , Aprendizado de Máquina/tendências , Aprendizado de Máquina/normas
10.
Crit Care ; 28(1): 230, 2024 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-38987802

RESUMO

BACKGROUND: Impaired microcirculation is a cornerstone of sepsis development and leads to reduced tissue oxygenation, influenced by fluid and catecholamine administration during treatment. Hyperspectral imaging (HSI) is a non-invasive bedside technology for visualizing physicochemical tissue characteristics. Machine learning (ML) for skin HSI might offer an automated approach for bedside microcirculation assessment, providing an individualized tissue fingerprint of critically ill patients in intensive care. The study aimed to determine if machine learning could be utilized to automatically identify regions of interest (ROIs) in the hand, thereby distinguishing between healthy individuals and critically ill patients with sepsis using HSI. METHODS: HSI raw data from 75 critically ill sepsis patients and from 30 healthy controls were recorded using TIVITA® Tissue System and analyzed using an automated ML approach. Additionally, patients were divided into two groups based on their SOFA scores for further subanalysis: less severely ill (SOFA ≤ 5) and severely ill (SOFA > 5). The analysis of the HSI raw data was fully-automated using MediaPipe for ROI detection (palm and fingertips) and feature extraction. HSI Features were statistically analyzed to highlight relevant wavelength combinations using Mann-Whitney-U test and Benjamini, Krieger, and Yekutieli (BKY) correction. In addition, Random Forest models were trained using bootstrapping, and feature importances were determined to gain insights regarding the wavelength importance for a model decision. RESULTS: An automated pipeline for generating ROIs and HSI feature extraction was successfully established. HSI raw data analysis accurately distinguished healthy controls from sepsis patients. Wavelengths at the fingertips differed in the ranges of 575-695 nm and 840-1000 nm. For the palm, significant differences were observed in the range of 925-1000 nm. Feature importance plots indicated relevant information in the same wavelength ranges. Combining palm and fingertip analysis provided the highest reliability, with an AUC of 0.92 to distinguish between sepsis patients and healthy controls. CONCLUSION: Based on this proof of concept, the integration of automated and standardized ROIs along with automated skin HSI analyzes, was able to differentiate between healthy individuals and patients with sepsis. This approach offers a reliable and objective assessment of skin microcirculation, facilitating the rapid identification of critically ill patients.


Assuntos
Estado Terminal , Imageamento Hiperespectral , Aprendizado de Máquina , Microcirculação , Humanos , Aprendizado de Máquina/normas , Masculino , Feminino , Microcirculação/fisiologia , Pessoa de Meia-Idade , Idoso , Imageamento Hiperespectral/métodos , Sepse/fisiopatologia , Sepse/diagnóstico , Adulto , Estudo de Prova de Conceito , Monitorização Fisiológica/métodos , Monitorização Fisiológica/instrumentação
11.
Crit Care ; 28(1): 189, 2024 06 04.
Artigo em Inglês | MEDLINE | ID: mdl-38834995

RESUMO

BACKGROUND: The aim of this retrospective cohort study was to develop and validate on multiple international datasets a real-time machine learning model able to accurately predict persistent acute kidney injury (AKI) in the intensive care unit (ICU). METHODS: We selected adult patients admitted to ICU classified as AKI stage 2 or 3 as defined by the "Kidney Disease: Improving Global Outcomes" criteria. The primary endpoint was the ability to predict AKI stage 3 lasting for at least 72 h while in the ICU. An explainable tree regressor was trained and calibrated on two tertiary, urban, academic, single-center databases and externally validated on two multi-centers databases. RESULTS: A total of 7759 ICU patients were enrolled for analysis. The incidence of persistent stage 3 AKI varied from 11 to 6% in the development and internal validation cohorts, respectively and 19% in external validation cohorts. The model achieved area under the receiver operating characteristic curve of 0.94 (95% CI 0.92-0.95) in the US external validation cohort and 0.85 (95% CI 0.83-0.88) in the Italian external validation cohort. CONCLUSIONS: A machine learning approach fed with the proper data pipeline can accurately predict onset of Persistent AKI Stage 3 during ICU patient stay in retrospective, multi-centric and international datasets. This model has the potential to improve management of AKI episodes in ICU if implemented in clinical practice.


Assuntos
Injúria Renal Aguda , Unidades de Terapia Intensiva , Aprendizado de Máquina , Humanos , Injúria Renal Aguda/diagnóstico , Injúria Renal Aguda/terapia , Aprendizado de Máquina/tendências , Aprendizado de Máquina/normas , Masculino , Feminino , Estudos Retrospectivos , Pessoa de Meia-Idade , Unidades de Terapia Intensiva/organização & administração , Unidades de Terapia Intensiva/estatística & dados numéricos , Idoso , Estudos de Coortes , Curva ROC , Adulto
18.
BMC Palliat Care ; 23(1): 124, 2024 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-38769564

RESUMO

BACKGROUND: Ex-ante identification of the last year in life facilitates a proactive palliative approach. Machine learning models trained on electronic health records (EHR) demonstrate promising performance in cancer prognostication. However, gaps in literature include incomplete reporting of model performance, inadequate alignment of model formulation with implementation use-case, and insufficient explainability hindering trust and adoption in clinical settings. Hence, we aim to develop an explainable machine learning EHR-based model that prompts palliative care processes by predicting for 365-day mortality risk among patients with advanced cancer within an outpatient setting. METHODS: Our cohort consisted of 5,926 adults diagnosed with Stage 3 or 4 solid organ cancer between July 1, 2017, and June 30, 2020 and receiving ambulatory cancer care within a tertiary center. The classification problem was modelled using Extreme Gradient Boosting (XGBoost) and aligned to our envisioned use-case: "Given a prediction point that corresponds to an outpatient cancer encounter, predict for mortality within 365-days from prediction point, using EHR data up to 365-days prior." The model was trained with 75% of the dataset (n = 39,416 outpatient encounters) and validated on a 25% hold-out dataset (n = 13,122 outpatient encounters). To explain model outputs, we used Shapley Additive Explanations (SHAP) values. Clinical characteristics, laboratory tests and treatment data were used to train the model. Performance was evaluated using area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC), while model calibration was assessed using the Brier score. RESULTS: In total, 17,149 of the 52,538 prediction points (32.6%) had a mortality event within the 365-day prediction window. The model demonstrated an AUROC of 0.861 (95% CI 0.856-0.867) and AUPRC of 0.771. The Brier score was 0.147, indicating slight overestimations of mortality risk. Explanatory diagrams utilizing SHAP values allowed visualization of feature impacts on predictions at both the global and individual levels. CONCLUSION: Our machine learning model demonstrated good discrimination and precision-recall in predicting 365-day mortality risk among individuals with advanced cancer. It has the potential to provide personalized mortality predictions and facilitate earlier integration of palliative care.


Assuntos
Registros Eletrônicos de Saúde , Aprendizado de Máquina , Cuidados Paliativos , Humanos , Aprendizado de Máquina/normas , Registros Eletrônicos de Saúde/estatística & dados numéricos , Cuidados Paliativos/métodos , Cuidados Paliativos/normas , Cuidados Paliativos/estatística & dados numéricos , Masculino , Feminino , Pessoa de Meia-Idade , Idoso , Medição de Risco/métodos , Neoplasias/mortalidade , Neoplasias/terapia , Estudos de Coortes , Adulto , Oncologia/métodos , Oncologia/normas , Idoso de 80 Anos ou mais , Mortalidade/tendências
19.
Brief Bioinform ; 22(1): 497-514, 2021 01 18.
Artigo em Inglês | MEDLINE | ID: mdl-31982914

RESUMO

How to accurately estimate protein-ligand binding affinity remains a key challenge in computer-aided drug design (CADD). In many cases, it has been shown that the binding affinities predicted by classical scoring functions (SFs) cannot correlate well with experimentally measured biological activities. In the past few years, machine learning (ML)-based SFs have gradually emerged as potential alternatives and outperformed classical SFs in a series of studies. In this study, to better recognize the potential of classical SFs, we have conducted a comparative assessment of 25 commonly used SFs. Accordingly, the scoring power was systematically estimated by using the state-of-the-art ML methods that replaced the original multiple linear regression method to refit individual energy terms. The results show that the newly-developed ML-based SFs consistently performed better than classical ones. In particular, gradient boosting decision tree (GBDT) and random forest (RF) achieved the best predictions in most cases. The newly-developed ML-based SFs were also tested on another benchmark modified from PDBbind v2007, and the impacts of structural and sequence similarities were evaluated. The results indicated that the superiority of the ML-based SFs could be fully guaranteed when sufficient similar targets were contained in the training set. Moreover, the effect of the combinations of features from multiple SFs was explored, and the results indicated that combining NNscore2.0 with one to four other classical SFs could yield the best scoring power. However, it was not applicable to derive a generic target-specific SF or SF combination.


Assuntos
Desenvolvimento de Medicamentos/métodos , Aprendizado de Máquina/normas , Proteômica/métodos , Animais , Desenvolvimento de Medicamentos/normas , Humanos , Ligantes , Ligação Proteica , Proteoma/metabolismo , Proteômica/normas
20.
Proc Natl Acad Sci U S A ; 117(38): 23393-23400, 2020 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-32887799

RESUMO

Most real-world networks are incompletely observed. Algorithms that can accurately predict which links are missing can dramatically speed up network data collection and improve network model validation. Many algorithms now exist for predicting missing links, given a partially observed network, but it has remained unknown whether a single best predictor exists, how link predictability varies across methods and networks from different domains, and how close to optimality current methods are. We answer these questions by systematically evaluating 203 individual link predictor algorithms, representing three popular families of methods, applied to a large corpus of 550 structurally diverse networks from six scientific domains. We first show that individual algorithms exhibit a broad diversity of prediction errors, such that no one predictor or family is best, or worst, across all realistic inputs. We then exploit this diversity using network-based metalearning to construct a series of "stacked" models that combine predictors into a single algorithm. Applied to a broad range of synthetic networks, for which we may analytically calculate optimal performance, these stacked models achieve optimal or nearly optimal levels of accuracy. Applied to real-world networks, stacked models are superior, but their accuracy varies strongly by domain, suggesting that link prediction may be fundamentally easier in social networks than in biological or technological networks. These results indicate that the state of the art for link prediction comes from combining individual algorithms, which can achieve nearly optimal predictions. We close with a brief discussion of limitations and opportunities for further improvements.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Humanos , Aprendizado de Máquina/normas , Modelos Estatísticos , Valor Preditivo dos Testes , Rede Social
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA