Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
1.
Proc Natl Acad Sci U S A ; 121(24): e2318124121, 2024 Jun 11.
Artículo en Inglés | MEDLINE | ID: mdl-38830100

RESUMEN

There is much excitement about the opportunity to harness the power of large language models (LLMs) when building problem-solving assistants. However, the standard methodology of evaluating LLMs relies on static pairs of inputs and outputs; this is insufficient for making an informed decision about which LLMs are best to use in an interactive setting, and how that varies by setting. Static assessment therefore limits how we understand language model capabilities. We introduce CheckMate, an adaptable prototype platform for humans to interact with and evaluate LLMs. We conduct a study with CheckMate to evaluate three language models (InstructGPT, ChatGPT, and GPT-4) as assistants in proving undergraduate-level mathematics, with a mixed cohort of participants from undergraduate students to professors of mathematics. We release the resulting interaction and rating dataset, MathConverse. By analyzing MathConverse, we derive a taxonomy of human query behaviors and uncover that despite a generally positive correlation, there are notable instances of divergence between correctness and perceived helpfulness in LLM generations, among other findings. Further, we garner a more granular understanding of GPT-4 mathematical problem-solving through a series of case studies, contributed by experienced mathematicians. We conclude with actionable takeaways for ML practitioners and mathematicians: models that communicate uncertainty, respond well to user corrections, and can provide a concise rationale for their recommendations, may constitute better assistants. Humans should inspect LLM output carefully given their current shortcomings and potential for surprising fallibility.


Asunto(s)
Lenguaje , Matemática , Solución de Problemas , Humanos , Solución de Problemas/fisiología , Estudiantes/psicología
2.
PLoS Comput Biol ; 19(4): e1010719, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-37058541

RESUMEN

The computational principles adopted by the hippocampus in associative memory (AM) tasks have been one of the most studied topics in computational and theoretical neuroscience. Recent theories suggested that AM and the predictive activities of the hippocampus could be described within a unitary account, and that predictive coding underlies the computations supporting AM in the hippocampus. Following this theory, a computational model based on classical hierarchical predictive networks was proposed and was shown to perform well in various AM tasks. However, this fully hierarchical model did not incorporate recurrent connections, an architectural component of the CA3 region of the hippocampus that is crucial for AM. This makes the structure of the model inconsistent with the known connectivity of CA3 and classical recurrent models such as Hopfield Networks, which learn the covariance of inputs through their recurrent connections to perform AM. Earlier PC models that learn the covariance information of inputs explicitly via recurrent connections seem to be a solution to these issues. Here, we show that although these models can perform AM, they do it in an implausible and numerically unstable way. Instead, we propose alternatives to these earlier covariance-learning predictive coding networks, which learn the covariance information implicitly and plausibly, and can use dendritic structures to encode prediction errors. We show analytically that our proposed models are perfectly equivalent to the earlier predictive coding model learning covariance explicitly, and encounter no numerical issues when performing AM tasks in practice. We further show that our models can be combined with hierarchical predictive coding networks to model the hippocampo-neocortical interactions. Our models provide a biologically plausible approach to modelling the hippocampal network, pointing to a potential computational mechanism during hippocampal memory formation and recall, which employs both predictive coding and covariance learning based on the recurrent network structure of the hippocampus.


Asunto(s)
Hipocampo , Aprendizaje , Recuerdo Mental , Condicionamiento Clásico , Modelos Neurológicos
3.
IEEE Trans Med Imaging ; 43(1): 76-95, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37379176

RESUMEN

Existing self-supervised medical image segmentation usually encounters the domain shift problem (i.e., the input distribution of pre-training is different from that of fine-tuning) and/or the multimodality problem (i.e., it is based on single-modal data only and cannot utilize the fruitful multimodal information of medical images). To solve these problems, in this work, we propose multimodal contrastive domain sharing (Multi-ConDoS) generative adversarial networks to achieve effective multimodal contrastive self-supervised medical image segmentation. Compared to the existing self-supervised approaches, Multi-ConDoS has the following three advantages: (i) it utilizes multimodal medical images to learn more comprehensive object features via multimodal contrastive learning; (ii) domain translation is achieved by integrating the cyclic learning strategy of CycleGAN and the cross-domain translation loss of Pix2Pix; (iii) novel domain sharing layers are introduced to learn not only domain-specific but also domain-sharing information from the multimodal medical images. Extensive experiments on two publicly multimodal medical image segmentation datasets show that, with only 5% (resp., 10%) of labeled data, Multi-ConDoS not only greatly outperforms the state-of-the-art self-supervised and semi-supervised medical image segmentation baselines with the same ratio of labeled data, but also achieves similar (sometimes even better) performances as fully supervised segmentation methods with 50% (resp., 100%) of labeled data, which thus proves that our work can achieve superior segmentation performances with very low labeling workload. Furthermore, ablation studies prove that the above three improvements are all effective and essential for Multi-ConDoS to achieve this very superior performance.


Asunto(s)
Procesamiento de Imagen Asistido por Computador , Frecuencia Respiratoria , Aprendizaje Automático Supervisado
4.
Nat Neurosci ; 27(2): 348-358, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38172438

RESUMEN

For both humans and machines, the essence of learning is to pinpoint which components in its information processing pipeline are responsible for an error in its output, a challenge that is known as 'credit assignment'. It has long been assumed that credit assignment is best solved by backpropagation, which is also the foundation of modern machine learning. Here, we set out a fundamentally different principle on credit assignment called 'prospective configuration'. In prospective configuration, the network first infers the pattern of neural activity that should result from learning, and then the synaptic weights are modified to consolidate the change in neural activity. We demonstrate that this distinct mechanism, in contrast to backpropagation, (1) underlies learning in a well-established family of models of cortical circuits, (2) enables learning that is more efficient and effective in many contexts faced by biological organisms and (3) reproduces surprising patterns of neural activity and behavior observed in diverse human and rat learning experiments.


Asunto(s)
Aprendizaje Automático , Redes Neurales de la Computación , Humanos , Ratas , Animales , Estudios Prospectivos , Plasticidad Neuronal
5.
Comput Biol Med ; 168: 107744, 2024 01.
Artículo en Inglés | MEDLINE | ID: mdl-38006826

RESUMEN

Data augmentation is widely applied to medical image analysis tasks in limited datasets with imbalanced classes and insufficient annotations. However, traditional augmentation techniques cannot supply extra information, making the performance of diagnosis unsatisfactory. GAN-based generative methods have thus been proposed to obtain additional useful information to realize more effective data augmentation; but existing generative data augmentation techniques mainly encounter two problems: (i) Current generative data augmentation lacks of the capability in using cross-domain differential information to extend limited datasets. (ii) The existing generative methods cannot provide effective supervised information in medical image segmentation tasks. To solve these problems, we propose an attention-guided cross-domain tumor image generation model (CDA-GAN) with an information enhancement strategy. The CDA-GAN can generate diverse samples to expand the scale of datasets, improving the performance of medical image diagnosis and treatment tasks. In particular, we incorporate channel attention into a CycleGAN-based cross-domain generation network that captures inter-domain information and generates positive or negative samples of brain tumors. In addition, we propose a semi-supervised spatial attention strategy to guide spatial information of features at the pixel level in tumor generation. Furthermore, we add spectral normalization to prevent the discriminator from mode collapse and stabilize the training procedure. Finally, to resolve an inapplicability problem in the segmentation task, we further propose an application strategy of using this data augmentation model to achieve more accurate medical image segmentation with limited data. Experimental studies on two public brain tumor datasets (BraTS and TCIA) show that the proposed CDA-GAN model greatly outperforms the state-of-the-art generative data augmentation in both practical medical image classification tasks and segmentation tasks; e.g. CDA-GAN is 0.50%, 1.72%, 2.05%, and 0.21% better than the best SOTA baseline in terms of ACC, AUC, Recall, and F1, respectively, in the classification task of BraTS, while its improvements w.r.t. the best SOTA baseline in terms of Dice, Sens, HD95, and mIOU, in the segmentation task of TCIA are 2.50%, 0.90%, 14.96%, and 4.18%, respectively.


Asunto(s)
Neoplasias Encefálicas , Humanos , Procesamiento de Imagen Asistido por Computador
6.
Comput Biol Med ; 169: 107877, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38157774

RESUMEN

Although existing deep reinforcement learning-based approaches have achieved some success in image augmentation tasks, their effectiveness and adequacy for data augmentation in intelligent medical image analysis are still unsatisfactory. Therefore, we propose a novel Adaptive Sequence-length based Deep Reinforcement Learning (ASDRL) model for Automatic Data Augmentation (AutoAug) in intelligent medical image analysis. The improvements of ASDRL-AutoAug are two-fold: (i) To remedy the problem of some augmented images being invalid, we construct a more accurate reward function based on different variations of the augmentation trajectories. This reward function assesses the validity of each augmentation transformation more accurately by introducing different information about the validity of the augmented images. (ii) Then, to alleviate the problem of insufficient augmentation, we further propose a more intelligent automatic stopping mechanism (ASM). ASM feeds a stop signal to the agent automatically by judging the adequacy of image augmentation. This ensures that each transformation before stopping the augmentation can smoothly improve the model performance. Extensive experimental results on three medical image segmentation datasets show that (i) ASDRL-AutoAug greatly outperforms the state-of-the-art data augmentation methods in medical image segmentation tasks, (ii) the proposed improvements are both effective and essential for ASDRL-AutoAug to achieve superior performance, and the new reward evaluates the transformations more accurately than existing reward functions, and (iii) we also demonstrate that ASDRL-AutoAug is adaptive for different images in terms of sequence length, as well as generalizable across different segmentation models.

7.
Comput Biol Med ; 153: 106487, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-36603432

RESUMEN

Pre-processing is widely applied in medical image analysis to remove the interference information. However, the existing pre-processing solutions mainly encounter two problems: (i) it is heavily relied on the assistance of clinical experts, making it hard for intelligent CAD systems to deploy quickly; (ii) due to the personnel and information barriers, it is difficult for medical institutions to conduct the same pre-processing operations, making a deep model that performs well on a specific medical institution difficult to achieve similar performances on the same task in other medical institutions. To overcome these problems, we propose a deep-reinforcement-learning-based task-oriented homogenized automatic pre-processing (DRL-HAPre) framework to overcome these two problems. This framework utilizes deep reinforcement learning techniques to learn a policy network to automatically and adaptively select the optimal pre-processing operations for the input medical images according to different analysis tasks, thus helping the intelligent CAD system to achieve a rapid deployment (i.e., painless) and maintain a satisfactory performance (i.e., accurate) among different medical institutes. To verify the effectiveness and advantages of the proposed DRL-HAPre framework, we further develop a homogenized automatic pre-processing model based on the DRL-HAPre framework to realize the automatic pre-processing of key region selection (called HAPre-KRS) in the pneumonia image classification task. Extensive experimental studies are conducted on three pediatric pneumonia classification datasets with different image qualities, and the results show that: (i) There does exist a hard-to-reproduce problem in clinical practices and the fact that having different medical image qualities in different medical institutes is an important reason for the existing of hard-to-reproduce problem, so it is compelling to propose homogenized automatic pre-processing method. (ii) The proposed HAPre-KRS model and DRL-HAPre framework greatly outperform three kinds of state-of-the-art baselines (i.e., pre-processing, attention and pneumonia baseline), and the lower the medical image quality, the greater the improvements of using our HAPre-KRS model and DRL-HAPre framework. (iii) With the help of homogenized pre-processing, HAPre-KRS (and DRL-HAPre framework) can greatly avoid performance degradation in real-world cross-source applications (i.e., thus overcoming the hard-to-reproduce problem).


Asunto(s)
Aprendizaje Profundo , Humanos , Niño , Procesamiento de Imagen Asistido por Computador/métodos
8.
Med Image Anal ; 83: 102656, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36327656

RESUMEN

Semi-supervised learning has a great potential in medical image segmentation tasks with a few labeled data, but most of them only consider single-modal data. The excellent characteristics of multi-modal data can improve the performance of semi-supervised segmentation for each image modality. However, a shortcoming for most existing multi-modal solutions is that as the corresponding processing models of the multi-modal data are highly coupled, multi-modal data are required not only in the training but also in the inference stages, which thus limits its usage in clinical practice. Consequently, we propose a semi-supervised contrastive mutual learning (Semi-CML) segmentation framework, where a novel area-similarity contrastive (ASC) loss leverages the cross-modal information and prediction consistency between different modalities to conduct contrastive mutual learning. Although Semi-CML can improve the segmentation performance of both modalities simultaneously, there is a performance gap between two modalities, i.e., there exists a modality whose segmentation performance is usually better than that of the other. Therefore, we further develop a soft pseudo-label re-learning (PReL) scheme to remedy this gap. We conducted experiments on two public multi-modal datasets. The results show that Semi-CML with PReL greatly outperforms the state-of-the-art semi-supervised segmentation methods and achieves a similar (and sometimes even better) performance as fully supervised segmentation methods with 100% labeled data, while reducing the cost of data annotation by 90%. We also conducted ablation studies to evaluate the effectiveness of the ASC loss and the PReL module.


Asunto(s)
Aprendizaje Automático Supervisado , Humanos
9.
Comput Biol Med ; 163: 107149, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37348265

RESUMEN

Feature pyramid networks (FPNs) are widely used in the existing deep detection models to help them utilize multi-scale features. However, there exist two multi-scale feature fusion problems for the FPN-based deep detection models in medical image detection tasks: insufficient multi-scale feature fusion and the same importance for multi-scale features. Therefore, in this work, we propose a new enhanced backbone model, EFPNs, to overcome these problems and help the existing FPN-based detection models to achieve much better medical image detection performances. We first introduce an additional top-down pyramid to help the detection networks fuse deeper multi-scale information; then, a scale enhancement module is developed to use different sizes of kernels to generate more diverse multi-scale features. Finally, we propose a feature fusion attention module to estimate and assign different importance weights to features with different depths and scales. Extensive experiments are conducted on two public lesion detection datasets for different medical image modalities (X-ray and MRI). On the mAP and mR evaluation metrics, EFPN-based Faster R-CNNs improved 1.55% and 4.3% on the PenD (X-ray) dataset, and 2.74% and 3.1% on the BraTs (MRI) dataset, respectively. EFPN-based Faster R-CNNs achieve much better performances than the state-of-the-art baselines in medical image detection tasks. The proposed three improvements are all essential and effective for EFPNs to achieve superior performances; and besides Faster R-CNNs, EFPNs can be easily applied to other deep models to significantly enhance their performances in medical image detection tasks.


Asunto(s)
Benchmarking , Procesamiento de Imagen Asistido por Computador
10.
Comput Biol Med ; 160: 106963, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37150087

RESUMEN

Although the existing deep supervised solutions have achieved some great successes in medical image segmentation, they have the following shortcomings; (i) semantic difference problem: since they are obtained by very different convolution or deconvolution processes, the intermediate masks and predictions in deep supervised baselines usually contain semantics with different depth, which thus hinders the models' learning capabilities; (ii) low learning efficiency problem: additional supervision signals will inevitably make the training of the models more time-consuming. Therefore, in this work, we first propose two deep supervised learning strategies, U-Net-Deep and U-Net-Auto, to overcome the semantic difference problem. Then, to resolve the low learning efficiency problem, upon the above two strategies, we further propose a new deep supervised segmentation model, called µ-Net, to achieve not only effective but also efficient deep supervised medical image segmentation by introducing a tied-weight decoder to generate pseudo-labels with more diverse information and also speed up the convergence in training. Finally, three different types of µ-Net-based deep supervision strategies are explored and a Similarity Principle of Deep Supervision is further derived to guide future research in deep supervised learning. Experimental studies on four public benchmark datasets show that µ-Net greatly outperforms all the state-of-the-art baselines, including the state-of-the-art deeply supervised segmentation models, in terms of both effectiveness and efficiency. Ablation studies sufficiently prove the soundness of the proposed Similarity Principle of Deep Supervision, the necessity and effectiveness of the tied-weight decoder, and using both the segmentation and reconstruction pseudo-labels for deep supervised learning.


Asunto(s)
Benchmarking , Procesamiento de Imagen Asistido por Computador , Semántica , Sonido
11.
Front Bioeng Biotechnol ; 11: 1049555, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36815901

RESUMEN

Automatic medical image detection aims to utilize artificial intelligence techniques to detect lesions in medical images accurately and efficiently, which is one of the most important tasks in computer-aided diagnosis (CAD) systems, and can be embedded into portable imaging devices for intelligent Point of Care (PoC) Diagnostics. The Feature Pyramid Networks (FPN) based models are widely used deep-learning-based solutions for automatic medical image detection. However, FPN-based medical lesion detection models have two shortcomings: the object position offset problem and the degradation problem of IoU-based loss. Therefore, in this work, we propose a novel FPN-based backbone model, i.e., Multi-Pathway Feature Pyramid Networks with Position Attention Guided Connections and Vertex Distance IoU (abbreviated as PAC-Net), to replace vanilla FPN for more accurate lesion detection, where two innovative improvements, a position attention guided connection (PAC) module and Vertex Distance IoU Vertex Distance Intersection over Union loss, are proposed to address the above-mentioned shortcomings of vanilla FPN, respectively. Extensive experiments are conducted on a public medical image detection dataset, i.e., Deeplesion, and the results showed that i) PAC-Net outperforms all state-of-the-art FPN-based depth models in both evaluation metrics of lesion detection on the DeepLesion dataset, ii) the proposed PAC module and VDIoU loss are both effective and important for PAC-Net to achieve a superior performance in automatic medical image detection tasks, and iii) the proposed VDIoU loss converges more quickly than the existing IoU-based losses, making PAC-Net an accurate and also highly efficient 3D medical image detection model.

12.
Artículo en Inglés | MEDLINE | ID: mdl-38145508

RESUMEN

To reduce doctors' workload, deep-learning-based automatic medical report generation has recently attracted more and more research efforts, where deep convolutional neural networks (CNNs) are employed to encode the input images, and recurrent neural networks (RNNs) are used to decode the visual features into medical reports automatically. However, these state-of-the-art methods mainly suffer from three shortcomings: 1) incomprehensive optimization; 2) low-order and unidimensional attention; and 3) repeated generation. In this article, we propose a hybrid reinforced medical report generation method with m-linear attention and repetition penalty mechanism (HReMRG-MR) to overcome these problems. Specifically, a hybrid reward with different weights is employed to remedy the limitations of single-metric-based rewards, and a local optimal weight search algorithm is proposed to significantly reduce the complexity of searching the weights of the rewards from exponential to linear. Furthermore, we use m-linear attention modules to learn multidimensional high-order feature interactions and to achieve multimodal reasoning, while a new repetition penalty is proposed to apply penalties to repeated terms adaptively during the model's training process. Extensive experimental studies on two public benchmark datasets show that HReMRG-MR greatly outperforms the state-of-the-art baselines in terms of all metrics. The effectiveness and necessity of all components in HReMRG-MR are also proved by ablation studies. Additional experiments are further conducted and the results demonstrate that our proposed local optimal weight search algorithm can significantly reduce the search time while maintaining superior medical report generation performances.

13.
IEEE J Biomed Health Inform ; 27(2): 1106-1117, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-36427286

RESUMEN

Electronic health records (EHR) represent a holistic overview of patients' trajectories. Their increasing availability has fueled new hopes to leverage them and develop accurate risk prediction models for a wide range of diseases. Given the complex interrelationships of medical records and patient outcomes, deep learning models have shown clear merits in achieving this goal. However, a key limitation of current study remains their capacity in processing long sequences, and long sequence modelling and its application in the context of healthcare and EHR remains unexplored. Capturing the whole history of medical encounters is expected to lead to more accurate predictions, but the inclusion of records collected for decades and from multiple resources can inevitably exceed the receptive field of the most existing deep learning architectures. This can result in missing crucial, long-term dependencies. To address this gap, we present Hi-BEHRT, a hierarchical Transformer-based model that can significantly expand the receptive field of Transformers and extract associations from much longer sequences. Using a multimodal large-scale linked longitudinal EHR, the Hi-BEHRT exceeds the state-of-the-art deep learning models 1% to 5% for area under the receiver operating characteristic (AUROC) curve and 1% to 8% for area under the precision recall (AUPRC) curve on average, and 2% to 8% (AUROC) and 2% to 11% (AUPRC) for patients with long medical history for 5-year heart failure, diabetes, chronic kidney disease, and stroke risk prediction. Additionally, because pretraining for hierarchical Transformer is not well-established, we provide an effective end-to-end contrastive pre-training strategy for Hi-BEHRT using EHR, improving its transferability on predicting clinical events with relatively small training dataset.


Asunto(s)
Registros Electrónicos de Salud , Insuficiencia Cardíaca , Humanos , Área Bajo la Curva , Suministros de Energía Eléctrica , Curva ROC
14.
Proc AAAI Conf Artif Intell ; 36(7): 8150-8158, 2022 Jun 28.
Artículo en Inglés | MEDLINE | ID: mdl-37205168

RESUMEN

Deep learning has redefined AI thanks to the rise of artificial neural networks, which are inspired by neuronal networks in the brain. Through the years, these interactions between AI and neuroscience have brought immense benefits to both fields, allowing neural networks to be used in a plethora of applications. Neural networks use an efficient implementation of reverse differentiation, called backpropagation (BP). This algorithm, however, is often criticized for its biological implausibility (e.g., lack of local update rules for the parameters). Therefore, biologically plausible learning methods that rely on predictive coding (PC), a framework for describing information processing in the brain, are increasingly studied. Recent works prove that these methods can approximate BP up to a certain margin on multilayer perceptrons (MLPs), and asymptotically on any other complex model, and that zerodivergence inference learning (Z-IL), a variant of PC, is able to exactly implement BP on MLPs. However, the recent literature shows also that there is no biologically plausible method yet that can exactly replicate the weight update of BP on complex models. To fill this gap, in this paper, we generalize (PC and) Z-IL by directly defining it on computational graphs, and show that it can perform exact reverse differentiation. What results is the first PC (and so biologically plausible) algorithm that is equivalent to BP in the way of updating parameters on any neural network, providing a bridge between the interdisciplinary research of neuroscience and deep learning. Furthermore, the above results in particular also immediately provide a novel local and parallel implementation of BP.

15.
Proc Mach Learn Res ; 162: 15561-15583, 2022 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-36751405

RESUMEN

A large number of neural network models of associative memory have been proposed in the literature. These include the classical Hopfield networks (HNs), sparse distributed memories (SDMs), and more recently the modern continuous Hopfield networks (MCHNs), which possess close links with self-attention in machine learning. In this paper, we propose a general framework for understanding the operation of such memory networks as a sequence of three operations: similarity, separation, and projection. We derive all these memory models as instances of our general framework with differing similarity and separation functions. We extend the mathematical framework of Krotov & Hopfield (2020) to express general associative memory models using neural network dynamics with local computation, and derive a general energy function that is a Lyapunov function of the dynamics. Finally, using our framework, we empirically investigate the capacity of using different similarity functions for these associative memory models, beyond the dot product similarity measure, and demonstrate empirically that Euclidean or Manhattan distance similarity metrics perform substantially better in practice on many tasks, enabling a more robust retrieval and higher memory capacity than existing models.

16.
AMIA Jt Summits Transl Sci Proc ; 2022: 130-139, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35854727

RESUMEN

Machine learning can be used to identify relevant trajectory shape features for improved predictive risk modeling, which can help inform decisions for individualized patient management in intensive care during COVID-19 outbreaks. We present explainable random forests to dynamically predict next day mortality risk in COVID -19 positive and negative patients admitted to the Mount Sinai Health System between March 1st and June 8th, 2020 using patient time-series data of vitals, blood and other laboratory measurements from the previous 7 days. Three different models were assessed by using time series with: 1) most recent patient measurements, 2) summary statistics of trajectories (min/max/median/first/last/count), and 3) coefficients of fitted cubic splines to trajectories. AUROC and AUPRC with cross-validation were used to compare models. We found that the second and third models performed statistically significantly better than the first model. Model interpretations are provided at patient-specific level to inform resource allocation and patient care.

17.
Adv Neural Inf Process Syst ; 35: 38232-38244, 2022 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37090087

RESUMEN

Training with backpropagation (BP) in standard deep learning consists of two main steps: a forward pass that maps a data point to its prediction, and a backward pass that propagates the error of this prediction back through the network. This process is highly effective when the goal is to minimize a specific objective function. However, it does not allow training on networks with cyclic or backward connections. This is an obstacle to reaching brain-like capabilities, as the highly complex heterarchical structure of the neural connections in the neocortex are potentially fundamental for its effectiveness. In this paper, we show how predictive coding (PC), a theory of information processing in the cortex, can be used to perform inference and learning on arbitrary graph topologies. We experimentally show how this formulation, called PC graphs, can be used to flexibly perform different tasks with the same network by simply stimulating specific neurons. This enables the model to be queried on stimuli with different structures, such as partial images, images with labels, or images without labels. We conclude by investigating how the topology of the graph influences the final performance, and comparing against simple baselines trained with BP.

18.
AMIA Jt Summits Transl Sci Proc ; 2022: 120-129, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35854750

RESUMEN

Incorporating repeated measurements of vitals and laboratory measurements can improve mortality risk-prediction and identify key risk factors in individualized treatment of COVID-19 hospitalized patients. In this observational study, demographic and laboratory data of all admitted patients to 5 hospitals of Mount Sinai Health System, New York, with COVID-19 positive tests between March 1st and June 8th, 2020, were extracted from electronic medical records and compared between survivors and non-survivors. Next day mortality risk of patients was assessed using a transformer-based model BEHRTDAY fitted to patient time series data of vital signs, blood and other laboratory measurements given the entire patients' hospital stay. The study population includes 3699 COVID-19 positive (57% male, median age: 67) patients. This model had a very high average precision score (0.96) and area under receiver operator curve (0.92) for next-day mortality prediction given entire patients' trajectories, and through masking, it learnt each variable's context.

19.
IEEE J Biomed Health Inform ; 26(7): 3362-3372, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35130176

RESUMEN

Predicting the incidence of complex chronic conditions such as heart failure is challenging. Deep learning models applied to rich electronic health records may improve prediction but remain unexplainable hampering their wider use in medical practice. We aimed to develop a deep-learning framework for accurate and yet explainable prediction of 6-month incident heart failure (HF). Using 100,071 patients from longitudinal linked electronic health records across the U.K., we applied a novel Transformer-based risk model using all community and hospital diagnoses and medications contextualized within the age and calendar year for each patient's clinical encounter. Feature importance was investigated with an ablation analysis to compare model performance when alternatively removing features and by comparing the variability of temporal representations. A post-hoc perturbation technique was conducted to propagate the changes in the input to the outcome for feature contribution analyses. Our model achieved 0.93 area under the receiver operator curve and 0.69 area under the precision-recall curve on internal 5-fold cross validation and outperformed existing deep learning models. Ablation analysis indicated medication is important for predicting HF risk, calendar year is more important than chronological age, which was further reinforced by temporal variability analysis. Contribution analyses identified risk factors that are closely related to HF. Many of them were consistent with existing knowledge from clinical and epidemiological research but several new associations were revealed which had not been considered in expert-driven risk prediction models. In conclusion, the results highlight that our deep learning model, in addition high predictive performance, can inform data-driven risk factor identification.


Asunto(s)
Aprendizaje Profundo , Insuficiencia Cardíaca , Enfermedad Crónica , Registros Electrónicos de Salud , Insuficiencia Cardíaca/diagnóstico , Insuficiencia Cardíaca/epidemiología , Humanos , Factores de Riesgo
20.
Eur Heart J Digit Health ; 3(4): 535-547, 2022 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-36710898

RESUMEN

Aims: Deep learning has dominated predictive modelling across different fields, but in medicine it has been met with mixed reception. In clinical practice, simple, statistical models and risk scores continue to inform cardiovascular disease risk predictions. This is due in part to the knowledge gap about how deep learning models perform in practice when they are subject to dynamic data shifts; a key criterion that common internal validation procedures do not address. We evaluated the performance of a novel deep learning model, BEHRT, under data shifts and compared it with several ML-based and established risk models. Methods and results: Using linked electronic health records of 1.1 million patients across England aged at least 35 years between 1985 and 2015, we replicated three established statistical models for predicting 5-year risk of incident heart failure, stroke, and coronary heart disease. The results were compared with a widely accepted machine learning model (random forests), and a novel deep learning model (BEHRT). In addition to internal validation, we investigated how data shifts affect model discrimination and calibration. To this end, we tested the models on cohorts from (i) distinct geographical regions; (ii) different periods. Using internal validation, the deep learning models substantially outperformed the best statistical models by 6%, 8%, and 11% in heart failure, stroke, and coronary heart disease, respectively, in terms of the area under the receiver operating characteristic curve. Conclusion: The performance of all models declined as a result of data shifts; despite this, the deep learning models maintained the best performance in all risk prediction tasks. Updating the model with the latest information can improve discrimination but if the prior distribution changes, the model may remain miscalibrated.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA