RESUMEN
BACKGROUND/OBJECTIVES: Pancreatic cyst management can be distilled into three separate pathways - discharge, monitoring or surgery- based on the risk of malignant transformation. This study compares the performance of artificial intelligence (AI) models to clinical care for this task. METHODS: Two explainable boosting machine (EBM) models were developed and evaluated using clinical features only, or clinical features and cyst fluid molecular markers (CFMM) using a publicly available dataset, consisting of 850 cases (median age 64; 65 % female) with independent training (429 cases) and holdout test cohorts (421 cases). There were 137 cysts with no malignant potential, 114 malignant cysts, and 599 IPMNs and MCNs. RESULTS: The EBM and EBM with CFMM models had higher accuracy for identifying patients requiring monitoring (0.88 and 0.82) and surgery (0.66 and 0.82) respectively compared with current clinical care (0.62 and 0.58). For discharge, the EBM with CFMM model had a higher accuracy (0.91) than either the EBM model (0.84) or current clinical care (0.86). In the cohort of patients who underwent surgical resection, use of the EBM-CFMM model would have decreased the number of unnecessary surgeries by 59 % (n = 92), increased correct surgeries by 7.5 % (n = 11), identified patients who require monitoring by 122 % (n = 76), and increased the number of patients correctly classified for discharge by 138 % (n = 18) compared to clinical care. CONCLUSIONS: EBM models had greater sensitivity and specificity for identifying the correct management compared with either clinical management or previous AI models. The model predictions are demonstrated to be interpretable by clinicians.
RESUMEN
Proteomics has been revolutionized by large protein language models (PLMs), which learn unsupervised representations from large corpora of sequences. These models are typically fine-tuned in a supervised setting to adapt the model to specific downstream tasks. However, the computational and memory footprint of fine-tuning (FT) large PLMs presents a barrier for many research groups with limited computational resources. Natural language processing has seen a similar explosion in the size of models, where these challenges have been addressed by methods for parameter-efficient fine-tuning (PEFT). In this work, we introduce this paradigm to proteomics through leveraging the parameter-efficient method LoRA and training new models for two important tasks: predicting protein-protein interactions (PPIs) and predicting the symmetry of homooligomer quaternary structures. We show that these approaches are competitive with traditional FT while requiring reduced memory and substantially fewer parameters. We additionally show that for the PPI prediction task, training only the classification head also remains competitive with full FT, using five orders of magnitude fewer parameters, and that each of these methods outperform state-of-the-art PPI prediction methods with substantially reduced compute. We further perform a comprehensive evaluation of the hyperparameter space, demonstrate that PEFT of PLMs is robust to variations in these hyperparameters, and elucidate where best practices for PEFT in proteomics differ from those in natural language processing. All our model adaptation and evaluation code is available open-source at https://github.com/microsoft/peft_proteomics. Thus, we provide a blueprint to democratize the power of PLM adaptation to groups with limited computational resources.
Asunto(s)
Proteómica , Proteómica/métodos , Proteínas/química , Proteínas/metabolismo , Procesamiento de Lenguaje Natural , Mapeo de Interacción de Proteínas/métodos , Biología Computacional/métodos , Humanos , AlgoritmosRESUMEN
The majority of proteins must form higher-order assemblies to perform their biological functions. Despite the importance of protein quaternary structure, there are few machine learning models that can accurately and rapidly predict the symmetry of assemblies involving multiple copies of the same protein chain. Here, we address this gap by training several classes of protein foundation models, including ESM-MSA, ESM2, and RoseTTAFold2, to predict homo-oligomer symmetry. Our best model named Seq2Symm, which utilizes ESM2, outperforms existing template-based and deep learning methods. It achieves an average PR-AUC of 0.48 and 0.44 across homo-oligomer symmetries on two different held-out test sets compared to 0.32 and 0.23 for the template-based method. Because Seq2Symm can rapidly predict homo-oligomer symmetries using a single sequence as input (~ 80,000 proteins/hour), we have applied it to 5 entire proteomes and ~ 3.5 million unlabeled protein sequences to identify patterns in protein assembly complexity across biological kingdoms and species.
RESUMEN
Differentially private (DP) synthetic datasets are a solution for sharing data while preserving the privacy of individual data providers. Understanding the effects of utilizing DP synthetic data in end-to-end machine learning pipelines impacts areas such as health care and humanitarian action, where data is scarce and regulated by restrictive privacy laws. In this work, we investigate the extent to which synthetic data can replace real, tabular data in machine learning pipelines and identify the most effective synthetic data generation techniques for training and evaluating machine learning models. We systematically investigate the impacts of differentially private synthetic data on downstream classification tasks from the point of view of utility as well as fairness. Our analysis is comprehensive and includes representatives of the two main types of synthetic data generation algorithms: marginal-based and GAN-based. To the best of our knowledge, our work is the first that: (i) proposes a training and evaluation framework that does not assume that real data is available for testing the utility and fairness of machine learning models trained on synthetic data; (ii) presents the most extensive analysis of synthetic dataset generation algorithms in terms of utility and fairness when used for training machine learning models; and (iii) encompasses several different definitions of fairness. Our findings demonstrate that marginal-based synthetic data generators surpass GAN-based ones regarding model training utility for tabular data. Indeed, we show that models trained using data generated by marginal-based algorithms can exhibit similar utility to models trained using real data. Our analysis also reveals that the marginal-based synthetic data generated using AIM and MWEM PGM algorithms can train models that simultaneously achieve utility and fairness characteristics close to those obtained by models trained with real data.
Asunto(s)
Algoritmos , Instituciones de Salud , Diseño Interior y Mobiliario , Conocimiento , Aprendizaje AutomáticoRESUMEN
Importance: Deep learning image analysis often depends on large, labeled datasets, which are difficult to obtain for rare diseases. Objective: To develop a self-supervised approach for automated classification of macular telangiectasia type 2 (MacTel) on optical coherence tomography (OCT) with limited labeled data. Design, Setting, and Participants: This was a retrospective comparative study. OCT images from May 2014 to May 2019 were collected by the Lowy Medical Research Institute, La Jolla, California, and the University of Washington, Seattle, from January 2016 to October 2022. Clinical diagnoses of patients with and without MacTel were confirmed by retina specialists. Data were analyzed from January to September 2023. Exposures: Two convolutional neural networks were pretrained using the Bootstrap Your Own Latent algorithm on unlabeled training data and fine-tuned with labeled training data to predict MacTel (self-supervised method). ResNet18 and ResNet50 models were also trained using all labeled data (supervised method). Main Outcomes and Measures: The ground truth yes vs no MacTel diagnosis is determined by retinal specialists based on spectral-domain OCT. The models' predictions were compared against human graders using accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), area under precision recall curve (AUPRC), and area under the receiver operating characteristic curve (AUROC). Uniform manifold approximation and projection was performed for dimension reduction and GradCAM visualizations for supervised and self-supervised methods. Results: A total of 2636 OCT scans from 780 patients with MacTel and 131 patients without MacTel were included from the MacTel Project (mean [SD] age, 60.8 [11.7] years; 63.8% female), and another 2564 from 1769 patients without MacTel from the University of Washington (mean [SD] age, 61.2 [18.1] years; 53.4% female). The self-supervised approach fine-tuned on 100% of the labeled training data with ResNet50 as the feature extractor performed the best, achieving an AUPRC of 0.971 (95% CI, 0.969-0.972), an AUROC of 0.970 (95% CI, 0.970-0.973), accuracy of 0.898%, sensitivity of 0.898, specificity of 0.949, PPV of 0.935, and NPV of 0.919. With only 419 OCT volumes (185 MacTel patients in 10% of labeled training dataset), the ResNet18 self-supervised model achieved comparable performance, with an AUPRC of 0.958 (95% CI, 0.957-0.960), an AUROC of 0.966 (95% CI, 0.964-0.967), and accuracy, sensitivity, specificity, PPV, and NPV of 90.2%, 0.884, 0.916, 0.896, and 0.906, respectively. The self-supervised models showed better agreement with the more experienced human expert graders. Conclusions and Relevance: The findings suggest that self-supervised learning may improve the accuracy of automated MacTel vs non-MacTel binary classification on OCT with limited labeled training data, and these approaches may be applicable to other rare diseases, although further research is warranted.
Asunto(s)
Aprendizaje Profundo , Telangiectasia Retiniana , Humanos , Femenino , Persona de Mediana Edad , Masculino , Tomografía de Coherencia Óptica/métodos , Estudios Retrospectivos , Enfermedades Raras , Telangiectasia Retiniana/diagnóstico por imagen , Aprendizaje Automático SupervisadoRESUMEN
The sustainable management of fisheries and aquaculture requires an understanding of how these activities interact with natural fish populations. GoPro cameras were used to collect an underwater video data set on and around shellfish aquaculture farms in an estuary in the NE Pacific from June to August 2017 and June to August 2018 to better understand habitat use by the local fish and crab communities. Images extracted from these videos were labeled to produce a data set that is suitable for use in training computer vision models. The labeled data set contains 77,739 images sampled from the collected video; 67,990 objects (fishes and crustaceans) have been annotated in 30,384 images (the remainder have been annotated as "empty"). The metadata of the data set also indicates whether a physical magenta filter was used during video collection to counteract reduced visibility. These data have the potential to help researchers address system-level and in-depth regional shellfish aquaculture questions related to ecosystem services and shellfish aquaculture interactions.
Asunto(s)
Braquiuros , Peces , Animales , Acuicultura , Ecosistema , Explotaciones PesquerasRESUMEN
Proteomics has been revolutionized by large pre-trained protein language models, which learn unsupervised representations from large corpora of sequences. The parameters of these models are then fine-tuned in a supervised setting to tailor the model to a specific downstream task. However, as model size increases, the computational and memory footprint of fine-tuning becomes a barrier for many research groups. In the field of natural language processing, which has seen a similar explosion in the size of models, these challenges have been addressed by methods for parameter-efficient fine-tuning (PEFT). In this work, we newly bring parameter-efficient fine-tuning methods to proteomics. Using the parameter-efficient method LoRA, we train new models for two important proteomic tasks: predicting protein-protein interactions (PPI) and predicting the symmetry of homooligomers. We show that for homooligomer symmetry prediction, these approaches achieve performance competitive with traditional fine-tuning while requiring reduced memory and using three orders of magnitude fewer parameters. On the PPI prediction task, we surprisingly find that PEFT models actually outperform traditional fine-tuning while using two orders of magnitude fewer parameters. Here, we go even further to show that freezing the parameters of the language model and training only a classification head also outperforms fine-tuning, using five orders of magnitude fewer parameters, and that both of these models outperform state-of-the-art PPI prediction methods with substantially reduced compute. We also demonstrate that PEFT is robust to variations in training hyper-parameters, and elucidate where best practices for PEFT in proteomics differ from in natural language processing. Thus, we provide a blueprint to democratize the power of protein language model tuning to groups which have limited computational resources.
RESUMEN
To evaluate the generalizability of artificial intelligence (AI) algorithms that use deep learning methods to identify middle ear disease from otoscopic images, between internal to external performance. 1842 otoscopic images were collected from three independent sources: (a) Van, Turkey, (b) Santiago, Chile, and (c) Ohio, USA. Diagnostic categories consisted of (i) normal or (ii) abnormal. Deep learning methods were used to develop models to evaluate internal and external performance, using area under the curve (AUC) estimates. A pooled assessment was performed by combining all cohorts together with fivefold cross validation. AI-otoscopy algorithms achieved high internal performance (mean AUC: 0.95, 95%CI: 0.80-1.00). However, performance was reduced when tested on external otoscopic images not used for training (mean AUC: 0.76, 95%CI: 0.61-0.91). Overall, external performance was significantly lower than internal performance (mean difference in AUC: -0.19, p ≤ 0.04). Combining cohorts achieved a substantial pooled performance (AUC: 0.96, standard error: 0.01). Internally applied algorithms for otoscopy performed well to identify middle ear disease from otoscopy images. However, external performance was reduced when applied to new test cohorts. Further efforts are required to explore data augmentation and pre-processing techniques that might improve external performance and develop a robust, generalizable algorithm for real-world clinical applications.
Asunto(s)
Aprendizaje Profundo , Enfermedades del Oído , Humanos , Inteligencia Artificial , Otoscopía/métodos , Algoritmos , Enfermedades del Oído/diagnóstico por imagenRESUMEN
PURPOSE: Automatic and accurate segmentation of lesions in images of metastatic castration-resistant prostate cancer has the potential to enable personalized radiopharmaceutical therapy and advanced treatment response monitoring. The aim of this study is to develop a convolutional neural networks-based framework for fully-automated detection and segmentation of metastatic prostate cancer lesions in whole-body PET/CT images. METHODS: 525 whole-body PET/CT images of patients with metastatic prostate cancer were available for the study, acquired with the [18F]DCFPyL radiotracer that targets prostate-specific membrane antigen (PSMA). U-Net (1)-based convolutional neural networks (CNNs) were trained to identify lesions on paired axial PET/CT slices. Baseline models were trained using batch-wise dice loss, as well as the proposed weighted batch-wise dice loss (wDice), and the lesion detection performance was quantified, with a particular emphasis on lesion size, intensity, and location. We used 418 images for model training, 30 for model validation, and 77 for model testing. In addition, we allowed our model to take n = 0,2, , 12 neighboring axial slices to examine how incorporating greater amounts of 3D context influences model performance. We selected the optimal number of neighboring axial slices that maximized the detection rate on the 30 validation images, and trained five neural networks with different architectures. RESULTS: Model performance was evaluated using the detection rate, Dice similarity coefficient (DSC) and sensitivity. We found that the proposed wDice loss significantly improved the lesion detection rate, lesion-wise DSC and lesion-wise sensitivity compared to the baseline, with corresponding average increases of 0.07 (p-value = 0.01), 0.03 (p-value = 0.01) and 0.04 (p-value = 0.01), respectively. The inclusion of the first two neighboring axial slices in the input likewise increased the detection rate by 0.17, lesion-wise DSC by 0.05, and lesion-wise mean sensitivity by 0.16. However, there was a minimal effect from including more distant neighboring slices. We ultimately chose to use a number of neighboring slices equal to 2 and the wDice loss function to train our final model. To evaluate the model's performance, we trained three models using identical hyperparameters on three different data splits. The results showed that, on average, the model was able to detect 80% of all testing lesions, with a detection rate of 93% for lesions with maximum standardized uptake values (SUVmax) greater than 5.0. In addition, the average median lesion-wise DSC was 0.51 and 0.60 for all the lesions and lesions with SUVmax>5.0, respectively, on the testing set. Four additional neural networks with different architectures were trained, and they both yielded stronger performance of segmenting lesions whose SUVmax>5.0 compared to the rest of lesions. CONCLUSION: Our results demonstrate that prostate cancer metastases in PSMA PET/CT images can be detected and segmented using CNNs. The segmentation performance strongly depends on the intensity, size, and the location of lesions, and can be improved by using specialized loss functions. Specifically, the models performed best in detection of lesions with SUVmax>5.0. Another challenge was to accurately segment lesions close to the bladder. Future work will focus on improving the detection of lesions with lower SUV values by designing custom loss functions that take into account the lesion intensity, using additional data augmentation techniques, and reducing the number of false lesions by developing methods to better separate signal from noise.
Asunto(s)
Tomografía Computarizada por Tomografía de Emisión de Positrones , Neoplasias de la Próstata , Masculino , Humanos , Tomografía Computarizada por Tomografía de Emisión de Positrones/métodos , Neoplasias de la Próstata/diagnóstico por imagen , Redes Neurales de la Computación , RadiofármacosRESUMEN
In response to the COVID-19 global pandemic, recent research has proposed creating deep learning based models that use chest radiographs (CXRs) in a variety of clinical tasks to help manage the crisis. However, the size of existing datasets of CXRs from COVID-19+ patients are relatively small, and researchers often pool CXR data from multiple sources, for example, using different x-ray machines in various patient populations under different clinical scenarios. Deep learning models trained on such datasets have been shown to overfit to erroneous features instead of learning pulmonary characteristics in a phenomenon known as shortcut learning. We propose adding feature disentanglement to the training process. This technique forces the models to identify pulmonary features from the images and penalizes them for learning features that can discriminate between the original datasets that the images come from. We find that models trained in this way indeed have better generalization performance on unseen data; in the best case we found that it improved AUC by 0.13 on held out data. We further find that this outperforms masking out non-lung parts of the CXRs and performing histogram equalization, both of which are recently proposed methods for removing biases in CXR datasets.
Asunto(s)
COVID-19 , Aprendizaje Profundo , COVID-19/diagnóstico por imagen , Humanos , Pulmón/diagnóstico por imagen , Radiografía Torácica/métodos , Rayos XRESUMEN
BACKGROUND: Several risk factors have been identified for severe COVID-19 disease by the scientific community. In this paper, we focus on understanding the risks for severe COVID-19 infections after vaccination (ie, in breakthrough SARS-CoV-2 infections). Studying these risks by vaccine type, age, sex, comorbidities, and any prior SARS-CoV-2 infection is important to policy makers planning further vaccination efforts. OBJECTIVE: We performed a comparative study of the risks of hospitalization (n=1140) and mortality (n=159) in a SARS-CoV-2 positive cohort of 19,815 patients who were all fully vaccinated with the Pfizer, Moderna, or Janssen vaccines. METHODS: We performed Cox regression analysis to calculate the risk factors for developing a severe breakthrough SARS-CoV-2 infection in the study cohort by controlling for vaccine type, age, sex, comorbidities, and a prior SARS-CoV-2 infection. RESULTS: We found lower hazard ratios for those receiving the Moderna vaccine (P<.001) and Pfizer vaccine (P<.001), with the lowest hazard rates being for Moderna, as compared to those who received the Janssen vaccine, independent of age, sex, comorbidities, vaccine type, and prior SARS-CoV-2 infection. Further, individuals who had a SARS-CoV-2 infection prior to vaccination had some increased protection over and above the protection already provided by the vaccines, from hospitalization (P=.001) and death (P=.04), independent of age, sex, comorbidities, and vaccine type. We found that the top statistically significant risk factors for severe breakthrough SARS-CoV-2 infections were age of >50, male gender, moderate and severe renal failure, severe liver disease, leukemia, chronic lung disease, coagulopathy, and alcohol abuse. CONCLUSIONS: Among individuals who were fully vaccinated, the risk of severe breakthrough SARS-CoV-2 infection was lower for recipients of the Moderna or Pfizer vaccines and higher for recipients of the Janssen vaccine. These results from our analysis at a population level will be helpful to public health policy makers. Our result on the influence of a previous SARS-CoV-2 infection necessitates further research into the impact of multiple exposures on the risk of developing severe COVID-19.
Asunto(s)
COVID-19 , Vacunas Virales , Humanos , Masculino , COVID-19/epidemiología , COVID-19/prevención & control , SARS-CoV-2 , Vacunación , HospitalizaciónRESUMEN
The rapid evolution of the novel coronavirus disease (COVID-19) pandemic has resulted in an urgent need for effective clinical tools to reduce transmission and manage severe illness. Numerous teams are quickly developing artificial intelligence approaches to these problems, including using deep learning to predict COVID-19 diagnosis and prognosis from chest computed tomography (CT) imaging data. In this work, we assess the value of aggregated chest CT data for COVID-19 prognosis compared to clinical metadata alone. We develop a novel patient-level algorithm to aggregate the chest CT volume into a 2D representation that can be easily integrated with clinical metadata to distinguish COVID-19 pneumonia from chest CT volumes from healthy participants and participants with other viral pneumonia. Furthermore, we present a multitask model for joint segmentation of different classes of pulmonary lesions present in COVID-19 infected lungs that can outperform individual segmentation models for each task. We directly compare this multitask segmentation approach to combining feature-agnostic volumetric CT classification feature maps with clinical metadata for predicting mortality. We show that the combination of features derived from the chest CT volumes improve the AUC performance to 0.80 from the 0.52 obtained by using patients' clinical data alone. These approaches enable the automated extraction of clinically relevant features from chest CT volumes for risk stratification of COVID-19 patients.
Asunto(s)
COVID-19/diagnóstico , COVID-19/virología , Aprendizaje Profundo , SARS-CoV-2 , Tórax/diagnóstico por imagen , Tórax/patología , Tomografía Computarizada por Rayos X , Algoritmos , COVID-19/mortalidad , Bases de Datos Genéticas , Humanos , Interpretación de Imagen Asistida por Computador/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Pronóstico , Tomografía Computarizada por Rayos X/métodos , Tomografía Computarizada por Rayos X/normasRESUMEN
Malnutrition is a global health crisis and is a leading cause of death among children under 5 years. Detecting malnutrition requires anthropometric measurements of weight, height, and middle-upper arm circumference. However, measuring them accurately is a challenge, especially in the global south, due to limited resources. In this work, we propose a CNN-based approach to estimate the height of standing children under 5 years from depth images collected using a smartphone. According to the SMART Methodology Manual, the acceptable accuracy for height is less than 1.4 cm. On training our deep learning model on 87131 depth images, our model achieved a mean absolute error of 1.64% on 57064 test images. For 70.3% test images, we estimated height accurately within the acceptable 1.4 cm range. Thus, our proposed solution can accurately detect stunting (low height-for-age) in standing children below 5 years of age.
Asunto(s)
Estatura , Trastornos del Crecimiento , Brazo , Peso Corporal , Niño , Preescolar , HumanosRESUMEN
The goal of this project is to use acoustic signatures to detect, classify, and count the calls of four acoustic populations of blue whales so that, ultimately, the conservation status of each population can be better assessed. We used manual annotations from 350 h of audio recordings from the underwater hydrophones in the Indian Ocean to build a deep learning model to detect, classify, and count the calls from four acoustic song types. The method we used was Siamese neural networks (SNN), a class of neural network architectures that are used to find the similarity of the inputs by comparing their feature vectors, finding that they outperformed the more widely used convolutional neural networks (CNN). Specifically, the SNN outperform a CNN with 2% accuracy improvement in population classification and 1.7%-6.4% accuracy improvement in call count estimation for each blue whale population. In addition, even though we treat the call count estimation problem as a classification task and encode the number of calls in each spectrogram as a categorical variable, SNN surprisingly learned the ordinal relationship among them. SNN are robust and are shown here to be an effective way to automatically mine large acoustic datasets for blue whale calls.
Asunto(s)
Balaenoptera , Acústica , Animales , Océano Índico , Redes Neurales de la Computación , Vocalización AnimalRESUMEN
In response to the COVID-19 global pandemic, recent research has proposed creating deep learning based models that use chest radiographs (CXRs) in a variety of clinical tasks to help manage the crisis. However, the size of existing datasets of CXRs from COVID-19+ patients are relatively small, and researchers often pool CXR data from multiple sources, for example, using different x-ray machines in various patient populations under different clinical scenarios. Deep learning models trained on such datasets have been shown to overfit to erroneous features instead of learning pulmonary characteristics -- a phenomenon known as shortcut learning. We propose adding feature disentanglement to the training process, forcing the models to identify pulmonary features from the images while penalizing them for learning features that can discriminate between the original datasets that the images come from. We find that models trained in this way indeed have better generalization performance on unseen data; in the best case we found that it improved AUC by 0.13 on held out data. We further find that this outperforms masking out non-lung parts of the CXRs and performing histogram equalization, both of which are recently proposed methods for removing biases in CXR datasets.
RESUMEN
Over a decade after the Cook Inlet beluga (Delphinapterus leucas) was listed as endangered in 2008, the population has shown no sign of recovery. Lack of ecological knowledge limits the understanding of, and ability to manage, potential threats impeding recovery of this declining population. National Oceanic and Atmospheric Administration Fisheries, in partnership with the Alaska Department of Fish and Game, initiated a passive acoustics monitoring program in 2017 to investigate beluga seasonal occurrence by deploying a series of passive acoustic moorings. Data have been processed with semi-automated tonal detectors followed by time intensive manual validation. To reduce this labor intensive and time-consuming process, in addition to increasing the accuracy of classification results, the authors constructed an ensembled deep learning convolutional neural network model to classify beluga detections as true or false. Using a 0.5 threshold, the final model achieves 96.57% precision and 92.26% recall on testing dataset. This methodology proves to be successful at classifying beluga signals, and the framework can be easily generalized to other acoustic classification problems.