Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 48
Filtrar
1.
Res Sq ; 2024 Apr 26.
Artigo em Inglês | MEDLINE | ID: mdl-38746169

RESUMO

The majority of proteins must form higher-order assemblies to perform their biological functions. Despite the importance of protein quaternary structure, there are few machine learning models that can accurately and rapidly predict the symmetry of assemblies involving multiple copies of the same protein chain. Here, we address this gap by training several classes of protein foundation models, including ESM-MSA, ESM2, and RoseTTAFold2, to predict homo-oligomer symmetry. Our best model named Seq2Symm, which utilizes ESM2, outperforms existing template-based and deep learning methods. It achieves an average PR-AUC of 0.48 and 0.44 across homo-oligomer symmetries on two different held-out test sets compared to 0.32 and 0.23 for the template-based method. Because Seq2Symm can rapidly predict homo-oligomer symmetries using a single sequence as input (~ 80,000 proteins/hour), we have applied it to 5 entire proteomes and ~ 3.5 million unlabeled protein sequences to identify patterns in protein assembly complexity across biological kingdoms and species.

4.
Sci Rep ; 14(1): 6002, 2024 03 12.
Artigo em Inglês | MEDLINE | ID: mdl-38472269

RESUMO

In the United States the rate of stillbirth after 28 weeks' gestation (late stillbirth) is 2.7/1000 births. Fetuses that are small for gestational age (SGA) or large for gestational age (LGA) are at increased risk of stillbirth. SGA and LGA are often categorized as growth or birthweight ≤ 10th and ≥ 90th centile, respectively; however, these cut-offs are arbitrary. We sought to characterize the relationship between birthweight and stillbirth risk in greater detail. Data on singleton births between 28- and 44-weeks' gestation from 2014 to 2015 were extracted from the US Centers for Disease Control and Prevention live birth and fetal death files. Growth was assessed using customized birthweight centiles (Gestation Related Optimal Weight; GROW). The analyses included logistic regression using SGA/LGA categories and a generalized additive model (GAM) using birthweight centile as a continuous exposure. Although the SGA and LGA categories identified infants at risk of stillbirth, categorical models provided poor fits to the data within the high-risk bins, and in particular markedly underestimated the risk for the extreme centiles. For example, for fetuses in the lowest GROW centile, the observed rate was 39.8/1000 births compared with a predicted rate of 11.7/1000 from the category-based analysis. In contrast, the model-predicted risk from the GAM tracked closely with the observed risk, with the GAM providing an accurate characterization of stillbirth risk across the entire birthweight continuum. This study provides stillbirth risk estimates for each GROW centile, which clinicians can use in conjunction with other clinical details to guide obstetric management.


Assuntos
Desenvolvimento Fetal , Natimorto , Gravidez , Recém-Nascido , Lactente , Feminino , Humanos , Estados Unidos , Peso ao Nascer , Recém-Nascido Pequeno para a Idade Gestacional , Idade Gestacional , Retardo do Crescimento Fetal
6.
PLoS One ; 19(2): e0297271, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38315667

RESUMO

Differentially private (DP) synthetic datasets are a solution for sharing data while preserving the privacy of individual data providers. Understanding the effects of utilizing DP synthetic data in end-to-end machine learning pipelines impacts areas such as health care and humanitarian action, where data is scarce and regulated by restrictive privacy laws. In this work, we investigate the extent to which synthetic data can replace real, tabular data in machine learning pipelines and identify the most effective synthetic data generation techniques for training and evaluating machine learning models. We systematically investigate the impacts of differentially private synthetic data on downstream classification tasks from the point of view of utility as well as fairness. Our analysis is comprehensive and includes representatives of the two main types of synthetic data generation algorithms: marginal-based and GAN-based. To the best of our knowledge, our work is the first that: (i) proposes a training and evaluation framework that does not assume that real data is available for testing the utility and fairness of machine learning models trained on synthetic data; (ii) presents the most extensive analysis of synthetic dataset generation algorithms in terms of utility and fairness when used for training machine learning models; and (iii) encompasses several different definitions of fairness. Our findings demonstrate that marginal-based synthetic data generators surpass GAN-based ones regarding model training utility for tabular data. Indeed, we show that models trained using data generated by marginal-based algorithms can exhibit similar utility to models trained using real data. Our analysis also reveals that the marginal-based synthetic data generated using AIM and MWEM PGM algorithms can train models that simultaneously achieve utility and fairness characteristics close to those obtained by models trained with real data.


Assuntos
Algoritmos , Instalações de Saúde , Decoração de Interiores e Mobiliário , Conhecimento , Aprendizado de Máquina
7.
JAMA Ophthalmol ; 142(3): 226-233, 2024 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-38329740

RESUMO

Importance: Deep learning image analysis often depends on large, labeled datasets, which are difficult to obtain for rare diseases. Objective: To develop a self-supervised approach for automated classification of macular telangiectasia type 2 (MacTel) on optical coherence tomography (OCT) with limited labeled data. Design, Setting, and Participants: This was a retrospective comparative study. OCT images from May 2014 to May 2019 were collected by the Lowy Medical Research Institute, La Jolla, California, and the University of Washington, Seattle, from January 2016 to October 2022. Clinical diagnoses of patients with and without MacTel were confirmed by retina specialists. Data were analyzed from January to September 2023. Exposures: Two convolutional neural networks were pretrained using the Bootstrap Your Own Latent algorithm on unlabeled training data and fine-tuned with labeled training data to predict MacTel (self-supervised method). ResNet18 and ResNet50 models were also trained using all labeled data (supervised method). Main Outcomes and Measures: The ground truth yes vs no MacTel diagnosis is determined by retinal specialists based on spectral-domain OCT. The models' predictions were compared against human graders using accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), area under precision recall curve (AUPRC), and area under the receiver operating characteristic curve (AUROC). Uniform manifold approximation and projection was performed for dimension reduction and GradCAM visualizations for supervised and self-supervised methods. Results: A total of 2636 OCT scans from 780 patients with MacTel and 131 patients without MacTel were included from the MacTel Project (mean [SD] age, 60.8 [11.7] years; 63.8% female), and another 2564 from 1769 patients without MacTel from the University of Washington (mean [SD] age, 61.2 [18.1] years; 53.4% female). The self-supervised approach fine-tuned on 100% of the labeled training data with ResNet50 as the feature extractor performed the best, achieving an AUPRC of 0.971 (95% CI, 0.969-0.972), an AUROC of 0.970 (95% CI, 0.970-0.973), accuracy of 0.898%, sensitivity of 0.898, specificity of 0.949, PPV of 0.935, and NPV of 0.919. With only 419 OCT volumes (185 MacTel patients in 10% of labeled training dataset), the ResNet18 self-supervised model achieved comparable performance, with an AUPRC of 0.958 (95% CI, 0.957-0.960), an AUROC of 0.966 (95% CI, 0.964-0.967), and accuracy, sensitivity, specificity, PPV, and NPV of 90.2%, 0.884, 0.916, 0.896, and 0.906, respectively. The self-supervised models showed better agreement with the more experienced human expert graders. Conclusions and Relevance: The findings suggest that self-supervised learning may improve the accuracy of automated MacTel vs non-MacTel binary classification on OCT with limited labeled training data, and these approaches may be applicable to other rare diseases, although further research is warranted.


Assuntos
Aprendizado Profundo , Telangiectasia Retiniana , Humanos , Feminino , Pessoa de Meia-Idade , Masculino , Tomografia de Coerência Óptica/métodos , Estudos Retrospectivos , Doenças Raras , Telangiectasia Retiniana/diagnóstico por imagem , Aprendizado de Máquina Supervisionado
8.
Med Phys ; 51(2): 1203-1216, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37544015

RESUMO

BACKGROUND: Prostate-specific membrane antigen (PSMA) PET imaging represents a valuable source of information reflecting disease stage, response rate, and treatment optimization options, particularly with PSMA radioligand therapy. Quantification of radiopharmaceutical uptake in healthy organs from PSMA images has the potential to minimize toxicity by extrapolation of the radiation dose delivery towards personalization of therapy. However, segmentation and quantification of uptake in organs requires labor-intensive organ delineations that are often not feasible in the clinic nor scalable for large clinical trials. PURPOSE: In this work we develop and test the PSMA Healthy organ segmentation network (PSMA-Hornet), a fully-automated deep neural net for simultaneous segmentation of 14 healthy organs representing the normal biodistribution of [18 F]DCFPyL on PET/CT images. We also propose a modified U-net architecture, a self-supervised pre-training method for PET/CT images, a multi-target Dice loss, and multi-target batch balancing to effectively train PSMA-Hornet and similar networks. METHODS: The study used manually-segmented [18 F]DCFPyL PET/CT images from 100 subjects, and 526 similar images without segmentations. The unsegmented images were used for self-supervised model pretraining. For supervised training, Monte-Carlo cross-validation was used to evaluate the network performance, with 85 subjects in each trial reserved for model training, 5 for validation, and 10 for testing. Image segmentation and quantification metrics were evaluated on the test folds with respect to manual segmentations by a nuclear medicine physician, and compared to inter-rater agreement. The model's segmentation performance was also evaluated on a separate set of 19 images with high tumor load. RESULTS: With our best model, the lowest mean Dice coefficient on the test set was 0.826 for the sublingual gland, and the highest was 0.964 for liver. The highest mean error in tracer uptake quantification was 13.9% in the sublingual gland. Self-supervised pretraining improved training convergence, train-to-test generalization, and segmentation quality. In addition, we found that a multi-target network produced significantly higher segmentation accuracy than single-organ networks. CONCLUSIONS: The developed network can be used to automatically obtain high-quality organ segmentations for PSMA image analysis tasks. It can be used to reproducibly extract imaging data, and holds promise for clinical applications such as personalized radiation dosimetry and improved radioligand therapy.


Assuntos
Antígenos de Superfície , Glutamato Carboxipeptidase II , Neoplasias da Próstata , Animais , Humanos , Masculino , Processamento de Imagem Assistida por Computador/métodos , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada/métodos , Neoplasias da Próstata/diagnóstico por imagem , Neoplasias da Próstata/radioterapia , Distribuição Tecidual
9.
bioRxiv ; 2023 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-37986761

RESUMO

Proteomics has been revolutionized by large pre-trained protein language models, which learn unsupervised representations from large corpora of sequences. The parameters of these models are then fine-tuned in a supervised setting to tailor the model to a specific downstream task. However, as model size increases, the computational and memory footprint of fine-tuning becomes a barrier for many research groups. In the field of natural language processing, which has seen a similar explosion in the size of models, these challenges have been addressed by methods for parameter-efficient fine-tuning (PEFT). In this work, we newly bring parameter-efficient fine-tuning methods to proteomics. Using the parameter-efficient method LoRA, we train new models for two important proteomic tasks: predicting protein-protein interactions (PPI) and predicting the symmetry of homooligomers. We show that for homooligomer symmetry prediction, these approaches achieve performance competitive with traditional fine-tuning while requiring reduced memory and using three orders of magnitude fewer parameters. On the PPI prediction task, we surprisingly find that PEFT models actually outperform traditional fine-tuning while using two orders of magnitude fewer parameters. Here, we go even further to show that freezing the parameters of the language model and training only a classification head also outperforms fine-tuning, using five orders of magnitude fewer parameters, and that both of these models outperform state-of-the-art PPI prediction methods with substantially reduced compute. We also demonstrate that PEFT is robust to variations in training hyper-parameters, and elucidate where best practices for PEFT in proteomics differ from in natural language processing. Thus, we provide a blueprint to democratize the power of protein language model tuning to groups which have limited computational resources.

10.
medRxiv ; 2023 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-37745463

RESUMO

Purpose: To gain insights into potential genetic factors contributing to the infant's vulnerability to Sudden Unexpected Infant Death (SUID). Methods: Whole Genome Sequencing (WGS) was performed on 145 infants that succumbed to SUID, and 576 healthy adults. Variants were filtered by gnomAD allele frequencies and predictions of functional consequences. Results: Variants of interest were identified in 86 genes, 63.4% of our cohort. Seventy-one of these have been previously associated with SIDS/SUID/SUDP. Forty-three can be characterized as cardiac genes and are related to cardiomyopathies, arrhythmias, and other conditions. Variants in 22 genes were associated with neurologic functions. Variants were also found in 13 genes reported to be pathogenic for various systemic disorders. Variants in eight genes are implicated in the response to hypoxia and the regulation of reactive oxygen species (ROS) and have not been previously described in SIDS/SUID/SUDP. Seventy-two infants met the triple risk hypothesis criteria (Figure 1). Conclusion: Our study confirms and further expands the list of genetic variants associated with SUID. The abundance of genes associated with heart disease and the discovery of variants associated with the redox metabolism have important mechanistic implications for the pathophysiology of SUID.

11.
Int J Equity Health ; 22(1): 181, 2023 09 05.
Artigo em Inglês | MEDLINE | ID: mdl-37670348

RESUMO

BACKGROUND: Socioeconomic status has long been associated with population health and health outcomes. While ameliorating social determinants of health may improve health, identifying and targeting areas where feasible interventions are most needed would help improve health equity. We sought to identify inequities in health and social determinants of health (SDOH) associated with local economic distress at the county-level. METHODS: For 3,131 counties in the 50 US states and Washington, DC (wherein approximately 325,711,203 people lived in 2019), we conducted a retrospective analysis of county-level data collected from County Health Rankings in two periods (centering around 2015 and 2019). We used ANOVA to compare thirty-three measures across five health and SDOH domains (Health Outcomes, Clinical Care, Health Behaviors, Physical Environment, and Social and Economic Factors) that were available in both periods, changes in measures between periods, and ratios of measures for the least to most prosperous counties across county-level prosperity quintiles, based on the Economic Innovation Group's 2015-2019 Distressed Community Index Scores. RESULTS: With seven exceptions, in both periods, we found a worsening of values with each progression from more to less prosperous counties, with least prosperous counties having the worst values (ANOVA p < 0.001 for all measures). Between 2015 and 2019, all except six measures progressively worsened when comparing higher to lower prosperity quintiles, and gaps between the least and most prosperous counties generally widened. CONCLUSIONS: In the late 2010s, the least prosperous US counties overwhelmingly had worse values in measures of Health Outcomes, Clinical Care, Health Behaviors, the Physical Environment, and Social and Economic Factors than more prosperous counties. Between 2015 and 2019, for most measures, inequities between the least and most prosperous counties widened. Our findings suggest that local economic prosperity may serve as a proxy for health and SDOH status of the community. Policymakers and leaders in public and private sectors might use long-term, targeted economic stimuli in low prosperity counties to generate local, community health benefits for vulnerable populations. Doing so could sustainably improve health; not doing so will continue to generate poor health outcomes and ever-widening economic disparities.


Assuntos
Comportamentos Relacionados com a Saúde , Determinantes Sociais da Saúde , Humanos , Estudos Retrospectivos , Fatores Econômicos , Avaliação de Resultados em Cuidados de Saúde
12.
PLoS One ; 18(8): e0289405, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37647261

RESUMO

BACKGROUND: In the United States (US) late stillbirth (at 28 weeks or more of gestation) occurs in 3/1000 births. AIM: We examined risk factors for late stillbirth with the specific goal of identifying modifiable factors that contribute substantially to stillbirth burden. SETTING: All singleton births in the US for 2014-2015. METHODS: We used a retrospective population-based design to assess the effects of multiple factors on the risk of late stillbirth in the US. Data were drawn from the US Centers for Disease Control and Prevention live birth and fetal death data files. RESULTS: There were 6,732,157 live and 18,334 stillbirths available for analysis (late stillbirth rate = 2.72/1000 births). The importance of sociodemographic determinants was shown by higher risks for Black and Native Hawaiian and Other Pacific Islander mothers compared with White mothers, mothers with low educational attainment, and older mothers. Among modifiable risk factors, delayed/absent prenatal care, diabetes, hypertension, and maternal smoking were associated with increased risk, though they accounted for only 3-6% of stillbirths each. Two factors accounted for the largest proportion of late stillbirths: high maternal body mass index (BMI; 15%) and infants who were small for gestational age (38%). Participation in the supplemental nutrition for women, infants and children program was associated with a 28% reduction in overall stillbirth burden. CONCLUSIONS: This study provides population-based evidence for stillbirth risk in the US. A high proportion of late stillbirths was associated with high maternal BMI and small for gestational age, whereas participation in supplemental nutrition programs was associated with a large reduction in stillbirth burden. Addressing obesity and fetal growth restriction, as well as broadening participation in nutritional supplementation programs could reduce late stillbirths.


Assuntos
Retardo do Crescimento Fetal , Natimorto , Estados Unidos/epidemiologia , Criança , Lactente , Gravidez , Humanos , Feminino , Natimorto/epidemiologia , Idade Gestacional , Estudos Retrospectivos , Fatores de Risco , Havaí
13.
J Health Care Poor Underserved ; 34(2): 521-534, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37464515

RESUMO

Understanding how post-acute COVID-19 syndrome (PACS or long COVID) manifests among underserved populations, who experienced a disproportionate burden of acute COVID-19, can help providers and policymakers better address this ongoing crisis. To identify clinical sequelae of long COVID among underserved populations treated in the primary care safety net, we conducted a causal impact analysis with electronic health records (EHR) to compare symptoms among community health center patients who tested positive (n=4,091) and negative (n=7,118) for acute COVID-19. We found 18 sequelae with statistical significance and causal dependence among patients who had a visit after 60 days or more following acute COVID-19. These sequelae encompass most organ systems and include breathing abnormalities, malaise and fatigue, and headache. This study adds to current knowledge about how long COVID manifests in a large, underserved population.


Assuntos
COVID-19 , Equidade em Saúde , Humanos , Síndrome de COVID-19 Pós-Aguda , Ciência de Dados , Área Carente de Assistência Médica , COVID-19/epidemiologia , Progressão da Doença
14.
Arch Public Health ; 81(1): 137, 2023 Jul 26.
Artigo em Inglês | MEDLINE | ID: mdl-37495995

RESUMO

BACKGROUND: In 1991, Halpern and Coren claimed that left-handed people die nine years younger than right-handed people. Most subsequent studies did not find support for the difference in age of death or its magnitude, primarily because of the realization that there have been historical changes in reported rates of left-handedness. METHODS: We created a model that allowed us to determine whether the historical change in left-handedness explains the original finding of a nine-year difference in life expectancy. We calculated all deaths in the United States by birth year, gender, and handedness for 1989 (the Halpern and Coren study was based on data from that year) and contrasted those findings with the modeled age of death by reported and counterfactual estimated handedness for each birth year, 1900-1989. RESULTS: In 1989, 2,019,512 individuals died, of which 6.4% were reportedly left-handed based on concurrent annual handedness reporting. However, it is widely believed that cultural pressures may have caused an underestimation of the true rate of left-handedness. Using a simulation that assumed no age of death difference between left-handed and right-handed individuals in this cohort and adjusting for the reported rates of left-handedness, we found that left-handed individuals were expected to die 9.3 years earlier than their right-handed counterparts due to changes in the rate of left-handedness over time. This difference of 9.3 years was not found to be statistically significant compared to the 8.97 years reported by Halpern and Coren. When we assumed no change in the rate of left-handedness over time, the survival advantage for right-handed individuals was reduced to 0.02 years, solely driven by not controlling for gender. When we considered the estimated age of death for each birth cohort, we found a mean difference of 0.43 years between left-handed and right-handed individuals, also driven by handedness difference by gender. CONCLUSION: We found that the changing rate of left-handedness reporting over the years entirely explains the originally reported observation of nine-year difference in life expectancy. In epidemiology, new information on past reporting biases could warrant re-exploration of initial findings. The simulation modeling approach that we use here might facilitate such analyses.

17.
PLoS One ; 18(4): e0284614, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37083949

RESUMO

BACKGROUND: Infection is thought to play a part in some infant deaths. Maternal infection in pregnancy has focused on chlamydia with some reports suggesting an association with sudden unexpected infant death (SUID). OBJECTIVES: We hypothesized that maternal infections in pregnancy are associated with subsequent SUID in their offspring. SETTING: All births in the United States, 2011-2015. DATA SOURCE: Centers for Disease Control and Prevention (CDC) Birth Cohort Linked Birth-Infant Death Data Files. STUDY DESIGN: Cohort study, although the data were analysed as a case control study. Cases were infants that died from SUID. Controls were randomly sampled infants that survived their first year of life; approximately 10 controls per SUID case. EXPOSURES: Chlamydia, gonorrhea and hepatitis C. RESULTS: There were 19,849,690 live births in the U.S. for the period 2011-2015. There were 37,143 infant deaths of which 17,398 were classified as SUID cases (a rate of 0.86/1000 live births). The proportion of the control mothers with chlamydia was 1.7%, gonorrhea 0.2% and hepatitis C was 0.3%. Chlamydia was present in 3.8% of mothers whose infants subsequently died of SUID compared with 1.7% of controls (unadjusted OR = 2.35, 95% CI = 2.15, 2.56; adjusted OR = 1.08, 95% CI = 0.98, 1.19). Gonorrhea was present in 0.7% of mothers of SUID cases compared with 0.2% of mothers of controls (OR = 3.09, (2.50, 3.79); aOR = 1.20(0.95, 1.49)) and hepatitis C was present in 1.3% of mothers of SUID cases compared with 0.3% of mothers of controls (OR = 4.69 (3.97, 5.52): aOR = 1.80 (1.50, 2.15)). CONCLUSIONS: The marked attenuation of SUID risk after adjustment for a wide variety of socioeconomic and demographic factors suggests the small increase in the risk of SUID of the offspring of mothers with infection with hepatitis C in pregnancy is due to residual confounding.


Assuntos
Gonorreia , Hepatite C , Morte Súbita do Lactente , Lactente , Gravidez , Feminino , Humanos , Estados Unidos/epidemiologia , Estudos de Coortes , Estudos de Casos e Controles , Morte Súbita do Lactente/epidemiologia , Morte Súbita do Lactente/etiologia , Mortalidade Infantil , Hepacivirus , Morte
18.
Sci Rep ; 13(1): 5368, 2023 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-37005441

RESUMO

To evaluate the generalizability of artificial intelligence (AI) algorithms that use deep learning methods to identify middle ear disease from otoscopic images, between internal to external performance. 1842 otoscopic images were collected from three independent sources: (a) Van, Turkey, (b) Santiago, Chile, and (c) Ohio, USA. Diagnostic categories consisted of (i) normal or (ii) abnormal. Deep learning methods were used to develop models to evaluate internal and external performance, using area under the curve (AUC) estimates. A pooled assessment was performed by combining all cohorts together with fivefold cross validation. AI-otoscopy algorithms achieved high internal performance (mean AUC: 0.95, 95%CI: 0.80-1.00). However, performance was reduced when tested on external otoscopic images not used for training (mean AUC: 0.76, 95%CI: 0.61-0.91). Overall, external performance was significantly lower than internal performance (mean difference in AUC: -0.19, p ≤ 0.04). Combining cohorts achieved a substantial pooled performance (AUC: 0.96, standard error: 0.01). Internally applied algorithms for otoscopy performed well to identify middle ear disease from otoscopy images. However, external performance was reduced when applied to new test cohorts. Further efforts are required to explore data augmentation and pre-processing techniques that might improve external performance and develop a robust, generalizable algorithm for real-world clinical applications.


Assuntos
Aprendizado Profundo , Otopatias , Humanos , Inteligência Artificial , Otoscopia/métodos , Algoritmos , Otopatias/diagnóstico por imagem
19.
Comput Biol Med ; 158: 106882, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37037147

RESUMO

PURPOSE: Automatic and accurate segmentation of lesions in images of metastatic castration-resistant prostate cancer has the potential to enable personalized radiopharmaceutical therapy and advanced treatment response monitoring. The aim of this study is to develop a convolutional neural networks-based framework for fully-automated detection and segmentation of metastatic prostate cancer lesions in whole-body PET/CT images. METHODS: 525 whole-body PET/CT images of patients with metastatic prostate cancer were available for the study, acquired with the [18F]DCFPyL radiotracer that targets prostate-specific membrane antigen (PSMA). U-Net (1)-based convolutional neural networks (CNNs) were trained to identify lesions on paired axial PET/CT slices. Baseline models were trained using batch-wise dice loss, as well as the proposed weighted batch-wise dice loss (wDice), and the lesion detection performance was quantified, with a particular emphasis on lesion size, intensity, and location. We used 418 images for model training, 30 for model validation, and 77 for model testing. In addition, we allowed our model to take n = 0,2, …, 12 neighboring axial slices to examine how incorporating greater amounts of 3D context influences model performance. We selected the optimal number of neighboring axial slices that maximized the detection rate on the 30 validation images, and trained five neural networks with different architectures. RESULTS: Model performance was evaluated using the detection rate, Dice similarity coefficient (DSC) and sensitivity. We found that the proposed wDice loss significantly improved the lesion detection rate, lesion-wise DSC and lesion-wise sensitivity compared to the baseline, with corresponding average increases of 0.07 (p-value = 0.01), 0.03 (p-value = 0.01) and 0.04 (p-value = 0.01), respectively. The inclusion of the first two neighboring axial slices in the input likewise increased the detection rate by 0.17, lesion-wise DSC by 0.05, and lesion-wise mean sensitivity by 0.16. However, there was a minimal effect from including more distant neighboring slices. We ultimately chose to use a number of neighboring slices equal to 2 and the wDice loss function to train our final model. To evaluate the model's performance, we trained three models using identical hyperparameters on three different data splits. The results showed that, on average, the model was able to detect 80% of all testing lesions, with a detection rate of 93% for lesions with maximum standardized uptake values (SUVmax) greater than 5.0. In addition, the average median lesion-wise DSC was 0.51 and 0.60 for all the lesions and lesions with SUVmax>5.0, respectively, on the testing set. Four additional neural networks with different architectures were trained, and they both yielded stronger performance of segmenting lesions whose SUVmax>5.0 compared to the rest of lesions. CONCLUSION: Our results demonstrate that prostate cancer metastases in PSMA PET/CT images can be detected and segmented using CNNs. The segmentation performance strongly depends on the intensity, size, and the location of lesions, and can be improved by using specialized loss functions. Specifically, the models performed best in detection of lesions with SUVmax>5.0. Another challenge was to accurately segment lesions close to the bladder. Future work will focus on improving the detection of lesions with lower SUV values by designing custom loss functions that take into account the lesion intensity, using additional data augmentation techniques, and reducing the number of false lesions by developing methods to better separate signal from noise.


Assuntos
Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada , Neoplasias da Próstata , Masculino , Humanos , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada/métodos , Neoplasias da Próstata/diagnóstico por imagem , Redes Neurais de Computação , Compostos Radiofarmacêuticos
20.
Nat Commun ; 14(1): 1177, 2023 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-36859488

RESUMO

Cryptic pockets expand the scope of drug discovery by enabling targeting of proteins currently considered undruggable because they lack pockets in their ground state structures. However, identifying cryptic pockets is labor-intensive and slow. The ability to accurately and rapidly predict if and where cryptic pockets are likely to form from a structure would greatly accelerate the search for druggable pockets. Here, we present PocketMiner, a graph neural network trained to predict where pockets are likely to open in molecular dynamics simulations. Applying PocketMiner to single structures from a newly curated dataset of 39 experimentally confirmed cryptic pockets demonstrates that it accurately identifies cryptic pockets (ROC-AUC: 0.87) >1,000-fold faster than existing methods. We apply PocketMiner across the human proteome and show that predicted pockets open in simulations, suggesting that over half of proteins thought to lack pockets based on available structures likely contain cryptic pockets, vastly expanding the potentially druggable proteome.


Assuntos
Trabalho de Parto , Proteoma , Humanos , Gravidez , Feminino , Descoberta de Drogas , Simulação de Dinâmica Molecular , Redes Neurais de Computação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...