Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 67
Filtrar
1.
Med Image Anal ; 97: 103280, 2024 Jul 22.
Artículo en Inglés | MEDLINE | ID: mdl-39096845

RESUMEN

Medical image segmentation is crucial for healthcare, yet convolution-based methods like U-Net face limitations in modeling long-range dependencies. To address this, Transformers designed for sequence-to-sequence predictions have been integrated into medical image segmentation. However, a comprehensive understanding of Transformers' self-attention in U-Net components is lacking. TransUNet, first introduced in 2021, is widely recognized as one of the first models to integrate Transformer into medical image analysis. In this study, we present the versatile framework of TransUNet that encapsulates Transformers' self-attention into two key modules: (1) a Transformer encoder tokenizing image patches from a convolution neural network (CNN) feature map, facilitating global context extraction, and (2) a Transformer decoder refining candidate regions through cross-attention between proposals and U-Net features. These modules can be flexibly inserted into the U-Net backbone, resulting in three configurations: Encoder-only, Decoder-only, and Encoder+Decoder. TransUNet provides a library encompassing both 2D and 3D implementations, enabling users to easily tailor the chosen architecture. Our findings highlight the encoder's efficacy in modeling interactions among multiple abdominal organs and the decoder's strength in handling small targets like tumors. It excels in diverse medical applications, such as multi-organ segmentation, pancreatic tumor segmentation, and hepatic vessel segmentation. Notably, our TransUNet achieves a significant average Dice improvement of 1.06% and 4.30% for multi-organ segmentation and pancreatic tumor segmentation, respectively, when compared to the highly competitive nn-UNet, and surpasses the top-1 solution in the BrasTS2021 challenge. 2D/3D Code and models are available at https://github.com/Beckschen/TransUNet and https://github.com/Beckschen/TransUNet-3D, respectively.

2.
Med Image Anal ; 97: 103285, 2024 Jul 30.
Artículo en Inglés | MEDLINE | ID: mdl-39116766

RESUMEN

We introduce the largest abdominal CT dataset (termed AbdomenAtlas) of 20,460 three-dimensional CT volumes sourced from 112 hospitals across diverse populations, geographies, and facilities. AbdomenAtlas provides 673 K high-quality masks of anatomical structures in the abdominal region annotated by a team of 10 radiologists with the help of AI algorithms. We start by having expert radiologists manually annotate 22 anatomical structures in 5,246 CT volumes. Following this, a semi-automatic annotation procedure is performed for the remaining CT volumes, where radiologists revise the annotations predicted by AI, and in turn, AI improves its predictions by learning from revised annotations. Such a large-scale, detailed-annotated, and multi-center dataset is needed for two reasons. Firstly, AbdomenAtlas provides important resources for AI development at scale, branded as large pre-trained models, which can alleviate the annotation workload of expert radiologists to transfer to broader clinical applications. Secondly, AbdomenAtlas establishes a large-scale benchmark for evaluating AI algorithms-the more data we use to test the algorithms, the better we can guarantee reliable performance in complex clinical scenarios. An ISBI & MICCAI challenge named BodyMaps: Towards 3D Atlas of Human Body was launched using a subset of our AbdomenAtlas, aiming to stimulate AI innovation and to benchmark segmentation accuracy, inference efficiency, and domain generalizability. We hope our AbdomenAtlas can set the stage for larger-scale clinical trials and offer exceptional opportunities to practitioners in the medical imaging community. Codes, models, and datasets are available at https://www.zongweiz.com/dataset.

3.
Proc Natl Acad Sci U S A ; 121(24): e2317707121, 2024 Jun 11.
Artículo en Inglés | MEDLINE | ID: mdl-38830105

RESUMEN

Human pose, defined as the spatial relationships between body parts, carries instrumental information supporting the understanding of motion and action of a person. A substantial body of previous work has identified cortical areas responsive to images of bodies and different body parts. However, the neural basis underlying the visual perception of body part relationships has received less attention. To broaden our understanding of body perception, we analyzed high-resolution fMRI responses to a wide range of poses from over 4,000 complex natural scenes. Using ground-truth annotations and an application of three-dimensional (3D) pose reconstruction algorithms, we compared similarity patterns of cortical activity with similarity patterns built from human pose models with different levels of depth availability and viewpoint dependency. Targeting the challenge of explaining variance in complex natural image responses with interpretable models, we achieved statistically significant correlations between pose models and cortical activity patterns (though performance levels are substantially lower than the noise ceiling). We found that the 3D view-independent pose model, compared with two-dimensional models, better captures the activation from distinct cortical areas, including the right posterior superior temporal sulcus (pSTS). These areas, together with other pose-selective regions in the LOTC, form a broader, distributed cortical network with greater view-tolerance in more anterior patches. We interpret these findings in light of the computational complexity of natural body images, the wide range of visual tasks supported by pose structures, and possible shared principles for view-invariant processing between articulated objects and ordinary, rigid objects.


Asunto(s)
Encéfalo , Imagen por Resonancia Magnética , Humanos , Imagen por Resonancia Magnética/métodos , Masculino , Femenino , Adulto , Encéfalo/fisiología , Encéfalo/diagnóstico por imagen , Mapeo Encefálico/métodos , Percepción Visual/fisiología , Postura/fisiología , Adulto Joven , Imagenología Tridimensional/métodos , Estimulación Luminosa/métodos , Algoritmos
4.
Med Image Anal ; 97: 103226, 2024 Jun 04.
Artículo en Inglés | MEDLINE | ID: mdl-38852215

RESUMEN

The advancement of artificial intelligence (AI) for organ segmentation and tumor detection is propelled by the growing availability of computed tomography (CT) datasets with detailed, per-voxel annotations. However, these AI models often struggle with flexibility for partially annotated datasets and extensibility for new classes due to limitations in the one-hot encoding, architectural design, and learning scheme. To overcome these limitations, we propose a universal, extensible framework enabling a single model, termed Universal Model, to deal with multiple public datasets and adapt to new classes (e.g., organs/tumors). Firstly, we introduce a novel language-driven parameter generator that leverages language embeddings from large language models, enriching semantic encoding compared with one-hot encoding. Secondly, the conventional output layers are replaced with lightweight, class-specific heads, allowing Universal Model to simultaneously segment 25 organs and six types of tumors and ease the addition of new classes. We train our Universal Model on 3410 CT volumes assembled from 14 publicly available datasets and then test it on 6173 CT volumes from four external datasets. Universal Model achieves first place on six CT tasks in the Medical Segmentation Decathlon (MSD) public leaderboard and leading performance on the Beyond The Cranial Vault (BTCV) dataset. In summary, Universal Model exhibits remarkable computational efficiency (6× faster than other dataset-specific models), demonstrates strong generalization across different hospitals, transfers well to numerous downstream tasks, and more importantly, facilitates the extensibility to new classes while alleviating the catastrophic forgetting of previously learned classes. Codes, models, and datasets are available at https://github.com/ljwztc/CLIP-Driven-Universal-Model.

5.
Nature ; 629(8012): 679-687, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38693266

RESUMEN

Pancreatic intraepithelial neoplasias (PanINs) are the most common precursors of pancreatic cancer, but their small size and inaccessibility in humans make them challenging to study1. Critically, the number, dimensions and connectivity of human PanINs remain largely unknown, precluding important insights into early cancer development. Here, we provide a microanatomical survey of human PanINs by analysing 46 large samples of grossly normal human pancreas with a machine-learning pipeline for quantitative 3D histological reconstruction at single-cell resolution. To elucidate genetic relationships between and within PanINs, we developed a workflow in which 3D modelling guides multi-region microdissection and targeted and whole-exome sequencing. From these samples, we calculated a mean burden of 13 PanINs per cm3 and extrapolated that the normal intact adult pancreas harbours hundreds of PanINs, almost all with oncogenic KRAS hotspot mutations. We found that most PanINs originate as independent clones with distinct somatic mutation profiles. Some spatially continuous PanINs were found to contain multiple KRAS mutations; computational and in situ analyses demonstrated that different KRAS mutations localize to distinct cell subpopulations within these neoplasms, indicating their polyclonal origins. The extensive multifocality and genetic heterogeneity of PanINs raises important questions about mechanisms that drive precancer initiation and confer differential progression risk in the human pancreas. This detailed 3D genomic mapping of molecular alterations in human PanINs provides an empirical foundation for early detection and rational interception of pancreatic cancer.


Asunto(s)
Heterogeneidad Genética , Genómica , Imagenología Tridimensional , Neoplasias Pancreáticas , Lesiones Precancerosas , Análisis de la Célula Individual , Adulto , Femenino , Humanos , Masculino , Células Clonales/metabolismo , Células Clonales/patología , Secuenciación del Exoma , Aprendizaje Automático , Mutación , Páncreas/anatomía & histología , Páncreas/citología , Páncreas/metabolismo , Páncreas/patología , Neoplasias Pancreáticas/genética , Neoplasias Pancreáticas/patología , Lesiones Precancerosas/genética , Lesiones Precancerosas/patología , Flujo de Trabajo , Progresión de la Enfermedad , Detección Precoz del Cáncer , Oncogenes/genética
6.
Nat Med ; 29(12): 3033-3043, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37985692

RESUMEN

Pancreatic ductal adenocarcinoma (PDAC), the most deadly solid malignancy, is typically detected late and at an inoperable stage. Early or incidental detection is associated with prolonged survival, but screening asymptomatic individuals for PDAC using a single test remains unfeasible due to the low prevalence and potential harms of false positives. Non-contrast computed tomography (CT), routinely performed for clinical indications, offers the potential for large-scale screening, however, identification of PDAC using non-contrast CT has long been considered impossible. Here, we develop a deep learning approach, pancreatic cancer detection with artificial intelligence (PANDA), that can detect and classify pancreatic lesions with high accuracy via non-contrast CT. PANDA is trained on a dataset of 3,208 patients from a single center. PANDA achieves an area under the receiver operating characteristic curve (AUC) of 0.986-0.996 for lesion detection in a multicenter validation involving 6,239 patients across 10 centers, outperforms the mean radiologist performance by 34.1% in sensitivity and 6.3% in specificity for PDAC identification, and achieves a sensitivity of 92.9% and specificity of 99.9% for lesion detection in a real-world multi-scenario validation consisting of 20,530 consecutive patients. Notably, PANDA utilized with non-contrast CT shows non-inferiority to radiology reports (using contrast-enhanced CT) in the differentiation of common pancreatic lesion subtypes. PANDA could potentially serve as a new tool for large-scale pancreatic cancer screening.


Asunto(s)
Carcinoma Ductal Pancreático , Aprendizaje Profundo , Neoplasias Pancreáticas , Humanos , Inteligencia Artificial , Neoplasias Pancreáticas/diagnóstico por imagen , Neoplasias Pancreáticas/patología , Tomografía Computarizada por Rayos X , Páncreas/diagnóstico por imagen , Páncreas/patología , Carcinoma Ductal Pancreático/diagnóstico por imagen , Carcinoma Ductal Pancreático/patología , Estudios Retrospectivos
7.
Cogn Sci ; 47(9): e13347, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37718474

RESUMEN

Advances in artificial intelligence have raised a basic question about human intelligence: Is human reasoning best emulated by applying task-specific knowledge acquired from a wealth of prior experience, or is it based on the domain-general manipulation and comparison of mental representations? We address this question for the case of visual analogical reasoning. Using realistic images of familiar three-dimensional objects (cars and their parts), we systematically manipulated viewpoints, part relations, and entity properties in visual analogy problems. We compared human performance to that of two recent deep learning models (Siamese Network and Relation Network) that were directly trained to solve these problems and to apply their task-specific knowledge to analogical reasoning. We also developed a new model using part-based comparison (PCM) by applying a domain-general mapping procedure to learned representations of cars and their component parts. Across four-term analogies (Experiment 1) and open-ended analogies (Experiment 2), the domain-general PCM model, but not the task-specific deep learning models, generated performance similar in key aspects to that of human reasoners. These findings provide evidence that human-like analogical reasoning is unlikely to be achieved by applying deep learning with big data to a specific type of analogy problem. Rather, humans do (and machines might) achieve analogical reasoning by learning representations that encode structural information useful for multiple tasks, coupled with efficient computation of relational similarity.


Asunto(s)
Inteligencia Artificial , Inteligencia , Humanos , Conocimiento , Solución de Problemas
8.
IEEE Trans Pattern Anal Mach Intell ; 45(7): 9225-9232, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37018583

RESUMEN

Batch normalization (BN) is a fundamental unit in modern deep neural networks. However, BN and its variants focus on normalization statistics but neglect the recovery step that uses linear transformation to improve the capacity of fitting complex data distributions. In this paper, we demonstrate that the recovery step can be improved by aggregating the neighborhood of each neuron rather than just considering a single neuron. Specifically, we propose a simple yet effective method named batch normalization with enhanced linear transformation (BNET) to embed spatial contextual information and improve representation ability. BNET can be easily implemented using the depth-wise convolution and seamlessly transplanted into existing architectures with BN. To our best knowledge, BNET is the first attempt to enhance the recovery step for BN. Furthermore, BN is interpreted as a special case of BNET from both spatial and spectral views. Experimental results demonstrate that BNET achieves consistent performance gains based on various backbones in a wide range of visual tasks. Moreover, BNET can accelerate the convergence of network training and enhance spatial information by assigning important neurons with large weights accordingly.

9.
bioRxiv ; 2023 Jan 28.
Artículo en Inglés | MEDLINE | ID: mdl-36747709

RESUMEN

Pancreatic intraepithelial neoplasia (PanIN) is a precursor to pancreatic cancer and represents a critical opportunity for cancer interception. However, the number, size, shape, and connectivity of PanINs in human pancreatic tissue samples are largely unknown. In this study, we quantitatively assessed human PanINs using CODA, a novel machine-learning pipeline for 3D image analysis that generates quantifiable models of large pieces of human pancreas with single-cell resolution. Using a cohort of 38 large slabs of grossly normal human pancreas from surgical resection specimens, we identified striking multifocality of PanINs, with a mean burden of 13 spatially separate PanINs per cm3 of sampled tissue. Extrapolating this burden to the entire pancreas suggested a median of approximately 1000 PanINs in an entire pancreas. In order to better understand the clonal relationships within and between PanINs, we developed a pipeline for CODA-guided multi-region genomic analysis of PanINs, including targeted and whole exome sequencing. Multi-region assessment of 37 PanINs from eight additional human pancreatic tissue slabs revealed that almost all PanINs contained hotspot mutations in the oncogene KRAS, but no gene other than KRAS was altered in more than 20% of the analyzed PanINs. PanINs contained a mean of 13 somatic mutations per region when analyzed by whole exome sequencing. The majority of analyzed PanINs originated from independent clonal events, with distinct somatic mutation profiles between PanINs in the same tissue slab. A subset of the analyzed PanINs contained multiple KRAS mutations, suggesting a polyclonal origin even in PanINs that are contiguous by rigorous 3D assessment. This study leverages a novel 3D genomic mapping approach to describe, for the first time, the spatial and genetic multifocality of human PanINs, providing important insights into the initiation and progression of pancreatic neoplasia.

10.
IEEE Trans Med Imaging ; 41(6): 1346-1357, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-34968179

RESUMEN

The spleen is one of the most commonly injured solid organs in blunt abdominal trauma. The development of automatic segmentation systems from multi-phase CT for splenic vascular injury can augment severity grading for improving clinical decision support and outcome prediction. However, accurate segmentation of splenic vascular injury is challenging for the following reasons: 1) Splenic vascular injury can be highly variant in shape, texture, size, and overall appearance; and 2) Data acquisition is a complex and expensive procedure that requires intensive efforts from both data scientists and radiologists, which makes large-scale well-annotated datasets hard to acquire in general. In light of these challenges, we hereby design a novel framework for multi-phase splenic vascular injury segmentation, especially with limited data. On the one hand, we propose to leverage external data to mine pseudo splenic masks as the spatial attention, dubbed external attention, for guiding the segmentation of splenic vascular injury. On the other hand, we develop a synthetic phase augmentation module, which builds upon generative adversarial networks, for populating the internal data by fully leveraging the relation between different phases. By jointly enforcing external attention and populating internal data representation during training, our proposed method outperforms other competing methods and substantially improves the popular DeepLab-v3+ baseline by more than 7% in terms of average DSC, which confirms its effectiveness.


Asunto(s)
Bazo , Lesiones del Sistema Vascular , Abdomen , Atención , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Bazo/diagnóstico por imagen , Tomografía Computarizada por Rayos X
11.
AJR Am J Roentgenol ; 217(5): 1104-1112, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34467768

RESUMEN

OBJECTIVE. Pancreatic ductal adenocarcinoma (PDAC) is often a lethal malignancy with limited preoperative predictors of long-term survival. The purpose of this study was to evaluate the prognostic utility of preoperative CT radiomics features in predicting postoperative survival of patients with PDAC. MATERIALS AND METHODS. A total of 153 patients with surgically resected PDAC who underwent preoperative CT between 2011 and 2017 were retrospectively identified. Demographic, clinical, and survival information was collected from the medical records. Survival time after the surgical resection was used to stratify patients into a low-risk group (survival time > 3 years) and a high-risk group (survival time < 1 year). The 3D volume of the whole pancreatic tumor and background pancreas were manually segmented. A total of 478 radiomics features were extracted from tumors and 11 extra features were computed from pancreas boundaries. The 10 most relevant features were selected by feature reduction. Survival analysis was performed on the basis of clinical parameters both with and without the addition of the selected features. Survival status and time were estimated by a random survival forest algorithm. Concordance index (C-index) was used to evaluate performance of the survival prediction model. RESULTS. The mean age of patients with PDAC was 67 ± 11 (SD) years. The mean tumor size was 3.31 ± 2.55 cm. The 10 most relevant radiomics features showed 82.2% accuracy in the classification of high-risk versus low-risk groups. The C-index of survival prediction with clinical parameters alone was 0.6785. The addition of CT radiomics features improved the C-index to 0.7414. CONCLUSION. Addition of CT radiomics features to standard clinical factors improves survival prediction in patients with PDAC.


Asunto(s)
Carcinoma Ductal Pancreático/diagnóstico por imagen , Carcinoma Ductal Pancreático/mortalidad , Neoplasias Pancreáticas/diagnóstico por imagen , Neoplasias Pancreáticas/mortalidad , Cuidados Preoperatorios , Tomografía Computarizada por Rayos X , Adulto , Anciano , Anciano de 80 o más Años , Carcinoma Ductal Pancreático/cirugía , Femenino , Humanos , Aprendizaje Automático , Masculino , Persona de Mediana Edad , Neoplasias Pancreáticas/cirugía , Pronóstico , Estudios Retrospectivos , Análisis de Supervivencia , Carga Tumoral
12.
J Comput Assist Tomogr ; 45(3): 343-351, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34297507

RESUMEN

ABSTRACT: Artificial intelligence is poised to revolutionize medical image. It takes advantage of the high-dimensional quantitative features present in medical images that may not be fully appreciated by humans. Artificial intelligence has the potential to facilitate automatic organ segmentation, disease detection and characterization, and prediction of disease recurrence. This article reviews the current status of artificial intelligence in liver imaging and reviews the opportunities and challenges in clinical implementation.


Asunto(s)
Neoplasias Hepáticas/diagnóstico por imagen , Interpretación de Imagen Radiográfica Asistida por Computador/métodos , Aprendizaje Profundo , Humanos , Hígado/diagnóstico por imagen , Recurrencia Local de Neoplasia
13.
IEEE Trans Med Imaging ; 40(10): 2723-2735, 2021 10.
Artículo en Inglés | MEDLINE | ID: mdl-33600311

RESUMEN

Pancreatic ductal adenocarcinoma (PDAC) is the third most common cause of cancer death in the United States. Predicting tumors like PDACs (including both classification and segmentation) from medical images by deep learning is becoming a growing trend, but usually a large number of annotated data are required for training, which is very labor-intensive and time-consuming. In this paper, we consider a partially supervised setting, where cheap image-level annotations are provided for all the training data, and the costly per-voxel annotations are only available for a subset of them. We propose an Inductive Attention Guidance Network (IAG-Net) to jointly learn a global image-level classifier for normal/PDAC classification and a local voxel-level classifier for semi-supervised PDAC segmentation. We instantiate both the global and the local classifiers by multiple instance learning (MIL), where the attention guidance, indicating roughly where the PDAC regions are, is the key to bridging them: For global MIL based normal/PDAC classification, attention serves as a weight for each instance (voxel) during MIL pooling, which eliminates the distraction from the background; For local MIL based semi-supervised PDAC segmentation, the attention guidance is inductive, which not only provides bag-level pseudo-labels to training data without per-voxel annotations for MIL training, but also acts as a proxy of an instance-level classifier. Experimental results show that our IAG-Net boosts PDAC segmentation accuracy by more than 5% compared with the state-of-the-arts.


Asunto(s)
Adenocarcinoma , Neoplasias Pancreáticas , Atención , Humanos , Neoplasias Pancreáticas/diagnóstico por imagen , Aprendizaje Automático Supervisado
14.
Abdom Radiol (NY) ; 46(6): 2556-2566, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-33469691

RESUMEN

PURPOSE: In patients presenting with blunt hepatic injury (BHI), the utility of CT for triage to hepatic angiography remains uncertain since simple binary assessment of contrast extravasation (CE) as being present or absent has only modest accuracy for major arterial injury on digital subtraction angiography (DSA). American Association for the Surgery of Trauma (AAST) liver injury grading is coarse and subjective, with limited diagnostic utility in this setting. Volumetric measurements of hepatic injury burden could improve prediction. We hypothesized that in a cohort of patients that underwent catheter-directed hepatic angiography following admission trauma CT, a deep learning quantitative visualization method that calculates % liver parenchymal disruption (the LPD index, or LPDI) would add value to CE assessment for prediction of major hepatic arterial injury (MHAI). METHODS: This retrospective study included adult patients with BHI between 1/1/2008 and 5/1/2017 from two institutions that underwent admission trauma CT prior to hepatic angiography (n = 73). Presence (n = 41) or absence (n = 32) of MHAI (pseudoaneurysm, AVF, or active contrast extravasation on DSA) served as the outcome. Voxelwise measurements of liver laceration were derived using an existing multiscale deep learning algorithm trained on manually labeled data using cross-validation with a 75-25% split in four unseen folds. Liver volume was derived using a pre-trained whole liver segmentation algorithm. LPDI was automatically calculated for each patient by determining the percentage of liver involved by laceration. Classification and regression tree (CART) analyses were performed using a combination of automated LPDI measurements and either manually segmented CE volumes, or CE as a binary sign. Performance metrics for the decision rules were compared for significant differences with binary CE alone (the current standard of care for predicting MHAI), and the AAST grade. RESULTS: 36% of patients (n = 26) had contrast extravasation on CT. Median [Q1-Q3] automated LPDI was 4.0% [1.0-12.1%]. 41/73 (56%) of patients had MHAI. A decision tree based on auto-LPDI and volumetric CE measurements (CEvol) had the highest accuracy (0.84, 95% CI 0.73-0.91) with significant improvement over binary CE assessment (0.68, 95% CI 0.57-0.79; p = 0.01). AAST grades at different cut-offs performed poorly for predicting MHAI, with accuracies ranging from 0.44-0.63. Decision tree analysis suggests an auto-LPDI cut-off of ≥ 12% for minimizing false negative CT exams when CE is absent or diminutive. CONCLUSION: Current CT imaging paradigms are coarse, subjective, and limited for predicting which BHIs are most likely to benefit from AE. LPDI, automated using deep learning methods, may improve objective personalized triage of BHI patients to angiography at the point of care.


Asunto(s)
Aprendizaje Profundo , Adulto , Árboles de Decisión , Humanos , Hígado/diagnóstico por imagen , Estudios Retrospectivos , Tomografía Computarizada por Rayos X
15.
Cogsci ; 43: 223-229, 2021 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-35969705

RESUMEN

Perceiving 3D structure in natural images is an immense computational challenge for the visual system. While many previous studies focused on the perception of rigid 3D objects, we applied a novel method on a common set of non-rigid objects-static images of the human body in the natural world. We investigated to what extent human ability to interpret 3D poses in natural images depends on the typicality of the underlying 3D pose and the informativeness of the viewpoint. Using a novel 2AFC pose matching task, we measured how well subjects were able to match a target natural pose image with one of two comparison, synthetic body images from a different viewpoint-one was rendered with the same 3D pose parameters as the target while the other was a distractor rendered with added noises on joint angles. We found that performance for typical poses was measurably better than atypical poses; however, we found no significant difference between informative and less informative viewpoints. Further comparisons of 2D and 3D pose matching models on the same task showed that 3D body knowledge is particularly important when interpreting images of atypical poses. These results suggested that human ability to interpret 3D poses depends on pose typicality but not viewpoint informativeness, and that humans probably use prior knowledge of 3D pose structures.

16.
IEEE Trans Pattern Anal Mach Intell ; 43(2): 404-419, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-31449007

RESUMEN

Age estimation from facial images is typically cast as a label distribution learning or regression problem, since aging is a gradual progress. Its main challenge is the facial feature space w.r.t. ages is inhomogeneous, due to the large variation in facial appearance across different persons of the same age and the non-stationary property of aging. In this paper, we propose two Deep Differentiable Random Forests methods, Deep Label Distribution Learning Forest (DLDLF) and Deep Regression Forest (DRF), for age estimation. Both of them connect split nodes to the top layer of convolutional neural networks (CNNs) and deal with inhomogeneous data by jointly learning input-dependent data partitions at the split nodes and age distributions at the leaf nodes. This joint learning follows an alternating strategy: (1) Fixing the leaf nodes and optimizing the split nodes and the CNN parameters by Back-propagation; (2) Fixing the split nodes and optimizing the leaf nodes by Variational Bounding. Two Deterministic Annealing processes are introduced into the learning of the split and leaf nodes, respectively, to avoid poor local optima and obtain better estimates of tree parameters free of initial values. Experimental results show that DLDLF and DRF achieve state-of-the-art performance on three age estimation datasets.


Asunto(s)
Algoritmos , Redes Neurales de la Computación , Cara , Aprendizaje
17.
Curr Probl Diagn Radiol ; 50(4): 540-550, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-32988674

RESUMEN

Computed tomography is the most commonly used imaging modality to detect and stage pancreatic cancer. Previous advances in pancreatic cancer imaging have focused on optimizing image acquisition parameters and reporting standards. However, current state-of-the-art imaging approaches still misdiagnose some potentially curable pancreatic cancers and do not provide prognostic information or inform optimal management strategies beyond stage. Several recent developments in pancreatic cancer imaging, including artificial intelligence and advanced visualization techniques, are rapidly changing the field. The purpose of this article is to review how these recent advances have the potential to revolutionize pancreatic cancer imaging.


Asunto(s)
Inteligencia Artificial , Neoplasias Pancreáticas , Humanos , Neoplasias Pancreáticas/diagnóstico por imagen , Tomografía Computarizada por Rayos X
18.
Radiol Artif Intell ; 2(6): e190220, 2020 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-33330848

RESUMEN

PURPOSE: To evaluate the feasibility of a multiscale deep learning algorithm for quantitative visualization and measurement of traumatic hemoperitoneum and to compare diagnostic performance for relevant outcomes with categorical estimation. MATERIALS AND METHODS: This retrospective, single-institution study included 130 patients (mean age, 38 years; interquartile range, 25-50 years; 79 men) with traumatic hemoperitoneum who underwent CT of the abdomen and pelvis at trauma admission between January 2016 and April 2019. Labeled cases were separated into five combinations of training (80%) and test (20%) sets, and fivefold cross-validation was performed. Dice similarity coefficients (DSCs) were compared with those from a three-dimensional (3D) U-Net and a coarse-to-fine deep learning method. Areas under the receiver operating characteristic curve (AUCs) for a composite outcome, including hemostatic intervention, transfusion, and in-hospital mortality, were compared with consensus categorical assessment by two radiologists. An optimal cutoff was derived by using a radial basis function-based support vector machine. RESULTS: Mean DSC for the multiscale algorithm was 0.61 ± 0.15 (standard deviation) compared with 0.32 ± 0.16 for the 3D U-Net method and 0.52 ± 0.17 for the coarse-to-fine method (P < .0001). Correlation and agreement between automated and manual volumes were excellent (Pearson r = 0.97, intraclass correlation coefficient = 0.93). The algorithm produced intuitive and explainable visual results. AUCs for automated volume measurement and categorical estimation were 0.86 and 0.77, respectively (P = .004). An optimal cutoff of 278.9 mL yielded accuracy of 84%, sensitivity of 82%, specificity of 93%, positive predictive value of 86%, and negative predictive value of 83%. CONCLUSION: A multiscale deep learning method for traumatic hemoperitoneum quantitative visualization had improved diagnostic performance for predicting hemorrhage-control interventions and mortality compared with subjective volume estimation. Supplemental material is available for this article. © RSNA, 2020.

19.
Artículo en Inglés | MEDLINE | ID: mdl-32956057

RESUMEN

The Deep learning of optical flow has been an active area for its empirical success. For the difficulty of obtaining accurate dense correspondence labels, unsupervised learning of optical flow has drawn more and more attention, while the accuracy is still far from satisfaction. By holding the philosophy that better estimation models can be trained with betterapproximated labels, which in turn can be obtained from better estimation models, we propose a self-taught learning framework to continually improve the accuracy using self-generated pseudo labels. The estimated optical flow is first filtered by bidirectional flow consistency validation and occlusion-aware dense labels are then generated by edge-aware interpolation from selected sparse matches. Moreover, by combining reconstruction loss with regression loss on the generated pseudo labels, the performance is further improved. The experimental results demonstrate that our models achieve state-of-the-art results among unsupervised methods on the public KITTI, MPI-Sintel and Flying Chairs datasets.

20.
Med Image Anal ; 65: 101766, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-32623276

RESUMEN

Although having achieved great success in medical image segmentation, deep learning-based approaches usually require large amounts of well-annotated data, which can be extremely expensive in the field of medical image analysis. Unlabeled data, on the other hand, is much easier to acquire. Semi-supervised learning and unsupervised domain adaptation both take the advantage of unlabeled data, and they are closely related to each other. In this paper, we propose uncertainty-aware multi-view co-training (UMCT), a unified framework that addresses these two tasks for volumetric medical image segmentation. Our framework is capable of efficiently utilizing unlabeled data for better performance. We firstly rotate and permute the 3D volumes into multiple views and train a 3D deep network on each view. We then apply co-training by enforcing multi-view consistency on unlabeled data, where an uncertainty estimation of each view is utilized to achieve accurate labeling. Experiments on the NIH pancreas segmentation dataset and a multi-organ segmentation dataset show state-of-the-art performance of the proposed framework on semi-supervised medical image segmentation. Under unsupervised domain adaptation settings, we validate the effectiveness of this work by adapting our multi-organ segmentation model to two pathological organs from the Medical Segmentation Decathlon Datasets. Additionally, we show that our UMCT-DA model can even effectively handle the challenging situation where labeled source data is inaccessible, demonstrating strong potentials for real-world applications.


Asunto(s)
Aprendizaje Automático Supervisado , Humanos , Incertidumbre
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA