Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Más filtros

Banco de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Acad Radiol ; 31(4): 1518-1527, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-37951778

RESUMEN

OBJECTIVES: To develop a deep learning (DL) model for segmentation of the suprapatellar capsule (SC) and infrapatellar fat pad (IPFP) based on sagittal proton density-weighted images and to distinguish between three common types of knee synovitis. MATERIALS AND METHODS: This retrospective study included 376 consecutive patients with pathologically confirmed knee synovitis (rheumatoid arthritis, gouty arthritis, and pigmented villonodular synovitis) from two institutions. A semantic segmentation model was trained on manually annotated sagittal proton density-weighted images. The segmentation results of the regions of interest and patients' sex and age were used to classify knee synovitis after feature processing. Classification by the DL method was compared to the classification performed by radiologists. RESULTS: Data of the 376 patients (mean age, 42 ± 15 years; 216 men) were separated into a training set (n = 233), an internal test set (n = 93), and an external test set (n = 50). The automated segmentation model showed good performance (mean accuracy: 0.99 and 0.99 in the internal and external test sets). On the internal test set, the DL model performed better than the senior radiologist (accuracy: 0.86 vs. 0.79; area under the curve [AUC]: 0.83 vs. 0.79). On the external test set, the DL diagnostic model based on automatic segmentation performed as well or better than senior and junior radiologists (accuracy: 0.79 vs. 0.79 vs. 0.73; AUC: 0.76 vs. 0.77 vs. 0.70). CONCLUSION: DL models for segmentation of SC and IPFD can accurately classify knee synovitis and aid radiologic diagnosis.


Asunto(s)
Aprendizaje Profundo , Sinovitis , Masculino , Humanos , Adulto , Persona de Mediana Edad , Estudios Retrospectivos , Protones , Sinovitis/diagnóstico por imagen , Imagen por Resonancia Magnética/métodos
2.
IEEE Trans Image Process ; 32: 2608-2619, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37040250

RESUMEN

Visual object navigation is an essential task of embodied AI, which is letting the agent navigate to the goal object under the user's demand. Previous methods often focus on single-object navigation. However, in real life, human demands are generally continuous and multiple, requiring the agent to implement multiple tasks in sequence. These demands can be addressed by repeatedly performing previous single task methods. However, by dividing multiple tasks into several independent tasks to perform, without the global optimization between different tasks, the agents' trajectories may overlap, reducing the efficiency of navigation. In this paper, we propose an efficient reinforcement learning framework with a hybrid policy for multi-object navigation, aiming to maximally eliminate noneffective actions. First, the visual observations are embedded to detect the semantic entities (such as objects). And the detected objects are memorized and projected into semantic maps, which can also be regarded as a long-term memory of the observed environment. Then a hybrid policy consisting of exploration and long-term planning strategies is proposed to predict the potential target position. In particular, when the target is directly oriented, the policy function makes long-term planning for the target based on the semantic map, which is implemented by a sequence of motion actions. In the alternative, when the target is not oriented, the policy function estimates an object's potential position toward exploring the most possible objects (positions) that have close relations to the target. The relation between different objects is obtained with prior knowledge, which is used to predict the potential target position by integrating with the memorized semantic map. And then a path to the potential target is planned by the policy function. We evaluate our proposed method on two large-scale 3D realistic environment datasets, Gibson and Matterport3D, and the experimental results demonstrate the effectiveness and generalization of the proposed method.

3.
IEEE Trans Pattern Anal Mach Intell ; 45(1): 229-246, 2023 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-35201982

RESUMEN

The goal of few-shot image recognition (FSIR) is to identify novel categories with a small number of annotated samples by exploiting transferable knowledge from training data (base categories). Most current studies assume that the transferable knowledge can be well used to identify novel categories. However, such transferable capability may be impacted by the dataset bias, and this problem has rarely been investigated before. Besides, most of few-shot learning methods are biased to different datasets, which is also an important issue that needs to be investigated deeply. In this paper, we first investigate the impact of transferable capabilities learned from base categories. Specifically, we use the relevance to measure relationships between base categories and novel categories. Distributions of base categories are depicted via the instance density and category diversity. The FSIR model learns better transferable knowledge from relevant training data. In the relevant data, dense instances or diverse categories can further enrich the learned knowledge. Experimental results on different sub-datasets of Imagenet demonstrate category relevance, instance density and category diversity can depict transferable bias from distributions of base categories. Second, we investigate performance differences on different datasets from the aspects of dataset structures and different few-shot learning methods. Specifically, we introduce image complexity, intra-concept visual consistency, and inter-concept visual similarity to quantify characteristics of dataset structures. We use these quantitative characteristics and eight few-shot learning methods to analyze performance differences on multiple datasets. Based on the experimental analysis, some insightful observations are obtained from the perspective of both dataset structures and few-shot learning methods. We hope these observations are useful to guide future few-shot learning research on new datasets or tasks. Our data is available at http://123.57.42.89/dataset-bias/dataset-bias.html.

4.
Ecol Evol ; 13(10): e10653, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37869444

RESUMEN

The endangerment mechanisms of various species are a focus of studies on biodiversity and conservation biology. Hipposideros pomona is an endangered species, but the reasons behind its endangerment remain unclear. We investigated the endangerment mechanisms of H. pomona using mitochondrial DNA, nuclear DNA, and microsatellite loci markers. The results showed that the nucleotide diversity of mitochondria DNA and heterozygosity of microsatellite markers were high (π = 0.04615, H O = 0.7115), whereas the nucleotide diversity of the nuclear genes was low (THY: π = 0.00508, SORBS2: π = 0.00677, ACOX2: π = 0.00462, COPS7A: π = 0.00679). The phylogenetic tree and median-joining network based on mitochondrial DNA sequences clustered the species into three clades, namely North Vietnam-Fujian, Myanmar-West Yunnan, and Laos-Hainan clades. However, joint analysis of nuclear genes did not exhibit clustering. Analysis of molecular variance revealed a strong population genetic structure; IMa2 analysis did not reveal significant gene flow between all groups (p > .05), and isolation-by-distance analysis revealed a significant positive correlation between genetic and geographic distances (p < .05). The mismatch distribution analysis, neutral test, and Bayesian skyline plots revealed that the H. pomona population were relatively stable and exhibited a contraction trend. The results implied that H. pomona exhibits female philopatry and male-biased dispersal. The Hengduan Mountains could have acted as a geographical barrier for gene flow between the North Vietnam-Fujian clade and the Myanmar-West Yunnan clade, whereas the Qiongzhou Strait may have limited interaction between the Hainan populations and other clades. The warm climate during the second interglacial Quaternary period (c. 0.33 Mya) could have been responsible for species differentiation, whereas the cold climate during the late Quaternary last glacial maximum (c. 10 ka BP) might have caused the overall contraction of species. The lack of significant gene flow in nuclear microsatellite loci markers among the different populations investigated reflects recent habitat fragmentation due to anthropogenic activities; thus, on-site conservation of the species and restoration of gene flow corridors among populations need immediate implementation.

5.
IEEE Trans Image Process ; 32: 5678-5691, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37812539

RESUMEN

The goal of few-shot image recognition is to classify different categories with only one or a few training samples. Previous works of few-shot learning mainly focus on simple images, such as object or character images. Those works usually use a convolutional neural network (CNN) to learn the global image representations from training tasks, which are then adapted to novel tasks. However, there are many more abstract and complex images in real world, such as scene images, consisting of many object entities with flexible spatial relations among them. In such cases, global features can hardly obtain satisfactory generalization ability due to the large diversity of object relations in the scenes, which may hinder the adaptability to novel scenes. This paper proposes a composite object relation modeling method for few-shot scene recognition, capturing the spatial structural characteristic of scene images to enhance adaptability on novel scenes, considering that objects commonly co- occurred in different scenes. In different few-shot scene recognition tasks, the objects in the same images usually play different roles. Thus we propose a task-aware region selection module (TRSM) to further select the detected regions in different few-shot tasks. In addition to detecting object regions, we mainly focus on exploiting the relations between objects, which are more consistent to the scenes and can be used to cleave apart different scenes. Objects and relations are used to construct a graph in each image, which is then modeled with graph convolutional neural network. The graph modeling is jointly optimized with few-shot recognition, where the loss of few-shot learning is also capable of adjusting graph based representations. Typically, the proposed graph based representations can be plugged in different types of few-shot architectures, such as metric-based and meta-learning methods. Experimental results of few-shot scene recognition show the effectiveness of the proposed method.

6.
Integr Zool ; 2023 Oct 03.
Artículo en Inglés | MEDLINE | ID: mdl-37789567

RESUMEN

Hipposideros pratti has low genetic diversity and was divided into two clades, the central-western clade and the eastern clade. We did not detect a clear east-to-west dispersal route along the longitudinal direction, and we found that the eastern clade spread outward from one population to another, while the central-western clade spread gradually. The glacial-interglacial period of the Quaternary influenced the migration and dispersal of H. pratti. H. pratti did not experience a significant population increase in the past, and the average population trajectory was decreasing. Given the convenient ecosystem services provided by bats, the preservation of bat populations is particularly critical. Nonetheless, we have discovered that the majority of H. pratti's distributional sites were not discovered in this study. Based on our results, it is important to apply in situ conservation measures for effective protection as soon as possible.

7.
Front Oncol ; 12: 814667, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35359400

RESUMEN

Background: Recently, the Turing test has been used to investigate whether machines have intelligence similar to humans. Our study aimed to assess the ability of an artificial intelligence (AI) system for spine tumor detection using the Turing test. Methods: Our retrospective study data included 12179 images from 321 patients for developing AI detection systems and 6635 images from 187 patients for the Turing test. We utilized a deep learning-based tumor detection system with Faster R-CNN architecture, which generates region proposals by Region Proposal Network in the first stage and corrects the position and the size of the bounding box of the lesion area in the second stage. Each choice question featured four bounding boxes enclosing an identical tumor. Three were detected by the proposed deep learning model, whereas the other was annotated by a doctor; the results were shown to six doctors as respondents. If the respondent did not correctly identify the image annotated by a human, his answer was considered a misclassification. If all misclassification rates were >30%, the respondents were considered unable to distinguish the AI-detected tumor from the human-annotated one, which indicated that the AI system passed the Turing test. Results: The average misclassification rates in the Turing test were 51.2% (95% CI: 45.7%-57.5%) in the axial view (maximum of 62%, minimum of 44%) and 44.5% (95% CI: 38.2%-51.8%) in the sagittal view (maximum of 59%, minimum of 36%). The misclassification rates of all six respondents were >30%; therefore, our AI system passed the Turing test. Conclusion: Our proposed intelligent spine tumor detection system has a similar detection ability to annotation doctors and may be an efficient tool to assist radiologists or orthopedists in primary spine tumor detection.

8.
Artículo en Inglés | MEDLINE | ID: mdl-32305915

RESUMEN

Exploiting the spatial structure in scene images is a key research direction for scene recognition. Due to the large intra-class structural diversity, building and modeling flexible structural layout to adapt various image characteristics is a challenge. Existing structural modeling methods in scene recognition either focus on predefined grids or rely on learned prototypes, which all have limited representative ability. In this paper, we propose Prototype-agnostic Scene Layout (PaSL) construction method to build the spatial structure for each image without conforming to any prototype. Our PaSL can flexibly capture the diverse spatial characteristic of scene images and have considerable generalization capability. Given a PaSL, we build Layout Graph Network (LGN) where regions in PaSL are defined as nodes and two kinds of independent relations between regions are encoded as edges. The LGN aims to incorporate two topological structures (formed in spatial and semantic similarity dimensions) into image representations through graph convolution. Extensive experiments show that our approach achieves state-of-the-art results on widely recognized MIT67 and SUN397 datasets without multi-model or multi-scale fusion. Moreover, we also conduct the experiments on one of the largest scale datasets, Places365. The results demonstrate the proposed method can be well generalized and obtains competitive performance.

9.
Artículo en Inglés | MEDLINE | ID: mdl-31425031

RESUMEN

Scene recognition is challenging due to the intra-class diversity and inter-class similarity. Previous works recognize scenes either with global representations or with the intermediate representations of objects. In contrast, we investigate more discriminative image representations of object-to-object relations for scene recognition, which are based on the triplets of obtained with detection techniques. Particularly, two types of representations, including co-occurring frequency of object-to-object relation (denoted as COOR) and sequential representation of object-to-object relation (denoted as SOOR), are proposed to describe objects and their relative relations in different forms. COOR is represented as the intermediate representation of co-occurring frequency of objects and their relations, with a three order tensor that can be fed to scene classifier without further embedding. SOOR is represented in a more explicit and freer form that sequentially describe image contents with local captions. And a sequence encoding model (e.g., recurrent neural network (RNN)) is implemented to encode SOOR to the features for feeding the classifiers. In order to better capture the spatial information, the proposed COOR and SOOR are adapted to RGB-D data, where a RGB-D proposal fusion method is proposed for RGB-D object detection. With the proposed approaches COOR and SOOR, we obtain the state-of-the-art results of RGB-D scene recognition on SUN RGB-D and NYUD2 datasets.

10.
Artículo en Inglés | MEDLINE | ID: mdl-30281448

RESUMEN

Deep convolutional networks (CNN) can achieve impressive results on RGB scene recognition thanks to large datasets such as Places. In contrast, RGB-D scene recognition is still underdeveloped in comparison, due to two limitations of RGB-D data we address in this paper. The first limitation is the lack of depth data for training deep learning models. Rather than fine tuning or transferring RGB-specific features, we address this limitation by proposing an architecture and a twostep training approach that directly learns effective depth-specific features using weak supervision via patches. The resulting RGBD model also benefits from more complementary multimodal features. Another limitation is the short range of depth sensors (typically 0.5m to 5.5m), resulting in depth images not capturing distant objects in the scenes that RGB images can. We show that this limitation can be addressed by using RGB-D videos, where more comprehensive depth information is accumulated as the camera travels across the scenes. Focusing on this scenario, we introduce the ISIA RGB-D video dataset to evaluate RGB-D scene recognition with videos. Our video recognition architecture combines convolutional and recurrent neural networks (RNNs) that are trained in three steps with increasingly complex data to learn effective features (i.e. patches, frames and sequences). Our approach obtains state-of-the-art performances on RGB-D image (NYUD2 and SUN RGB-D) and video (ISIA RGB-D) scene recognition.

11.
IEEE Trans Image Process ; 26(6): 2721-2735, 2017 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-28333637

RESUMEN

Before the big data era, scene recognition was often approached with two-step inference using localized intermediate representations (objects, topics, and so on). One of such approaches is the semantic manifold (SM), in which patches and images are modeled as points in a semantic probability simplex. Patch models are learned resorting to weak supervision via image labels, which leads to the problem of scene categories co-occurring in this semantic space. Fortunately, each category has its own co-occurrence patterns that are consistent across the images in that category. Thus, discovering and modeling these patterns are critical to improve the recognition performance in this representation. Since the emergence of large data sets, such as ImageNet and Places, these approaches have been relegated in favor of the much more powerful convolutional neural networks (CNNs), which can automatically learn multi-layered representations from the data. In this paper, we address many limitations of the original SM approach and related works. We propose discriminative patch representations using neural networks and further propose a hybrid architecture in which the semantic manifold is built on top of multiscale CNNs. Both representations can be computed significantly faster than the Gaussian mixture models of the original SM. To combine multiple scales, spatial relations, and multiple features, we formulate rich context models using Markov random fields. To solve the optimization problem, we analyze global and local approaches, where a top-down hierarchical algorithm has the best performance. Experimental results show that exploiting different types of contextual relations jointly consistently improves the recognition accuracy.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA