Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
IEEE Trans Image Process ; 32: 5138-5152, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37676804

RESUMEN

Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms. Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner. Distortion type identification and degradation level determination is employed as an auxiliary task to train a deep learning model containing a deep Convolutional Neural Network (CNN) that extracts spatial features, as well as a recurrent unit that captures temporal information. The model is trained using a contrastive loss and we therefore refer to this training framework and resulting model as CONtrastive VIdeo Quality EstimaTor (CONVIQT). During testing, the weights of the trained model are frozen, and a linear regressor maps the learned features to quality scores in a no-reference (NR) setting. We conduct comprehensive evaluations of the proposed model against leading algorithms on multiple VQA databases containing wide ranges of spatial and temporal distortions. We analyze the correlations between model predictions and ground-truth quality ratings, and show that CONVIQT achieves competitive performance when compared to state-of-the-art NR-VQA models, even though it is not trained on those databases. Our ablation experiments demonstrate that the learned representations are highly robust and generalize well across synthetic and realistic distortions. Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.

2.
IEEE Trans Image Process ; 31: 4149-4161, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35700254

RESUMEN

We consider the problem of obtaining image quality representations in a self-supervised manner. We use prediction of distortion type and degree as an auxiliary task to learn features from an unlabeled image dataset containing a mixture of synthetic and realistic distortions. We then train a deep Convolutional Neural Network (CNN) using a contrastive pairwise objective to solve the auxiliary problem. We refer to the proposed training framework and resulting deep IQA model as the CONTRastive Image QUality Evaluator (CONTRIQUE). During evaluation, the CNN weights are frozen and a linear regressor maps the learned representations to quality scores in a No-Reference (NR) setting. We show through extensive experiments that CONTRIQUE achieves competitive performance when compared to state-of-the-art NR image quality models, even without any additional fine-tuning of the CNN backbone. The learned representations are highly robust and generalize well across images afflicted by either synthetic or authentic distortions. Our results suggest that powerful quality representations with perceptual relevance can be obtained without requiring large labeled subjective image quality datasets. The implementations used in this paper are available at https://github.com/pavancm/CONTRIQUE.

3.
IEEE Trans Image Process ; 30: 7446-7457, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34449359

RESUMEN

We consider the problem of conducting frame rate dependent video quality assessment (VQA) on videos of diverse frame rates, including high frame rate (HFR) videos. More generally, we study how perceptual quality is affected by frame rate, and how frame rate and compression combine to affect perceived quality. We devise an objective VQA model called Space-Time GeneRalized Entropic Difference (GREED) which analyzes the statistics of spatial and temporal band-pass video coefficients. A generalized Gaussian distribution (GGD) is used to model band-pass responses, while entropy variations between reference and distorted videos under the GGD model are used to capture video quality variations arising from frame rate changes. The entropic differences are calculated across multiple temporal and spatial subbands, and merged using a learned regressor. We show through extensive experiments that GREED achieves state-of-the-art performance on the LIVE-YT-HFR Database when compared with existing VQA models. The features used in GREED are highly generalizable and obtain competitive performance even on standard, non-HFR VQA databases. The implementation of GREED has been made available online: https://github.com/pavancm/GREED.

4.
IEEE Trans Image Process ; 30: 7511-7526, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34460374

RESUMEN

Because of the increasing ease of video capture, many millions of consumers create and upload large volumes of User-Generated-Content (UGC) videos to social and streaming media sites over the Internet. UGC videos are commonly captured by naive users having limited skills and imperfect techniques, and tend to be afflicted by mixtures of highly diverse in-capture distortions. These UGC videos are then often uploaded for sharing onto cloud servers, where they are further compressed for storage and transmission. Our paper tackles the highly practical problem of predicting the quality of compressed videos (perhaps during the process of compression, to help guide it), with only (possibly severely) distorted UGC videos as references. To address this problem, we have developed a novel Video Quality Assessment (VQA) framework that we call 1stepVQA (to distinguish it from two-step methods that we discuss). 1stepVQA overcomes limitations of Full-Reference, Reduced-Reference and No-Reference VQA models by exploiting the statistical regularities of both natural videos and distorted videos. We also describe a new dedicated video database, which was created by applying a realistic VMAF-Guided perceptual rate distortion optimization (RDO) criterion to create realistically compressed versions of UGC source videos, which typically have pre-existing distortions. We show that 1stepVQA is able to more accurately predict the quality of compressed videos, given imperfect reference videos, and outperforms other VQA models in this scenario.

5.
IEEE Trans Image Process ; 30: 4449-4464, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33856995

RESUMEN

Recent years have witnessed an explosion of user-generated content (UGC) videos shared and streamed over the Internet, thanks to the evolution of affordable and reliable consumer capture devices, and the tremendous popularity of social media platforms. Accordingly, there is a great need for accurate video quality assessment (VQA) models for UGC/consumer videos to monitor, control, and optimize this vast content. Blind quality prediction of in-the-wild videos is quite challenging, since the quality degradations of UGC videos are unpredictable, complicated, and often commingled. Here we contribute to advancing the UGC-VQA problem by conducting a comprehensive evaluation of leading no-reference/blind VQA (BVQA) features and models on a fixed evaluation architecture, yielding new empirical insights on both subjective video quality studies and objective VQA model design. By employing a feature selection strategy on top of efficient BVQA models, we are able to extract 60 out of 763 statistical features used in existing methods to create a new fusion-based model, which we dub the VIDeo quality EVALuator (VIDEVAL), that effectively balances the trade-off between VQA performance and efficiency. Our experimental results show that VIDEVAL achieves state-of-the-art performance at considerably lower computational cost than other leading models. Our study protocol also defines a reliable benchmark for the UGC-VQA problem, which we believe will facilitate further research on deep learning-based VQA modeling, as well as perceptually-optimized efficient UGC video processing, transcoding, and streaming. To promote reproducible research and public evaluation, an implementation of VIDEVAL has been made available online: https://github.com/vztu/VIDEVAL.

6.
Med Image Comput Comput Assist Interv ; 17(Pt 1): 372-80, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25333140

RESUMEN

Patient-specific orthopedic knee surgery planning requires precisely segmenting from 3D CT images multiple knee bones, namely femur, tibia, fibula, and patella, around the knee joint with severe pathologies. In this work, we propose a fully automated, highly precise, and computationally efficient segmentation approach for multiple bones. First, each bone is initially segmented using a model-based marginal space learning framework for pose estimation followed by non-rigid boundary deformation. To recover shape details, we then refine the bone segmentation using graph cut that incorporates the shape priors derived from the initial segmentation. Finally we remove overlap between neighboring bones using multi-layer graph partition. In experiments, we achieve simultaneous segmentation of femur, tibia, patella, and fibula with an overall accuracy of less than 1mm surface-to-surface error in less than 90s on hundreds of 3D CT scans with pathological knee joints.


Asunto(s)
Huesos/diagnóstico por imagen , Almacenamiento y Recuperación de la Información/métodos , Articulación de la Rodilla/diagnóstico por imagen , Reconocimiento de Normas Patrones Automatizadas/métodos , Interpretación de Imagen Radiográfica Asistida por Computador/métodos , Cirugía Asistida por Computador/métodos , Tomografía Computarizada por Rayos X/métodos , Algoritmos , Inteligencia Artificial , Huesos/cirugía , Humanos , Imagenología Tridimensional/métodos , Articulación de la Rodilla/cirugía , Cuidados Preoperatorios/métodos , Intensificación de Imagen Radiográfica/métodos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
7.
Med Image Comput Comput Assist Interv ; 17(Pt 1): 804-11, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25333193

RESUMEN

The diversity in appearance of diseased lung tissue makes automatic segmentation of lungs from CT with severe pathologies challenging. To overcome this challenge, we rely on contextual constraints from neighboring anatomies to detect and segment lung tissue across a variety of pathologies. We propose an algorithm that combines statistical learning with these anatomical constraints to seek a segmentation of the lung consistent with adjacent structures, such as the heart, liver, spleen, and ribs. We demonstrate that our algorithm reduces the number of failed detections and increases the accuracy of the segmentation on unseen test cases with severe pathologies.


Asunto(s)
Puntos Anatómicos de Referencia/diagnóstico por imagen , Imagenología Tridimensional/métodos , Enfermedades Pulmonares/diagnóstico por imagen , Pulmón/diagnóstico por imagen , Reconocimiento de Normas Patrones Automatizadas/métodos , Interpretación de Imagen Radiográfica Asistida por Computador/métodos , Tomografía Computarizada por Rayos X/métodos , Algoritmos , Humanos , Intensificación de Imagen Radiográfica/métodos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
8.
Med Image Comput Comput Assist Interv ; 16(Pt 3): 235-42, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24505766

RESUMEN

Automatic segmentation techniques, despite demonstrating excellent overall accuracy, can often produce inaccuracies in local regions. As a result, correcting segmentations remains an important task that is often laborious, especially when done manually for 3D datasets. This work presents a powerful tool called Intelligent Learning-Based Editor of Segmentations (IntellEditS) that minimizes user effort and further improves segmentation accuracy. The tool partners interactive learning with an energy-minimization approach to editing. Based on interactive user input, a discriminative classifier is trained and applied to the edited 3D region to produce soft voxel labeling. The labels are integrated into a novel energy functional along with the existing segmentation and image data. Unlike the state of the art, IntellEditS is designed to correct segmentation results represented not only as masks but also as meshes. In addition, IntellEditS accepts intuitive boundary-based user interactions. The versatility and performance of IntellEditS are demonstrated on both MRI and CT datasets consisting of varied anatomical structures and resolutions.


Asunto(s)
Algoritmos , Inteligencia Artificial , Interpretación de Imagen Asistida por Computador/métodos , Imagen por Resonancia Magnética/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Programas Informáticos , Tomografía Computarizada por Rayos X/métodos , Documentación/métodos , Aumento de la Imagen/métodos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
9.
Inf Process Med Imaging ; 23: 450-62, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24683990

RESUMEN

We propose a novel framework for rapid and accurate segmentation of a cohort of organs. First, it integrates local and global image context through a product rule to simultaneously detect multiple landmarks on the target organs. The global posterior integrates evidence over all volume patches, while the local image context is modeled with a local discriminative classifier. Through non-parametric modeling of the global posterior, it exploits sparsity in the global context for efficient detection. The complete surface of the target organs is then inferred by robust alignment of a shape model to the resulting landmarks and finally deformed using discriminative boundary detectors. Using our approach, we demonstrate efficient detection and accurate segmentation of liver, kidneys, heart, and lungs in challenging low-resolution MR data in less than one second, and of prostate, bladder, rectum, and femoral heads in CT scans, in roughly one to three seconds and in both cases with accuracy fairly close to inter-user variability.


Asunto(s)
Inteligencia Artificial , Imagenología Tridimensional/métodos , Imagen por Resonancia Magnética/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Tomografía Computarizada por Rayos X/métodos , Vísceras/anatomía & histología , Vísceras/diagnóstico por imagen , Algoritmos , Simulación por Computador , Análisis Discriminante , Humanos , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Modelos Biológicos , Modelos Estadísticos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Integración de Sistemas
10.
Artículo en Inglés | MEDLINE | ID: mdl-23286081

RESUMEN

In this paper, we present a novel method by incorporating information theory into the learning-based approach for automatic and accurate pelvic organ segmentation (including the prostate, bladder and rectum). We target 3D CT volumes that are generated using different scanning protocols (e.g., contrast and non-contrast, with and without implant in the prostate, various resolution and position), and the volumes come from largely diverse sources (e.g., diseased in different organs). Three key ingredients are combined to solve this challenging segmentation problem. First, marginal space learning (MSL) is applied to efficiently and effectively localize the multiple organs in the largely diverse CT volumes. Second, learning techniques, steerable features, are applied for robust boundary detection. This enables handling of highly heterogeneous texture pattern. Third, a novel information theoretic scheme is incorporated into the boundary inference process. The incorporation of the Jensen-Shannon divergence further drives the mesh to the best fit of the image, thus improves the segmentation performance. The proposed approach is tested on a challenging dataset containing 188 volumes from diverse sources. Our approach not only produces excellent segmentation accuracy, but also runs about eighty times faster than previous state-of-the-art solutions. The proposed method can be applied to CT images to provide visual guidance to physicians during the computer-aided diagnosis, treatment planning and image-guided radiotherapy to treat cancers in pelvic region.


Asunto(s)
Algoritmos , Inteligencia Artificial , Imagenología Tridimensional/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Interpretación de Imagen Radiográfica Asistida por Computador/métodos , Tomografía Computarizada por Rayos X/métodos , Vísceras/diagnóstico por imagen , Humanos , Intensificación de Imagen Radiográfica/métodos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
11.
Med Image Comput Comput Assist Interv ; 14(Pt 3): 338-45, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-22003717

RESUMEN

We present a novel generic segmentation system for the fully automatic multi-organ segmentation from CT medical images. Thereby we combine the advantages of learning-based approaches on point cloud-based shape representation, such a speed, robustness, point correspondences, with those of PDE-optimization-based level set approaches, such as high accuracy and the straightforward prevention of segment overlaps. In a benchmark on 10-100 annotated datasets for the liver, the lungs, and the kidneys we show that the proposed system yields segmentation accuracies of 1.17-2.89 mm average surface errors. Thereby the level set segmentation (which is initialized by the learning-based segmentations) contributes with an 20%-40% increase in accuracy.


Asunto(s)
Imagenología Tridimensional/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Tomografía Computarizada por Rayos X/métodos , Algoritmos , Inteligencia Artificial , Bases de Datos Factuales , Humanos , Riñón/patología , Aprendizaje , Hígado/patología , Pulmón/patología , Modelos Anatómicos , Modelos Estadísticos , Análisis de Componente Principal , Reproducibilidad de los Resultados , Programas Informáticos
12.
Med Image Comput Comput Assist Interv ; 14(Pt 3): 667-74, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-22003757

RESUMEN

Simple algorithms for segmenting healthy lung parenchyma in CT are unable to deal with high density tissue common in pulmonary diseases. To overcome this problem, we propose a multi-stage learning-based approach that combines anatomical information to predict an initialization of a statistical shape model of the lungs. The initialization first detects the carina of the trachea, and uses this to detect a set of automatically selected stable landmarks on regions near the lung (e.g., ribs, spine). These landmarks are used to align the shape model, which is then refined through boundary detection to obtain fine-grained segmentation. Robustness is obtained through hierarchical use of discriminative classifiers that are trained on a range of manually annotated data of diseased and healthy lungs. We demonstrate fast detection (35s per volume on average) and segmentation of 2 mm accuracy on challenging data.


Asunto(s)
Tomografía Computarizada de Haz Cónico/métodos , Neoplasias Pulmonares/diagnóstico por imagen , Neoplasias Pulmonares/diagnóstico , Pulmón/diagnóstico por imagen , Algoritmos , Diagnóstico por Imagen/métodos , Humanos , Aprendizaje , Pulmón/patología , Neoplasias Pulmonares/patología , Modelos Estadísticos , Reconocimiento de Normas Patrones Automatizadas/métodos , Interpretación de Imagen Radiográfica Asistida por Computador/métodos , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA