Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 11 de 11
Filter
Add more filters










Publication year range
1.
Orthod Craniofac Res ; 2024 Jun 02.
Article in English | MEDLINE | ID: mdl-38825845

ABSTRACT

OBJECTIVE: In many medical disciplines, facial attractiveness is part of the diagnosis, yet its scoring might be confounded by facial expressions. The intent was to apply deep convolutional neural networks (CNN) to identify how facial expressions affect facial attractiveness and to explore whether a dedicated training of the CNN is able to reduce the bias of facial expressions. MATERIALS AND METHODS: Frontal facial images (n = 840) of 40 female participants (mean age 24.5 years) were taken adapting a neutral facial expression and the six universal facial expressions. Facial attractiveness was computed by means of a face detector, deep convolutional neural networks, standard support vector regression for facial beauty, visual regularized collaborative filtering and a regression technique for handling visual queries without rating history. CNN was first trained on random facial photographs from a dating website and then further trained on the Chicago Face Database (CFD) to increase its suitability to medical conditions. Both algorithms scored every image for attractiveness. RESULTS: Facial expressions affect facial attractiveness scores significantly. Scores from CNN additionally trained on CFD had less variability between the expressions (range 54.3-60.9 compared to range: 32.6-49.5) and less variance within the scores (P ≤ .05), but also caused a shift in the ranking of the expressions' facial attractiveness. CONCLUSION: Facial expressions confound attractiveness scores. Training on norming images generated scores less susceptible to distortion, but more difficult to interpret. Scoring facial attractiveness based on CNN seems promising, but AI solutions must be developed on CNN trained to recognize facial expressions as distractors.

2.
IEEE Trans Image Process ; 33: 2171-2182, 2024.
Article in English | MEDLINE | ID: mdl-38451763

ABSTRACT

Video restoration aims to restore high-quality frames from low-quality frames. Different from single image restoration, video restoration generally requires to utilize temporal information from multiple adjacent but usually misaligned video frames. Existing deep methods generally tackle with this by exploiting a sliding window strategy or a recurrent architecture, which are restricted by frame-by-frame restoration. In this paper, we propose a Video Restoration Transformer (VRT) with parallel frame prediction ability. More specifically, VRT is composed of multiple scales, each of which consists of two kinds of modules: temporal reciprocal self attention (TRSA) and parallel warping. TRSA divides the video into small clips, on which reciprocal attention is applied for joint motion estimation, feature alignment and feature fusion, while self attention is used for feature extraction. To enable cross-clip interactions, the video sequence is shifted for every other layer. Besides, parallel warping is used to further fuse information from neighboring frames by parallel feature warping. Experimental results on five tasks, including video super-resolution, video deblurring, video denoising, video frame interpolation and space-time video super-resolution, demonstrate that VRT outperforms the state-of-the-art methods by large margins (up to 2.16dB) on fourteen benchmark datasets. The codes are available at https://github.com/JingyunLiang/VRT.

3.
IEEE Trans Pattern Anal Mach Intell ; 45(8): 10247-10266, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37027599

ABSTRACT

Establishing robust and accurate correspondences between a pair of images is a long-standing computer vision problem with numerous applications. While classically dominated by sparse methods, emerging dense approaches offer a compelling alternative paradigm that avoids the keypoint detection step. However, dense flow estimation is often inaccurate in the case of large displacements, occlusions, or homogeneous regions. In order to apply dense methods to real-world applications, such as pose estimation, image manipulation, or 3D reconstruction, it is therefore crucial to estimate the confidence of the predicted matches. We propose the Enhanced Probabilistic Dense Correspondence Network, PDC-Net+, capable of estimating accurate dense correspondences along with a reliable confidence map. We develop a flexible probabilistic approach that jointly learns the flow prediction and its uncertainty. In particular, we parametrize the predictive distribution as a constrained mixture model, ensuring better modelling of both accurate flow predictions and outliers. Moreover, we develop an architecture and an enhanced training strategy tailored for robust and generalizable uncertainty prediction in the context of self-supervised training. Our approach obtains state-of-the-art results on multiple challenging geometric matching and optical flow datasets. We further validate the usefulness of our probabilistic confidence estimation for the tasks of pose estimation, 3D reconstruction, image-based localization, and image retrieval.


Subject(s)
Algorithms , Pattern Recognition, Automated , Pattern Recognition, Automated/methods
4.
IEEE Trans Neural Netw Learn Syst ; 34(3): 1132-1145, 2023 Mar.
Article in English | MEDLINE | ID: mdl-34428157

ABSTRACT

The entropy of the codes usually serves as the rate loss in the recent learned lossy image compression methods. Precise estimation of the probabilistic distribution of the codes plays a vital role in reducing the entropy and boosting the joint rate-distortion performance. However, existing deep learning based entropy models generally assume the latent codes are statistically independent or depend on some side information or local context, which fails to take the global similarity within the context into account and thus hinders the accurate entropy estimation. To address this issue, we propose a special nonlocal operation for context modeling by employing the global similarity within the context. Specifically, due to the constraint of context, nonlocal operation is incalculable in context modeling. We exploit the relationship between the code maps produced by deep neural networks and introduce the proxy similarity functions as a workaround. Then, we combine the local and the global context via a nonlocal attention block and employ it in masked convolutional networks for entropy modeling. Taking the consideration that the width of the transforms is essential in training low distortion models, we finally produce a U-net block in the transforms to increase the width with manageable memory consumption and time complexity. Experiments on Kodak and Tecnick datasets demonstrate the priority of the proposed context-based nonlocal attention block in entropy modeling and the U-net block in low distortion situations. On the whole, our model performs favorably against the existing image compression standards and recent deep image compression models.

5.
Clin Oral Investig ; 26(12): 6871-6879, 2022 Dec.
Article in English | MEDLINE | ID: mdl-36153437

ABSTRACT

OBJECTIVES: This review aims to share the current developments of artificial intelligence (AI) solutions in the field of medico-dental diagnostics of the face. The primary focus of this review is to present the applicability of artificial neural networks (ANN) to interpret medical images, together with the associated opportunities, obstacles, and ethico-legal concerns. MATERIAL AND METHODS: Narrative literature review. RESULTS: Narrative literature review. CONCLUSION: Curated facial images are widely available and easily accessible and are as such particularly suitable big data for ANN training. New AI solutions have the potential to change contemporary dentistry by optimizing existing processes and enriching dental care with the introduction of new tools for assessment or treatment planning. The analyses of health-related big data may also contribute to revolutionize personalized medicine through the detection of previously unknown associations. In regard to facial images, advances in medico-dental AI-based diagnostics include software solutions for the detection and classification of pathologies, for rating attractiveness and for the prediction of age or gender. In order for an ANN to be suitable for medical diagnostics of the face, the arising challenges regarding computation and management of the software are discussed, with special emphasis on the use of non-medical big data for ANN training. The legal and ethical ramifications of feeding patients' facial images to a neural network for diagnostic purposes are related to patient consent, data privacy, data security, liability, and intellectual property. Current ethico-legal regulation practices seem incapable of addressing all concerns and ensuring accountability. CLINICAL SIGNIFICANCE: While this review confirms the many benefits derived from AI solutions used for the diagnosis of medical images, it highlights the evident lack of regulatory oversight, the urgent need to establish licensing protocols, and the imperative to investigate the moral quality of new norms set with the implementation of AI applications in medico-dental diagnostics.


Subject(s)
Artificial Intelligence , Humans
6.
Eur J Orthod ; 44(4): 445-451, 2022 08 16.
Article in English | MEDLINE | ID: mdl-35532375

ABSTRACT

BACKGROUND: Facial aesthetics is a major motivating factor for undergoing orthodontic treatment. OBJECTIVES: To ascertain-by means of artificial intelligence (AI)-the influence of dental alignment on facial attractiveness and perceived age, compared to other modifications such as wearing glasses, earrings, or lipstick. MATERIAL AND METHODS: Forty volunteering females (mean age: 24.5) with near perfectly aligned upper front teeth [Aesthetic Component scale of the Index of Orthodontic Treatment Need (AC-IOTN) = 1 and Peer Assessment Rating Index (PAR Index) = 0 or 1] were photographed with a standardized pose while smiling, in the following settings (number of photographs = 960): without modifications, wearing eyeglasses, earrings, or lipstick. These pictures were taken with natural aligned dentition and with an individually manufactured crooked teeth mock-up (AC-IOTN = 8) to create the illusion of misaligned teeth. Images were assessed for attractiveness and perceived age, using AI, consisting of a face detector and deep convolutional neural networks trained on dedicated datasets for attractiveness and age prediction. Each image received an attractiveness score from 0 to 100 and one value for an age prediction. The scores were descriptively reviewed for each setting, and the facial modifications were tested statistically whether they affected the attractiveness score. The relationship between predicted age and attractiveness scores was examined with linear regression models. RESULTS: All modifications showed a significant effect (for all: P < 0.001) on facial attractiveness. In faces with misaligned teeth, wearing eyeglasses (-17.8%) and earrings (-3.2%) had an adverse effect on facial aesthetics. Tooth alignment (+6.9%) and wearing lipstick (+7.9%) increased attractiveness. There was no relevant effect of any assessed modifications or tooth alignment on perceived age (all: <1.5 years). Mean attractiveness score declined with predicted age, except when wearing glasses, in which case attractiveness was rated higher with increasing predicted age. CONCLUSIONS: Alignment of teeth improves facial attractiveness to a similar extent than wearing lipstick, but has no discernable effect on perceived age. Wearing glasses reduces attractiveness considerably, but this effect vanishes with age.


Subject(s)
Artificial Intelligence , Malocclusion , Adult , Esthetics, Dental , Face , Female , Humans , Index of Orthodontic Treatment Need , Infant , Malocclusion/therapy , Smiling , Young Adult
7.
IEEE Trans Pattern Anal Mach Intell ; 44(10): 6360-6376, 2022 Oct.
Article in English | MEDLINE | ID: mdl-34125670

ABSTRACT

Recent works on plug-and-play image restoration have shown that a denoiser can implicitly serve as the image prior for model-based methods to solve many inverse problems. Such a property induces considerable advantages for plug-and-play image restoration (e.g., integrating the flexibility of model-based method and effectiveness of learning-based methods) when the denoiser is discriminatively learned via deep convolutional neural network (CNN) with large modeling capacity. However, while deeper and larger CNN models are rapidly gaining popularity, existing plug-and-play image restoration hinders its performance due to the lack of suitable denoiser prior. In order to push the limits of plug-and-play image restoration, we set up a benchmark deep denoiser prior by training a highly flexible and effective CNN denoiser. We then plug the deep denoiser prior as a modular part into a half quadratic splitting based iterative algorithm to solve various image restoration problems. We, meanwhile, provide a thorough analysis of parameter setting, intermediate results and empirical convergence to better understand the working mechanism. Experimental results on three representative image restoration tasks, including deblurring, super-resolution and demosaicing, demonstrate that the proposed plug-and-play image restoration with deep denoiser prior not only significantly outperforms other state-of-the-art model-based methods but also achieves competitive or even superior performance against state-of-the-art learning-based methods. The source code is available at https://github.com/cszn/DPIR.

8.
Front Plant Sci ; 12: 774068, 2021.
Article in English | MEDLINE | ID: mdl-35058948

ABSTRACT

Robust and automated segmentation of leaves and other backgrounds is a core prerequisite of most approaches in high-throughput field phenotyping. So far, the possibilities of deep learning approaches for this purpose have not been explored adequately, partly due to a lack of publicly available, appropriate datasets. This study presents a workflow based on DeepLab v3+ and on a diverse annotated dataset of 190 RGB (350 x 350 pixels) images. Images of winter wheat plants of 76 different genotypes and developmental stages have been acquired throughout multiple years at high resolution in outdoor conditions using nadir view, encompassing a wide range of imaging conditions. Inconsistencies of human annotators in complex images have been quantified, and metadata information of camera settings has been included. The proposed approach achieves an intersection over union (IoU) of 0.77 and 0.90 for plants and soil, respectively. This outperforms the benchmarked machine learning methods which use Support Vector Classifier and/or Random Forrest. The results show that a small but carefully chosen and annotated set of images can provide a good basis for a powerful segmentation pipeline. Compared to earlier methods based on machine learning, the proposed method achieves better performance on the selected dataset in spite of using a deep learning approach with limited data. Increasing the amount of publicly available data with high human agreement on annotations and further development of deep neural network architectures will provide high potential for robust field-based plant segmentation in the near future. This, in turn, will be a cornerstone of data-driven improvement in crop breeding and agricultural practices of global benefit.

9.
Article in English | MEDLINE | ID: mdl-31870979

ABSTRACT

The depth images acquired by consumer depth sensors (e.g., Kinect and ToF) usually are of low resolution and insufficient quality. One natural solution is to incorporate a high resolution RGB camera and exploit the statistical correlation of its data and depth. In recent years, both optimization-based and learning-based approaches have been proposed to deal with the guided depth reconstruction problems. In this paper, we introduce a weighted analysis sparse representation (WASR) model for guided depth image enhancement, which can be considered a generalized formulation of a wide range of previous optimization-based models. We unfold the optimization by the WASR model and conduct guided depth reconstruction with dynamically changed stage-wise operations. Such a guidance strategy enables us to dynamically adjust the stage-wise operations that update the depth image, thus improving the reconstruction quality and speed. To learn the stage-wise operations in a task-driven manner, we propose two parameterizations and their corresponding methods: dynamic guidance with Gaussian RBF nonlinearity parameterization (DG-RBF) and dynamic guidance with CNN nonlinearity parameterization (DG-CNN). The network structures of the proposed DG-RBF and DG-CNN methods are designed with the the objective function of our WASR model in mind and the optimal network parameters are learned from paired training data. Such optimization-inspired network architectures enable our models to leverage the previous expertise as well as take benefit from training data. The effectiveness is validated for guided depth image super-resolution and for realistic depth image reconstruction tasks using standard benchmarks. Our DG-RBF and DG-CNN methods achieve the best quantitative results (RMSE) and better visual quality than the state-of-the-art approaches at the time of writing. The code is available at https://github.com/ShuhangGu/GuidedDepthSR.

10.
Eur J Orthod ; 41(4): 428-433, 2019 Aug 08.
Article in English | MEDLINE | ID: mdl-30788496

ABSTRACT

OBJECTIVES: To evaluate facial attractiveness of treated cleft patients and controls by artificial intelligence (AI) and to compare these results with panel ratings performed by laypeople, orthodontists, and oral surgeons. MATERIALS AND METHODS: Frontal and profile images of 20 treated left-sided cleft patients (10 males, mean age: 20.5 years) and 10 controls (5 males, mean age: 22.1 years) were evaluated for facial attractiveness with dedicated convolutional neural networks trained on >17 million ratings for attractiveness and compared to the assessments of 15 laypeople, 14 orthodontists, and 10 oral surgeons performed on a visual analogue scale (n = 2323 scorings). RESULTS: AI evaluation of cleft patients (mean score: 4.75 ± 1.27) was comparable to human ratings (laypeople: 4.24 ± 0.81, orthodontists: 4.82 ± 0.94, oral surgeons: 4.74 ± 0.83) and was not statistically different (all Ps ≥ 0.19). Facial attractiveness of controls was rated significantly higher by humans than AI (all Ps ≤ 0.02), which yielded lower scores than in cleft subjects. Variance was considerably large in all human rating groups when considering cases separately, and especially accentuated in the assessment of cleft patients (coefficient of variance-laypeople: 38.73 ± 9.64, orthodontists: 32.56 ± 8.21, oral surgeons: 42.19 ± 9.80). CONCLUSIONS: AI-based results were comparable with the average scores of cleft patients seen in all three rating groups (with especially strong agreement to both professional panels) but overall lower for control cases. The variance observed in panel ratings revealed a large imprecision based on a problematic absence of unity. IMPLICATION: Current panel-based evaluations of facial attractiveness suffer from dispersion-related issues and remain practically unavailable for patients. AI could become a helpful tool to describe facial attractiveness, but the present results indicate that important adjustments are needed on AI models, to improve the interpretation of the impact of cleft features on facial attractiveness.


Subject(s)
Artificial Intelligence , Face , Adult , Humans , Intelligence , Male , Young Adult
11.
IEEE Trans Image Process ; 25(8): 3862-74, 2016 08.
Article in English | MEDLINE | ID: mdl-27254866

ABSTRACT

Color demosaicing is a key image processing step aiming to reconstruct the missing pixels from a recorded raw image. On the one hand, numerous interpolation methods focusing on spatial-spectral correlations have been proved very efficient, whereas they yield a poor image quality and strong visible artifacts. On the other hand, optimization strategies, such as learned simultaneous sparse coding and sparsity and adaptive principal component analysis-based algorithms, were shown to greatly improve image quality compared with that delivered by interpolation methods, but unfortunately are computationally heavy. In this paper, we propose efficient regression priors as a novel, fast post-processing algorithm that learns the regression priors offline from training data. We also propose an independent efficient demosaicing algorithm based on directional difference regression, and introduce its enhanced version based on fused regression. We achieve an image quality comparable to that of the state-of-the-art methods for three benchmarks, while being order(s) of magnitude faster.


Subject(s)
Algorithms , Image Enhancement , Image Interpretation, Computer-Assisted , Artifacts , Colorimetry
SELECTION OF CITATIONS
SEARCH DETAIL
...