Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 53
Filtrar
1.
J Neural Eng ; 21(3)2024 May 13.
Artigo em Inglês | MEDLINE | ID: mdl-38684154

RESUMO

Objective. The patterns of brain activity associated with different brain processes can be used to identify different brain states and make behavioural predictions. However, the relevant features are not readily apparent and accessible. Our aim is to design a system for learning informative latent representations from multichannel recordings of ongoing EEG activity.Approach: We propose a novel differentiable decoding pipeline consisting of learnable filters and a pre-determined feature extraction module. Specifically, we introduce filters parameterized by generalized Gaussian functions that offer a smooth derivative for stable end-to-end model training and allow for learning interpretable features. For the feature module, we use signal magnitude and functional connectivity estimates.Main results.We demonstrate the utility of our model on a new EEG dataset of unprecedented size (i.e. 721 subjects), where we identify consistent trends of music perception and related individual differences. Furthermore, we train and apply our model in two additional datasets, specifically for emotion recognition on SEED and workload classification on simultaneous task EEG workload. The discovered features align well with previous neuroscience studies and offer new insights, such as marked differences in the functional connectivity profile between left and right temporal areas during music listening. This agrees with the specialisation of the temporal lobes regarding music perception proposed in the literature.Significance. The proposed method offers strong interpretability of learned features while reaching similar levels of accuracy achieved by black box deep learning models. This improved trustworthiness may promote the use of deep learning models in real world applications. The model code is available athttps://github.com/SMLudwig/EEGminer/.


Assuntos
Encéfalo , Eletroencefalografia , Humanos , Eletroencefalografia/métodos , Encéfalo/fisiologia , Masculino , Adulto , Feminino , Música , Adulto Jovem , Percepção Auditiva/fisiologia , Aprendizado de Máquina , Emoções/fisiologia
2.
J Neural Eng ; 21(3)2024 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-38621380

RESUMO

Objective. Machine learning (ML) models have opened up enormous opportunities in the field of brain-computer Interfaces (BCIs). Despite their great success, they usually face severe limitations when they are employed in real-life applications outside a controlled laboratory setting.Approach. Mixing causal reasoning, identifying causal relationships between variables of interest, with brainwave modeling can change one's viewpoint on some of these major challenges which can be found in various stages in the ML pipeline, ranging from data collection and data pre-processing to training methods and techniques.Main results. In this work, we employ causal reasoning and present a framework aiming to breakdown and analyze important challenges of brainwave modeling for BCIs.Significance. Furthermore, we present how general ML practices as well as brainwave-specific techniques can be utilized and solve some of these identified challenges. And finally, we discuss appropriate evaluation schemes in order to measure these techniques' performance and efficiently compare them with other methods that will be developed in the future.


Assuntos
Interfaces Cérebro-Computador , Aprendizado de Máquina , Interfaces Cérebro-Computador/tendências , Humanos , Eletroencefalografia/métodos , Ondas Encefálicas/fisiologia , Encéfalo/fisiologia , Algoritmos
3.
IEEE Trans Image Process ; 32: 5721-5736, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37824316

RESUMO

The long-tailed distribution is a common phenomenon in the real world. Extracted large scale image datasets inevitably demonstrate the long-tailed property and models trained with imbalanced data can obtain high performance for the over-represented categories, but struggle for the under-represented categories, leading to biased predictions and performance degradation. To address this challenge, we propose a novel de-biasing method named Inverse Image Frequency (IIF). IIF is a multiplicative margin adjustment transformation of the logits in the classification layer of a convolutional neural network. Our method achieves stronger performance than similar works and it is especially useful for downstream tasks such as long-tailed instance segmentation as it produces fewer false positive detections. Our extensive experiments show that IIF surpasses the state of the art on many long-tailed benchmarks such as ImageNet-LT, CIFAR-LT, Places-LT and LVIS, reaching 55.8% top-1 accuracy with ResNet50 on ImageNet-LT and 26.3% segmentation AP with MaskRCNN ResNet50 on LVIS. Code available at https://github.com/kostas1515/iif.

4.
J Neural Eng ; 20(5)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37678229

RESUMO

Objective.Brain-computer interfaces (BCIs) enable a direct communication of the brain with the external world, using one's neural activity, measured by electroencephalography (EEG) signals. In recent years, convolutional neural networks (CNNs) have been widely used to perform automatic feature extraction and classification in various EEG-based tasks. However, their undeniable benefits are counterbalanced by the lack of interpretability properties as well as the inability to perform sufficiently when only limited amount of training data is available.Approach.In this work, we introduce a novel, lightweight, fully-learnable neural network architecture that relies on Gabor filters to delocalize EEG signal information into scattering decomposition paths along frequency and slow-varying temporal modulations.Main results.We utilize our network in two distinct modeling settings, for building either a generic (training across subjects) or a personalized (training within a subject) classifier.Significance.In both cases, using two different publicly available datasets and one in-house collected dataset, we demonstrate high performance for our model with considerably less number of trainable parameters as well as shorter training time compared to other state-of-the-art deep architectures. Moreover, our network demonstrates enhanced interpretability properties emerging at the level of the temporal filtering operation and enables us to train efficient personalized BCI models with limited amount of training data.


Assuntos
Ondas Encefálicas , Interfaces Cérebro-Computador , Humanos , Eletroencefalografia , Reconhecimento Psicológico , Encéfalo
5.
IEEE Trans Image Process ; 32: 3664-3678, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37384475

RESUMO

Perspective distortions and crowd variations make crowd counting a challenging task in computer vision. To tackle it, many previous works have used multi-scale architecture in deep neural networks (DNNs). Multi-scale branches can be either directly merged (e.g. by concatenation) or merged through the guidance of proxies (e.g. attentions) in the DNNs. Despite their prevalence, these combination methods are not sophisticated enough to deal with the per-pixel performance discrepancy over multi-scale density maps. In this work, we redesign the multi-scale neural network by introducing a hierarchical mixture of density experts, which hierarchically merges multi-scale density maps for crowd counting. Within the hierarchical structure, an expert competition and collaboration scheme is presented to encourage contributions from all scales; pixel-wise soft gating nets are introduced to provide pixel-wise soft weights for scale combinations in different hierarchies. The network is optimized using both the crowd density map and the local counting map, where the latter is obtained by local integration on the former. Optimizing both can be problematic because of their potential conflicts. We introduce a new relative local counting loss based on relative count differences among hard-predicted local regions in an image, which proves to be complementary to the conventional absolute error loss on the density map. Experiments show that our method achieves the state-of-the-art performance on five public datasets, i.e. ShanghaiTech, UCF_CC_50, JHU-CROWD++, NWPU-Crowd and Trancos. Our codes will be available at https://github.com/ZPDu/Redesigning-Multi-Scale-Neural-Network-for-Crowd-Counting.

6.
IEEE Trans Pattern Anal Mach Intell ; 45(11): 12726-12737, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37030770

RESUMO

Self-attention mechanisms and non-local blocks have become crucial building blocks for state-of-the-art neural architectures thanks to their unparalleled ability in capturing long-range dependencies in the input. However their cost is quadratic with the number of spatial positions hence making their use impractical in many real case applications. In this work, we analyze these methods through a polynomial lens, and we show that self-attention can be seen as a special case of a 3 rd order polynomial. Within this polynomial framework, we are able to design polynomial operators capable of accessing the same data pattern of non-local and self-attention blocks while reducing the complexity from quadratic to linear. As a result, we propose two modules (Poly-NL and Poly-SA) that can be used as "drop-in" replacements for more-complex non-local and self-attention layers in state-of-the-art CNNs and ViT architectures. Our modules can achieve comparable, if not better, performance across a wide range of computer vision tasks while keeping a complexity equivalent to a standard linear layer.

7.
IEEE Trans Pattern Anal Mach Intell ; 45(8): 9743-9756, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37028333

RESUMO

We present Free-HeadGAN, a person-generic neural talking head synthesis system. We show that modeling faces with sparse 3D facial landmarks is sufficient for achieving state-of-the-art generative performance, without relying on strong statistical priors of the face, such as 3D Morphable Models. Apart from 3D pose and facial expressions, our method is capable of fully transferring the eye gaze, from a driving actor to a source identity. Our complete pipeline consists of three components: a canonical 3D key-point estimator that regresses 3D pose and expression-related deformations, a gaze estimation network and a generator that is built upon the architecture of HeadGAN. We further experiment with an extension of our generator to accommodate few-shot learning using an attention mechanism, in case multiple source images are available. Compared to recent methods for reenactment and motion transfer, our system achieves higher photo-realism combined with superior identity preservation, while offering explicit gaze control.


Assuntos
Algoritmos , Face , Humanos , Fixação Ocular , Aprendizagem , Expressão Facial
8.
Artigo em Inglês | MEDLINE | ID: mdl-37023162

RESUMO

Deep Convolutional Neural Networks (CNNs) have recently demonstrated impressive results in electroencephalogram (EEG) decoding for several Brain-Computer Interface (BCI) paradigms, including Motor-Imagery (MI). However, neurophysiological processes underpinning EEG signals vary across subjects causing covariate shifts in data distributions and hence hindering the generalization of deep models across subjects. In this paper, we aim to address the challenge of inter-subject variability in MI. To this end, we employ causal reasoning to characterize all possible distribution shifts in the MI task and propose a dynamic convolution framework to account for shifts caused by the inter-subject variability. Using publicly available MI datasets, we demonstrate improved generalization performance (up to 5%) across subjects in various MI tasks for four well-established deep architectures.


Assuntos
Algoritmos , Interfaces Cérebro-Computador , Humanos , Redes Neurais de Computação , Eletroencefalografia/métodos , Generalização Psicológica , Imaginação/fisiologia
9.
IEEE Trans Pattern Anal Mach Intell ; 45(3): 3968-3978, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-35687621

RESUMO

Recent deep face hallucination methods show stunning performance in super-resolving severely degraded facial images, even surpassing human ability. However, these algorithms are mainly evaluated on non-public synthetic datasets. It is thus unclear how these algorithms perform on public face hallucination datasets. Meanwhile, most of the existing datasets do not well consider the distribution of races, which makes face hallucination methods trained on these datasets biased toward some specific races. To address the above two problems, in this paper, we build a public Ethnically Diverse Face dataset, EDFace-Celeb-1 M, and design a benchmark task for face hallucination. Our dataset includes 1.7 million photos that cover different countries, with relatively balanced race composition. To the best of our knowledge, it is the largest-scale and publicly available face hallucination dataset in the wild. Associated with this dataset, this paper also contributes various evaluation protocols and provides comprehensive analysis to benchmark the existing state-of-the-art methods. The benchmark evaluations demonstrate the performance and limitations of state-of-the-art algorithms. https://github.com/HDCVLab/EDFace-Celeb-1M.


Assuntos
Algoritmos , Benchmarking , Humanos , Alucinações
10.
IEEE Trans Pattern Anal Mach Intell ; 45(1): 657-668, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-35201983

RESUMO

While Graph Neural Networks (GNNs) have achieved remarkable results in a variety of applications, recent studies exposed important shortcomings in their ability to capture the structure of the underlying graph. It has been shown that the expressive power of standard GNNs is bounded by the Weisfeiler-Leman (WL) graph isomorphism test, from which they inherit proven limitations such as the inability to detect and count graph substructures. On the other hand, there is significant empirical evidence, e.g. in network science and bioinformatics, that substructures are often intimately related to downstream tasks. To this end, we propose "Graph Substructure Networks" (GSN), a topologically-aware message passing scheme based on substructure encoding. We theoretically analyse the expressive power of our architecture, showing that it is strictly more expressive than the WL test, and provide sufficient conditions for universality. Importantly, we do not attempt to adhere to the WL hierarchy; this allows us to retain multiple attractive properties of standard GNNs such as locality and linear network complexity, while being able to disambiguate even hard instances of graph isomorphism. We perform an extensive experimental evaluation on graph classification and regression tasks and obtain state-of-the-art results in diverse real-world settings including molecular graphs and social networks.

11.
Bone Rep ; 16: 101528, 2022 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-35399871

RESUMO

Background/aim: To develop a 3D morphable model of the normal paediatric mandible to analyse shape development and growth patterns for males and females. Methods: Computed tomography (CT) data was collected for 242 healthy children referred for CT scan between 2011 and 2018 aged between 0 and 47 months (mean, 20.6 ± 13.4 months, 59.9% male). Thresholding techniques were used to segment the mandible from the CT scans. All mandible meshes were annotated using a defined set of 52 landmarks and processed such that all meshes followed a consistent triangulation. Following this, the mandible meshes were rigidly aligned to remove translation and rotation effects, while size effects were retained. Principal component analysis (PCA) was applied to the processed meshes to construct a generative 3D morphable model. Partial least squares (PLS) regression was also applied to the processed data to extract the shape modes with which to evaluate shape differences for age and sex. Growth curves were constructed for anthropometric measurements. Results: A 3D morphable model of the paediatric mandible was constructed and validated with good generalisation, compactness, and specificity. Growth curves of the assessed anthropometric measurements were plotted without significant differences between male and female subjects. The first principal component was dominated by size effects and is highly correlated with age at time of scan (Spearman's r = 0.94, p < 0.01). As with PCA, the first extracted PLS mode captures much of the size variation within the dataset and is highly correlated with age (Spearman's r = -0.94, p < 0.01). Little correlation was observed between extracted shape modes and sex with either PCA or PLS for this study population. Conclusion: The presented 3D morphable model of the paediatric mandible enables an understanding of mandibular shape development and variation by age and sex. It allowed for the construction of growth curves, which contains valuable information that can be used to enhance our understanding of various disorders that affect the mandibular development. Knowledge of shape changes in the growing mandible has potential to improve diagnostic accuracy for craniofacial conditions that impact the mandibular morphology, objective evaluation, surgical planning, and patient follow-up.

12.
Sci Rep ; 12(1): 2230, 2022 02 09.
Artigo em Inglês | MEDLINE | ID: mdl-35140239

RESUMO

Clinical diagnosis of craniofacial anomalies requires expert knowledge. Recent studies have shown that artificial intelligence (AI) based facial analysis can match the diagnostic capabilities of expert clinicians in syndrome identification. In general, these systems use 2D images and analyse texture and colour. They are powerful tools for photographic analysis but are not suitable for use with medical imaging modalities such as ultrasound, MRI or CT, and are unable to take shape information into consideration when making a diagnostic prediction. 3D morphable models (3DMMs), and their recently proposed successors, mesh autoencoders, analyse surface topography rather than texture enabling analysis from photography and all common medical imaging modalities and present an alternative to image-based analysis. We present a craniofacial analysis framework for syndrome identification using Convolutional Mesh Autoencoders (CMAs). The models were trained using 3D photographs of the general population (LSFM and LYHM), computed tomography data (CT) scans from healthy infants and patients with 3 genetically distinct craniofacial syndromes (Muenke, Crouzon, Apert). Machine diagnosis outperformed expert clinical diagnosis with an accuracy of 99.98%, sensitivity of 99.95% and specificity of 100%. The diagnostic precision of this technique supports its potential inclusion in clinical decision support systems. Its reliance on 3D topography characterisation make it suitable for AI assisted diagnosis in medical imaging as well as photographic analysis in the clinical setting.


Assuntos
Inteligência Artificial , Craniossinostoses/classificação , Craniossinostoses/diagnóstico , Processamento de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Simulação por Computador , Craniossinostoses/diagnóstico por imagem , Face/anormalidades , Cabeça/anormalidades , Humanos , Lactente , Tomografia Computadorizada por Raios X
13.
IEEE Trans Pattern Anal Mach Intell ; 44(8): 4021-4034, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-33571091

RESUMO

Deep convolutional neural networks (DCNNs) are currently the method of choice both for generative, as well as for discriminative learning in computer vision and machine learning. The success of DCNNs can be attributed to the careful selection of their building blocks (e.g., residual blocks, rectifiers, sophisticated normalization schemes, to mention but a few). In this paper, we propose Π-Nets, a new class of function approximators based on polynomial expansions. Π-Nets are polynomial neural networks, i.e., the output is a high-order polynomial of the input. The unknown parameters, which are naturally represented by high-order tensors, are estimated through a collective tensor factorization with factors sharing. We introduce three tensor decompositions that significantly reduce the number of parameters and show how they can be efficiently implemented by hierarchical neural networks. We empirically demonstrate that Π-Nets are very expressive and they even produce good results without the use of non-linear activation functions in a large battery of tasks and signals, i.e., images, graphs, and audio. When used in conjunction with activation functions, Π-Nets produce state-of-the-art results in three challenging tasks, i.e., image generation, face verification and 3D mesh representation learning. The source code is available at https://github.com/grigorisg9gr/polynomial_nets.


Assuntos
Algoritmos , Redes Neurais de Computação , Aprendizado de Máquina
14.
IEEE Trans Pattern Anal Mach Intell ; 44(10): 5962-5979, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-34106845

RESUMO

Recently, a popular line of research in face recognition is adopting margins in the well-established softmax loss function to maximize class separability. In this paper, we first introduce an Additive Angular Margin Loss (ArcFace), which not only has a clear geometric interpretation but also significantly enhances the discriminative power. Since ArcFace is susceptible to the massive label noise, we further propose sub-center ArcFace, in which each class contains K sub-centers and training samples only need to be close to any of the K positive sub-centers. Sub-center ArcFace encourages one dominant sub-class that contains the majority of clean faces and non-dominant sub-classes that include hard or noisy faces. Based on this self-propelled isolation, we boost the performance through automatically purifying raw web faces under massive real-world noise. Besides discriminative feature embedding, we also explore the inverse problem, mapping feature vectors to face images. Without training any additional generator or discriminator, the pre-trained ArcFace model can generate identity-preserved face images for both subjects inside and outside the training data only by using the network gradient and Batch Normalization (BN) priors. Extensive experiments demonstrate that ArcFace can enhance the discriminative feature embedding as well as strengthen the generative face synthesis.


Assuntos
Reconhecimento Facial , Algoritmos , Face , Humanos
15.
IEEE Trans Pattern Anal Mach Intell ; 44(12): 9269-9284, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34748477

RESUMO

Over the last years, with the advent of Generative Adversarial Networks (GANs), many face analysis tasks have accomplished astounding performance, with applications including, but not limited to, face generation and 3D face reconstruction from a single "in-the-wild" image. Nevertheless, to the best of our knowledge, there is no method which can produce render-ready high-resolution 3D faces from "in-the-wild" images and this can be attributed to the: (a) scarcity of available data for training, and (b) lack of robust methodologies that can successfully be applied on very high-resolution data. In this paper, we introduce the first method that is able to reconstruct photorealistic render-ready 3D facial geometry and BRDF from a single "in-the-wild" image. To achieve this, we capture a large dataset of facial shape and reflectance, which we have made public. Moreover, we define a fast and photorealistic differentiable rendering methodology with accurate facial skin diffuse and specular reflection, self-occlusion and subsurface scattering approximation. With this, we train a network that disentangles the facial diffuse and specular reflectance components from a mesh and texture with baked illumination, scanned or reconstructed with a 3DMM fitting method. As we demonstrate in a series of qualitative and quantitative experiments, our method outperforms the existing arts by a significant margin and reconstructs authentic, 4K by 6K-resolution 3D faces from a single low-resolution image, that are ready to be rendered in various applications and bridge the uncanny valley.


Assuntos
Algoritmos , Processamento de Imagem Assistida por Computador , Processamento de Imagem Assistida por Computador/métodos , Face/diagnóstico por imagem , Iluminação
16.
IEEE Trans Pattern Anal Mach Intell ; 44(9): 4879-4893, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-34043505

RESUMO

A lot of work has been done towards reconstructing the 3D facial structure from single images by capitalizing on the power of deep convolutional neural networks (DCNNs). In the recent works, the texture features either correspond to components of a linear texture space or are learned by auto-encoders directly from in-the-wild images. In all cases, the quality of the facial texture reconstruction is still not capable of modeling facial texture with high-frequency details. In this paper, we take a radically different approach and harness the power of generative adversarial networks (GANs) and DCNNs in order to reconstruct the facial texture and shape from single images. That is, we utilize GANs to train a very powerful facial texture prior from a large-scale 3D texture dataset. Then, we revisit the original 3D Morphable Models (3DMMs) fitting making use of non-linear optimization to find the optimal latent parameters that best reconstruct the test image but under a new perspective. In order to be robust towards initialisation and expedite the fitting process, we propose a novel self-supervised regression based approach. We demonstrate excellent results in photorealistic and identity preserving 3D face reconstructions and achieve for the first time, to the best of our knowledge, facial texture reconstruction with high-frequency details.


Assuntos
Algoritmos , Redes Neurais de Computação , Face/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodos
17.
Bone Rep ; 15: 101154, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34917697

RESUMO

BACKGROUND: This study aims to capture the 3D shape of the human skull in a healthy paediatric population (0-4 years old) and construct a generative statistical shape model. METHODS: The skull bones of 178 healthy children (55% male, 20.8 ± 12.9 months) were reconstructed from computed tomography (CT) images. 29 anatomical landmarks were placed on the 3D skull reconstructions. Rotation, translation and size were removed, and all skull meshes were placed in dense correspondence using a dimensionless skull mesh template and a non-rigid iterative closest point algorithm. A 3D morphable model (3DMM) was created using principal component analysis, and intrinsically and geometrically validated with anthropometric measurements. Synthetic skull instances were generated exploiting the 3DMM and validated by comparison of the anthropometric measurements with the selected input population. RESULTS: The 3DMM of the paediatric skull 0-4 years was successfully constructed. The model was reasonably compact - 90% of the model shape variance was captured within the first 10 principal components. The generalisation error, quantifying the ability of the 3DMM to represent shape instances not encountered during training, was 0.47 mm when all model components were used. The specificity value was <0.7 mm demonstrating that novel skull instances generated by the model are realistic. The 3DMM mean shape was representative of the selected population (differences <2%). Overall, good agreement was observed in the anthropometric measures extracted from the selected population, and compared to normative literature data (max difference in the intertemporal distance) and to the synthetic generated cases. CONCLUSION: This study presents a reliable statistical shape model of the paediatric skull 0-4 years that adheres to known skull morphometric measures, can accurately represent unseen skull samples not used during model construction and can generate novel realistic skull instances, thus presenting a solution to limited availability of normative data in this field.

18.
IEEE Trans Pattern Anal Mach Intell ; 43(11): 4142-4160, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-32356737

RESUMO

Three-dimensional morphable models (3DMMs) are powerful statistical tools for representing the 3D shapes and textures of an object class. Here we present the most complete 3DMM of the human head to date that includes face, cranium, ears, eyes, teeth and tongue. To achieve this, we propose two methods for combining existing 3DMMs of different overlapping head parts: (i). use a regressor to complete missing parts of one model using the other, and (ii). use the Gaussian Process framework to blend covariance matrices from multiple models. Thus, we build a new combined face-and-head shape model that blends the variability and facial detail of an existing face model (the LSFM) with the full head modelling capability of an existing head model (the LYHM). Then we construct and fuse a highly-detailed ear model to extend the variation of the ear shape. Eye and eye region models are incorporated into the head model, along with basic models of the teeth, tongue and inner mouth cavity. The new model achieves state-of-the-art performance. We use our model to reconstruct full head representations from single, unconstrained images allowing us to parameterize craniofacial shape and texture, along with the ear shape, eye gaze and eye color.


Assuntos
Imageamento Tridimensional , Reconhecimento Automatizado de Padrão , Algoritmos , Face , Cabeça/diagnóstico por imagem , Humanos
19.
Sci Rep ; 9(1): 13597, 2019 09 19.
Artigo em Inglês | MEDLINE | ID: mdl-31537815

RESUMO

Current computational tools for planning and simulation in plastic and reconstructive surgery lack sufficient precision and are time-consuming, thus resulting in limited adoption. Although computer-assisted surgical planning systems help to improve clinical outcomes, shorten operation time and reduce cost, they are often too complex and require extensive manual input, which ultimately limits their use in doctor-patient communication and clinical decision making. Here, we present the first large-scale clinical 3D morphable model, a machine-learning-based framework involving supervised learning for diagnostics, risk stratification, and treatment simulation. The model, trained and validated with 4,261 faces of healthy volunteers and orthognathic (jaw) surgery patients, diagnoses patients with 95.5% sensitivity and 95.2% specificity, and simulates surgical outcomes with a mean accuracy of 1.1 ± 0.3 mm. We demonstrate how this model could fully-automatically aid diagnosis and provide patient-specific treatment plans from a 3D scan alone, to help efficient clinical decision making and improve clinical understanding of face shape as a marker for primary and secondary surgery.


Assuntos
Interpretação de Imagem Assistida por Computador/métodos , Procedimentos Cirúrgicos Ortognáticos/métodos , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Tomada de Decisão Clínica , Simulação por Computador , Feminino , Voluntários Saudáveis , Humanos , Masculino , Pessoa de Meia-Idade , Modelagem Computacional Específica para o Paciente , Procedimentos de Cirurgia Plástica , Aprendizado de Máquina Supervisionado , Cirurgia Assistida por Computador , Adulto Jovem
20.
IEEE Trans Pattern Anal Mach Intell ; 41(10): 2349-2364, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-30843800

RESUMO

Robust principal component analysis (RPCA) is a powerful method for learning low-rank feature representation of various visual data. However, for certain types as well as significant amount of error corruption, it fails to yield satisfactory results; a drawback that can be alleviated by exploiting domain-dependent prior knowledge or information. In this paper, we propose two models for the RPCA that take into account such side information, even in the presence of missing values. We apply this framework to the task of UV completion which is widely used in pose-invariant face recognition. Moreover, we construct a generative adversarial network (GAN) to extract side information as well as subspaces. These subspaces not only assist in the recovery but also speed up the process in case of large-scale data. We quantitatively and qualitatively evaluate the proposed approaches through both synthetic data and eight real-world datasets to verify their effectiveness.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA