Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25.595
Filtrar
1.
Sensors (Basel) ; 21(5)2021 Feb 28.
Artigo em Inglês | MEDLINE | ID: mdl-33670835

RESUMO

At present, in the field of video-based human action recognition, deep neural networks are mainly divided into two branches: the 2D convolutional neural network (CNN) and 3D CNN. However, 2D CNN's temporal and spatial feature extraction processes are independent of each other, which means that it is easy to ignore the internal connection, affecting the performance of recognition. Although 3D CNN can extract the temporal and spatial features of the video sequence at the same time, the parameters of the 3D model increase exponentially, resulting in the model being difficult to train and transfer. To solve this problem, this article is based on 3D CNN combined with a residual structure and attention mechanism to improve the existing 3D CNN model, and we propose two types of human action recognition models (the Residual 3D Network (R3D) and Attention Residual 3D Network (AR3D)). Firstly, in this article, we propose a shallow feature extraction module and improve the ordinary 3D residual structure, which reduces the parameters and strengthens the extraction of temporal features. Secondly, we explore the application of the attention mechanism in human action recognition and design a 3D spatio-temporal attention mechanism module to strengthen the extraction of global features of human action. Finally, in order to make full use of the residual structure and attention mechanism, an Attention Residual 3D Network (AR3D) is proposed, and its two fusion strategies and corresponding model structure (AR3D_V1, AR3D_V2) are introduced in detail. Experiments show that the fused structure shows different degrees of performance improvement compared to a single structure.


Assuntos
Atividades Humanas , Redes Neurais de Computação , Humanos , Reconhecimento Automatizado de Padrão , Gravação em Vídeo
2.
Sensors (Basel) ; 21(5)2021 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-33668254

RESUMO

Speech emotion recognition (SER) is a natural method of recognizing individual emotions in everyday life. To distribute SER models to real-world applications, some key challenges must be overcome, such as the lack of datasets tagged with emotion labels and the weak generalization of the SER model for an unseen target domain. This study proposes a multi-path and group-loss-based network (MPGLN) for SER to support multi-domain adaptation. The proposed model includes a bidirectional long short-term memory-based temporal feature generator and a transferred feature extractor from the pre-trained VGG-like audio classification model (VGGish), and it learns simultaneously based on multiple losses according to the association of emotion labels in the discrete and dimensional models. For the evaluation of the MPGLN SER as applied to multi-cultural domain datasets, the Korean Emotional Speech Database (KESD), including KESDy18 and KESDy19, is constructed, and the English-speaking Interactive Emotional Dyadic Motion Capture database (IEMOCAP) is used. The evaluation of multi-domain adaptation and domain generalization showed 3.7% and 3.5% improvements, respectively, of the F1 score when comparing the performance of MPGLN SER with a baseline SER model that uses a temporal feature generator. We show that the MPGLN SER efficiently supports multi-domain adaptation and reinforces model generalization.


Assuntos
Bases de Dados Factuais , Emoções/classificação , Aprendizado de Máquina , Reconhecimento Automatizado de Padrão , Fala , Humanos
3.
Sensors (Basel) ; 21(4)2021 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-33671966

RESUMO

In recent years, flexible sensors for data gloves have been developed that aim to achieve excellent wearability, but they are associated with difficulties due to the complicated manufacturing and embedding into the glove. This study proposes a knitted glove integrated with strain sensors for pattern recognition of hand postures. The proposed sensing glove is fabricated at all once by a knitting technique without sewing and bonding, which is composed of strain sensors knitted with conductive yarn and a glove body with non-conductive yarn. To verify the performance of the developed glove, electrical resistance variations were measured according to the flexed angle and speed. These data showed different values depending on the speed or angle of movements. We carried out experiments on hand postures pattern recognition for the practicability verification of the knitted sensing glove. For this purpose, 10 able-bodied subjects participated in the recognition experiments on 10 target hand postures. The average classification accuracy of 10 subjects reached 94.17% when their own data were used. The accuracy of up to 97.1% was achieved in the case of grasp posture among 10 target postures. When all mixed data from 10 subjects were utilized for pattern recognition, the average classification expressed by the confusion matrix arrived at 89.5%. Therefore, the comprehensive experimental results demonstrated the effectiveness of the knitted sensing gloves. In addition, it is expected to reduce the cost through a simple manufacturing process of the knitted sensing glove.


Assuntos
Luvas Protetoras , Mãos , Postura , Força da Mão , Humanos , Reconhecimento Automatizado de Padrão , Amplitude de Movimento Articular
4.
Sensors (Basel) ; 21(4)2021 Feb 13.
Artigo em Inglês | MEDLINE | ID: mdl-33668544

RESUMO

Surgeons' procedural skills and intraoperative decision making are key elements of clinical practice. However, the objective assessment of these skills remains a challenge to this day. Surgical workflow analysis (SWA) is emerging as a powerful tool to solve this issue in surgical educational environments in real time. Typically, SWA makes use of video signals to automatically identify the surgical phase. We hypothesize that the analysis of surgeons' speech using natural language processing (NLP) can provide deeper insight into the surgical decision-making processes. As a preliminary step, this study proposes to use audio signals registered in the educational operating room (OR) to classify the phases of a laparoscopic cholecystectomy (LC). To do this, we firstly created a database with the transcriptions of audio recorded in surgical educational environments and their corresponding phase. Secondly, we compared the performance of four feature extraction techniques and four machine learning models to find the most appropriate model for phase recognition. The best resulting model was a support vector machine (SVM) coupled to a hidden-Markov model (HMM), trained with features obtained with Word2Vec (82.95% average accuracy). The analysis of this model's confusion matrix shows that some phrases are misplaced due to the similarity in the words used. The study of the model's temporal component suggests that further attention should be paid to accurately detect surgeons' normal conversation. This study proves that speech-based classification of LC phases can be effectively achieved. This lays the foundation for the use of audio signals for SWA, to create a framework of LC to be used in surgical training, especially for the training and assessment of procedural and decision-making skills (e.g., to assess residents' procedural knowledge and their ability to react to adverse situations).


Assuntos
Colecistectomia Laparoscópica , Competência Clínica , Cirurgia Geral , Reconhecimento Automatizado de Padrão , Cirurgia Geral/normas , Humanos , Salas Cirúrgicas , Fala
5.
J Chromatogr A ; 1639: 461922, 2021 Feb 22.
Artigo em Inglês | MEDLINE | ID: mdl-33540183

RESUMO

A peak-tracking algorithm was developed for use in comprehensive two-dimensional liquid chromatography coupled to mass spectrometry. Chromatographic peaks were tracked across two different chromatograms, utilizing the available spectral information, the statistical moments of the peaks and the relative retention times in both dimensions. The algorithm consists of three branches. In the pre-processing branch, system peaks are removed based on mass spectra compared to low intensity regions and search windows are applied, relative to the retention times in each dimension, to reduce the required computational power by elimination unlikely pairs. In the comparison branch, similarity between the spectral information and statistical moments of peaks within the search windows is calculated. Lastly, in the evaluation branch extracted-ion-current chromatograms are utilized to assess the validity of the pairing results. The algorithm was applied to peptide retention data recorded under varying chromatographic conditions for use in retention modelling as part of method optimization tools. Moreover, the algorithm was applied to complex peptide mixtures obtained from enzymatic digestion of monoclonal antibodies. The algorithm yielded no false positives. However, due to limitations in the peak-detection algorithm, cross-pairing within the same peaks occurred and six trace compounds remained falsely unpaired.


Assuntos
Algoritmos , Anticorpos Monoclonais/análise , Cromatografia Líquida/métodos , Peptídeos/análise , Espectrometria de Massas/métodos , Reconhecimento Automatizado de Padrão , Padrões de Referência
6.
J Korean Med Sci ; 36(5): e46, 2021 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-33527788

RESUMO

BACKGROUND: It is difficult to distinguish subtle differences shown in computed tomography (CT) images of coronavirus disease 2019 (COVID-19) and bacterial pneumonia patients, which often leads to an inaccurate diagnosis. It is desirable to design and evaluate interpretable feature extraction techniques to describe the patient's condition. METHODS: This is a retrospective cohort study of 170 confirmed patients with COVID-19 or bacterial pneumonia acquired at Yeungnam University Hospital in Daegu, Korea. The Lung and lesion regions were segmented to crop the lesion into 2D patches to train a classifier model that could differentiate between COVID-19 and bacterial pneumonia. The K-means algorithm was used to cluster deep features extracted by the trained model into 20 groups. Each lesion patch cluster was described by a characteristic imaging term for comparison. For each CT image containing multiple lesions, a histogram of lesion types was constructed using the cluster information. Finally, a Support Vector Machine classifier was trained with the histogram and radiomics features to distinguish diseases and severity. RESULTS: The 20 clusters constructed from 170 patients were reviewed based on common radiographic appearance types. Two clusters showed typical findings of COVID-19, with two other clusters showing typical findings related to bacterial pneumonia. Notably, there is one cluster that showed bilateral diffuse ground-glass opacities (GGOs) in the central and peripheral lungs and was considered to be a key factor for severity classification. The proposed method achieved an accuracy of 91.2% for classifying COVID-19 and bacterial pneumonia patients with 95% reported for severity classification. The CT quantitative parameters represented by the values of cluster 8 were correlated with existing laboratory data and clinical parameters. CONCLUSION: Deep chest CT analysis with constructed lesion clusters revealed well-known COVID-19 CT manifestations comparable to manual CT analysis. The constructed histogram features improved accuracy for both diseases and severity classification, and showed correlations with laboratory data and clinical parameters. The constructed histogram features can provide guidance for improved analysis and treatment of COVID-19.


Assuntos
/diagnóstico por imagem , Pulmão/diagnóstico por imagem , Pneumonia Bacteriana/diagnóstico por imagem , Tomografia Computadorizada por Raios X , Adulto , Idoso , Algoritmos , Inteligência Artificial , Análise por Conglomerados , Aprendizado Profundo , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Reconhecimento Automatizado de Padrão , Reprodutibilidade dos Testes , República da Coreia/epidemiologia , Estudos Retrospectivos , Índice de Gravidade de Doença , Máquina de Vetores de Suporte
7.
Biomed Eng Online ; 20(1): 22, 2021 Feb 17.
Artigo em Inglês | MEDLINE | ID: mdl-33596908

RESUMO

BACKGROUND: The detection and dissection of epidermal subgroups could lead to an improved understanding of skin homeostasis and wound healing. Flow cytometric analysis provides an effective method to detect the surface markers of epidermal cells while producing high-dimensional data files. METHODS: A 9-color flow cytometric panel was optimized to reveal the heterogeneous subgroups in the epidermis of human skin. The subsets of epidermal cells were characterized using automated methods based on dimensional reduction approaches (viSNE) and clustering with Spanning-tree Progression Analysis of Density-normalized Events (SPADE). RESULTS: The manual analysis revealed differences in epidermal distribution between body sites based on a series biaxial gating starting with the expression of CD49f and CD29. The computational analysis divided the whole epidermal cell population into 25 clusters according to the surface marker phenotype with SPADE. This automatic analysis delineated the differences between body sites. The consistency of the results was confirmed with PhenoGraph. CONCLUSION: A multicolor flow cytometry panel with a streamlined computational analysis pipeline is a feasible approach to delineate the heterogeneity of the epidermis in human skin.


Assuntos
Epiderme/fisiologia , Citometria de Fluxo/métodos , Pele/citologia , Algoritmos , Análise por Conglomerados , Cor , Simulação por Computador , Humanos , Aprendizado de Máquina , Reconhecimento Automatizado de Padrão , Fenótipo , Software
8.
Biomed Eng Online ; 20(1): 23, 2021 Feb 25.
Artigo em Inglês | MEDLINE | ID: mdl-33632226

RESUMO

BACKGROUND: Precise visualization of meshes and their position would greatly aid in mesh shrinkage evaluation, hernia recurrence risk assessment, and the preoperative planning of salvage repair. Lightweight (LW) meshes are able to preserve abdominal wall compliance by generating less post-implantation fibrosis and rigidity. However, conventional 3D imaging techniques such as computed tomography (CT) and magnetic resonance imaging (MRI) cannot visualize the LW meshes. Patients sometimes have to undergo a second-look operation for visualizing the mesh implants. The goal of this work is to investigate the potential advantages of Automated 3D breast ultrasound (ABUS) pore texture analysis for implanted LW hernia mesh identification. METHODS: In vitro, the appearances of four different flat meshes in both ABUS and 2D hand-held ultrasound (HHUS) images were evaluated and compared. In vivo, pore texture patterns of 87 hernia regions were analyzed both in ABUS images and their corresponding HHUS images. RESULTS: In vitro studies, the imaging results of ABUS for implanted LW meshes are much more visualized and effective in comparison to HHUS. In vivo, the inter-class distance of 40 texture features was calculated. The texture features of 2D sectional plans (axial and sagittal plane) have no significant contribution to implanted LW mesh identification. Significant contribution was observed in coronal plane. However, since the mesh may have spatial variation such as shrinkage after implantation surgery, the inter-class distance of 3D coronal plane pore texture features are bigger than 2D coronal plane, so the contribution of 3D coronal plane pore texture features are more valuable than 2D coronal plane for implanted LW mesh identification. The use of 3D pore texture features significantly improved the robustness of the identification method in distinguishing between LW mesh and fascia. CONCLUSIONS: An innovative new ABUS provides additional pore texture visualization, by separating the LW mesh from the fascia tissues. Therefore, ABUS has the potential to provides more accurate features to characterize pore texture patterns, and ultimately provide more accurate measures for implanted LW mesh identification.


Assuntos
Parede Abdominal/diagnóstico por imagem , Hérnia/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão , Telas Cirúrgicas , Adulto , Idoso , Feminino , Análise de Fourier , Hérnia/terapia , Humanos , Imageamento Tridimensional , Imagem por Ressonância Magnética , Masculino , Pessoa de Meia-Idade , Próteses e Implantes , Ultrassonografia
9.
Sensors (Basel) ; 21(4)2021 Feb 05.
Artigo em Inglês | MEDLINE | ID: mdl-33562715

RESUMO

Sign language is the most important way of communication for hearing-impaired people. Research on sign language recognition can help normal people understand sign language. We reviewed the classic methods of sign language recognition, and the recognition accuracy is not high enough because of redundant information, human finger occlusion, motion blurring, the diversified signing styles of different people, and so on. To overcome these shortcomings, we propose a multi-scale and dual sign language recognition Network (SLR-Net) based on a graph convolutional network (GCN). The original input data was RGB videos. We first extracted the skeleton data from them and then used the skeleton data for sign language recognition. SLR-Net is mainly composed of three sub-modules: multi-scale attention network (MSA), multi-scale spatiotemporal attention network (MSSTA) and attention enhanced temporal convolution network (ATCN). MSA allows the GCN to learn the dependencies between long-distance vertices; MSSTA can directly learn the spatiotemporal features; ATCN allows the GCN network to better learn the long temporal dependencies. The three different attention mechanisms, multi-scale attention mechanism, spatiotemporal attention mechanism, and temporal attention mechanism, are proposed to further improve the robustness and accuracy. Besides, a keyframe extraction algorithm is proposed, which can greatly improve efficiency by sacrificing a little accuracy. Experimental results showed that our method can reach 98.08% accuracy rate in the CSL-500 dataset with a 500-word vocabulary. Even on the challenging dataset DEVISIGN-L with a 2000-word vocabulary, it also reached a 64.57% accuracy rate, outperforming other state-of-the-art sign language recognition methods.


Assuntos
Reconhecimento Automatizado de Padrão , Línguas de Sinais , Algoritmos , Humanos , Movimento (Física) , Vocabulário
10.
J Chromatogr A ; 1640: 461896, 2021 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-33548825

RESUMO

Gas chromatography electron impact ionization mass spectrometry (GC-EI-MS) has been, and remains, the most widely applied analytical technique for metabolomic studies of essential oils. GC-EI-MS analysis of complex samples, such as essential oils, creates a large volume of data. Creating predictive models for such samples and observing patterns within complex data sets presents a significant challenge and requires application of robust data handling and data analysis methods. Accordingly, a wide variety of software and algorithms has been investigated and developed for this purpose over the years. This review provides an overview and summary of that research effort, and attempts to classify and compare different data handling and data analysis procedures that have been reported to-date in the metabolomic study of essential oils using GC-EI-MS.


Assuntos
Análise de Dados , Cromatografia Gasosa-Espectrometria de Massas/métodos , Metabolômica , Óleos Voláteis/metabolismo , Algoritmos , Reconhecimento Automatizado de Padrão
11.
Phys Rev Lett ; 126(4): 048101, 2021 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-33576647

RESUMO

Recent advances in microscopy techniques make it possible to study the growth, dynamics, and response of complex biophysical systems at single-cell resolution, from bacterial communities to tissues and organoids. In contrast to ordered crystals, it is less obvious how one can reliably distinguish two amorphous yet structurally different cellular materials. Here, we introduce a topological earth mover's (TEM) distance between disordered structures that compares local graph neighborhoods of the microscopic cell-centroid networks. Leveraging structural information contained in the neighborhood motif distributions, the TEM metric allows an interpretable reconstruction of equilibrium and nonequilibrium phase spaces and embedded pathways from static system snapshots alone. Applied to cell-resolution imaging data, the framework recovers time ordering without prior knowledge about the underlying dynamics, revealing that fly wing development solves a topological optimal transport problem. Extending our topological analysis to bacterial swarms, we find a universal neighborhood size distribution consistent with a Tracy-Widom law.


Assuntos
Modelos Teóricos , Reconhecimento Automatizado de Padrão/métodos , Algoritmos , Animais , Fenômenos Biofísicos , Coloides/química , Microscopia Crioeletrônica , Drosophila , Entropia , Células Epiteliais/citologia , Interpretação de Imagem Assistida por Computador/métodos , Modelos Biológicos , Modelos Químicos , RNA/química
12.
Neural Netw ; 135: 158-176, 2021 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-33388507

RESUMO

The sparse coding algorithm has served as a model for early processing in mammalian vision. It has been assumed that the brain uses sparse coding to exploit statistical properties of the sensory stream. We hypothesize that sparse coding discovers patterns from the data set, which can be used to estimate a set of stimulus parameters by simple readout. In this study, we chose a model of stereo vision to test our hypothesis. We used the Locally Competitive Algorithm (LCA), followed by a naïve Bayes classifier, to infer stereo disparity. From the results we report three observations. First, disparity inference was successful with this naturalistic processing pipeline. Second, an expanded, highly redundant representation is required to robustly identify the input patterns. Third, the inference error can be predicted from the number of active coefficients in the LCA representation. We conclude that sparse coding can generate a suitable general representation for subsequent inference tasks.


Assuntos
Algoritmos , Interpretação Estatística de Dados , Reconhecimento Automatizado de Padrão/métodos , Disparidade Visual/fisiologia , Percepção Visual/fisiologia , Teorema de Bayes , Humanos , Visão Ocular/fisiologia , Córtex Visual/fisiologia
13.
Neural Netw ; 135: 177-191, 2021 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-33395588

RESUMO

This paper introduces a novel framework for generative models based on Restricted Kernel Machines (RKMs) with joint multi-view generation and uncorrelated feature learning, called Gen-RKM. To enable joint multi-view generation, this mechanism uses a shared representation of data from various views. Furthermore, the model has a primal and dual formulation to incorporate both kernel-based and (deep convolutional) neural network based models within the same setting. When using neural networks as explicit feature-maps, a novel training procedure is proposed, which jointly learns the features and shared subspace representation. The latent variables are given by the eigen-decomposition of the kernel matrix, where the mutual orthogonality of eigenvectors represents the learned uncorrelated features. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of generated samples on various standard datasets.


Assuntos
Aprendizado Profundo , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão/métodos , Estimulação Luminosa/métodos , Humanos
14.
Neural Netw ; 135: 201-211, 2021 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-33401226

RESUMO

Discriminative learning based on convolutional neural networks (CNNs) aims to perform image restoration by learning from training examples of noisy-clean image pairs. It has become the go-to methodology for tackling image restoration and has outperformed the traditional non-local class of methods. However, the top-performing networks are generally composed of many convolutional layers and hundreds of neurons, with trainable parameters in excess of several million. We claim that this is due to the inherently linear nature of convolution-based transformation, which is inadequate for handling severe restoration problems. Recently, a non-linear generalization of CNNs, called the operational neural networks (ONN), has been shown to outperform CNN on AWGN denoising. However, its formulation is burdened by a fixed collection of well-known non-linear operators and an exhaustive search to find the best possible configuration for a given architecture, whose efficacy is further limited by a fixed output layer operator assignment. In this study, we leverage the Taylor series-based function approximation to propose a self-organizing variant of ONNs, Self-ONNs, for image restoration, which synthesizes novel nodal transformations on-the-fly as part of the learning process, thus eliminating the need for redundant training runs for operator search. In addition, it enables a finer level of operator heterogeneity by diversifying individual connections of the receptive fields and weights. We perform a series of extensive ablation experiments across three severe image restoration tasks. Even when a strict equivalence of learnable parameters is imposed, Self-ONNs surpass CNNs by a considerable margin across all problems, improving the generalization performance by up to 3 dB in terms of PSNR.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão/métodos , Humanos , Neurônios/fisiologia , Estimulação Luminosa/métodos
15.
Sensors (Basel) ; 21(2)2021 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-33440785

RESUMO

Graph convolutional networks (GCNs) have brought considerable improvement to the skeleton-based action recognition task. Existing GCN-based methods usually use the fixed spatial graph size among all the layers. It severely affects the model's abilities to exploit the global and semantic discriminative information due to the limits of receptive fields. Furthermore, the fixed graph size would cause many redundancies in the representation of actions, which is inefficient for the model. The redundancies could also hinder the model from focusing on beneficial features. To address those issues, we proposed a plug-and-play channel adaptive merging module (CAMM) specific for the human skeleton graph, which can merge the vertices from the same part of the skeleton graph adaptively and efficiently. The merge weights are different across the channels, so every channel has its flexibility to integrate the joints. Then, we build a novel shallow graph convolutional network (SGCN) based on the module, which achieves state-of-the-art performance with less computational cost. Experimental results on NTU-RGB+D and Kinetics-Skeleton illustrates the superiority of our methods.


Assuntos
Redes Neurais de Computação , Esqueleto , Humanos , Reconhecimento Automatizado de Padrão
17.
Neural Netw ; 135: 1-12, 2021 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-33310193

RESUMO

Knowledge graph reasoning aims to find reasoning paths for relations over incomplete knowledge graphs (KG). Prior works may not take into account that the rewards for each position (vertex in the graph) may be different. We propose the distance-aware reward in the reinforcement learning framework to assign different rewards for different positions. We observe that KG embeddings are learned from independent triples and therefore cannot fully cover the information described in the local neighborhood. To this effect, we integrate a graph self-attention (GSA) mechanism to capture more comprehensive entity information from the neighboring entities and relations. To let the model remember the path, we incorporate the GSA mechanism with GRU to consider the memory of relations in the path. Our approach can train the agent in one-pass, thus eliminating the pre-training or fine-tuning process, which significantly reduces the problem complexity. Experimental results demonstrate the effectiveness of our method. We found that our model can mine more balanced paths for each relation.


Assuntos
Bases de Dados Factuais , Aprendizado Profundo , Reconhecimento Automatizado de Padrão/métodos , Reforço Psicológico , Algoritmos , Bases de Dados Factuais/tendências , Aprendizado Profundo/tendências , Humanos , Conhecimento , Reconhecimento Automatizado de Padrão/tendências
18.
Neural Netw ; 135: 68-77, 2021 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-33360149

RESUMO

The goal of n-shot learning is the classification of input data from small datasets. This type of learning is challenging in neural networks, which typically need a high number of data during the training process. Recent advancements in data augmentation allow us to produce an infinite number of target conditions from the primary condition. This process includes two main steps for finding the best augmentations and training the data with the new augmentation techniques. Optimizing these two steps for n-shot learning is still an open problem. In this paper, we propose a new method for auto-augmentation to address both of these problems. The proposed method can potentially extract many possible types of information from a small number of available data points in n-shot learning. The results of our experiments on five prominent n-shot learning datasets show the effectiveness of the proposed method.


Assuntos
Bases de Dados Factuais , Aprendizado Profundo , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão/métodos , Estimulação Luminosa/métodos , Bases de Dados Factuais/tendências , Aprendizado Profundo/tendências , Humanos , Reconhecimento Automatizado de Padrão/tendências
19.
Ultrasonics ; 111: 106326, 2021 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-33348233

RESUMO

Biometric recognition systems based on ultrasonic images have several advantages over other technologies, including the capability of capturing 3D images and detecting liveness. In this work, a recognition system based on hand geometry achieved through ultrasound images is proposed and experimentally evaluated. 3D images of human hand are acquired by performing parallel mechanical scans with a commercial ultrasound probe. Several 2D images are then extracted at increasing under-skin depths and, from each of them, up to 26 distances among key points of the hand are defined and computed to achieve a 2D template. A 3D template is then obtained by combining in several ways 2D templates of two or more images. A preliminary evaluation of the system is achieved by carrying out verification experiments on a home-made database. Results have shown a good recognition accuracy: the Equal Error Rate was 1.15% when a single 2D image is used and improved to 0.98% by using the 3D template. The possibility to upgrade the proposed system to a multimodal system, by extracting from the same volume other features like palmprint and hand veins, as well as possible improvements are finally discussed.


Assuntos
Biometria/métodos , Mãos/diagnóstico por imagem , Imageamento Tridimensional/métodos , Ultrassonografia/métodos , Algoritmos , Humanos , Processamento de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos
20.
Neural Netw ; 133: 220-228, 2021 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-33232858

RESUMO

Attribution editing has achieved remarkable progress in recent years owing to the encoder-decoder structure and generative adversarial network (GAN). However, it remains challenging to generate high-quality images with accurate attribute transformation. Attacking these problems, the work proposes a novel selective attribute editing model based on classification adversarial network (referred to as ClsGAN) that shows good balance between attribute transfer accuracy and photo-realistic images. Considering that the editing images are prone to be affected by original attribute due to skip-connection in encoder-decoder structure, an upper convolution residual network (referred to as Tr-resnet) is presented to selectively extract information from the source image and target label. In addition, to further improve the transfer accuracy of generated images, an attribute adversarial classifier (referred to as Atta-cls) is introduced to guide the generator from the perspective of attribute through learning the defects of attribute transfer images. Experimental results on CelebA demonstrate that our ClsGAN performs favorably against state-of-the-art approaches in image quality and transfer accuracy. Moreover, ablation studies are also designed to verify the great performance of Tr-resnet and Atta-cls.


Assuntos
Redes Neurais de Computação , Reconhecimento Automatizado de Padrão/classificação , Humanos , Reconhecimento Automatizado de Padrão/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...