RESUMO
Milling is a main processing mode of the modern manufacturing industry, which seriously affects the quality and precision of the machined workpiece. However, it is difficult to monitor the tool wear condition in the continuous cutting process, especially under a variable speed condition. The existing tool wear condition monitoring methods only carry out analysis with a constant engine speed. Different from the general monitoring methods, this paper put forward a milling cutter wear condition monitoring method based on order analysis (OA) and stacked sparse autoencoder (SSAE). The methodology in the research include signals feature extraction and tool wear state monitoring and were designed to analyze the three-phase spindle current signals instead of the traditional force signals and vibration signals. The variable speed signals were transformed into angle domain stationary signals by order analysis, and the SSAE neural network was used to monitor the tool wear state. The proposed method was verified on the laboratory signals and the results showed a better performance than the other methods and a better applicability in actual industrial manufacturing.
RESUMO
Gearbox fault diagnosis based on the analysis of vibration signals has been a major research topic for a few decades due to the advantages of vibration characteristics. Such characteristics are used for early fault detection to guarantee the enhanced safety of complex systems and their cost-effective operation. There exist many fault diagnosis models that have been developed for classifying various fault types in gearboxes. However, the classification results of the conventional fault classification models degrade when they are applied to gearbox systems with multi-level tooth cut gear (MTCG) faults operating under variable shaft speeds. These conditions cause difficulty in discriminating the gear fault types. Due to the improved computational capabilities of modern systems, the application of deep neural networks (DNNs) is getting popular in a variety of research fields, such as image and natural language processing. DNNs are capable of improving the classification results even when addressing complex problems such as diagnosing gearbox MTCG faults. In this research, an adaptive noise control (ANC) and a stacked sparse autoencoder-based deep neural network (SSA-DNN) are used to construct a sensitive fault diagnosis model that can diagnose a gearbox system with MTCG fault types under varying shaft rotation speeds, despite its complicatedness. An ANC is applied to gear vibration characteristics to remove a significant level of noise along the frequency spectrum of vibration signals to fix the most fault-informative components of each fault case. Next, the autoencoder learns the gear faults characteristic features from these fault-informative components to separate the fault types considered in this study. Furthermore, the implementation of the SSA-DNN is substituted for feature extraction, feature selection, and the classification processes in traditional fault diagnosis schemes by high-performance unity. The experimental results show that the proposed model outperforms conventional methodologies with higher classification accuracy.
RESUMO
Automatic detection of left ventricle myocardium is essential to subsequent cardiac image registration and tissue segmentation. However, it is considered challenging mainly because of the complex and varying shape of the myocardium and surrounding tissues across slices and phases. In this study, a hybrid model is proposed to detect myocardium in cardiac magnetic resonance (MR) images combining region proposal and deep feature classification and regression. The model firstly generates candidate regions using new structural similarity-enhanced supervoxel over-segmentation plus hierarchical clustering. Then it adopts a deep stacked sparse autoencoder (SSAE) network to learn the discriminative deep feature to represent the regions. Finally, the features are fed to train a novel nonlinear within-class neighborhood preserved soft margin support vector (C-SVC) classifier and multiple-output support vector ( ε -SVR) regressor for refining the location of myocardium. To improve the stability and generalization, the model also takes hard negative sample mining strategy to fine-tune the SSAE and the classifier. The proposed model with impacts of different components were extensively evaluated and compared to related methods on public cardiac data set. Experimental results verified the effectiveness of proposed integrated components, and demonstrated that it was robust in myocardium localization and outperformed the state-of-the-art methods in terms of typical metrics. This study would be beneficial in some cardiac image processing such as region-of-interest cropping and left ventricle volume measurement.
Assuntos
Técnicas de Imagem Cardíaca , Ventrículos do Coração/diagnóstico por imagem , Processamento de Imagem Assistida por Computador , Miocárdio/patologia , Ventrículos do Coração/patologia , Humanos , Imageamento por Ressonância Magnética/tendências , Máquina de Vetores de SuporteRESUMO
Automatic vertebrae localization and identification in medical computed tomography (CT) scans is of great value for computer-aided spine diseases diagnosis. In order to overcome the disadvantages of the approaches employing hand-crafted, low-level features and based on field-of-view priori assumption of spine structure, an automatic method is proposed to localize and identify vertebrae by combining deep stacked sparse autoencoder (SSAE) contextual features and structured regression forest (SRF). The method employs SSAE to learn image deep contextual features instead of hand-crafted ones by building larger-range input samples to improve their contextual discrimination ability. In the localization and identification stage, it incorporates the SRF model to achieve whole spine localization, then screens those vertebrae within the image, thus relieves the assumption that the part of spine in the field of image is visible. In the end, the output distribution of SRF and spine CT scans properties are assembled to develop a two-stage progressive refining strategy, where the mean-shift kernel density estimation and Otsu method instead of Markov random field (MRF) are adopted to reduce model complexity and refine vertebrae localization results. Extensive evaluation was performed on a challenging data set of 98 spine CT scans. Compared with the hidden Markov model and the method based on convolutional neural network (CNN), the proposed approach could effectively and automatically locate and identify spinal targets in CT scans, and achieve higher localization accuracy, low model complexity, and no need for any assumptions about visual field in CT scans.
Assuntos
Redes Neurais de Computação , Reconhecimento Automatizado de Padrão/métodos , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Coluna Vertebral/diagnóstico por imagem , Tomografia Computadorizada por Raios X , Humanos , Doenças da Coluna Vertebral/diagnóstico por imagemRESUMO
The feasibility of using the fourier transform infrared (FTIR) spectroscopic technique with a stacked sparse auto-encoder (SSAE) to identify orchid varieties was studied. Spectral data of 13 orchids varieties covering the spectral range of 4000-550 cm-1 were acquired to establish discriminant models and to select optimal spectral variables. K nearest neighbors (KNN), support vector machine (SVM), and SSAE models were built using full spectra. The SSAE model performed better than the KNN and SVM models and obtained a classification accuracy 99.4% in the calibration set and 97.9% in the prediction set. Then, three algorithms, principal component analysis loading (PCA-loading), competitive adaptive reweighted sampling (CARS), and stacked sparse auto-encoder guided backward (SSAE-GB), were used to select 39, 300, and 38 optimal wavenumbers, respectively. The KNN and SVM models were built based on optimal wavenumbers. Most of the optimal wavenumbers-based models performed slightly better than the all wavenumbers-based models. The performance of the SSAE-GB was better than the other two from the perspective of the accuracy of the discriminant models and the number of optimal wavenumbers. The results of this study showed that the FTIR spectroscopic technique combined with the SSAE algorithm could be adopted in the identification of the orchid varieties.
Assuntos
Orchidaceae/química , Orchidaceae/classificação , Espectroscopia de Infravermelho com Transformada de Fourier , Algoritmos , Modelos Teóricos , Reprodutibilidade dos Testes , Análise Espectral , Máquina de Vetores de SuporteRESUMO
Brain tumor detection depicts a tough job because of its shape, size and appearance variations. In this manuscript, a deep learning model is deployed to predict input slices as a tumor (unhealthy)/non-tumor (healthy). This manuscript employs a high pass filter image to prominent the inhomogeneities field effect of the MR slices and fused with the input slices. Moreover, the median filter is applied to the fused slices. The resultant slices quality is improved with smoothen and highlighted edges of the input slices. After that, based on these slices' intensity, a 4-connected seed growing algorithm is applied, where optimal threshold clusters the similar pixels from the input slices. The segmented slices are then supplied to the fine-tuned two layers proposed stacked sparse autoencoder (SSAE) model. The hyperparameters of the model are selected after extensive experiments. At the first layer, 200 hidden units and at the second layer 400 hidden units are utilized. The testing is performed on the softmax layer for the prediction of the images having tumors and no tumors. The suggested model is trained and checked on BRATS datasets i.e., 2012(challenge and synthetic), 2013, and 2013 Leaderboard, 2014, and 2015 datasets. The presented model is evaluated with a number of performance metrics which demonstrates the improved performance.
Assuntos
Algoritmos , Neoplasias Encefálicas/diagnóstico por imagem , Neoplasias Encefálicas/patologia , Aprendizado Profundo , Diagnóstico por Computador/métodos , Humanos , Processamento de Imagem Assistida por Computador/métodosRESUMO
In this paper, a building extraction method is proposed based on a stacked sparse autoencoder with an optimized structure and training samples. Building extraction plays an important role in urban construction and planning. However, some negative effects will reduce the accuracy of extraction, such as exceeding resolution, bad correction and terrain influence. Data collected by multiple sensors, as light detection and ranging (LIDAR), optical sensor etc., are used to improve the extraction. Using digital surface model (DSM) obtained from LIDAR data and optical images, traditional method can improve the extraction effect to a certain extent, but there are some defects in feature extraction. Since stacked sparse autoencoder (SSAE) neural network can learn the essential characteristics of the data in depth, SSAE was employed to extract buildings from the combined DSM data and optical image. A better setting strategy of SSAE network structure is given, and an idea of setting the number and proportion of training samples for better training of SSAE was presented. The optical data and DSM were combined as input of the optimized SSAE, and after training by an optimized samples, the appropriate network structure can extract buildings with great accuracy and has good robustness.
RESUMO
Hearing loss, a partial or total inability to hear, is known as hearing impairment. Untreated hearing loss can have a bad effect on normal social communication, and it can cause psychological problems in patients. Therefore, we design a three-category classification system to detect the specific category of hearing loss, which is beneficial to be treated in time for patients. Before the training and test stages, we use the technology of data augmentation to produce a balanced dataset. Then we use deep autoencoder neural network to classify the magnetic resonance brain images. In the stage of deep autoencoder, we use stacked sparse autoencoder to generate visual features, and softmax layer to classify the different brain images into three categories of hearing loss. Our method can obtain good experimental results. The overall accuracy of our method is 99.5%, and the time consuming is 0.078 s per brain image. Our proposed method based on stacked sparse autoencoder works well in classification of hearing loss images. The overall accuracy of our method is 4% higher than the best of state-of-the-art approaches.
Assuntos
Perda Auditiva , Algoritmos , Encéfalo , Humanos , Interpretação de Imagem Assistida por Computador , Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Redes Neurais de Computação , Reprodutibilidade dos TestesRESUMO
Facial expression is a type of communication and is useful in many areas of computer vision, including intelligent visual surveillance, human-robot interaction and human behavior analysis. A deep learning approach is presented to classify happy, sad, angry, fearful, contemptuous, surprised and disgusted expressions. Accurate detection and classification of human facial expression is a critical task in image processing due to the inconsistencies amid the complexity, including change in illumination, occlusion, noise and the over-fitting problem. A stacked sparse auto-encoder for facial expression recognition (SSAE-FER) is used for unsupervised pre-training and supervised fine-tuning. SSAE-FER automatically extracts features from input images, and the softmax classifier is used to classify the expressions. Our method achieved an accuracy of 92.50% on the JAFFE dataset and 99.30% on the CK+ dataset. SSAE-FER performs well compared to the other comparative methods in the same domain.
Assuntos
Aprendizado Profundo , Reconhecimento Facial , Humanos , Comunicação , Medo , Processamento de Imagem Assistida por ComputadorRESUMO
Zinc (Zn) content plays a decisive role in plant growth. Accurate management of Zn fertilizer application can promote high-quality development of the oilseed rape industry. This study adopted a deep learning (DL) method to predict the Zn content of oilseed rape leaves using hyperspectral imaging (HSI). The dropout mechanism was introduced to improve the stacked sparse autoencoder (SSAE) and named modified SSAE (MSSAE). MSSAE extracted deep spectral features of samples based on pixel-level spectral information (the wavelength range of the spectrum is 431-962 nm). Subsequently, the deep spectral features were applied as the inputs for support vector regression (SVR) and least squares support vector regression (LSSVR) to predict the Zn content in oilseed rape leaves. In addition, the successive projections algorithm (SPA) and the variable iterative space shrinkage approach (VISSA) were investigated as wavelength selection algorithms for comparison. The results showed that the MSSAE-LSSVR model had the best prediction performance (the coefficient of determination (R2) and root mean square error (RMSE) of the prediction set were 0.9566 and 1.0240 mg/kg, respectively). The overall results showed that the MSSAE was able to extract the deep features of HSI data and validated the possibility of HSI combined with a DL method for nondestructive testing of Zn content in oilseed rape leaves.
Assuntos
Brassica napus , Imageamento Hiperespectral , Algoritmos , Análise dos Mínimos Quadrados , Folhas de Planta , Máquina de Vetores de Suporte , Verduras , ZincoRESUMO
OBJECTIVES: Computer-aided pathological voice detection is efficient for initial screening of pathological voice, and has received high academic and clinical attention. This paper proposes an automatic diagnosis method of pathological voice based on deep neural network (DNN). Other two classification models (support vector machines and random forests) were used to verify the effectiveness of DNN. METHODS: In this paper, we extracted 12 Mel frequency cepstral coefficients of each voice sample as row features. The constructed DNN consists a two-layer stacked sparse autoencoders network and a softmax layer. The stacked sparse autoencoders layer can learn high-level features from raw Mel frequency cepstral coefficients features. Then, the softmax layer can diagnose pathological voice according to high-level features. The DNN and the other two comparison models used the same train set and test set for the experiment. RESULTS: Experimental results reveal that the value of sensitivity, specificity, precision, accuracy, and F1 score of the DNN can reach 97.8%, 99.4%, 99.4%, 98.6%, and 98.4%, respectively. The five indexes of DNN classification results are at least 6.2%, 5%, 5.6%, 5.7%, and 6.2% higher than the comparison models (support vector machine and random forest). CONCLUSIONS: The proposed DNN can learn advanced features from raw acoustic features, and distinguish pathological voice from healthy voice. To the extent of this preliminary study, future studies can further explore the application of DNN in other experiments and clinical practice.
Assuntos
Redes Neurais de Computação , Voz , Acústica , Humanos , Máquina de Vetores de SuporteRESUMO
The intima-media thickness (IMT) of a common carotid artery in an ultrasound image is considered an important indicator of the onset of atherosclerosis. However, it is challenging to segment the intima-media complex (IMC) directly in ultrasound images. This study proposes a fully automatic method to segment the IMC on longitudinal B-mode ultrasound images. Our method consists of two stages: (i) extraction of the region of interest with a continuous max-flow algorithm and region-of-interest reconstruction using a stacked sparse auto-encoder model, and (ii) IMC segmentation using a trained random forest classifier. The proposed method has been tested on three databases from three different imaging centres, comprising a total of 228 ultrasound images of the common carotid artery. On the three databases, our method yields mean absolute errors of 0.028 ± 0.016 mm, 0.579 ± 0.288 pixel and 0.582 ± 0.341 pixel; polyline distance (PD) measures of 0.026 ± 0.017 mm, 0.657 ± 0.275 pixel and 0.731 ± 0:282 pixel; Hausdorff distance measures of 0.249 ± 0.101 mm, 4.760 ± 1.085 pixels and 5.825 ± 2.059 pixels; and correlation coefficients of 95.19%, 93.79%, and 98.96%, respectively. These results indicate that the proposed method performs well in segmentation of the IMC and measurement of the IMT.
Assuntos
Artérias Carótidas/diagnóstico por imagem , Espessura Intima-Media Carotídea , Artérias Carótidas/fisiologia , Humanos , Fluxo Sanguíneo Regional , Ultrassonografia/métodosRESUMO
Pregnancy is a complex process, and the prediction of premature birth is uncertain. Many researchers are exploring non-invasive approaches to enhance its predictability. Currently, the ElectroHysteroGram (EHG) and Tocography (TOCO) signal are a real-time and non-invasive technology which can be employed to predict preterm birth. For this purpose, sparse autoencoder (SAE) based deep neural network (SAE-based DNN) is developed. The deep neural network has three layers including a stacked sparse autoencoder (SSAE) network with two hidden layers and one final softmax layer. To this end, the bursts of all 26 recordings of the publicly available TPEHGT DS database corresponding to uterine contraction intervals and non-contraction intervals (dummy intervals) were manually segmented. 20 features were extracted by two feature extraction algorithms including sample entropy and wavelet entropy. Afterwards, the SSAE network is adopted to learn high-level features from raw features by unsupervised learning. The softmax layer is added at the top of the SSAE network for classification. In order to verify the effectiveness of the proposed method, this study used 10-fold cross-validation and four indicators to evaluate classification performance. Experimental research results display that the performance of deep neural network can achieve Sensitivity of 98.2%, Specificity of 97.74%, and Accuracy of 97.9% in the publicly TPEHGT DS database. The performance of deep neural network outperforms the comparison models including deep belief networks (DBN) and hierarchical extreme learning machine (H-ELM). Finally, experimental research results reveal that the proposed method could be valid applied to semi-automatic identification of term and preterm uterine recordings.
Assuntos
Nascimento Prematuro , Algoritmos , Bases de Dados Factuais , Feminino , Humanos , Recém-Nascido , Redes Neurais de Computação , Gravidez , Nascimento Prematuro/diagnóstico , Útero/diagnóstico por imagemRESUMO
Malignant mesothelioma (MM) is a rare but aggressive cancer. The definitive diagnosis of MM is critical for effective treatment and has important medicolegal significance. However, the definitive diagnosis of MM is challenging due to its composite epithelial/mesenchymal pattern. The aim of the current study was to develop a deep learning method to automatically diagnose MM. A retrospective analysis of 324 participants with or without MM was performed. Significant features were selected using a genetic algorithm (GA) or a ReliefF algorithm performed in MATLAB software. Subsequently, the current study constructed and trained several models based on a backpropagation (BP) algorithm, extreme learning machine algorithm and stacked sparse autoencoder (SSAE) to diagnose MM. A confusion matrix, F-measure and a receiver operating characteristic (ROC) curve were used to evaluate the performance of each model. A total of 34 potential variables were analyzed, while the GA and ReliefF algorithms selected 19 and 5 effective features, respectively. The selected features were used as the inputs of the three models. SSAE and GA+SSAE demonstrated the highest performance in terms of classification accuracy, specificity, F-measure and the area under the ROC curve. Overall, the GA+SSAE model was the preferred model since it required a shorter CPU time and fewer variables. Therefore, the SSAE with GA feature selection was selected as the most accurate model for the diagnosis of MM. The deep learning methods developed based on the GA+SSAE model may assist physicians with the diagnosis of MM.
RESUMO
A novel method to determine the Grade Group (GG) in prostate cancer (PCa) using multi-parametric magnetic resonance imaging (mpMRI) biomarkers is investigated in this paper. In this method, high-level features are extracted from hand-crafted texture features using a deep network of stacked sparse autoencoders (SSAE) and classified them using a softmax classifier (SMC). Transaxial T2 Weighted (T2W), Apparent Diffusion Coefficient (ADC) and high B-Value Diffusion-Weighted (BVAL) images obtained from PROSTATEx-2 2017 challenge dataset are used in this technique. The method was evaluated on the challenge dataset composed of a training set of 112 lesions and a test set of 70 lesions. It achieved a quadratic-weighted Kappa score of 0.2772 on evaluation using test dataset of the challenge. It also reached a Positive Predictive Value (PPV) of 80% in predicting PCa with GGâ¯>â¯1. The method achieved first place in the challenge, winning over 43 methods submitted by 21 groups. A 3-fold cross-validation using training data of the challenge was further performed and the method achieved a quadratic-weighted kappa score of 0.2326 and Positive Predictive Value (PPV) of 80.26% in predicting PCa with GGâ¯>â¯1. Even though the training dataset is a highly imbalanced one, the method was able to achieve a fair kappa score. Being one of the pioneer methods which attempted to classify prostate cancer into 5 grade groups from MRI images, it could serve as a base method for further investigations and improvements.
Assuntos
Interpretação de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Gradação de Tumores/métodos , Neoplasias da Próstata/patologia , Algoritmos , Humanos , Masculino , Próstata/diagnóstico por imagemRESUMO
BACKGROUND AND OBJECTIVE: Cancer has become a complex health problem due to its high mortality. Over the past few decades, with the rapid development of the high-throughput sequencing technology and the application of various machine learning methods, remarkable progress in cancer research has been made based on gene expression data. At the same time, a growing amount of high-dimensional data has been generated, such as RNA-seq data, which calls for superior machine learning methods able to deal with mass data effectively in order to make accurate treatment decision. METHODS: In this paper, we present a semi-supervised deep learning strategy, the stacked sparse auto-encoder (SSAE) based classification, for cancer prediction using RNA-seq data. The proposed SSAE based method employs the greedy layer-wise pre-training and a sparsity penalty term to help capture and extract important information from the high-dimensional data and then classify the samples. RESULTS: We tested the proposed SSAE model on three public RNA-seq data sets of three types of cancers and compared the prediction performance with several commonly-used classification methods. The results indicate that our approach outperforms the other methods for all the three cancer data sets in various metrics. CONCLUSIONS: The proposed SSAE based semi-supervised deep learning model shows its promising ability to process high-dimensional gene expression data and is proved to be effective and accurate for cancer prediction.
Assuntos
Neoplasias da Mama/diagnóstico , Aprendizado Profundo , Diagnóstico por Computador/métodos , Regulação Neoplásica da Expressão Gênica , Neoplasias Pulmonares/diagnóstico , Neoplasias Gástricas/diagnóstico , Aprendizado de Máquina Supervisionado , Algoritmos , Neoplasias da Mama/genética , Reações Falso-Positivas , Feminino , Humanos , Neoplasias Pulmonares/genética , Masculino , Reconhecimento Automatizado de Padrão , RNA , Reprodutibilidade dos Testes , Análise de Sequência de RNA , Software , Neoplasias Gástricas/genéticaRESUMO
Early diagnosis remains a significant challenge for many neurological disorders, especially for rare disorders where studying large cohorts is not possible. A novel solution that investigators have undertaken is combining advanced machine learning algorithms with resting-state functional Magnetic Resonance Imaging to unveil hidden pathological brain connectome patterns to uncover diagnostic and prognostic biomarkers. Recently, state-of-the-art deep learning techniques are outperforming traditional machine learning methods and are hailed as a milestone for artificial intelligence. However, whole brain classification that combines brain connectome with deep learning has been hindered by insufficient training samples. Inspired by the transfer learning strategy employed in computer vision, we exploited previously collected resting-state functional MRI data for healthy subjects from existing databases and transferred this knowledge for new disease classification tasks. We developed a deep transfer learning neural network (DTL-NN) framework for enhancing the classification of whole brain functional connectivity patterns. Briefly, we trained a stacked sparse autoencoder (SSAE) prototype to learn healthy functional connectivity patterns in an offline learning environment. Then, the SSAE prototype was transferred to a DTL-NN model for a new classification task. To test the validity of our framework, we collected resting-state functional MRI data from the Autism Brain Imaging Data Exchange (ABIDE) repository. Using autism spectrum disorder (ASD) classification as a target task, we compared the performance of our DTL-NN approach with a traditional deep neural network and support vector machine models across four ABIDE data sites that enrolled at least 60 subjects. As compared to traditional models, our DTL-NN approach achieved an improved performance in accuracy, sensitivity, specificity and area under receiver operating characteristic curve. These findings suggest that DTL-NN approaches could enhance disease classification for neurological conditions, where accumulating large neuroimaging datasets has been challenging.
RESUMO
Investigation of the brain's functional connectome can improve our understanding of how an individual brain's organizational changes influence cognitive function and could result in improved individual risk stratification. Brain connectome studies in adults and older children have shown that abnormal network properties may be useful as discriminative features and have exploited machine learning models for early diagnosis in a variety of neurological conditions. However, analogous studies in neonates are rare and with limited significant findings. In this paper, we propose an artificial neural network (ANN) framework for early prediction of cognitive deficits in very preterm infants based on functional connectome data from resting state fMRI. Specifically, we conducted feature selection via stacked sparse autoencoder and outcome prediction via support vector machine (SVM). The proposed ANN model was unsupervised learned using brain connectome data from 884 subjects in autism brain imaging data exchange database and SVM was cross-validated on 28 very preterm infants (born at 23-31â¯weeks of gestation and without brain injury; scanned at term-equivalent postmenstrual age). Using 90 regions of interests, we found that the ANN model applied to functional connectome data from very premature infants can predict cognitive outcome at 2â¯years of corrected age with an accuracy of 70.6% and area under receiver operating characteristic curve of 0.76. We also noted that several frontal lobe and somatosensory regions, significantly contributed to prediction of cognitive deficits 2â¯years later. Our work can be considered as a proof of concept for utilizing ANN models on functional connectome data to capture the individual variability inherent in the developing brains of preterm infants. The full potential of ANN will be realized and more robust conclusions drawn when applied to much larger neuroimaging datasets, as we plan to do.
Assuntos
Encéfalo/fisiopatologia , Transtornos Cognitivos/diagnóstico , Conectoma , Lactente Extremamente Prematuro , Redes Neurais de Computação , Encéfalo/diagnóstico por imagem , Feminino , Idade Gestacional , Humanos , Processamento de Imagem Assistida por Computador , Lactente , Imageamento por Ressonância Magnética , Masculino , Testes Neuropsicológicos , Oxigênio/sangue , Máquina de Vetores de SuporteRESUMO
PURPOSE: Wireless capsule endoscopy (WCE) enables physicians to examine the digestive tract without any surgical operations, at the cost of a large volume of images to be analyzed. In the computer-aided diagnosis of WCE images, the main challenge arises from the difficulty of robust characterization of images. This study aims to provide discriminative description of WCE images and assist physicians to recognize polyp images automatically. METHODS: We propose a novel deep feature learning method, named stacked sparse autoencoder with image manifold constraint (SSAEIM), to recognize polyps in the WCE images. Our SSAEIM differs from the traditional sparse autoencoder (SAE) by introducing an image manifold constraint, which is constructed by a nearest neighbor graph and represents intrinsic structures of images. The image manifold constraint enforces that images within the same category share similar learned features and images in different categories should be kept far away. Thus, the learned features preserve large intervariances and small intravariances among images. RESULTS: The average overall recognition accuracy (ORA) of our method for WCE images is 98.00%. The accuracies for polyps, bubbles, turbid images, and clear images are 98.00%, 99.50%, 99.00%, and 95.50%, respectively. Moreover, the comparison results show that our SSAEIM outperforms existing polyp recognition methods with relative higher ORA. CONCLUSION: The comprehensive results have demonstrated that the proposed SSAEIM can provide descriptive characterization for WCE images and recognize polyps in a WCE video accurately. This method could be further utilized in the clinical trials to help physicians from the tedious image reading work.