RESUMO
The use of higher frequency bands compared to other wireless communication protocols enhances the capability of accurately determining locations from ultra-wideband (UWB) signals. It can also be used to estimate the number of people in a room based on the waveform of the channel impulse response (CIR) from UWB transceivers. In this paper, we apply deep neural networks to UWB CIR signals for the purpose of estimating the number of people in a room. We especially focus on empirically investigating the various network architectures for classification from single UWB CIR data, as well as from various ensemble configurations. We present our processes for acquiring and preprocessing CIR data, our designs of the different network architectures and ensembles that were applied, and the comparative experimental evaluations. We demonstrate that deep neural networks can accurately classify the number of people within a Line of Sight (LoS), thereby achieving an 99% performance and efficiency with respect to both memory size and FLOPs (Floating Point Operations Per Second).
Assuntos
Aprendizado Profundo , Humanos , Comunicação , Redes Neurais de ComputaçãoRESUMO
BACKGROUND: Although ophthalmic devices have made remarkable progress and are widely used, most lack standardization of both image review and results reporting systems, making interoperability unachievable. We developed and validated new software for extracting, transforming, and storing information from report images produced by ophthalmic examination devices to generate standardized, structured, and interoperable information to assist ophthalmologists in eye clinics. RESULTS: We selected report images derived from optical coherence tomography (OCT). The new software consists of three parts: (1) The Area Explorer, which determines whether the designated area in the configuration file contains numeric values or tomographic images; (2) The Value Reader, which converts images to text according to ophthalmic measurements; and (3) The Finding Classifier, which classifies pathologic findings from tomographic images included in the report. After assessment of Value Reader accuracy by human experts, all report images were converted and stored in a database. We applied the Value Reader, which achieved 99.67% accuracy, to a total of 433,175 OCT report images acquired in a single tertiary hospital from 07/04/2006 to 08/31/2019. The Finding Classifier provided pathologic findings (e.g., macular edema and subretinal fluid) and disease activity. Patient longitudinal data could be easily reviewed to document changes in measurements over time. The final results were loaded into a common data model (CDM), and the cropped tomographic images were loaded into the Picture Archive Communication System. CONCLUSIONS: The newly developed software extracts valuable information from OCT images and may be extended to other types of report image files produced by medical devices. Furthermore, powerful databases such as the CDM may be implemented or augmented by adding the information captured through our program.
Assuntos
Edema Macular , Humanos , Software , Tomografia de Coerência ÓpticaRESUMO
The aim of this study was to introduce novel vector field analysis for the quantitative measurement of retinal displacement after epiretinal membrane (ERM) removal. We developed a novel framework to measure retinal displacement from retinal fundus images as follows: (1) rigid registration of preoperative retinal fundus images in reference to postoperative retinal fundus images, (2) extraction of retinal vessel segmentation masks from these retinal fundus images, (3) non-rigid registration of preoperative vessel masks in reference to postoperative vessel masks, and (4) calculation of the transformation matrix required for non-rigid registration for each pixel. These pixel-wise vector field results were summarized according to predefined 24 sectors after standardization. We applied this framework to 20 patients who underwent ERM removal to obtain their retinal displacement vector fields between retinal fundus images taken preoperatively and at postoperative 1, 4, 10, and 22 months. The mean direction of displacement vectors was in the nasal direction. The mean standardized magnitudes of retinal displacement between preoperative and postoperative 1 month, postoperative 1 and 4, 4 and 10, and 10 and 22 months were 38.6, 14.9, 7.6, and 5.4, respectively. In conclusion, the proposed method provides a computerized, reproducible, and scalable way to analyze structural changes in the retina with a powerful visualization tool. Retinal structural changes were mostly concentrated in the early postoperative period and tended to move nasally.
Assuntos
Membrana Epirretiniana , Humanos , Membrana Epirretiniana/cirurgia , Acuidade Visual , Retina/diagnóstico por imagem , Retina/cirurgia , Vasos Retinianos , Fundo de Olho , Vitrectomia , Tomografia de Coerência Óptica/métodos , Estudos RetrospectivosRESUMO
Ultra-widefield (UWF) retinal imaging stands as a pivotal modality for detecting major eye diseases such as diabetic retinopathy and retinal detachment. However, UWF exhibits a well-documented limitation in terms of low resolution and artifacts in the macular area, thereby constraining its clinical diagnostic accuracy, particularly for macular diseases like age-related macular degeneration. Conventional supervised super-resolution techniques aim to address this limitation by enhancing the resolution of the macular region through the utilization of meticulously paired and aligned fundus image ground truths. However, obtaining such refined paired ground truths is a formidable challenge. To tackle this issue, we propose an unpaired, degradation-aware, super-resolution technique for enhancing UWF retinal images. Our approach leverages recent advancements in deep learning: specifically, by employing generative adversarial networks and attention mechanisms. Notably, our method excels at enhancing and super-resolving UWF images without relying on paired, clean ground truths. Through extensive experimentation and evaluation, we demonstrate that our approach not only produces visually pleasing results but also establishes state-of-the-art performance in enhancing and super-resolving UWF retinal images. We anticipate that our method will contribute to improving the accuracy of clinical assessments and treatments, ultimately leading to better patient outcomes.
RESUMO
PROBLEM: Low-quality fundus images with complex degredation can cause costly re-examinations of patients or inaccurate clinical diagnosis. AIM: This study aims to create an automatic fundus macular image enhancement framework to improve low-quality fundus images and remove complex image degradation. METHOD: We propose a new deep learning-based model that automatically enhances low-quality retinal fundus images that suffer from complex degradation. We collected a dataset, comprising 1068 pairs of high-quality (HQ) and low-quality (LQ) fundus images from the Kangbuk Samsung Hospital's health screening program and ophthalmology department from 2017 to 2019. Then, we used these dataset to develop data augmentation methods to simulate major aspects of retinal image degradation and to propose a customized convolutional neural network (CNN) architecture to enhance LQ images, depending on the nature of the degradation. Peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), r-value (linear index of fuzziness), and proportion of ungradable fundus photographs before and after the enhancement process are calculated to assess the performance of proposed model. A comparative evaluation is conducted on an external database and four different open-source databases. RESULTS: The results of the evaluation on the external test dataset showed an significant increase in PSNR and SSIM compared with the original LQ images. Moreover, PSNR and SSIM increased by over 4 dB and 0.04, respectively compared with the previous state-of-the-art methods (P < 0.05). The proportion of ungradable fundus photographs decreased from 42.6% to 26.4% (P = 0.012). CONCLUSION: Our enhancement process improves LQ fundus images that suffer from complex degradation significantly. Moreover our customized CNN achieved improved performance over the existing state-of-the-art methods. Overall, our framework can have a clinical impact on reducing re-examinations and improving the accuracy of diagnosis.
Assuntos
Aprendizado Profundo , Humanos , Fundo de Olho , Redes Neurais de Computação , Razão Sinal-Ruído , Aumento da Imagem , Processamento de Imagem Assistida por Computador/métodosRESUMO
Retinal fundus images are used to detect organ damage from vascular diseases (e.g. diabetes mellitus and hypertension) and screen ocular diseases. We aimed to assess convolutional neural network (CNN) models that predict age and sex from retinal fundus images in normal participants and in participants with underlying systemic vascular-altered status. In addition, we also tried to investigate clues regarding differences between normal ageing and vascular pathologic changes using the CNN models. In this study, we developed CNN age and sex prediction models using 219,302 fundus images from normal participants without hypertension, diabetes mellitus (DM), and any smoking history. The trained models were assessed in four test-sets with 24,366 images from normal participants, 40,659 images from hypertension participants, 14,189 images from DM participants, and 113,510 images from smokers. The CNN model accurately predicted age in normal participants; the correlation between predicted age and chronologic age was R2 = 0.92, and the mean absolute error (MAE) was 3.06 years. MAEs in test-sets with hypertension (3.46 years), DM (3.55 years), and smoking (2.65 years) were similar to that of normal participants; however, R2 values were relatively low (hypertension, R2 = 0.74; DM, R2 = 0.75; smoking, R2 = 0.86). In subgroups with participants over 60 years, the MAEs increased to above 4.0 years and the accuracies declined for all test-sets. Fundus-predicted sex demonstrated acceptable accuracy (area under curve > 0.96) in all test-sets. Retinal fundus images from participants with underlying vascular-altered conditions (hypertension, DM, or smoking) indicated similar MAEs and low coefficients of determination (R2) between the predicted age and chronologic age, thus suggesting that the ageing process and pathologic vascular changes exhibit different features. Our models demonstrate the most improved performance yet and provided clues to the relationship and difference between ageing and pathologic changes from underlying systemic vascular conditions. In the process of fundus change, systemic vascular diseases are thought to have a different effect from ageing. Research in context. Evidence before this study. The human retina and optic disc continuously change with ageing, and they share physiologic or pathologic characteristics with brain and systemic vascular status. As retinal fundus images provide high-resolution in-vivo images of retinal vessels and parenchyma without any invasive procedure, it has been used to screen ocular diseases and has attracted significant attention as a predictive biomarker for cerebral and systemic vascular diseases. Recently, deep neural networks have revolutionised the field of medical image analysis including retinal fundus images and shown reliable results in predicting age, sex, and presence of cardiovascular diseases. Added value of this study. This is the first study demonstrating how a convolutional neural network (CNN) trained using retinal fundus images from normal participants measures the age of participants with underlying vascular conditions such as hypertension, diabetes mellitus (DM), or history of smoking using a large database, SBRIA, which contains 412,026 retinal fundus images from 155,449 participants. Our results indicated that the model accurately predicted age in normal participants, while correlations (coefficient of determination, R2) in test-sets with hypertension, DM, and smoking were relatively low. Additionally, a subgroup analysis indicated that mean absolute errors (MAEs) increased and accuracies declined significantly in subgroups with participants over 60 years of age in both normal participants and participants with vascular-altered conditions. These results suggest that pathologic retinal vascular changes occurring in systemic vascular diseases are different form the changes in spontaneous ageing process, and the ageing process observed in retinal fundus images may saturate at age about 60 years. Implications of all available evidence. Based on this study and previous reports, the CNN could accurately and reliably predict age and sex using retinal fundus images. The fact that retinal changes caused by ageing and systemic vascular diseases occur differently motivates one to understand the retina deeper. Deep learning-based fundus image reading may be a more useful and beneficial tool for screening and diagnosing systemic and ocular diseases after further development.
Assuntos
Diabetes Mellitus/epidemiologia , Fundo de Olho , Hipertensão/epidemiologia , Retina/diagnóstico por imagem , Fumar/epidemiologia , Adulto , Idoso , Algoritmos , Área Sob a Curva , Diabetes Mellitus/patologia , Feminino , Humanos , Hipertensão/patologia , Processamento de Imagem Assistida por Computador/métodos , Masculino , Pessoa de Meia-Idade , Redes Neurais de Computação , Vigilância em Saúde Pública , Curva ROC , República da Coreia , Retina/patologiaRESUMO
BACKGROUND AND OBJECTIVE: Retinal fundus images are widely used to diagnose retinal diseases and can potentially be used for early diagnosis and prevention of chronic vascular diseases and diabetes. While various automatic retinal vessel segmentation methods using deep learning have been proposed, they are mostly based on common CNN structures developed for other tasks such as classification. METHODS: We present a novel and simple multi-scale convolutional neural network (CNN) structure for retinal vessel segmentation. We first provide a theoretical analysis of existing multi-scale structures based on signal processing. In previous structures, multi-scale representations are achieved through downsampling by subsampling and decimation. By incorporating scale-space theory, we propose a simple yet effective multi-scale structure for CNNs using upsampling, which we term scale-space approximated CNN (SSANet). Based on further analysis of the effects of the SSA structure within a CNN, we also incorporate residual blocks, resulting in a multi-scale CNN that outperforms current state-of-the-art methods. RESULTS: Quantitative evaluations are presented as the area-under-curve (AUC) of the receiver operating characteristic (ROC) curve and the precision-recall curve, as well as accuracy, for four publicly available datasets, namely DRIVE, STARE, CHASE_DB1, and HRF. For the CHASE_DB1 set, the SSANet achieves state-of-the-art AUC value of 0.9916 for the ROC curve. An ablative analysis is presented to analyze the contribution of different components of the SSANet to the performance improvement. CONCLUSIONS: The proposed retinal SSANet achieves state-of-the-art or comparable accuracy across publicly available datasets, especially improving segmentation for thin vessels, vessel junctions, and central vessel reflexes.
Assuntos
Redes Neurais de Computação , Doenças Retinianas/diagnóstico por imagem , Vasos Retinianos/diagnóstico por imagem , Algoritmos , Área Sob a Curva , Aprendizado Profundo , Reações Falso-Positivas , Fundo de Olho , Humanos , Processamento de Imagem Assistida por Computador , Distribuição Normal , Curva ROC , Processamento de Sinais Assistido por ComputadorRESUMO
We propose a novel deep learning based system for vessel segmentation. Existing methods using CNNs have mostly relied on local appearances learned on the regular image grid, without consideration of the graphical structure of vessel shape. Effective use of the strong relationship that exists between vessel neighborhoods can help improve the vessel segmentation accuracy. To this end, we incorporate a graph neural network into a unified CNN architecture to jointly exploit both local appearances and global vessel structures. We extensively perform comparative evaluations on four retinal image datasets and a coronary artery X-ray angiography dataset, showing that the proposed method outperforms or is on par with current state-of-the-art methods in terms of the average precision and the area under the receiver operating characteristic curve. Statistical significance on the performance difference between the proposed method and each comparable method is suggested by conducting a paired t-test. In addition, ablation studies support the particular choices of algorithmic detail and hyperparameter values of the proposed method. The proposed architecture is widely applicable since it can be applied to expand any type of CNN-based vessel segmentation method to enhance the performance.
Assuntos
Vasos Coronários/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação , Vasos Retinianos/diagnóstico por imagem , Angiografia , HumanosRESUMO
We present multiple random forest methods for human pose estimation from single depth images that can operate in very high frame rate. We introduce four algorithms: random forest walk, greedy forest walk, random forest jumps, and greedy forest jumps. The proposed approaches can accurately infer the 3D positions of body joints without additional information such as temporal prior. A regression forest is trained to estimate the probability distribution to the direction or offset toward the particular joint, relative to the adjacent position. During pose estimation, the new position is chosen from a set of representative directions or offsets. The distribution for next position is found from traversing the regression tree from new position. The continual position sampling through 3D space will eventually produce an expectation of sample positions, which we estimate as the joint position. The experiments show that the accuracy is higher than current state-of-the-art pose estimation methods with additional advantage in computation time.
Assuntos
Articulações , Modelos Teóricos , Postura , Algoritmos , HumanosRESUMO
We present a novel interactive segmentation framework incorporating a priori knowledge learned from training data. The knowledge is learned as a structured patch model (StPM) comprising sets of corresponding local patch priors and their pairwise spatial distribution statistics which represent the local shape and appearance along its boundary and the global shape structure, respectively. When successive user annotations are given, the StPM is appropriately adjusted in the target image and used together with the annotations to guide the segmentation. The StPM reduces the dependency on the placement and quantity of user annotations with little increase in complexity since the time-consuming StPM construction is performed offline. Furthermore, a seamless learning system can be established by directly adding the patch priors and the pairwise statistics of segmentation results to the StPM. The proposed method was evaluated on three datasets, respectively, of 2D chest CT, 3D knee MR, and 3D brain MR. The experimental results demonstrate that within an equal amount of time, the proposed interactive segmentation framework outperforms recent state-of-the-art methods in terms of accuracy, while it requires significantly less computing and editing time to obtain results with comparable accuracy.
Assuntos
Algoritmos , Interpretação de Imagem Assistida por Computador/métodos , Aprendizado de Máquina , Modelos Biológicos , Modelos Estatísticos , Reconhecimento Automatizado de Padrão/métodos , Simulação por Computador , Aumento da Imagem/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Técnica de SubtraçãoRESUMO
In this paper, we present a novel cascaded classification framework for automatic detection of individual and clusters of microcalcifications (µC). Our framework comprises three classification stages: i) a random forest (RF) classifier for simple features capturing the second order local structure of individual µCs, where non-µC pixels in the target mammogram are efficiently eliminated; ii) a more complex discriminative restricted Boltzmann machine (DRBM) classifier for µC candidates determined in the RF stage, which automatically learns the detailed morphology of µC appearances for improved discriminative power; and iii) a detector to detect clusters of µCs from the individual µC detection results, using two different criteria. From the two-stage RF-DRBM classifier, we are able to distinguish µCs using explicitly computed features, as well as learn implicit features that are able to further discriminate between confusing cases. Experimental evaluation is conducted on the original Mammographic Image Analysis Society (MIAS) and mini-MIAS databases, as well as our own Seoul National University Bundang Hospital digital mammographic database. It is shown that the proposed method outperforms comparable methods in terms of receiver operating characteristic (ROC) and precision-recall curves for detection of individual µCs and free-response receiver operating characteristic (FROC) curve for detection of clustered µCs.