RESUMO
OBJECTIVE: The prospect of being able to gain relevant information from cardiovascular magnetic resonance (CMR) image analysis automatically opens up new potential to assist the evaluating physician. For machine-learning-based classification of complex congenital heart disease, only few studies have used CMR. MATERIALS AND METHODS: This study presents a tailor-made neural network architecture for detection of 7 distinctive anatomic landmarks in CMR images of patients with hypoplastic left heart syndrome (HLHS) in Fontan circulation or healthy controls and demonstrates the potential of the spatial arrangement of the landmarks to identify HLHS. The method was applied to the axial SSFP CMR scans of 46 patients with HLHS and 33 healthy controls. RESULTS: The displacement between predicted and annotated landmark had a standard deviation of 8-17 mm and was larger than the interobserver variability by a factor of 1.1-2.0. A high overall classification accuracy of 98.7% was achieved. DISCUSSION: Decoupling the identification of clinically meaningful anatomic landmarks from the actual classification improved transparency of classification results. Information from such automated analysis could be used to quickly jump to anatomic positions and guide the physician more efficiently through the analysis depending on the detected condition, which may ultimately improve work flow and save analysis time.
Assuntos
Sistema Cardiovascular , Síndrome do Coração Esquerdo Hipoplásico , Humanos , Síndrome do Coração Esquerdo Hipoplásico/diagnóstico por imagem , Síndrome do Coração Esquerdo Hipoplásico/cirurgia , Imageamento por Ressonância Magnética/métodos , Aprendizado de Máquina , Redes Neurais de ComputaçãoRESUMO
Among the common applications of plenoptic cameras are depth reconstruction and post-shot refocusing. These require a calibration relating the camera-side light field to that of the scene. Numerous methods with this goal have been developed based on thin lens models for the plenoptic camera's main lens and microlenses. Our work addresses the often-overlooked role of the main lens exit pupil in these models, specifically in the decoding process of standard plenoptic camera (SPC) images. We formally deduce the connection between the refocusing distance and the resampling parameter for the decoded light field and provide an analysis of the errors that arise when the exit pupil is not considered. In addition, previous work is revisited with respect to the exit pupil's role, and all theoretical results are validated through a ray tracing-based simulation. With the public release of the evaluated SPC designs alongside our simulation and experimental data, we aim to contribute to a more accurate and nuanced understanding of plenoptic camera optics.
RESUMO
Injurious pecking against conspecifics is a serious problem in turkey husbandry. Bloody injuries act as a trigger mechanism to induce further pecking, and timely detection and intervention can prevent massive animal welfare impairments and costly losses. Thus, the overarching aim is to develop a camera-based system to monitor the flock and detect injuries using neural networks. In a preliminary study, images of turkeys were annotated by labelling potential injuries. These were used to train a network for injury detection. Here, we applied a keypoint detection model to provide more information on animal position and indicate injury location. Therefore, seven turkey keypoints were defined, and 244 images (showing 7660 birds) were manually annotated. Two state-of-the-art approaches for pose estimation were adjusted, and their results were compared. Subsequently, a better keypoint detection model (HRNet-W48) was combined with the segmentation model for injury detection. For example, individual injuries were classified using "near tail" or "near head" labels. Summarizing, the keypoint detection showed good results and could clearly differentiate between individual animals even in crowded situations.
Assuntos
Bem-Estar do Animal , Perus , Animais , Redes Neurais de ComputaçãoRESUMO
Deep learning has been successfully applied to many classification problems including underwater challenges. However, a long-standing issue with deep learning is the need for large and consistently labeled datasets. Although current approaches in semi-supervised learning can decrease the required amount of annotated data by a factor of 10 or even more, this line of research still uses distinct classes. For underwater classification, and uncurated real-world datasets in general, clean class boundaries can often not be given due to a limited information content in the images and transitional stages of the depicted objects. This leads to different experts having different opinions and thus producing fuzzy labels which could also be considered ambiguous or divergent. We propose a novel framework for handling semi-supervised classifications of such fuzzy labels. It is based on the idea of overclustering to detect substructures in these fuzzy labels. We propose a novel loss to improve the overclustering capability of our framework and show the benefit of overclustering for fuzzy labels. We show that our framework is superior to previous state-of-the-art semi-supervised methods when applied to real-world plankton data with fuzzy labels. Moreover, we acquire 5 to 10% more consistent predictions of substructures.
Assuntos
Aprendizado de Máquina Supervisionado , EntropiaRESUMO
In this work, we present MorphoCluster, a software tool for data-driven, fast, and accurate annotation of large image data sets. While already having surpassed the annotation rate of human experts, volume and complexity of marine data will continue to increase in the coming years. Still, this data requires interpretation. MorphoCluster augments the human ability to discover patterns and perform object classification in large amounts of data by embedding unsupervised clustering in an interactive process. By aggregating similar images into clusters, our novel approach to image annotation increases consistency, multiplies the throughput of an annotator, and allows experts to adapt the granularity of their sorting scheme to the structure in the data. By sorting a set of 1.2 M objects into 280 data-driven classes in 71 h (16 k objects per hour), with 90% of these classes having a precision of 0.889 or higher. This shows that MorphoCluster is at the same time fast, accurate, and consistent; provides a fine-grained and data-driven classification; and enables novelty detection.
Assuntos
Processamento de Imagem Assistida por Computador , Reconhecimento Automatizado de Padrão , Plâncton , Software , Análise por ConglomeradosRESUMO
Behavioural research of pigs can be greatly simplified if automatic recognition systems are used. Systems based on computer vision in particular have the advantage that they allow an evaluation without affecting the normal behaviour of the animals. In recent years, methods based on deep learning have been introduced and have shown excellent results. Object and keypoint detector have frequently been used to detect individual animals. Despite promising results, bounding boxes and sparse keypoints do not trace the contours of the animals, resulting in a lot of information being lost. Therefore, this paper follows the relatively new approach of panoptic segmentation and aims at the pixel accurate segmentation of individual pigs. A framework consisting of a neural network for semantic segmentation as well as different network heads and postprocessing methods will be discussed. The method was tested on a data set of 1000 hand-labeled images created specifically for this experiment and achieves detection rates of around 95% (F1 score) despite disturbances such as occlusions and dirty lenses.
Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Postura , Suínos , AnimaisRESUMO
Light field technologies have seen a rise in recent years and microscopy is a field where such technology has had a deep impact. The possibility to provide spatial and angular information at the same time and in a single shot brings several advantages and allows for new applications. A common goal in these applications is the calculation of a depth map to reconstruct the three-dimensional geometry of the scene. Many approaches are applicable, but most of them cannot achieve high accuracy because of the nature of such images: biological samples are usually poor in features and do not exhibit sharp colors like natural scene. Due to such conditions, standard approaches result in noisy depth maps. In this work, a robust approach is proposed where accurate depth maps can be produced exploiting the information recorded in the light field, in particular, images produced with Fourier integral Microscope. The proposed approach can be divided into three main parts. Initially, it creates two cost volumes using different focal cues, namely correspondences and defocus. Secondly, it applies filtering methods that exploit multi-scale and super-pixels cost aggregation to reduce noise and enhance the accuracy. Finally, it merges the two cost volumes and extracts a depth map through multi-label optimization.
RESUMO
Coherent imaging has a wide range of applications in, for example, microscopy, astronomy, and radar imaging. Particularly interesting is the field of microscopy, where the optical quality of the lens is the main limiting factor. In this article, novel algorithms for the restoration of blurred images in a system with known optical aberrations are presented. Physically motivated by the scalar diffraction theory, the new algorithms are based on Haugazeau POCS and FISTA, and are faster and more robust than methods presented earlier. With the new approach the level of restoration quality on real images is very high, thereby blurring and ringing caused by defocus can be effectively removed. In classical microscopy, lenses with very low aberration must be used, which puts a practical limit on their size and numerical aperture. A coherent microscope using the novel restoration method overcomes this limitation. In contrast to incoherent microscopy, severe optical aberrations including defocus can be removed, hence the requirements on the quality of the optics are lower. This can be exploited for an essential price reduction of the optical system. It can be also used to achieve higher resolution than in classical microscopy, using lenses with high numerical aperture and high aberration. All this makes the coherent microscopy superior to the traditional incoherent in suited applications.
RESUMO
Several acoustic and optical techniques have been used for characterizing natural and anthropogenic gas leaks (carbon dioxide, methane) from the ocean floor. Here, single-camera based methods for bubble stream observation have become an important tool, as they help estimating flux and bubble sizes under certain assumptions. However, they record only a projection of a bubble into the camera and therefore cannot capture the full 3D shape, which is particularly important for larger, non-spherical bubbles. The unknown distance of the bubble to the camera (making it appear larger or smaller than expected) as well as refraction at the camera interface introduce extra uncertainties. In this article, we introduce our wide baseline stereo-camera deep-sea sensor bubble box that overcomes these limitations, as it observes bubbles from two orthogonal directions using calibrated cameras. Besides the setup and the hardware of the system, we discuss appropriate calibration and the different automated processing steps deblurring, detection, tracking, and 3D fitting that are crucial to arrive at a 3D ellipsoidal shape and rise speed of each bubble. The obtained values for single bubbles can be aggregated into statistical bubble size distributions or fluxes for extrapolation based on diffusion and dissolution models and large scale acoustic surveys. We demonstrate and evaluate the wide baseline stereo measurement model using a controlled test setup with ground truth information.
RESUMO
Extensive research has explored human motion generation, but the generated sequences are influenced by different motion styles. For instance, the act of walking with joy and sorrow evokes distinct effects on a character's motion. Due to the difficulties in motion capture with styles, the available data for style research are also limited. To address the problems, we propose ASMNet, an action and style-conditioned motion generative network. This network ensures that the generated human motion sequences not only comply with the provided action label but also exhibit distinctive stylistic features. To extract motion features from human motion sequences, we design a spatial temporal extractor. Moreover, we use the adaptive instance normalization layer to inject style into the target motion. Our results are comparable to state-of-the-art approaches and display a substantial advantage in both quantitative and qualitative evaluations. The code is available at https://github.com/ZongYingLi/ASMNet.git.
RESUMO
Hyperfluorescence (HF) and reduced autofluorescence (RA) are important biomarkers in fundus autofluorescence images (FAF) for the assessment of health of the retinal pigment epithelium (RPE), an important indicator of disease progression in geographic atrophy (GA) or central serous chorioretinopathy (CSCR). Autofluorescence images have been annotated by human raters, but distinguishing biomarkers (whether signals are increased or decreased) from the normal background proves challenging, with borders being particularly open to interpretation. Consequently, significant variations emerge among different graders, and even within the same grader during repeated annotations. Tests on in-house FAF data show that even highly skilled medical experts, despite previously discussing and settling on precise annotation guidelines, reach a pair-wise agreement measured in a Dice score of no more than 63-80% for HF segmentations and only 14-52% for RA. The data further show that the agreement of our primary annotation expert with herself is a 72% Dice score for HF and 51% for RA. Given these numbers, the task of automated HF and RA segmentation cannot simply be refined to the improvement in a segmentation score. Instead, we propose the use of a segmentation ensemble. Learning from images with a single annotation, the ensemble reaches expert-like performance with an agreement of a 64-81% Dice score for HF and 21-41% for RA with all our experts. In addition, utilizing the mean predictions of the ensemble networks and their variance, we devise ternary segmentations where FAF image areas are labeled either as confident background, confident HF, or potential HF, ensuring that predictions are reliable where they are confident (97% Precision), while detecting all instances of HF (99% Recall) annotated by all experts.
RESUMO
Medical phantoms mimic aspects of procedures like computed tomography (CT), ultrasound (US) imaging, and surgical practices. However, the materials for current commercial phantoms are expensive and the fabrication with these is complex and lacks versatility. Therefore, existing material solutions are not suitable for creating patient-specific phantoms. We present a novel and cost-effective material system (utilizing ubiquitous sodium alginate hydrogel and coconut fat) with independently and accurately tailorable CT, US, and mechanical properties. By varying the concentration of alginate, cross-linker, and coconut fat, the radiological parameters and the elastic modulus were adjusted independently in a wide range. The independence was demonstrated by creating phantoms with features hidden in US, while visible in CT imaging and vice versa. This system is particularly beneficial in resource-scarce areas since the materials are cheap (<$ 1 USD/kg) and easy to obtain, offering realistic and versatile phantoms to practice surgeries and ultimately enhance patient care.
RESUMO
Accurately measuring the size, morphology, and structure of nanoparticles is very important, because they are strongly dependent on their properties for many applications. In this paper, we present a deep-learning based method for nanoparticle measurement and classification trained from a small data set of scanning transmission electron microscopy images including overlapping nanoparticles. Our approach is comprised of two stages: localization, i.e., detection of nanoparticles, and classification, i.e., categorization of their ultrastructure. For each stage, we optimize the segmentation and classification by analysis of the different state-of-the-art neural networks. We show how the generation of synthetic images, either using image processing or using various image generation neural networks, can be used to improve the results in both stages. Finally, the application of the algorithm to bimetallic nanoparticles demonstrates the automated data collection of size distributions including classification of complex ultrastructures. The developed method can be easily transferred to other material systems and nanoparticle structures.
RESUMO
Mapping and monitoring of seafloor habitats are key tasks for fully understanding ocean ecosystems and resilience, which contributes towards sustainable use of ocean resources. Habitat mapping relies on seafloor classification typically based on acoustic methods, and ground truthing through direct sampling and optical imaging. With the increasing capabilities to record high-resolution underwater images, manual approaches for analyzing these images to create seafloor classifications are no longer feasible. Automated workflows have been proposed as a solution, in which algorithms assign pre-defined seafloor categories to each image. However, in order to provide consistent and repeatable analysis, these automated workflows need to address e.g., underwater illumination artefacts, variances in resolution and class-imbalances, which could bias the classification. Here, we present a generic implementation of an Automated and Integrated Seafloor Classification Workflow (AI-SCW). The workflow aims to classify the seafloor into habitat categories based on automated analysis of optical underwater images with only minimal amount of human annotations. AI-SCW incorporates laser point detection for scale determination and color normalization. It further includes semi-automatic generation of the training data set for fitting the seafloor classifier. As a case study, we applied the workflow to an example seafloor image dataset from the Belgian and German contract areas for Manganese-nodule exploration in the Pacific Ocean. Based on this, we provide seafloor classifications along the camera deployment tracks, and discuss results in the context of seafloor multibeam bathymetry. Our results show that the seafloor in the Belgian area predominantly comprises densely distributed nodules, which are intermingled with qualitatively larger-sized nodules at local elevations and within depressions. On the other hand, the German area primarily comprises nodules that only partly cover the seabed, and these occur alongside turned-over sediment (artificial seafloor) that were caused by the settling plume following a dredging experiment conducted in the area.
Assuntos
Ecossistema , Manganês , Algoritmos , Humanos , Oceano Pacífico , Fluxo de TrabalhoRESUMO
Optical coherence tomography (OCT) and fundus autofluorescence (FAF) are important imaging modalities for the assessment and prognosis of central serous chorioretinopathy (CSCR). However, setting the findings from both into spatial and temporal contexts as desirable for disease analysis remains a challenge due to both modalities being captured in different perspectives: sparse three-dimensional (3D) cross sections for OCT and two-dimensional (2D) en face images for FAF. To bridge this gap, we propose a visualisation pipeline capable of projecting OCT labels to en face image modalities such as FAF. By mapping OCT B-scans onto the accompanying en face infrared (IR) image and then registering the IR image onto the FAF image by a neural network, we can directly compare OCT labels to other labels in the en face plane. We also present a U-Net inspired segmentation model to predict segmentations in unlabeled OCTs. Evaluations show that both our networks achieve high precision (0.853 Dice score and 0.913 Area under Curve). Furthermore, medical analysis performed on exemplary, chronologically arranged CSCR progressions of 12 patients visualized with our pipeline indicates that, on CSCR, two patterns emerge: subretinal fluid (SRF) in OCT preceding hyperfluorescence (HF) in FAF and vice versa.
RESUMO
This study aimed to develop a camera-based system using artificial intelligence for automated detection of pecking injuries in turkeys. Videos were recorded and split into individual images for further processing. Using specifically developed software, the injuries visible on these images were marked by humans, and a neural network was trained with these annotations. Due to unacceptable agreement between the annotations of humans and the network, several work steps were initiated to improve the training data. First, a costly work step was used to create high-quality annotations (HQA) for which multiple observers evaluated already annotated injuries. Therefore, each labeled detection had to be validated by three observers before it was saved as "finished", and for each image, all detections had to be verified three times. Then, a network was trained with these HQA to assist observers in annotating more data. Finally, the benefit of the work step generating HQA was tested, and it was shown that the value of the agreement between the annotations of humans and the network could be doubled. Although the system is not yet capable of ensuring adequate detection of pecking injuries, the study demonstrated the importance of such validation steps in order to obtain good training data.
RESUMO
In this paper we deal with the camera pose estimation problem from a set of 2D/3D line correspondences, which is also known as PnL (Perspective-n-Line) problem. We carry out our study by comparing PnL with the well-studied PnP (Perspective-n-Point) problem, and our contributions are three-fold: (1) We provide a complete 3D configuration analysis for P3L, which includes the well-known P3P problem as well as several existing analyses as special cases. (2) By exploring the similarity between PnL and PnP, we propose a new subset-based PnL approach as well as a series of linear-formulation-based PnL approaches inspired by their PnP counterparts. (3) The proposed linear-formulation-based methods can be easily extended to deal with the line and point features simultaneously.
RESUMO
Sleep abnormalities in idiopathic Parkinson's disease (PD) frequently consist in a reduction of total sleep time and efficacy and subsequent excessive daytime sleepiness. As it remains unclear whether these phenomena are part of the disease itself or result from pharmacological treatment, animal models for investigating the pathophysiology of sleep alterations in PD may add knowledge to this research area. In the present study, we investigate whether changes in circadian motor activity occur in 6-OHDA-lesioning model for PD, and allow a screening for disturbed sleep-waking behaviour. Activity measurements of six male Wistar rats with 6-OHDA-lesions in the medial forebrain bundle and six controls were carried out in two consecutive 12:12 h light-dark (LD) cycles. A computer-based video-analysis system, recording the animals' movement tracks was used. Distance travelled and number of transitions between movement periods and resting periods were determined. Although 6-OHDA-lesioned animals show a reduced locomotor activity compared to non-lesioned rats, the circadian distribution basically remained intact. However, some lesioning effects were more pronounced in the resting phase than in the activity phase, possibly paralleling nocturnal akinesia in PD. In order to further elucidate the described phenomena, it will be necessary to perform studies combining sleep recordings with locomotor activity measurements.
Assuntos
Ritmo Circadiano/fisiologia , Lateralidade Funcional/fisiologia , Atividade Motora/fisiologia , Oxidopamina , Transtornos Parkinsonianos , Análise de Variância , Animais , Comportamento Animal/efeitos dos fármacos , Comportamento Animal/fisiologia , Ritmo Circadiano/efeitos dos fármacos , Modelos Animais de Doenças , Comportamento Exploratório/efeitos dos fármacos , Comportamento Exploratório/fisiologia , Lateralidade Funcional/efeitos dos fármacos , Masculino , Atividade Motora/efeitos dos fármacos , Transtornos Parkinsonianos/induzido quimicamente , Transtornos Parkinsonianos/patologia , Transtornos Parkinsonianos/fisiopatologia , Ratos , Ratos Wistar , Teste de Desempenho do Rota-Rod/métodos , Tirosina 3-Mono-Oxigenase/metabolismoRESUMO
Spatial augmented reality is especially interesting for the design process of a car, because a lot of virtual content and corresponding real objects are used. One important issue in such a process is that the designer can trust the visualized colors on the real object, because design decisions are made on basis of the projection. In this paper, we present an interactive visualization technique which is able to exactly compute the RGB values for the projected image, so that the resulting colors on the real object are equally perceived as the real desired colors. Our approach computes the influences of the ambient light, the material, the pose and the color model of the projector to the resulting colors of the projected RGB values by using a physically based computation. This information allows us to compute the adjustment for the RGB values for varying projector positions at interactive rates. Since the amount of projectable colors does not only depend on the material and the ambient light, but also on the pose of the projector, our method can be used to interactively adjust the range of projectable colors by moving the projector to arbitrary positions around the real object. We further extend the mentioned method so that it is applicable to multiple projectors. All methods are evaluated in a number of experiments.
RESUMO
Head-mounted displays (HMDs) allow users to observe virtual environments (VEs) from an egocentric perspective. However, several experiments have provided evidence that egocentric distances are perceived as compressed in VEs relative to the real world. Recent experiments suggest that the virtual view frustum set for rendering the VE has an essential impact on the user's estimation of distances. In this article we analyze if distance estimation can be improved by calibrating the view frustum for a given HMD and user. Unfortunately, in an immersive virtual reality (VR) environment, a full per user calibration is not trivial and manual per user adjustment often leads to mini- or magnification of the scene. Therefore, we propose a novel per user calibration approach with optical see-through displays commonly used in augmented reality (AR). This calibration takes advantage of a geometric scheme based on 2D point - 3D line correspondences, which can be used intuitively by inexperienced users and requires less than a minute to complete. The required user interaction is based on taking aim at a distant target marker with a close marker, which ensures non-planar measurements covering a large area of the interaction space while also reducing the number of required measurements to five. We found the tendency that a calibrated view frustum reduced the average distance underestimation of users in an immersive VR environment, but even the correctly calibrated view frustum could not entirely compensate for the distance underestimation effects.