Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Data Brief ; 55: 110700, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39071960

RESUMO

In this work, we present a novel dataset, SynthOutdoor, comprising 39,086 high-resolution images, aimed at addressing the data scarcity in the field of 3D light direction estimation under the assumption of distant lighting. SynthOutdoor was generated using our software (which is also publicly available), that in turn is based on the Unity3D engine. Our dataset provides a set of images rendered from a given input scene, with the camera moving across a predefined path within the scene. This dataset captures a wide variety of lighting conditions through the implementation of a solar cycle. The dataset's ground truth is composed of the following elements: the 3D light direction and color intensity of the sun; the color intensity of the ambient light; the instance segmentation masks of each object and the surface normals map, in which each pixel is assigned with the 3D surface normal in that point (encoded as 3 color channels). By providing not only the light direction and intensity, but also the geometric and semantic information of the rendered images, our dataset can be used not only for light estimation, but also for more general tasks such as 3D geometry and shading estimation from 2D images.

2.
Med Image Anal ; 93: 103090, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38241763

RESUMO

Many clinical and research studies of the human brain require accurate structural MRI segmentation. While traditional atlas-based methods can be applied to volumes from any acquisition site, recent deep learning algorithms ensure high accuracy only when tested on data from the same sites exploited in training (i.e., internal data). Performance degradation experienced on external data (i.e., unseen volumes from unseen sites) is due to the inter-site variability in intensity distributions, and to unique artefacts caused by different MR scanner models and acquisition parameters. To mitigate this site-dependency, often referred to as the scanner effect, we propose LOD-Brain, a 3D convolutional neural network with progressive levels-of-detail (LOD), able to segment brain data from any site. Coarser network levels are responsible for learning a robust anatomical prior helpful in identifying brain structures and their locations, while finer levels refine the model to handle site-specific intensity distributions and anatomical variations. We ensure robustness across sites by training the model on an unprecedentedly rich dataset aggregating data from open repositories: almost 27,000 T1w volumes from around 160 acquisition sites, at 1.5 - 3T, from a population spanning from 8 to 90 years old. Extensive tests demonstrate that LOD-Brain produces state-of-the-art results, with no significant difference in performance between internal and external sites, and robust to challenging anatomical variations. Its portability paves the way for large-scale applications across different healthcare institutions, patient populations, and imaging technology manufacturers. Code, model, and demo are available on the project website.


Assuntos
Imageamento por Ressonância Magnética , Neuroimagem , Humanos , Criança , Adolescente , Adulto Jovem , Adulto , Pessoa de Meia-Idade , Idoso , Idoso de 80 Anos ou mais , Encéfalo/diagnóstico por imagem , Algoritmos , Artefatos
3.
Data Brief ; 51: 109627, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37822886

RESUMO

The position and orientation of the camera in relation to the subject(s) in a movie scene, namely camera "level" and camera "angle", are essential features in the film-making process due to their influence on the viewer's perception of the scene. We provide a database containing camera feature annotations on camera angle and camera level, for about 25,000 image frames. Frames are sampled from a wide range of movies, freely available images, and shots from cinematographic websites, and are annotated on the following five categories - Overhead, High, Neutral, Low, and Dutch - for what concerns camera angle, and on six different classes of camera level: Aerial, Eye, Shoulder, Hip, Knee, and Ground level. This dataset is an extension of the Cinescale dataset [1], which contains movie frames and related annotations regarding shot scale. The CineScale2 database enables AI-driven interpretation of shot scale data and opens to a large set of research activities related to the automatic visual analysis of cinematic material, such as movie stylistic analysis, video recommendation, and media psychology. To these purposes, we also provide the model and the code for building a Convolutional Neural Network (CNN) architecture for automated camera feature recognition. All the material is provided on the the project website; video frames can be also provided upon requests to authors, for research purposes under fair use.

4.
Hum Brain Mapp ; 42(17): 5563-5580, 2021 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-34598307

RESUMO

Ultra-high-field magnetic resonance imaging (MRI) enables sub-millimetre resolution imaging of the human brain, allowing the study of functional circuits of cortical layers at the meso-scale. An essential step in many functional and structural neuroimaging studies is segmentation, the operation of partitioning the MR images in anatomical structures. Despite recent efforts in brain imaging analysis, the literature lacks in accurate and fast methods for segmenting 7-tesla (7T) brain MRI. We here present CEREBRUM-7T, an optimised end-to-end convolutional neural network, which allows fully automatic segmentation of a whole 7T T1w MRI brain volume at once, without partitioning the volume, pre-processing, nor aligning it to an atlas. The trained model is able to produce accurate multi-structure segmentation masks on six different classes plus background in only a few seconds. The experimental part, a combination of objective numerical evaluations and subjective analysis, confirms that the proposed solution outperforms the training labels it was trained on and is suitable for neuroimaging studies, such as layer functional MRI studies. Taking advantage of a fine-tuning operation on a reduced set of volumes, we also show how it is possible to effectively apply CEREBRUM-7T to different sites data. Furthermore, we release the code, 7T data, and other materials, including the training labels and the Turing test.


Assuntos
Encéfalo/anatomia & histologia , Encéfalo/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Redes Neurais de Computação , Neuroimagem/métodos , Humanos
5.
Data Brief ; 36: 107002, 2021 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-33997191

RESUMO

We provide a database containing shot scale annotations (i.e., the apparent distance of the camera from the subject of a filmed scene) for more than 792,000 image frames. Frames belong to 124 full movies from the entire filmographies by 6 important directors: Martin Scorsese, Jean-Luc Godard, Béla Tarr, Federico Fellini, Michelangelo Antonioni, and Ingmar Bergman. Each frame, extracted from videos at 1 frame per second, is annotated on the following scale categories: Extreme Close Up (ECU), Close Up (CU), Medium Close Up (MCU), Medium Shot (MS), Medium Long Shot (MLS), Long Shot (LS), Extreme Long Shot (ELS), Foreground Shot (FS), and Insert Shots (IS). Two independent coders annotated all frames from the 124 movies, whilst a third one checked their coding and made decisions in cases of disagreement. The CineScale database enables AI-driven interpretation of shot scale data and opens to a large set of research activities related to the automatic visual analysis of cinematic material, such as the automatic recognition of the director's style, or the unfolding of the relationship between shot scale and the viewers' emotional experience. To these purposes, we also provide the model and the code for building a Convolutional Neural Network (CNN) architecture for automated shot scale recognition. All this material is provided through the project website, where video frames can also be requested to authors, for research purposes under fair use.

6.
Med Image Anal ; 71: 102046, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-33862337

RESUMO

In this work we design an end-to-end deep learning architecture for predicting, on Chest X-rays images (CXR), a multi-regional score conveying the degree of lung compromise in COVID-19 patients. Such semi-quantitative scoring system, namely Brixia score, is applied in serial monitoring of such patients, showing significant prognostic value, in one of the hospitals that experienced one of the highest pandemic peaks in Italy. To solve such a challenging visual task, we adopt a weakly supervised learning strategy structured to handle different tasks (segmentation, spatial alignment, and score estimation) trained with a "from-the-part-to-the-whole" procedure involving different datasets. In particular, we exploit a clinical dataset of almost 5,000 CXR annotated images collected in the same hospital. Our BS-Net demonstrates self-attentive behavior and a high degree of accuracy in all processing stages. Through inter-rater agreement tests and a gold standard comparison, we show that our solution outperforms single human annotators in rating accuracy and consistency, thus supporting the possibility of using this tool in contexts of computer-assisted monitoring. Highly resolved (super-pixel level) explainability maps are also generated, with an original technique, to visually help the understanding of the network activity on the lung areas. We also consider other scores proposed in literature and provide a comparison with a recently proposed non-specific approach. We eventually test the performance robustness of our model on an assorted public COVID-19 dataset, for which we also provide Brixia score annotations, observing good direct generalization and fine-tuning capabilities that highlight the portability of BS-Net in other clinical settings. The CXR dataset along with the source code and the trained model are publicly released for research purposes.


Assuntos
COVID-19 , Aprendizado Profundo , Radiografia Torácica , COVID-19/diagnóstico por imagem , Humanos , SARS-CoV-2 , Raios X
7.
Int J Cosmet Sci ; 43(4): 405-418, 2021 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-33848366

RESUMO

OBJECTIVE: The first objective of this study was to apply computer vision and machine learning techniques to quantify the effects of haircare treatments on hair assembly and to identify correctly whether unknown tresses were treated or not. The second objective was to explore and compare the performance of human assessment with that obtained from artificial intelligence (AI) algorithms. METHODS: Machine learning was applied to a data set of hair tress images (virgin and bleached), both untreated and treated with a shampoo and conditioner set, aimed at increasing hair volume whilst improving alignment and reducing the flyway of the hair. The automatic quantification of the following hair image features was conducted: local and global hair volumes and hair alignment. These features were assessed at three time points: t0 (no treatment), t1 (two treatments) and t2 (three treatments). Classifier tests were applied to test the accuracy of the machine learning. A sensory test (paired comparison of t0 vs t2 ) and an online front image-based survey (paired comparison of t0 vs t1 , t1 vs t2 , t0 vs t2 ) were conducted to compare human assessment with that of the algorithms. RESULTS: The automatic image analysis identified changes to hair volume and alignment which enabled the successful application of the classification tests, especially when the hair images were grouped into untreated and treated groups. The human assessment of hair presented in pairs confirmed the automatic image analysis. The image assessment for both virgin hair and bleached only partially agreed with the analysis of the subset of images used in the online survey. One hypothesis is that treatments changed somewhat the shape of the hair tress, with the effect being more pronounced in bleached hair. This made human assessment of flat images more challenging than when viewed directly in 3D. Overall, the bleached hair exhibited effects of higher magnitude than the virgin hair. CONCLUSIONS: This study illustrated the capacity of artificial intelligence for hair image detection and classification, and for image analysis of hair assembly features following treatments. The human assessment partially confirmed the image analysis and highlighted the challenges imposed by the presentation mode.


OBJECTIF: Le premier objectif de cette étude était d'appliquer des techniques de vision par ordinateur et d'apprentissage automatique pour quantifier les effets des traitements capillaires sur l'organisation des cheveux et pour identifier précisément si des cheveux d'origine inconnue ont été traités ou non. Le deuxième objectif était d'explorer et de comparer les performances obtenues par évaluation humaine avec celles obtenues à partir d'algorithmes d'intelligence artificielle (IA). MÉTHODES: L'apprentissage automatique a été appliqué à un ensemble de données d'images de cheveux (vierges et décolorés), à la fois non traités et traités avec une association de shampooing et après shampooing visant à augmenter le volume des cheveux tout en améliorant l'alignement des fibres capillaires et en réduisant les frisottis. La quantification automatique des caractéristiques suivantes de l'image capillaire a été réalisée : volumes capillaires locaux et globaux et alignement des cheveux. Ces caractéristiques ont été évaluées à trois moments : t0 (pas de traitement), t1 (deux traitements), t2 (trois traitements). Des tests de classification ont été appliqués pour tester la précision de l'apprentissage automatique. Un test sensoriel (comparaison par paire de t0 vs t2) et une enquête en ligne basée sur l'image frontale (comparaison par paire de t0 vs t1, t1 vs t2, t0 vs t2) ont été menés pour comparer l'évaluation humaine avec celle des algorithmes. RÉSULTATS: L'analyse automatique des images a identifié des changements dans le volume et l'alignement des cheveux qui ont permis la validation des tests de classification, en particulier lorsque les images de cheveux ont été rassemblés en groupes non traités et traités. L'évaluation humaine des cheveux présentés par paires a confirmé l'analyse automatique des images. L'évaluation des images pour les cheveux vierges et décolorés n'était que partiellement en accord avec l'analyse du sous-ensemble d'images utilisées dans l'enquête en ligne. Une hypothèse est que les traitements ont quelque peu changé la forme de la chevelure, l'effet étant plus prononcé avec les cheveux décolorés. Cela a rendu l'évaluation humaine des images plates plus difficile que lorsqu'elles sont visualisées directement en 3D. Dans l'ensemble, les cheveux décolorés ont présenté des effets de plus grande ampleur que les cheveux vierges. CONCLUSION: Cette étude a illustré la capacité de l'intelligence artificielle pour la détection et la classification d'images capillaires, et pour l'analyse d'images des caractéristiques d'organisation des cheveux après traitements. Le bilan humain a partiellement confirmé l'analyse d'image et mis en évidence les enjeux posés par le mode de présentation.


Assuntos
Inteligência Artificial , Cabelo/química , Algoritmos , Humanos , Estudo de Prova de Conceito
8.
Data Brief ; 34: 106635, 2021 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-33364270

RESUMO

The provided database of 260 ECG signals was collected from patients with out-of-hospital cardiac arrest while treated by the emergency medical services. Each ECG signal contains a 9 second waveform showing ventricular fibrillation, followed by 1 min of post-shock waveform. Patients' ECGs are made available in multiple formats. All ECGs recorded during the prehospital treatment are provided in PFD files, after being anonymized, printed in paper, and scanned. For each ECG, the dataset also includes the whole digitized waveform (9 s pre- and 1 min post-shock each) and numerous features in temporal and frequency domain extracted from the 9 s episode immediately prior to the first defibrillation shock. Based on the shock outcome, each ECG file has been annotated by three expert cardiologists, - using majority decision -, as successful (56 cases), unsuccessful (195 cases), or indeterminable (9 cases). The code for preprocessing, for feature extraction, and for limiting the investigation to different temporal intervals before the shock is also provided. These data could be reused to design algorithms to predict shock outcome based on ventricular fibrillation analysis, with the goal to optimize the defibrillation strategy (immediate defibrillation versus cardiopulmonary resuscitation and/or drug administration) for enhancing resuscitation.

9.
Data Brief ; 31: 105964, 2020 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-32671161

RESUMO

The published database is composed of 1,080 images taken from 120 hair tresses made of medium blond, fine Caucasian hair with the aim to facilitate quantitative and qualitative studies about shampoo and conditioner efficacies. Two types of hair tresses were used: Caucasian hair which had not been subjected to oxidation with bleaching agents - virgin (60 tresses); and Caucasian hair, previously subjected to light oxidative bleaching - lightly bleached (remaining 60 tresses). Since cosmetic products such as shampoos and conditioners are often designed to subtly augment hair assembly features via the carefully balanced cumulative effects of deposited actives, each tress was subjected to consecutive washing+conditioning+drying cycles referred to as cosmetic treatment. The shampoo and conditioner used for this project were specifically selected for their suitability for fine hair. Each tress was photographed at three different time-points: before the cosmetic treatment; after two cosmetic treatments, and after an additional third cosmetic treatment. At each time-point, each tress was photographed from three different angles (-45, 0, and +45°), resulting in a total number of nine images for each tress. For each image in the database, we also provide a corresponding hair segmentation mask, which identifies the hair location area in the original image.

10.
Med Image Anal ; 62: 101688, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32272345

RESUMO

Many functional and structural neuroimaging studies call for accurate morphometric segmentation of different brain structures starting from image intensity values of MRI scans. Current automatic (multi-) atlas-based segmentation strategies often lack accuracy on difficult-to-segment brain structures and, since these methods rely on atlas-to-scan alignment, they may take long processing times. Alternatively, recent methods deploying solutions based on Convolutional Neural Networks (CNNs) are enabling the direct analysis of out-of-the-scanner data. However, current CNN-based solutions partition the test volume into 2D or 3D patches, which are processed independently. This process entails a loss of global contextual information, thereby negatively impacting the segmentation accuracy. In this work, we design and test an optimised end-to-end CNN architecture that makes the exploitation of global spatial information computationally tractable, allowing to process a whole MRI volume at once. We adopt a weakly supervised learning strategy by exploiting a large dataset composed of 947 out-of-the-scanner (3 Tesla T1-weighted 1mm isotropic MP-RAGE 3D sequences) MR Images. The resulting model is able to produce accurate multi-structure segmentation results in only a few seconds. Different quantitative measures demonstrate an improved accuracy of our solution when compared to state-of-the-art techniques. Moreover, through a randomised survey involving expert neuroscientists, we show that subjective judgements favour our solution with respect to widely adopted atlas-based software.


Assuntos
Encéfalo , Cérebro , Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Encéfalo/diagnóstico por imagem , Humanos , Redes Neurais de Computação
11.
J Neurosci Methods ; 328: 108319, 2019 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-31585315

RESUMO

BACKGROUND: Deep neural networks have revolutionised machine learning, with unparalleled performance in object classification. However, in brain imaging (e.g., fMRI), the direct application of Convolutional Neural Networks (CNN) to decoding subject states or perception from imaging data seems impractical given the scarcity of available data. NEW METHOD: In this work we propose a robust method to transfer information from deep learning (DL) features to brain fMRI data with the goal of decoding. By adopting Reduced Rank Regression with Ridge Regularisation we establish a multivariate link between imaging data and the fully connected layer (fc7) of a CNN. We exploit the reconstructed fc7 features by performing an object image classification task on two datasets: one of the largest fMRI databases, taken from different scanners from more than two hundred subjects watching different movie clips, and another with fMRI data taken while watching static images. RESULTS: The fc7 features could be significantly reconstructed from the imaging data, and led to significant decoding performance. COMPARISON WITH EXISTING METHODS: The decoding based on reconstructed fc7 outperformed the decoding based on imaging data alone. CONCLUSION: In this work we show how to improve fMRI-based decoding benefiting from the mapping between functional data and CNN features. The potential advantage of the proposed method is twofold: the extraction of stimuli representations by means of an automatic procedure (unsupervised) and the embedding of high-dimensional neuroimaging data onto a space designed for visual object discrimination, leading to a more manageable space from dimensionality point of view.


Assuntos
Mapeamento Encefálico/métodos , Encéfalo/fisiologia , Aprendizado Profundo , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Transferência de Experiência , Percepção Visual/fisiologia , Adulto , Encéfalo/diagnóstico por imagem , Humanos
12.
Data Brief ; 24: 103881, 2019 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-31008162

RESUMO

The FASSEG repository is composed by four subsets containing face images useful for training and testing automatic methods for the task of face segmentation. Threesubsets, namely frontal01, frontal02, and frontal03 are specifically built for performing frontal face segmentation. Frontal01 contains 70 original RGB images and the corresponding roughly labelledground-truth masks. Frontal02 contains the same image data, with high-precision labelled ground-truth masks. Frontal03 consists in 150 annotated face masks of twins captured in various orientations, illumination conditions and facial expressions. The last subset, namely multipose01, contains more than 200 faces in multiple poses and the corresponding ground-truth masks. For all face images, ground-truth masks are labelled on six classes (mouth, nose, eyes, hair, skin, and background).

13.
J Imaging ; 5(5)2019 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-34460490

RESUMO

Modern hyperspectral imaging systems produce huge datasets potentially conveying a great abundance of information; such a resource, however, poses many challenges in the analysis and interpretation of these data. Deep learning approaches certainly offer a great variety of opportunities for solving classical imaging tasks and also for approaching new stimulating problems in the spatial-spectral domain. This is fundamental in the driving sector of Remote Sensing where hyperspectral technology was born and has mostly developed, but it is perhaps even more true in the multitude of current and evolving application sectors that involve these imaging technologies. The present review develops on two fronts: on the one hand, it is aimed at domain professionals who want to have an updated overview on how hyperspectral acquisition techniques can combine with deep learning architectures to solve specific tasks in different application fields. On the other hand, we want to target the machine learning and computer vision experts by giving them a picture of how deep learning technologies are applied to hyperspectral data from a multidisciplinary perspective. The presence of these two viewpoints and the inclusion of application fields other than Remote Sensing are the original contributions of this review, which also highlights some potentialities and critical issues related to the observed development trends.

14.
Neuroimage ; 163: 244-263, 2017 12.
Artigo em Inglês | MEDLINE | ID: mdl-28939433

RESUMO

Major methodological advancements have been recently made in the field of neural decoding, which is concerned with the reconstruction of mental content from neuroimaging measures. However, in the absence of a large-scale examination of the validity of the decoding models across subjects and content, the extent to which these models can be generalized is not clear. This study addresses the challenge of producing generalizable decoding models, which allow the reconstruction of perceived audiovisual features from human magnetic resonance imaging (fMRI) data without prior training of the algorithm on the decoded content. We applied an adapted version of kernel ridge regression combined with temporal optimization on data acquired during film viewing (234 runs) to generate standardized brain models for sound loudness, speech presence, perceived motion, face-to-frame ratio, lightness, and color brightness. The prediction accuracies were tested on data collected from different subjects watching other movies mainly in another scanner. Substantial and significant (QFDR<0.05) correlations between the reconstructed and the original descriptors were found for the first three features (loudness, speech, and motion) in all of the 9 test movies (R¯=0.62, R¯ = 0.60, R¯ = 0.60, respectively) with high reproducibility of the predictors across subjects. The face ratio model produced significant correlations in 7 out of 8 movies (R¯=0.56). The lightness and brightness models did not show robustness (R¯=0.23, R¯ = 0). Further analysis of additional data (95 runs) indicated that loudness reconstruction veridicality can consistently reveal relevant group differences in musical experience. The findings point to the validity and generalizability of our loudness, speech, motion, and face ratio models for complex cinematic stimuli (as well as for music in the case of loudness). While future research should further validate these models using controlled stimuli and explore the feasibility of extracting more complex models via this method, the reliability of our results indicates the potential usefulness of the approach and the resulting models in basic scientific and diagnostic contexts.


Assuntos
Mapeamento Encefálico/métodos , Encéfalo/fisiologia , Interpretação de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Algoritmos , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA