Pesquisa | BVS Doenças Infecciosas e Parasitárias

Optimizing Strawberry Disease and Quality Detection with Vision Transformers and Attention-Based Convolutional Neural Networks.

Aghamohammadesmaeilketabforoosh, Kimia; Nikan, Soodeh; Antonini, Giorgio; Pearce, Joshua M.

Foods ; 13(12)2024 Jun 14.

Artigo em Inglês | MEDLINE | ID: mdl-38928810

RESUMO

Machine learning and computer vision have proven to be valuable tools for farmers to streamline their resource utilization to lead to more sustainable and efficient agricultural production. These techniques have been applied to strawberry cultivation in the past with limited success. To build on this past work, in this study, two separate sets of strawberry images, along with their associated diseases, were collected and subjected to resizing and augmentation. Subsequently, a combined dataset consisting of nine classes was utilized to fine-tune three distinct pretrained models: vision transformer (ViT), MobileNetV2, and ResNet18. To address the imbalanced class distribution in the dataset, each class was assigned weights to ensure nearly equal impact during the training process. To enhance the outcomes, new images were generated by removing backgrounds, reducing noise, and flipping them. The performances of ViT, MobileNetV2, and ResNet18 were compared after being selected. Customization specific to the task was applied to all three algorithms, and their performances were assessed. Throughout this experiment, none of the layers were frozen, ensuring all layers remained active during training. Attention heads were incorporated into the first five and last five layers of MobileNetV2 and ResNet18, while the architecture of ViT was modified. The results indicated accuracy factors of 98.4%, 98.1%, and 97.9% for ViT, MobileNetV2, and ResNet18, respectively. Despite the data being imbalanced, the precision, which indicates the proportion of correctly identified positive instances among all predicted positive instances, approached nearly 99% with the ViT. MobileNetV2 and ResNet18 demonstrated similar results. Overall, the analysis revealed that the vision transformer model exhibited superior performance in strawberry ripeness and disease classification. The inclusion of attention heads in the early layers of ResNet18 and MobileNet18, along with the inherent attention mechanism in ViT, improved the accuracy of image identification. These findings offer the potential for farmers to enhance strawberry cultivation through passive camera monitoring alone, promoting the health and well-being of the population.

PWD-3DNet: A Deep Learning-Based Fully-Automated Segmentation of Multiple Structures on Temporal Bone CT Scans.

Nikan, Soodeh; Van Osch, Kylen; Bartling, Mandolin; Allen, Daniel G; Rohani, S Alireza; Connors, Ben; Agrawal, Sumit K; Ladak, Hanif M.

IEEE Trans Image Process ; 30: 739-753, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-33226942

RESUMO

The temporal bone is a part of the lateral skull surface that contains organs responsible for hearing and balance. Mastering surgery of the temporal bone is challenging because of this complex and microscopic three-dimensional anatomy. Segmentation of intra-temporal anatomy based on computed tomography (CT) images is necessary for applications such as surgical training and rehearsal, amongst others. However, temporal bone segmentation is challenging due to the similar intensities and complicated anatomical relationships among critical structures, undetectable small structures on standard clinical CT, and the amount of time required for manual segmentation. This paper describes a single multi-class deep learning-based pipeline as the first fully automated algorithm for segmenting multiple temporal bone structures from CT volumes, including the sigmoid sinus, facial nerve, inner ear, malleus, incus, stapes, internal carotid artery and internal auditory canal. The proposed fully convolutional network, PWD-3DNet, is a patch-wise densely connected (PWD) three-dimensional (3D) network. The accuracy and speed of the proposed algorithm was shown to surpass current manual and semi-automated segmentation techniques. The experimental results yielded significantly high Dice similarity scores and low Hausdorff distances for all temporal bone structures with an average of 86% and 0.755 millimeter (mm), respectively. We illustrated that overlapping in the inference sub-volumes improves the segmentation performance. Moreover, we proposed augmentation layers by using samples with various transformations and image artefacts to increase the robustness of PWD-3DNet against image acquisition protocols, such as smoothing caused by soft tissue scanner settings and larger voxel sizes used for radiation reduction. The proposed algorithm was tested on low-resolution CTs acquired by another center with different scanner parameters than the ones used to create the algorithm and shows potential for application beyond the particular training data used in the study.

Assuntos

Aprendizado Profundo , Processamento de Imagem Assistida por Computador/métodos , Osso Temporal/diagnóstico por imagem , Tomografia Computadorizada por Raios X/métodos , Algoritmos , Humanos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA