Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 71
Filtrar
1.
Sensors (Basel) ; 24(6)2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-38544251

RESUMO

Restricted mouth opening (trismus) is one of the most common complications following head and neck cancer treatment. Early initiation of mouth-opening exercises is crucial for preventing or minimizing trismus. Current methods for these exercises predominantly involve finger exercises and traditional mouth-opening training devices. Our research group successfully designed an intelligent mouth-opening training device (IMOTD) that addresses the limitations of traditional home training methods, including the inability to quantify mouth-opening exercises, a lack of guided training resulting in temporomandibular joint injuries, and poor training continuity leading to poor training effect. For this device, an interactive remote guidance mode is introduced to address these concerns. The device was designed with a focus on the safety and effectiveness of medical devices. The accuracy of the training data was verified through piezoelectric sensor calibration. Through mechanical analysis, the stress points of the structure were identified, and finite element analysis of the connecting rod and the occlusal plate connection structure was conducted to ensure the safety of the device. The findings support the effectiveness of the intelligent device in rehabilitation through preclinical experiments when compared with conventional mouth-opening training methods. This intelligent device facilitates the quantification and visualization of mouth-opening training indicators, ensuring both the comfort and safety of the training process. Additionally, it enables remote supervision and guidance for patient training, thereby enhancing patient compliance and ultimately ensuring the effectiveness of mouth-opening exercises.


Assuntos
Neoplasias de Cabeça e Pescoço , Trismo , Humanos , Trismo/etiologia , Trismo/reabilitação , Terapia por Exercício/métodos , Exercício Físico , Boca
2.
Sensors (Basel) ; 23(10)2023 May 13.
Artigo em Inglês | MEDLINE | ID: mdl-37430638

RESUMO

New CMOS imaging sensor (CIS) techniques in smartphones have helped user-generated content dominate our lives over traditional DSLRs. However, tiny sensor sizes and fixed focal lengths also lead to more grainy details, especially for zoom photos. Moreover, multi-frame stacking and post-sharpening algorithms would produce zigzag textures and over-sharpened appearances, for which traditional image-quality metrics may over-estimate. To solve this problem, a real-world zoom photo database is first constructed in this paper, which includes 900 tele-photos from 20 different mobile sensors and ISPs. Then we propose a novel no-reference zoom quality metric which incorporates the traditional estimation of sharpness and the concept of image naturalness. More specifically, for the measurement of image sharpness, we are the first to combine the total energy of the predicted gradient image with the entropy of the residual term under the framework of free-energy theory. To further compensate for the influence of over-sharpening effect and other artifacts, a set of model parameters of mean subtracted contrast normalized (MSCN) coefficients are utilized as the natural statistics representatives. Finally, these two measures are combined linearly. Experimental results on the zoom photo database demonstrate that our quality metric can achieve SROCC and PLCC over 0.91, while the performance of single sharpness or naturalness index is around 0.85. Moreover, compared with the best tested general-purpose and sharpness models, our zoom metric outperforms them by 0.072 and 0.064 in SROCC, respectively.

3.
Eur Radiol ; 31(7): 5032-5040, 2021 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-33439312

RESUMO

OBJECTIVES: To develop a radiomics model using preoperative multiphasic CT for predicting distant metastasis after surgical resection in patients with localized clear cell renal cell carcinoma (ccRCC) and to identify key biological pathways underlying the predictive radiomics features using RNA sequencing data. METHODS: In this multi-institutional retrospective study, a CT radiomics metastasis score (RMS) was developed from a radiomics analysis cohort (n = 184) for distant metastasis prediction. Using a gene expression analysis cohort (n = 326), radiomics-associated gene modules were identified. Based on a radiogenomics discovery cohort (n = 42), key biological pathways were enriched from the gene modules. Furthermore, a multigene signature associated with RMS was constructed and validated on an independent radiogenomics validation cohort (n = 37). RESULTS: The 9-feature-based RMS predicted distant metastasis with an AUC of 0.861 in validation set and was independent with clinical factors (p < 0.001). A gene module comprising 114 genes was identified to be associated with all nine radiomics features (p < 0.05). Four enriched pathways were identified, including ECM-receptor interaction, focal adhesion, protein digestion and absorption, and PI3K-Akt pathways. Most of them play important roles in tumor progression and metastasis. A 19-gene signature was constructed from the radiomics-associated gene module and predicted metastasis with an AUC of 0.843 in the radiogenomics validation cohort. CONCLUSIONS: CT radiomics features can predict distant metastasis after surgical resection of localized ccRCC while the predictive radiomics phenotypes may be driven by key biological pathways related to cancer progression and metastasis. KEY POINTS: • Radiomics features from primary tumor in preoperative CT predicted distant metastasis after surgical resection in patients with localized ccRCC. • CT radiomics features predictive of distant metastasis were associated with key signaling pathways related to tumor progression and metastasis. • Gene signature associated with radiomics metastasis score predicted distant metastasis in localized ccRCC.


Assuntos
Carcinoma de Células Renais , Neoplasias Renais , Metástase Neoplásica/diagnóstico por imagem , Carcinoma de Células Renais/diagnóstico por imagem , Carcinoma de Células Renais/genética , Carcinoma de Células Renais/cirurgia , Humanos , Neoplasias Renais/diagnóstico por imagem , Neoplasias Renais/genética , Neoplasias Renais/cirurgia , Fosfatidilinositol 3-Quinases , Estudos Retrospectivos , Tomografia Computadorizada por Raios X
4.
BMC Ophthalmol ; 21(1): 169, 2021 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-33836706

RESUMO

BACKGROUND: To establish a decision model based on two- (2D) and three-dimensional (3D) eye data of patients with ptosis for developing personalized surgery plans. METHODS: Data of this retrospective, case-control study was collected from March 2019 to June 2019 at the Department of Ophthalmology, Shanghai Ninth People's Hospital, and then the patients were followed up for 3 months. One hundred fifty-two complete feature eyes from 100 voluntary patients with ptosis and satisfactory surgical results were selected, with 48 eyes excluded due to any severe condition or improper collection and shooting angle. Three experimental schemes were set as follows: use 2D distance alone, use 3D distance alone, and use two distances at the same time. The five most common evaluation indicators used in the binary classification problem to test the decision model were accuracy (ACC), precision, recall, F1-score, and area under the curve (AUC). RESULTS: For diagnostic discrimination, recall of "3D", "2D" and "Both" schemes were 0.875, 0.875 and 0.938 respectively. And precision of the three schemes were 0.8333, 0.7778 and 1.0000 for the surgical procedure classification. Values of "Both" scheme that combined 2D and 3D data were the highest in two classifications. CONCLUSIONS: In this study, 3D eye data are introduced into clinical practice to construct a decision model for ptosis surgery. Our decision model presents exceptional prediction effect, especially when 2D and 3D data employed jointly.


Assuntos
Aprendizado de Máquina , Área Sob a Curva , Estudos de Casos e Controles , China , Humanos , Estudos Retrospectivos
5.
Acta Radiol ; 62(1): 87-92, 2021 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-32252533

RESUMO

BACKGROUND: Orbital computed tomography (CT) is commonly used for the diagnosis and digital evaluation of orbital diseases. Yet, this approach requires longer scanning time, increased radiation exposure, and, especially, difficult patient positioning that can affect judgment and data processing. According to high-quality research on orbital imaging, computer-assisted surgery, and artificial intelligent diagnostic development, the correction of a coordinate system is a necessary procedure. Nevertheless, existing manual calibration methods are challenging to reproduce and there is no objective evaluation system for errors. PURPOSE: To establish a method for automatic calibration of orbital CT images and implementation of quantitative error evaluation. MATERIAL AND METHODS: A standard three-dimensional (3D) orbit model was manually adjusted, and optimized orbital models were reconstructed based on the initial registration of the skull-bound directed bounding box and the registration of the mutual information method. The calibration error was calculated based on the signed distance field. Seventeen cases of orbital CT were quantitatively evaluated. RESULTS: A new method for automatic calibration and quantitative error evaluation for orbital CT was established. The calibrated model error with ±2 mm accounted for 81.61% ± 6.91% of the total models, and the error of ±1 mm accounted for 53.49% ± 7.07% of the total models. CONCLUSION: This convenient tool for orbital CT automatic calibration may promote the related quantitative research based on orbital CT. The automated operation and small error are beneficial to the popularization and application of the tool, and the quantitative evaluation facilitates other coordinate systems.


Assuntos
Imageamento Tridimensional/métodos , Fraturas Orbitárias/diagnóstico por imagem , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Tomografia Computadorizada por Raios X/métodos , Estudos de Avaliação como Assunto , Humanos , Modelos Biológicos , Órbita/diagnóstico por imagem
6.
IEEE Sens J ; 21(9): 11084-11093, 2021 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-36820762

RESUMO

Coronavirus Disease 2019 (COVID-19) has spread all over the world since it broke out massively in December 2019, which has caused a large loss to the whole world. Both the confirmed cases and death cases have reached a relatively frightening number. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of COVID-19, can be transmitted by small respiratory droplets. To curb its spread at the source, wearing masks is a convenient and effective measure. In most cases, people use face masks in a high-frequent but short-time way. Aimed at solving the problem that we do not know which service stage of the mask belongs to, we propose a detection system based on the mobile phone. We first extract four features from the gray level co-occurrence matrixes (GLCMs) of the face mask's micro-photos. Next, a three-result detection system is accomplished by using K Nearest Neighbor (KNN) algorithm. The results of validation experiments show that our system can reach an accuracy of 82.87% (measured by macro-measures) on the testing dataset. The precision of Type I 'normal use' and the recall of type III 'not recommended' reach 92.00% and 92.59%. In future work, we plan to expand the detection objects to more mask types. This work demonstrates that the proposed mobile microscope system can be used as an assistant for face mask being used, which may play a positive role in fighting against COVID-19.

7.
J Oral Maxillofac Surg ; 78(4): 662.e1-662.e13, 2020 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-31857063

RESUMO

PURPOSE: The aim of the present study was to redetermine the position of the key points (skeletal marker points) in the damaged female and male jaws to improve the accuracy of jaw reconstruction. MATERIALS AND METHODS: To develop a personalized jaw reconstruction guidance program for each patient, we first made 3 statistics to compare the gender differences in the jaw. Next, we proposed and compared 3 methods to use to restore the key skeletal marker points of the damaged jaw according to our statistics. RESULTS: We collected 111 groups of computed tomography data of the jaw from normal people as experimental material. The use of our statistics showed that gender differences are present in the shape of the jaw. In addition, some key angles and distances of the jaw satisfied the Gaussian distribution. The reconstruction results showed that our methods will result in better effects than the widely used method. CONCLUSIONS: To reduce errors, gender differences should be considered when designing a reconstruction approach to the jaw. In addition, our methods can improve the accuracy of reconstruction of the jaw.


Assuntos
Arcada Osseodentária , Tomografia Computadorizada por Raios X , Feminino , Humanos , Masculino
8.
IEEE Sens J ; 20(22): 13674-13681, 2020 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-37974650

RESUMO

Coronavirus Disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronaviruses 2 (SARS-CoV-2) has become a serious global pandemic in the past few months and caused huge loss to human society worldwide. For such a large-scale pandemic, early detection and isolation of potential virus carriers is essential to curb the spread of the pandemic. Recent studies have shown that one important feature of COVID-19 is the abnormal respiratory status caused by viral infections. During the pandemic, many people tend to wear masks to reduce the risk of getting sick. Therefore, in this paper, we propose a portable non-contact method to screen the health conditions of people wearing masks through analysis of the respiratory characteristics from RGB-infrared sensors. We first accomplish a respiratory data capture technique for people wearing masks by using face recognition. Then, a bidirectional GRU neural network with an attention mechanism is applied to the respiratory data to obtain the health screening result. The results of validation experiments show that our model can identify the health status of respiratory with 83.69% accuracy, 90.23% sensitivity and 76.31% specificity on the real-world dataset. This work demonstrates that the proposed RGB-infrared sensors on portable device can be used as a pre-scan method for respiratory infections, which provides a theoretical basis to encourage controlled clinical trials and thus helps fight the current COVID-19 pandemic. The demo videos of the proposed system are available at: https://doi.org/10.6084/m9.figshare.12028032.

9.
Sensors (Basel) ; 20(3)2020 Jan 31.
Artigo em Inglês | MEDLINE | ID: mdl-32023963

RESUMO

Advancement in science and technology is playing an increasingly important role in solving difficult cases at present. Thermal cameras can help the police crack difficult cases by capturing the heat trace on the ground left by perpetrators, which cannot be spotted by the naked eye. Therefore, the purpose of this study is to establish a thermalfoot model using thermal imaging system to estimate the departure time. To this end, in the current work, we use a thermal camera to acquire the thermal sequence left on the floor, and convert it into the heat signal via image processing algorithm. We establish the model of thermalfoot print as we observe that the residual temperature would exponentially decrease with the departure time according to Newton's Law of Cooling. The correlation coefficients of 107 thermalfoot models derived from the corresponding 107 heat signals are basically above 0.99. In a validation experiment, a residual analysis is conducted and the residuals between estimated departure time points and ground-truth times are almost within a certain range from -150 s to +150 s. The reverse accuracy of the thermalfoot model for estimating departure time at one-third, one-half, two-thirds, three-fourths, four-fifths, and five-sixths capture time points are 71.96%, 50.47%, 42.06%, 31.78%, 21.70%, and 11.21%, respectively. The results of comparison experiments with two subjective evaluation methods (subjective 1: we directly estimate the departure time according to obtained local curves; subjective 2: we utilize auxiliary means such as a ruler to estimate the departure time based on obtained local curves) further demonstrate the effectiveness of thermalfoot model for detecting the departure time inversely. Experimental results also demonstrated that the thermalfoot model has good performance on the departure time reversal within a short time window someone leaves, whereas it is probably only approximately 15% to accurately determine the departure time via thermalfoot model within a long time window someone leaves. The influence of outliers, ROI (Region of Interest) selection, ROI size, different capture time points and environment temperature on the performance of thermalfoot model on departure time reversal can be explored in the future work. Overall, the thermalfoot model can help the police solve crimes to some extent, which in turn brings more guarantees for people's health, social security, and stability.

10.
Sensors (Basel) ; 20(17)2020 Sep 02.
Artigo em Inglês | MEDLINE | ID: mdl-32887391

RESUMO

Relying on large scale labeled datasets, deep learning has achieved good performance in image classification tasks. In agricultural and biological engineering, image annotation is time-consuming and expensive. It also requires annotators to have technical skills in specific areas. Obtaining the ground truth is difficult because natural images are expensive. In addition, images in these areas are usually stored as multichannel images, such as computed tomography (CT) images, magnetic resonance images (MRI), and hyperspectral images (HSI). In this paper, we present a framework using active learning and deep learning for multichannel image classification. We use three active learning algorithms, including least confidence, margin sampling, and entropy, as the selection criteria. Based on this framework, we further introduce an "image pool" to make full advantage of images generated by data augmentation. To prove the availability of the proposed framework, we present a case study on agricultural hyperspectral image classification. The results show that the proposed framework achieves better performance compared with the deep learning model. Manual annotation of all the training sets achieves an encouraging accuracy. In comparison, using active learning algorithm of entropy and image pool achieves a similar accuracy with only part of the whole training set manually annotated. In practical application, the proposed framework can remarkably reduce labeling effort during the model development and upadting processes, and can be applied to multichannel image classification in agricultural and biological engineering.


Assuntos
Aprendizado Profundo , Processamento de Imagem Assistida por Computador , Algoritmos , Análise Custo-Benefício , Imageamento por Ressonância Magnética
11.
Eur Radiol ; 29(8): 3996-4007, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-30523454

RESUMO

OBJECTIVES: To develop a radiomics model with all-relevant imaging features from multiphasic computed tomography (CT) for differentiating clear cell renal cell carcinoma (ccRCC) from non-ccRCC and to investigate the possible radiogenomics link between the imaging features and a key ccRCC driver gene-the von Hippel-Lindau (VHL) gene mutation. METHODS: In this retrospective two-center study, two radiomics models were built using random forest from a training cohort (170 patients), where one model was built with all-relevant features and the other with minimum redundancy maximum relevance (mRMR) features. A model combining all-relevant features and clinical factors (sex, age) was also built. The radiogenomics association between selected features and VHL mutation was investigated by Wilcoxon rank-sum test. All models were tested on an independent validation cohort (85 patients) with ROC curves analysis. RESULTS: The model with eight all-relevant features from corticomedullary phase CT achieved an AUC of 0.949 and an accuracy of 92.9% in the validation cohort, which significantly outperformed the model with eight mRMR features (seven from nephrographic phase and one from corticomedullary phase) with an AUC of 0.851 and an accuracy of 81.2%. Combining age and sex did not benefit the performance. Five out of eight all-relevant features were significantly associated with VHL mutation, while all eight mRMR features were significantly associated with VHL mutation (false discovery rate-adjusted p < 0.05). CONCLUSIONS: All-relevant features in corticomedullary phase CT can be used to differentiate ccRCC from non-ccRCC. Most subtype-discriminative imaging features were found to be significantly associated with VHL mutation, which may underlie the molecular basis of the radiomics features. KEY POINTS: • All-relevant features in corticomedullary phase CT can be used to differentiate ccRCC from non-ccRCC with high accuracy. • Most RCC-subtype-discriminative CT features were associated with the key RCC-driven gene-the VHL gene mutation. • Radiomics model can be more accurate and interpretable when the imaging features could reflect underlying molecular basis of RCC.


Assuntos
Carcinoma de Células Renais/diagnóstico , DNA de Neoplasias/genética , Neoplasias Renais/diagnóstico , Tomografia Computadorizada Multidetectores/métodos , Mutação , Estadiamento de Neoplasias/métodos , Proteína Supressora de Tumor Von Hippel-Lindau/genética , Adulto , Idoso , Idoso de 80 Anos ou mais , Carcinoma de Células Renais/genética , Carcinoma de Células Renais/metabolismo , Diferenciação Celular , Análise Mutacional de DNA , Diagnóstico Diferencial , Feminino , Humanos , Neoplasias Renais/genética , Neoplasias Renais/metabolismo , Masculino , Pessoa de Meia-Idade , Curva ROC , Estudos Retrospectivos , Proteína Supressora de Tumor Von Hippel-Lindau/metabolismo , Adulto Jovem
12.
Biomed Eng Online ; 18(1): 111, 2019 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-31729983

RESUMO

BACKGROUND: Head-mounted displays (HMDs) and virtual reality (VR) have been frequently used in recent years, and a user's experience and computation efficiency could be assessed by mounting eye-trackers. However, in addition to visually induced motion sickness (VIMS), eye fatigue has increasingly emerged during and after the viewing experience, highlighting the necessity of quantitatively assessment of the detrimental effects. As no measurement method for the eye fatigue caused by HMDs has been widely accepted, we detected parameters related to optometry test. We proposed a novel computational approach for estimation of eye fatigue by providing various verifiable models. RESULTS: We implemented three classifications and two regressions to investigate different feature sets, which led to present two valid assessment models for eye fatigue by employing blinking features and eye movement features with the ground truth of indicators for optometry test. Three graded results and one continuous result were provided by each model, respectively, which caused the whole result to be repeatable and comparable. CONCLUSION: We showed differences between VIMS and eye fatigue, and we also presented a new scheme to assess eye fatigue of HMDs users by analysis of parameters of the eye tracker.


Assuntos
Astenopia/diagnóstico , Movimentos Oculares , Cabeça , Adulto , Astenopia/fisiopatologia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Adulto Jovem
13.
J Oral Maxillofac Surg ; 77(3): 664.e1-664.e16, 2019 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-30598300

RESUMO

PURPOSE: For severe mandibular or maxillary defects across the midline, doctors often lack data on the shape of the jaws when designing virtual surgery. This study sought to repair the personalized 3-dimensional shape of the jaw, particularly when the jaw is severely damaged. MATERIALS AND METHODS: Two linear regression methods, denoted method I and method II, were used to reconstruct key points of the severely damaged maxilla or mandible based on the remaining jaw. The predictor variable was the position of key points. Outcome variables were the position of key points and the error between the predicted and actual positions. Another variable was the average error. In the final data analysis, the effect of the method was judged based on the mean error and error probability distribution. RESULTS: Computed tomographic data of jaws from 44 normal adults in East China were collected over 2 years by the Shanghai Jiao Tong University School of Medicine (Shanghai, China). Sixteen 16 key points were extracted for each jaw. Method I showed that 2-dimensional regression can yield the best overall result and that the position error of most points can be decreased to smaller than 5 mm. The result of method II was similar to that of method I but showed cumulative errors. CONCLUSIONS: Linear regression can be used to locate key points. Two-dimensional regression has the best effect, which can be used as a reference to develop a surgical plan and perform surgery.


Assuntos
Mandíbula , Maxila , Adulto , Cefalometria , China , Humanos , Modelos Lineares
14.
Sensors (Basel) ; 18(4)2018 Apr 07.
Artigo em Inglês | MEDLINE | ID: mdl-29642454

RESUMO

Deep learning has become a widely used powerful tool in many research fields, although not much so yet in agriculture technologies. In this work, two deep convolutional neural networks (CNN), viz. Residual Network (ResNet) and its improved version named ResNeXt, are used to detect internal mechanical damage of blueberries using hyperspectral transmittance data. The original structure and size of hypercubes are adapted for the deep CNN training. To ensure that the models are applicable to hypercube, we adjust the number of filters in the convolutional layers. Moreover, a total of 5 traditional machine learning algorithms, viz. Sequential Minimal Optimization (SMO), Linear Regression (LR), Random Forest (RF), Bagging and Multilayer Perceptron (MLP), are performed as the comparison experiments. In terms of model assessment, k-fold cross validation is used to indicate that the model performance does not vary with the different combination of dataset. In real-world application, selling damaged berries will lead to greater interest loss than discarding the sound ones. Thus, precision, recall, and F1-score are also used as the evaluation indicators alongside accuracy to quantify the false positive rate. The first three indicators are seldom used by investigators in the agricultural engineering domain. Furthermore, ROC curves and Precision-Recall curves are plotted to visualize the performance of classifiers. The fine-tuned ResNet/ResNeXt achieve average accuracy and F1-score of 0.8844/0.8784 and 0.8952/0.8905, respectively. Classifiers SMO/ LR/RF/Bagging/MLP obtain average accuracy and F1-score of 0.8082/0.7606/0.7314/0.7113/0.7827 and 0.8268/0.7796/0.7529/0.7339/0.7971, respectively. Two deep learning models achieve better classification performance than the traditional machine learning methods. Classification for each testing sample only takes 5.2 ms and 6.5 ms respectively for ResNet and ResNeXt, indicating that the deep learning framework has great potential for online fruit sorting. The results of this study demonstrate the potential of deep CNN application on analyzing the internal mechanical damage of fruit.


Assuntos
Mirtilos Azuis (Planta) , Aprendizado de Máquina , Algoritmos , Redes Neurais de Computação , Curva ROC , Fatores de Tempo
15.
Int Dent J ; 2024 Aug 03.
Artigo em Inglês | MEDLINE | ID: mdl-39098480

RESUMO

INTRODUCTION AND AIMS: In the face of escalating oral cancer rates, the application of large language models like Generative Pretrained Transformer (GPT)-4 presents a novel pathway for enhancing public awareness about prevention and early detection. This research aims to explore the capabilities and possibilities of GPT-4 in addressing open-ended inquiries in the field of oral cancer. METHODS: Using 60 questions accompanied by reference answers, covering concepts, causes, treatments, nutrition, and other aspects of oral cancer, evaluators from diverse backgrounds were selected to evaluate the capabilities of GPT-4 and a customized version. A P value under .05 was considered significant. RESULTS: Analysis revealed that GPT-4 and its adaptations notably excelled in answering open-ended questions, with the majority of responses receiving high scores. Although the median score for standard GPT-4 was marginally better, statistical tests showed no significant difference in capabilities between the two models (P > .05). Despite statistical significance indicated diverse backgrounds of evaluators have statistically difference (P < .05), a post hoc test and comprehensive analysis demonstrated that both editions of GPT-4 demonstrated equivalent capabilities in answering questions concerning oral cancer. CONCLUSIONS: GPT-4 has demonstrated its capability to furnish responses to open-ended inquiries concerning oral cancer. Utilizing this advanced technology to boost public awareness about oral cancer is viable and has much potential. When it's unable to locate pertinent information, it will resort to their inherent knowledge base or recommend consulting professionals after offering some basic information. Therefore, it cannot supplant the expertise and clinical judgment of surgical oncologists and could be used as an adjunctive evaluation tool.

16.
Comput Biol Med ; 174: 108431, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38626507

RESUMO

Skin wrinkles result from intrinsic aging processes and extrinsic influences, including prolonged exposure to ultraviolet radiation and tobacco smoking. Hence, the identification of wrinkles holds significant importance in skin aging and medical aesthetic investigation. Nevertheless, current methods lack the comprehensiveness to identify facial wrinkles, particularly those that may appear insignificant. Furthermore, the current assessment techniques neglect to consider the blurred boundary of wrinkles and cannot differentiate images with varying resolutions. This research introduces a novel wrinkle detection algorithm and a distance-based loss function to identify full-face wrinkles. Furthermore, we develop a wrinkle detection evaluation metric that assesses outcomes based on curve, location, and gradient similarity. We collected and annotated a dataset for wrinkle detection consisting of 1021 images of Chinese faces. The dataset will be made publicly available to further promote wrinkle detection research. The research demonstrates a substantial enhancement in detecting subtle wrinkles through implementing the proposed method. Furthermore, the suggested evaluation procedure effectively considers the indistinct boundaries of wrinkles and is applicable to images with various resolutions.


Assuntos
Algoritmos , Bases de Dados Factuais , Face , Envelhecimento da Pele , Humanos , Envelhecimento da Pele/fisiologia , Face/diagnóstico por imagem , Feminino , Masculino , Processamento de Imagem Assistida por Computador/métodos , Adulto
17.
IEEE Trans Image Process ; 33: 1898-1910, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38451761

RESUMO

In this paper, we present a simple yet effective continual learning method for blind image quality assessment (BIQA) with improved quality prediction accuracy, plasticity-stability trade-off, and task-order/-length robustness. The key step in our approach is to freeze all convolution filters of a pre-trained deep neural network (DNN) for an explicit promise of stability, and learn task-specific normalization parameters for plasticity. We assign each new IQA dataset (i.e., task) a prediction head, and load the corresponding normalization parameters to produce a quality score. The final quality estimate is computed by a weighted summation of predictions from all heads with a lightweight K -means gating mechanism. Extensive experiments on six IQA datasets demonstrate the advantages of the proposed method in comparison to previous training techniques for BIQA.

18.
IEEE Trans Pattern Anal Mach Intell ; 46(8): 5852-5872, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38376963

RESUMO

Video compression is indispensable to most video analysis systems. Despite saving the transportation bandwidth, it also deteriorates downstream video understanding tasks, especially at low-bitrate settings. To systematically investigate this problem, we first thoroughly review the previous methods, revealing that three principles, i.e., task-decoupled, label-free, and data-emerged semantic prior, are critical to a machine-friendly coding framework but are not fully satisfied so far. In this paper, we propose a traditional-neural mixed coding framework that simultaneously fulfills all these principles, by taking advantage of both traditional codecs and neural networks (NNs). On one hand, the traditional codecs can efficiently encode the pixel signal of videos but may distort the semantic information. On the other hand, highly non-linear NNs are proficient in condensing video semantics into a compact representation. The framework is optimized by ensuring that a transportation-efficient semantic representation of the video is preserved w.r.t. the coding procedure, which is spontaneously learned from unlabeled data in a self-supervised manner. The videos collaboratively decoded from two streams (codec and NN) are of rich semantics, as well as visually photo-realistic, empirically boosting several mainstream downstream video analysis task performances without any post-adaptation procedure. Furthermore, by introducing the attention mechanism and adaptive modeling scheme, the video semantic modeling ability of our approach is further enhanced. Fianlly, we build a low-bitrate video understanding benchmark with three downstream tasks on eight datasets, demonstrating the notable superiority of our approach. All codes, data, and models will be open-sourced for facilitating future research.

19.
Artigo em Inglês | MEDLINE | ID: mdl-39167507

RESUMO

The rapid development of Multi-modality Large Language Models (MLLMs) has navigated a paradigm shift in computer vision, moving towards versatile foundational models. However, evaluating MLLMs in low-level visual perception and understanding remains a yet-to-explore domain. To this end, we design benchmark settings to emulate human language responses related to low-level vision: the low-level visual perception (A1) via visual question answering related to low-level attributes (e.g. clarity, lighting); and the low-level visual description (A2), on evaluating MLLMs for low-level text descriptions. Furthermore, given that pairwise comparison can better avoid ambiguity of responses and has been adopted by many human experiments, we further extend the low-level perception-related questionanswering and description evaluations of MLLMs from single images to image pairs. Specifically, for perception (A1), we carry out the LLVisionQA+ dataset, comprising 2,990 single images and 1,999 image pairs each accompanied by an open-ended question about its low-level features; for description (A2), we propose the LLDescribe+ dataset, evaluating MLLMs for low-level descriptions on 499 single images and 450 pairs. Additionally, we evaluate MLLMs on assessment (A3) ability, i.e. predicting score, by employing a softmax-based approach to enable all MLLMs to generate quantifiable quality ratings, tested against human opinions in 7 image quality assessment (IQA) datasets. With 24 MLLMs under evaluation, we demonstrate that several MLLMs have decent low-level visual competencies on single images, but only GPT-4V exhibits higher accuracy on pairwise comparisons than single image evaluations (like humans). We hope that our benchmark will motivate further research into uncovering and enhancing these nascent capabilities of MLLMs. Datasets will be available at https://github.com/Q-Future/Q-Bench.

20.
Artigo em Inglês | MEDLINE | ID: mdl-38625773

RESUMO

Blind video quality assessment (BVQA) plays an indispensable role in monitoring and improving the end-users' viewing experience in various real-world video-enabled media applications. As an experimental field, the improvements of BVQA models have been measured primarily on a few human-rated VQA datasets. Thus, it is crucial to gain a better understanding of existing VQA datasets in order to properly evaluate the current progress in BVQA. Towards this goal, we conduct a first-of-its-kind computational analysis of VQA datasets via designing minimalistic BVQA models. By minimalistic, we restrict our family of BVQA models to build only upon basic blocks: a video preprocessor (for aggressive spatiotemporal downsampling), a spatial quality analyzer, an optional temporal quality analyzer, and a quality regressor, all with the simplest possible instantiations. By comparing the quality prediction performance of different model variants on eight VQA datasets with realistic distortions, we find that nearly all datasets suffer from the easy dataset problem of varying severity, some of which even admit blind image quality assessment (BIQA) solutions. We additionally justify our claims by comparing our model generalization capabilities on these VQA datasets, and by ablating a dizzying set of BVQA design choices related to the basic building blocks. Our results cast doubt on the current progress in BVQA, and meanwhile shed light on good practices of constructing next-generation VQA datasets and models.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA