Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 27
Filter
2.
JAMA Dermatol ; 159(11): 1223-1231, 2023 11 01.
Article in English | MEDLINE | ID: mdl-37792351

ABSTRACT

Importance: Artificial intelligence (AI) training for diagnosing dermatologic images requires large amounts of clean data. Dermatologic images have different compositions, and many are inaccessible due to privacy concerns, which hinder the development of AI. Objective: To build a training data set for discriminative and generative AI from unstandardized internet images of melanoma and nevus. Design, Setting, and Participants: In this diagnostic study, a total of 5619 (CAN5600 data set) and 2006 (CAN2000 data set; a manually revised subset of CAN5600) cropped lesion images of either melanoma or nevus were semiautomatically annotated from approximately 500 000 photographs on the internet using convolutional neural networks (CNNs), region-based CNNs, and large mask inpainting. For unsupervised pretraining, 132 673 possible lesions (LESION130k data set) were also created with diversity by collecting images from 18 482 websites in approximately 80 countries. A total of 5000 synthetic images (GAN5000 data set) were generated using the generative adversarial network (StyleGAN2-ADA; training, CAN2000 data set; pretraining, LESION130k data set). Main Outcomes and Measures: The area under the receiver operating characteristic curve (AUROC) for determining malignant neoplasms was analyzed. In each test, 1 of the 7 preexisting public data sets (total of 2312 images; including Edinburgh, an SNU subset, Asan test, Waterloo, 7-point criteria evaluation, PAD-UFES-20, and MED-NODE) was used as the test data set. Subsequently, a comparative study was conducted between the performance of the EfficientNet Lite0 CNN on the proposed data set and that trained on the remaining 6 preexisting data sets. Results: The EfficientNet Lite0 CNN trained on the annotated or synthetic images achieved higher or equivalent mean (SD) AUROCs to the EfficientNet Lite0 trained using the pathologically confirmed public data sets, including CAN5600 (0.874 [0.042]; P = .02), CAN2000 (0.848 [0.027]; P = .08), and GAN5000 (0.838 [0.040]; P = .31 [Wilcoxon signed rank test]) and the preexisting data sets combined (0.809 [0.063]) by the benefits of increased size of the training data set. Conclusions and Relevance: The synthetic data set in this diagnostic study was created using various AI technologies from internet images. A neural network trained on the created data set (CAN5600) performed better than the same network trained on preexisting data sets combined. Both the annotated (CAN5600 and LESION130k) and synthetic (GAN5000) data sets could be shared for AI training and consensus between physicians.


Subject(s)
Melanoma , Nevus, Pigmented , Nevus , Skin Neoplasms , Humans , Artificial Intelligence , Melanoma/diagnosis , Melanoma/pathology , Nevus/diagnosis , Nevus/pathology , Neural Networks, Computer , Skin Neoplasms/diagnosis , Skin Neoplasms/pathology
3.
J Clin Med ; 12(8)2023 Apr 13.
Article in English | MEDLINE | ID: mdl-37109186

ABSTRACT

Facial telangiectasias are small, dilated blood vessels frequently located on the face. They are cosmetically disfiguring and require an effective solution. We aimed to investigate the effect of the pinhole method using a carbon dioxide (CO2) laser to treat facial telangiectasias. This study included 155 facial telangiectasia lesions in 72 patients who visited the Kangnam Sacred Heart Hospital, Hallym University. Treatment efficacy and improvement were evaluated by quantitative measurements performed by two trained evaluators who assessed the percentage of residual lesion length using the same tape measure. Lesions were evaluated before laser therapy and 1, 3, and 6 months after the first treatment. Based on the initial lesion length (100%), the average percentages of the residual length at 1, 3, and 6 months were 48.26% (p < 0.01), 4.25% (p < 0.01), and 1.41% (p < 0.01), respectively. Complications were evaluated using the Patient and Observer Scar Assessment Scale (POSAS). The average POSAS scores improved from 46.09 at the first visit to 23.42 (p < 0.01), and 15.24 (p < 0.01) at the 3- and 6-month follow-up. No recurrence was noted at the 6-month follow-up. CO2 laser treatment using the pinhole method to treat facial telangiectasias is a safe, inexpensive, and effective treatment that provides patients with excellent aesthetic satisfaction.

4.
Sci Rep ; 12(1): 16260, 2022 09 28.
Article in English | MEDLINE | ID: mdl-36171272

ABSTRACT

Model Dermatology ( https://modelderm.com ; Build2021) is a publicly testable neural network that can classify 184 skin disorders. We aimed to investigate whether our algorithm can classify clinical images of an Internet community along with tertiary care center datasets. Consecutive images from an Internet skin cancer community ('RD' dataset, 1,282 images posted between 25 January 2020 to 30 July 2021; https://reddit.com/r/melanoma ) were analyzed retrospectively, along with hospital datasets (Edinburgh dataset, 1,300 images; SNU dataset, 2,101 images; TeleDerm dataset, 340 consecutive images). The algorithm's performance was equivalent to that of dermatologists in the curated clinical datasets (Edinburgh and SNU datasets). However, its performance deteriorated in the RD and TeleDerm datasets because of insufficient image quality and the presence of out-of-distribution disorders, respectively. For the RD dataset, the algorithm's Top-1/3 accuracy (39.2%/67.2%) and AUC (0.800) were equivalent to that of general physicians (36.8%/52.9%). It was more accurate than that of the laypersons using random Internet searches (19.2%/24.4%). The Top-1/3 accuracy was affected by inadequate image quality (adequate = 43.2%/71.3% versus inadequate = 32.9%/60.8%), whereas participant performance did not deteriorate (adequate = 35.8%/52.7% vs. inadequate = 38.4%/53.3%). In this report, the algorithm performance was significantly affected by the change of the intended settings, which implies that AI algorithms at dermatologist-level, in-distribution setting, may not be able to show the same level of performance in with out-of-distribution settings.


Subject(s)
Skin Neoplasms , Humans , Internet , Neural Networks, Computer , Retrospective Studies , Skin , Skin Neoplasms/diagnosis
5.
J Invest Dermatol ; 142(9): 2353-2362.e2, 2022 09.
Article in English | MEDLINE | ID: mdl-35183551

ABSTRACT

TRIAL DESIGN: This was a single-center, unmasked, paralleled, randomized controlled trial. METHODS: A randomized trial was conducted in a tertiary care institute in South Korea to validate whether artificial intelligence (AI) could augment the accuracy of nonexpert physicians in the real-world settings, which included diverse out-of-distribution conditions. Consecutive patients aged >19 years, having one or more skin lesions suspicious for skin cancer detected by either the patient or physician, were randomly allocated to four nondermatology trainees and four dermatology residents. The attending dermatologists examined the randomly allocated patients with (AI-assisted group) or without (unaided group) the real-time assistance of AI algorithm (https://b2020.modelderm.com#world; convolutional neural networks; unmasked design) after simple randomization of the patients. RESULTS: Using 576 consecutive cases (Fitzpatrick skin phototypes III or IV) with suspicious lesions out of the initial 603 recruitments, the accuracy of the AI-assisted group (n = 295, 53.9%) was found to be significantly higher than those of the unaided group (n = 281, 43.8%; P = 0.019). Whereas the augmentation was more significant from 54.7% (n = 150) to 30.7% (n = 138; P < 0.0001) in the nondermatology trainees who had the least experience in dermatology, it was not significant in the dermatology residents. The algorithm could help trainees in the AI-assisted group include more differential diagnoses than the unaided group (2.09 vs. 1.95 diagnoses; P = 0.0005). However, a 12.2% drop in Top-1 accuracy of the trainees was observed in cases in which all Top-3 predictions given by the algorithm were incorrect. CONCLUSIONS: The multiclass AI algorithm augmented the diagnostic accuracy of nonexpert physicians in dermatology.


Subject(s)
Artificial Intelligence , Skin Neoplasms , Algorithms , Diagnosis, Differential , Humans , Neural Networks, Computer , Skin Neoplasms/diagnosis , Skin Neoplasms/pathology
6.
PLoS One ; 17(1): e0260895, 2022.
Article in English | MEDLINE | ID: mdl-35061692

ABSTRACT

BACKGROUND: Although deep neural networks have shown promising results in the diagnosis of skin cancer, a prospective evaluation in a real-world setting could confirm these results. This study aimed to evaluate whether an algorithm (http://b2019.modelderm.com) improves the accuracy of nondermatologists in diagnosing skin neoplasms. METHODS: A total of 285 cases (random series) with skin neoplasms suspected of malignancy by either physicians or patients were recruited in two tertiary care centers located in South Korea. An artificial intelligence (AI) group (144 cases, mean [SD] age, 57.0 [17.7] years; 62 [43.1%] men) was diagnosed via routine examination with photographic review and assistance by the algorithm, whereas the control group (141 cases, mean [SD] age, 61.0 [15.3] years; 52 [36.9%] men) was diagnosed only via routine examination with a photographic review. The accuracy of the nondermatologists before and after the interventions was compared. RESULTS: Among the AI group, the accuracy of the first impression (Top-1 accuracy; 58.3%) after the assistance of AI was higher than that before the assistance (46.5%, P = .008). The number of differential diagnoses of the participants increased from 1.9 ± 0.5 to 2.2 ± 0.6 after the assistance (P < .001). In the control group, the difference in the Top-1 accuracy between before and after reviewing photographs was not significant (before, 46.1%; after, 51.8%; P = .19), and the number of differential diagnoses did not significantly increase (before, 2.0 ± 0.4; after, 2.1 ± 0.5; P = .57). CONCLUSIONS: In real-world settings, AI augmented the diagnostic accuracy of trainee doctors. The limitation of this study is that the algorithm was tested only for Asians recruited from a single region. Additional international randomized controlled trials involving various ethnicities are required.


Subject(s)
Artificial Intelligence
8.
PLoS One ; 15(12): e0244899, 2020.
Article in English | MEDLINE | ID: mdl-33373424

ABSTRACT

[This corrects the article DOI: 10.1371/journal.pone.0234334.].

9.
PLoS Med ; 17(11): e1003381, 2020 11.
Article in English | MEDLINE | ID: mdl-33237903

ABSTRACT

BACKGROUND: The diagnostic performance of convolutional neural networks (CNNs) for diagnosing several types of skin neoplasms has been demonstrated as comparable with that of dermatologists using clinical photography. However, the generalizability should be demonstrated using a large-scale external dataset that includes most types of skin neoplasms. In this study, the performance of a neural network algorithm was compared with that of dermatologists in both real-world practice and experimental settings. METHODS AND FINDINGS: To demonstrate generalizability, the skin cancer detection algorithm (https://rcnn.modelderm.com) developed in our previous study was used without modification. We conducted a retrospective study with all single lesion biopsied cases (43 disorders; 40,331 clinical images from 10,426 cases: 1,222 malignant cases and 9,204 benign cases); mean age (standard deviation [SD], 52.1 [18.3]; 4,701 men [45.1%]) were obtained from the Department of Dermatology, Severance Hospital in Seoul, Korea between January 1, 2008 and March 31, 2019. Using the external validation dataset, the predictions of the algorithm were compared with the clinical diagnoses of 65 attending physicians who had recorded the clinical diagnoses with thorough examinations in real-world practice. In addition, the results obtained by the algorithm for the data of randomly selected batches of 30 patients were compared with those obtained by 44 dermatologists in experimental settings; the dermatologists were only provided with multiple images of each lesion, without clinical information. With regard to the determination of malignancy, the area under the curve (AUC) achieved by the algorithm was 0.863 (95% confidence interval [CI] 0.852-0.875), when unprocessed clinical photographs were used. The sensitivity and specificity of the algorithm at the predefined high-specificity threshold were 62.7% (95% CI 59.9-65.1) and 90.0% (95% CI 89.4-90.6), respectively. Furthermore, the sensitivity and specificity of the first clinical impression of 65 attending physicians were 70.2% and 95.6%, respectively, which were superior to those of the algorithm (McNemar test; p < 0.0001). The positive and negative predictive values of the algorithm were 45.4% (CI 43.7-47.3) and 94.8% (CI 94.4-95.2), respectively, whereas those of the first clinical impression were 68.1% and 96.0%, respectively. In the reader test conducted using images corresponding to batches of 30 patients, the sensitivity and specificity of the algorithm at the predefined threshold were 66.9% (95% CI 57.7-76.0) and 87.4% (95% CI 82.5-92.2), respectively. Furthermore, the sensitivity and specificity derived from the first impression of 44 of the participants were 65.8% (95% CI 55.7-75.9) and 85.7% (95% CI 82.4-88.9), respectively, which are values comparable with those of the algorithm (Wilcoxon signed-rank test; p = 0.607 and 0.097). Limitations of this study include the exclusive use of high-quality clinical photographs taken in hospitals and the lack of ethnic diversity in the study population. CONCLUSIONS: Our algorithm could diagnose skin tumors with nearly the same accuracy as a dermatologist when the diagnosis was performed solely with photographs. However, as a result of limited data relevancy, the performance was inferior to that of actual medical examination. To achieve more accurate predictive diagnoses, clinical information should be integrated with imaging information.


Subject(s)
Dermatologists/statistics & numerical data , Skin Neoplasms/diagnosis , Skin Neoplasms/pathology , Skin/pathology , Biopsy , Female , Humans , Male , Melanoma/diagnosis , Melanoma/pathology , Middle Aged , Retrospective Studies , Sensitivity and Specificity
10.
PLoS One ; 15(6): e0234334, 2020.
Article in English | MEDLINE | ID: mdl-32525908

ABSTRACT

BACKGROUND: Onychomycosis is the most common nail disorder and is associated with diagnostic challenges. Emerging non-invasive, real-time techniques such as dermoscopy and deep convolutional neural networks have been proposed for the diagnosis of this condition. However, comparative studies of the two tools in the diagnosis of onychomycosis have not previously been conducted. OBJECTIVES: This study evaluated the diagnostic abilities of a deep neural network (http://nail.modelderm.com) and dermoscopic examination in patients with onychomycosis. METHODS: A prospective observational study was performed in patients presenting with dystrophic features in the toenails. Clinical photographs were taken by research assistants, and the ground truth was determined either by direct microscopy using the potassium hydroxide test or by fungal culture. Five board-certified dermatologists determined a diagnosis of onychomycosis using the clinical photographs. The diagnosis was also made using the algorithm and dermoscopic examination. RESULTS: A total of 90 patients (mean age, 55.3; male, 43.3%) assessed between September 2018 and July 2019 were included in the analysis. The detection of onychomycosis using the algorithm (AUC, 0.751; 95% CI, 0.646-0.856) and that by dermoscopy (AUC, 0.755; 95% CI, 0.654-0.855) were seen to be comparable (Delong's test; P = 0.952). The sensitivity and specificity of the algorithm at the operating point were 70.2% and 72.7%, respectively. The sensitivity and specificity of diagnosis by the five dermatologists were 73.0% and 49.7%, respectively. The Youden index of the algorithm (0.429) was also comparable to that of the dermatologists' diagnosis (0.230±0.176; Wilcoxon rank-sum test; P = 0.667). CONCLUSIONS: As a standalone method, the algorithm analyzed photographs taken by non-physician and showed comparable accuracy for the diagnosis of onychomycosis to that made by experienced dermatologists and by dermoscopic examination. Large sample size and world-wide, multicentered studies should be investigated to prove the performance of the algorithm.


Subject(s)
Deep Learning , Dermoscopy , Foot Dermatoses/diagnosis , Onychomycosis/diagnosis , Adult , Algorithms , Computer Systems , Dermatologists , Diagnosis, Computer-Assisted , Diagnostic Errors , Female , Foot Dermatoses/diagnostic imaging , Foot Dermatoses/microbiology , Humans , Hydroxides , Male , Middle Aged , Mycological Typing Techniques , Neural Networks, Computer , Onychomycosis/diagnostic imaging , Onychomycosis/microbiology , Photography , Potassium Compounds , Predictive Value of Tests , Prospective Studies , Young Adult
11.
J Invest Dermatol ; 140(9): 1753-1761, 2020 09.
Article in English | MEDLINE | ID: mdl-32243882

ABSTRACT

Although deep learning algorithms have demonstrated expert-level performance, previous efforts were mostly binary classifications of limited disorders. We trained an algorithm with 220,680 images of 174 disorders and validated it using Edinburgh (1,300 images; 10 disorders) and SNU datasets (2,201 images; 134 disorders). The algorithm could accurately predict malignancy, suggest primary treatment options, render multi-class classification among 134 disorders, and improve the performance of medical professionals. The area under the curves for malignancy detection were 0.928 ± 0.002 (Edinburgh) and 0.937 ± 0.004 (SNU). The area under the curves of primary treatment suggestion (SNU) were 0.828 ± 0.012, 0.885 ± 0.006, 0.885 ± 0.006, and 0.918 ± 0.006 for steroids, antibiotics, antivirals, and antifungals, respectively. For multi-class classification, the mean top-1 and top-5 accuracies were 56.7 ± 1.6% and 92.0 ± 1.1% (Edinburgh) and 44.8 ± 1.2% and 78.1 ± 0.3% (SNU), respectively. With the assistance of our algorithm, the sensitivity and specificity of 47 clinicians (21 dermatologists and 26 dermatology residents) for malignancy prediction (SNU; 240 images) were improved by 12.1% (P < 0.0001) and 1.1% (P < 0.0001), respectively. The malignancy prediction sensitivity of 23 non-medical professionals was significantly increased by 83.8% (P < 0.0001). The top-1 and top-3 accuracies of four doctors in the multi-class classification of 134 diseases (SNU; 2,201 images) were increased by 7.0% (P = 0.045) and 10.1% (P = 0.0020), respectively. The results suggest that our algorithm may serve as augmented intelligence that can empower medical professionals in diagnostic dermatology.


Subject(s)
Deep Learning , Dermatology/methods , Image Interpretation, Computer-Assisted , Skin Diseases/drug therapy , Skin Neoplasms/diagnosis , Adolescent , Adult , Aged , Anti-Bacterial Agents/therapeutic use , Antifungal Agents/therapeutic use , Antiviral Agents/therapeutic use , Clinical Competence/statistics & numerical data , Datasets as Topic , Dermatologists/statistics & numerical data , Dermoscopy/methods , Drug Therapy, Computer-Assisted , Feasibility Studies , Female , Glucocorticoids/therapeutic use , Humans , Internship and Residency/statistics & numerical data , Male , Middle Aged , Photography/methods , ROC Curve , Skin/diagnostic imaging , Skin Diseases/diagnosis , Skin Diseases/microbiology , Young Adult
12.
JAMA Dermatol ; 156(1): 29-37, 2020 01 01.
Article in English | MEDLINE | ID: mdl-31799995

ABSTRACT

Importance: Detection of cutaneous cancer on the face using deep-learning algorithms has been challenging because various anatomic structures create curves and shades that confuse the algorithm and can potentially lead to false-positive results. Objective: To evaluate whether an algorithm can automatically locate suspected areas and predict the probability of a lesion being malignant. Design, Setting, and Participants: Region-based convolutional neural network technology was used to create 924 538 possible lesions by extracting nodular benign lesions from 182 348 clinical photographs. After manually or automatically annotating these possible lesions based on image findings, convolutional neural networks were trained with 1 106 886 image crops to locate and diagnose cancer. Validation data sets (2844 images from 673 patients; mean [SD] age, 58.2 [19.9] years; 308 men [45.8%]; 185 patients with malignant tumors, 305 with benign tumors, and 183 free of tumor) were obtained from 3 hospitals between January 1, 2010, and September 30, 2018. Main Outcomes and Measures: The area under the receiver operating characteristic curve, F1 score (mean of precision and recall; range, 0.000-1.000), and Youden index score (sensitivity + specificity -1; 0%-100%) were used to compare the performance of the algorithm with that of the participants. Results: The algorithm analyzed a mean (SD) of 4.2 (2.4) photographs per patient and reported the malignancy score according to the highest malignancy output. The area under the receiver operating characteristic curve for the validation data set (673 patients) was 0.910. At a high-sensitivity cutoff threshold, the sensitivity and specificity of the model with the 673 patients were 76.8% and 90.6%, respectively. With the test partition (325 images; 80 patients), the performance of the algorithm was compared with the performance of 13 board-certified dermatologists, 34 dermatology residents, 20 nondermatologic physicians, and 52 members of the general public with no medical background. When the disease screening performance was evaluated at high sensitivity areas using the F1 score and Youden index score, the algorithm showed a higher F1 score (0.831 vs 0.653 [0.126], P < .001) and Youden index score (0.675 vs 0.417 [0.124], P < .001) than that of nondermatologic physicians. The accuracy of the algorithm was comparable with that of dermatologists (F1 score, 0.831 vs 0.835 [0.040]; Youden index score, 0.675 vs 0.671 [0.100]). Conclusions and Relevance: The results of the study suggest that the algorithm could localize and diagnose skin cancer without preselection of suspicious lesions by dermatologists.


Subject(s)
Carcinoma, Basal Cell/diagnosis , Carcinoma, Squamous Cell/diagnosis , Image Processing, Computer-Assisted/methods , Neural Networks, Computer , Skin Neoplasms/diagnosis , Adult , Aged , Carcinoma, Basal Cell/pathology , Carcinoma, Squamous Cell/pathology , Datasets as Topic , Face , Female , Humans , Keratinocytes/pathology , Male , Middle Aged , Photography , ROC Curve , Skin/cytology , Skin/diagnostic imaging , Skin/pathology , Skin Neoplasms/pathology
15.
Acta Orthop ; 89(4): 468-473, 2018 Aug.
Article in English | MEDLINE | ID: mdl-29577791

ABSTRACT

Background and purpose - We aimed to evaluate the ability of artificial intelligence (a deep learning algorithm) to detect and classify proximal humerus fractures using plain anteroposterior shoulder radiographs. Patients and methods - 1,891 images (1 image per person) of normal shoulders (n = 515) and 4 proximal humerus fracture types (greater tuberosity, 346; surgical neck, 514; 3-part, 269; 4-part, 247) classified by 3 specialists were evaluated. We trained a deep convolutional neural network (CNN) after augmentation of a training dataset. The ability of the CNN, as measured by top-1 accuracy, area under receiver operating characteristics curve (AUC), sensitivity/specificity, and Youden index, in comparison with humans (28 general physicians, 11 general orthopedists, and 19 orthopedists specialized in the shoulder) to detect and classify proximal humerus fractures was evaluated. Results - The CNN showed a high performance of 96% top-1 accuracy, 1.00 AUC, 0.99/0.97 sensitivity/specificity, and 0.97 Youden index for distinguishing normal shoulders from proximal humerus fractures. In addition, the CNN showed promising results with 65-86% top-1 accuracy, 0.90-0.98 AUC, 0.88/0.83-0.97/0.94 sensitivity/specificity, and 0.71-0.90 Youden index for classifying fracture type. When compared with the human groups, the CNN showed superior performance to that of general physicians and orthopedists, similar performance to orthopedists specialized in the shoulder, and the superior performance of the CNN was more marked in complex 3- and 4-part fractures. Interpretation - The use of artificial intelligence can accurately detect and classify proximal humerus fractures on plain shoulder AP radiographs. Further studies are necessary to determine the feasibility of applying artificial intelligence in the clinic and whether its use could improve care and outcomes compared with current orthopedic assessments.


Subject(s)
Deep Learning , Shoulder Fractures/diagnostic imaging , Adult , Aged , Aged, 80 and over , Algorithms , Area Under Curve , Arthrography , Female , Humans , Male , Middle Aged , Shoulder Fractures/classification , Young Adult
16.
J Invest Dermatol ; 138(7): 1529-1538, 2018 07.
Article in English | MEDLINE | ID: mdl-29428356

ABSTRACT

We tested the use of a deep learning algorithm to classify the clinical images of 12 skin diseases-basal cell carcinoma, squamous cell carcinoma, intraepithelial carcinoma, actinic keratosis, seborrheic keratosis, malignant melanoma, melanocytic nevus, lentigo, pyogenic granuloma, hemangioma, dermatofibroma, and wart. The convolutional neural network (Microsoft ResNet-152 model; Microsoft Research Asia, Beijing, China) was fine-tuned with images from the training portion of the Asan dataset, MED-NODE dataset, and atlas site images (19,398 images in total). The trained model was validated with the testing portion of the Asan, Hallym and Edinburgh datasets. With the Asan dataset, the area under the curve for the diagnosis of basal cell carcinoma, squamous cell carcinoma, intraepithelial carcinoma, and melanoma was 0.96 ± 0.01, 0.83 ± 0.01, 0.82 ± 0.02, and 0.96 ± 0.00, respectively. With the Edinburgh dataset, the area under the curve for the corresponding diseases was 0.90 ± 0.01, 0.91 ± 0.01, 0.83 ± 0.01, and 0.88 ± 0.01, respectively. With the Hallym dataset, the sensitivity for basal cell carcinoma diagnosis was 87.1% ± 6.0%. The tested algorithm performance with 480 Asan and Edinburgh images was comparable to that of 16 dermatologists. To improve the performance of convolutional neural network, additional images with a broader range of ages and ethnicities should be collected.


Subject(s)
Deep Learning , Image Processing, Computer-Assisted/methods , Skin/diagnostic imaging , Adult , Aged , Aged, 80 and over , Area Under Curve , Biopsy , Datasets as Topic , Diagnosis, Differential , False Positive Reactions , Female , Granuloma, Pyogenic/diagnostic imaging , Granuloma, Pyogenic/pathology , Humans , Keratosis, Actinic/diagnostic imaging , Keratosis, Actinic/pathology , Keratosis, Seborrheic/diagnostic imaging , Keratosis, Seborrheic/pathology , Lentigo/diagnostic imaging , Lentigo/pathology , Male , Middle Aged , Photography , Predictive Value of Tests , ROC Curve , Skin/pathology , Skin Neoplasms/diagnostic imaging , Skin Neoplasms/pathology , Software , Warts/diagnostic imaging , Warts/pathology , Young Adult
18.
PLoS One ; 13(1): e0191493, 2018.
Article in English | MEDLINE | ID: mdl-29352285

ABSTRACT

Although there have been reports of the successful diagnosis of skin disorders using deep learning, unrealistically large clinical image datasets are required for artificial intelligence (AI) training. We created datasets of standardized nail images using a region-based convolutional neural network (R-CNN) trained to distinguish the nail from the background. We used R-CNN to generate training datasets of 49,567 images, which we then used to fine-tune the ResNet-152 and VGG-19 models. The validation datasets comprised 100 and 194 images from Inje University (B1 and B2 datasets, respectively), 125 images from Hallym University (C dataset), and 939 images from Seoul National University (D dataset). The AI (ensemble model; ResNet-152 + VGG-19 + feedforward neural networks) results showed test sensitivity/specificity/ area under the curve values of (96.0 / 94.7 / 0.98), (82.7 / 96.7 / 0.95), (92.3 / 79.3 / 0.93), (87.7 / 69.3 / 0.82) for the B1, B2, C, and D datasets. With a combination of the B1 and C datasets, the AI Youden index was significantly (p = 0.01) higher than that of 42 dermatologists doing the same assessment manually. For B1+C and B2+ D dataset combinations, almost none of the dermatologists performed as well as the AI. By training with a dataset comprising 49,567 images, we achieved a diagnostic accuracy for onychomycosis using deep learning that was superior to that of most of the dermatologists who participated in this study.


Subject(s)
Diagnosis, Computer-Assisted , Neural Networks, Computer , Onychomycosis/diagnosis , Adult , Aged , Algorithms , Area Under Curve , Artificial Intelligence , Databases, Factual , Dermatologists , Female , Foot Dermatoses/diagnosis , Foot Dermatoses/pathology , Hand Dermatoses/diagnosis , Hand Dermatoses/pathology , Humans , Image Interpretation, Computer-Assisted , Machine Learning , Male , Middle Aged , Onychomycosis/pathology , Young Adult
SELECTION OF CITATIONS
SEARCH DETAIL
...