A Deep Learning-Based Innovative Technique for Phishing Detection in Modern Security with Uniform Resource Locators.

Aldakheel, Eman Abdullah; Zakariah, Mohammed; Gashgari, Ghada Abdalaziz; Almarshad, Fahdah A; Alzahrani, Abdullah I A

Aldakheel, Eman Abdullah; Zakariah, Mohammed; Gashgari, Ghada Abdalaziz; Almarshad, Fahdah A; Alzahrani, Abdullah I A.

Affiliation

Aldakheel EA; Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh 11671, Saudi Arabia.
Zakariah M; Department of Computer Science, College of Computer and Information Science, King Saud University, Riyadh 12372, Saudi Arabia.
Gashgari GA; Department of Cybersecurity, College of Computer Science and Engineering, University of Jeddah, Ar Rabwah Jeddah 23449, Saudi Arabia.
Almarshad FA; Department of Information Systems, College of Computer Engineering and Sciences, Prince Sattam Bin Abdul-Aziz University, Al Kharj 11942, Saudi Arabia.
Alzahrani AIA; Department of Computer Science, College of Science and Humanities in Al Quwaiiyah, Shaqra University, Shaqra 11961, Saudi Arabia.

Sensors (Basel) ; 23(9)2023 Apr 30.

Article in En | MEDLINE | ID: mdl-37177607

ABSTRACT

ABSTRACT

Organizations and individuals worldwide are becoming increasingly vulnerable to cyberattacks as phishing continues to grow and the number of phishing websites grows. As a result, improved cyber defense necessitates more effective phishing detection (PD). In this paper, we introduce a novel method for detecting phishing sites with high accuracy. Our approach utilizes a Convolution Neural Network (CNN)-based model for precise classification that effectively distinguishes legitimate websites from phishing websites. We evaluate the performance of our model on the PhishTank dataset, which is a widely used dataset for detecting phishing websites based solely on Uniform Resource Locators (URL) features. Our approach presents a unique contribution to the field of phishing detection by achieving high accuracy rates and outperforming previous state-of-the-art models. Experiment results revealed that our proposed method performs well in terms of accuracy and its false-positive rate. We created a real data set by crawling 10,000 phishing URLs from PhishTank and 10,000 legitimate websites and then ran experiments using standard evaluation metrics on the data sets. This approach is founded on integrated and deep learning (DL). The CNN-based model can distinguish phishing websites from legitimate websites with a high degree of accuracy. When binary-categorical loss and the Adam optimizer are used, the accuracy of the k-nearest neighbors (KNN), Natural Language Processing (NLP), Recurrent Neural Network (RNN), and Random Forest (RF) models is 87%, 97.98%, 97.4% and 94.26%, respectively, in contrast to previous publications. Our model outperformed previous works due to several factors, including the use of more layers and larger training sizes, and the extraction of additional features from the PhishTank dataset. Specifically, our proposed model comprises seven layers, starting with the input layer and progressing to the seventh, which incorporates a layer with pooling, convolutional, linear 1 and 2, and linear six layers as the output layers. These design choices contribute to the high accuracy of our model, which achieved a 98.77% accuracy rate.

Key words

PhishTank data set; URL analysis; convolutional neural network; deep learning; machine-learning; phishing detection system

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google

Full text: 1 Collection: 01-internacional Database: MEDLINE Type of study: Diagnostic_studies / Prognostic_studies Language: En Journal: Sensors (Basel) Year: 2023 Document type: Article Affiliation country: Saudi Arabia

Fulltext

Add to My VHL

XML

PubMed Links

Search on Google