A new phishing-website detection framework using ensemble classification and clustering
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Alsharaiah, M.A. | - |
dc.contributor.author | Abu-Shareha, A.A. | - |
dc.contributor.author | Abualhaj, M. | - |
dc.contributor.author | Baniata, L.H. | - |
dc.contributor.author | Adwan, O. | - |
dc.contributor.author | Al-Saaidah, A. | - |
dc.contributor.author | Oraiqat, M. | - |
dc.date.accessioned | 2023-05-22T08:40:19Z | - |
dc.date.available | 2023-05-22T08:40:19Z | - |
dc.date.created | 2023-05-22 | - |
dc.date.issued | 2023-03 | - |
dc.identifier.issn | 2561-8148 | - |
dc.identifier.uri | https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/87874 | - |
dc.description.abstract | Phishing websites are characterized by distinguished visual, address, domain, and embedded fea-tures, which identify and defend such threats. Yet, phishing website detection is challenged by overlapping these features with legitimate websites’ features. As the inter-class variance between legitimate and phishing websites becomes low, commonly utilized machine learning algorithms suffer from low performance in overlapping feature cases. Alternatively, ensemble learning that combines multiple predictions intending to address low inter-class variations in the classified data improves the performance in such cases. Ensemble learning utilizes multiple classifiers of similar or different types with multiple deviations of the training data. This paper develops a framework based on random forest ensemble techniques. The limitations of the random forest are the inability to capture the high correlation between features and their join dependency on the label. The random forest is combined with k-means clustering to capture the feature correlation. The framework is evaluated for phishing detection with a dataset of 5000 samples. The results showed the proposed framework over-performed the random forest classifier, all other ensemble classifiers, and the conventional classification algorithms. The proposed framework achieved an accuracy of 98.64%, precision of 0.986, recall of 0.987, and F-measure of 0.986. © 2023 by the authors; licensee Growing Science, Canada. | - |
dc.language | 영어 | - |
dc.language.iso | en | - |
dc.publisher | Growing Science | - |
dc.relation.isPartOf | International Journal of Data and Network Science | - |
dc.title | A new phishing-website detection framework using ensemble classification and clustering | - |
dc.type | Article | - |
dc.type.rims | ART | - |
dc.description.journalClass | 1 | - |
dc.identifier.doi | 10.5267/j.ijdns.2023.1.003 | - |
dc.identifier.bibliographicCitation | International Journal of Data and Network Science, v.7, no.2, pp.857 - 864 | - |
dc.description.isOpenAccess | Y | - |
dc.identifier.scopusid | 2-s2.0-85151405527 | - |
dc.citation.endPage | 864 | - |
dc.citation.startPage | 857 | - |
dc.citation.title | International Journal of Data and Network Science | - |
dc.citation.volume | 7 | - |
dc.citation.number | 2 | - |
dc.contributor.affiliatedAuthor | Baniata, L.H. | - |
dc.type.docType | Article | - |
dc.subject.keywordAuthor | Classification | - |
dc.subject.keywordAuthor | Clustering | - |
dc.subject.keywordAuthor | Ensemble Learning | - |
dc.subject.keywordAuthor | Phishing Detection | - |
dc.description.journalRegisteredClass | scopus | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
1342, Seongnam-daero, Sujeong-gu, Seongnam-si, Gyeonggi-do, Republic of Korea(13120)031-750-5114
COPYRIGHT 2020 Gachon University All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.