TY - JOUR T1 - Türkçe Oltalama E-Postalarının Anlamsal Tespiti: Doğal Dil İşleme ve Derin Öğrenme Tabanlı Bir Yaklaşım TT - Semantic Detection of Turkish Phishing Emails: A Natural Language Processing and Deep Learning-Based Approach AU - Taş, Merve Gül PY - 2025 DA - June Y2 - 2025 JF - Siber Güvenlik ve Dijital Ekonomi PB - Düzce Üniversitesi WT - DergiPark SN - 3108-3706 SP - 29 EP - 42 VL - 1 IS - 1 LA - tr AB - Bu çalışma, Türkçe e-posta içeriklerindeki kimlik avı (phishing) saldırılarını anlamsal düzeyde tespit etmeye yönelik bir metin sınıflandırma yaklaşımı sunmaktadır. Gerçek ve sahte e-postalardan dengeli olarak oluşturulmuş bir veri kümesi kullanılmıştır. Ön işleme sürecinde küçük harfe dönüştürme, noktalama işaretlerinin temizlenmesi ve TF-IDF tabanlı vektörleştirme uygulanmış; bağlamsal temsiller ise BERTurk modeli aracılığıyla elde edilmiştir. Sınıflandırma işlemi Naive Bayes, SVM, LSTM, ELM ve BERT algoritmalarıyla gerçekleştirilmiştir. Modeller Google Colab ortamında eğitilmiş ve doğruluk, F1 skoru ile ROC-AUC metrikleri üzerinden değerlendirilmiştir. Sonuçlar, BERT modelinin Türkçe phishing e-postalarındaki anlamsal farkları başarılı biçimde ayırt ettiğini ortaya koymaktadır. Bu çalışma, morfolojik açıdan zengin dillerde phishing tespiti konusunda literatürdeki boşluğu doldurmayı amaçlamakta ve gerçek zamanlı siber güvenlik sistemlerine entegre edilebilecek ölçeklenebilir bir model önermektedir. Elde edilen bulgular, düşük kaynaklı dillerde bağlamsal doğal dil işleme yöntemlerinin etkinliğini de ortaya koymaktadır. KW - Derin Öğrenme KW - Doğal Dil İşleme KW - Kimlik Avı KW - Makine Öğrenmesi KW - Türkçe E-posta N2 - This study proposes a semantic-based text classification approach for detecting phishing attacks in Turkish email content. A balanced dataset consisting of legitimate and fraudulent emails was constructed. The preprocessing phase included case normalization, punctuation removal, and TF-IDF-based vectorization, while contextual embeddings were obtained using the BERTurk model. Classification was performed using Naive Bayes, SVM, LSTM, ELM, and BERT algorithms. All models were trained in the Google Colab environment, and their performance was assessed using accuracy, F1-score, and ROC-AUC metrics. Results indicate that the BERT model provides superior performance in identifying semantic differences in Turkish phishing emails. The study addresses the gap in phishing detection for morphologically rich languages such as Turkish and presents a scalable model suitable for integration into real-time cybersecurity systems. The findings also demonstrate the viability of contextual NLP techniques in resource-scarce language environments. CR - Ahi, Ş. ve Soğukpınar, İ. (2023). Derin öğrenme modelleri ile kimlik avı e-posta tespiti. Türkiye Bilim Vakfı Bilgisayar Bilimleri ve Mü-hendisliği Dergisi, 13(2), 17–29. CR - Aldakheel, E. A., Zakariah, M., Gashgari, G. A., Almarshad, F. A., & Alzahrani, A. I. A. (2023). A deep learning-based innovative tech-nique for phishing detection in modern security with uniform resource locators. Sensors, 23(9), 4403. https://doi.org/10.3390/s23094403 CR - Alhogail, A., & Alsabih, A. (2021). Applying machine learning and natural language processing to detect phishing email. Computers & Se-curity, 110, 102414. https://doi.org/10.1016/j.cose.2021.102414. CR - Al-Yozbaky, R. Sh. ve Alanezi, M. (2023). Detection and analyzing phishing emails using NLP techniques. 2023 5th International Confe-rence on Human-Computer Interaction, Optimization and Robotic Applications (HORA), 1-6. https://doi.org/10.1109/HORA58378.2023.10156738. CR - AlJamal, M., Alquran, R., Aljaidi, M., AlJamal, O. S., Alsarhan, A., AL-Aiash, I., Samara, G., BaniSalman, M. ve Khouj, M. (2024). Har-nessing ML and NLP for enhanced cybersecurity: A comprehensive approach for phishing email detection. 2024 25th Internatio-nal Arab Conference on Information Technology (ACIT), 1–6. https://doi.org/10.1109/ACIT62805.2024.10877181. CR - Anilkumar, C., Karrothu, A., Sri Mouli, N. ve Bhanu Tej, C. (2023). Recognition and processing of phishing emails using NLP: A survey. 2023 International Conference on Computer Communication and Informatics (ICCCI), 1–6. https://doi.org/10.1109/ICCCI56745.2023.10128481. CR - Atawneh, S. ve Aljehani, H. (2023). Phishing email detection model using deep learning. Electronics, 12(4261), 1–15. https://doi.org/10.3390/electronics12204261. CR - Buber, E., Diri, B. ve Şahingöz, Ö. K. (2017). DDİ yöntemleri ile oltalama saldırılarının URL’den tespiti. 2017 Uluslararası Bilgisayar Bi-limleri ve Mühendisliği Konferansı (UBMK), 253–258. https://doi.org/10.1109/UBMK.2017.8093406. CR - Egozi, G. ve Verma, R. (2018). Phishing email detection using robust NLP techniques. 2018 IEEE International Conference on Data Mi-ning Workshops (ICDMW), 1–8. https://doi.org/10.1109/ICDMW.2018.00009. CR - Eryılmaz, E. E., Şahin, D. Ö. ve Kılıç, E. (2020). Türkçe için makine öğrenimi tabanlı yaramaz elektronik posta algılama sistemi. 5. Ulus-lararası Bilgisayar Bilimleri ve Mühendisliği Konferansı (UBMK 2020), Kocaeli, Türkiye, ss. 122–130. https://doi.org/10.1109/UBMK50275.2020.9115625. CR - Fahim, R. A., Arman, M. S., Sultana, I., Tasnim, N., Ahmed, K. R. ve Mahmud, I. (2024). PhishGuard: Leveraging NLP and machine le-arning for email phishing detection. 2024 International Conference on Big Data Analytics in Bioinformatics (DABCon), 1–6. https://doi.org/10.1109/DABCON63472.2024.10919349. CR - Fette, I., Sadeh, N. ve Tomasic, A. (2007). Learning to detect phishing. 16th International Conference on World Wide Web (WWW '07), Banff, Alberta, Kanada, ss. 649–656. ACM. https://doi.org/10.1145/1242572.1242650. CR - Hasanov, I., Virtanen, S., Hakkala, A., & Isoaho, J. (2024). Application of large language models in cybersecurity: A systematic literature review. IEEE Access, 12, 176751–176773. https://doi.org/10.1109/ACCESS.2024.3505983. CR - Ibrahim, A., Aljarah, I. ve Al-Betar, M. A. (2024). Phishing detection in Arabic SMS messages using natural language proces-sing. Proceedings of the 2024 Seventh International Women in Data Science Conference at Prince Sultan University (WiDS PSU), Riyadh, Saudi Arabia. https://doi.org/10.1109/WiDS-PSU61003.2024.00040. CR - Kopparaju, S. T., Chavarriaga, C. ve Galarreta, E. (2024). Natural language processing-enhanced machine learning framework for comp-rehensive phishing email identification. 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT), 1-6. https://doi.org/10.1109/ICCCNT61001.2024.10723950. CR - Kulal, D., Shiferaw, L. ve Niyaz, Q. (2025). Phishing email detection through machine learning and word error correction. 2025 17th In-ternational Conference on COMmunication Systems & NETworkS (COMSNETS), Bengaluru, Hindistan, ss. 216–223. https://doi.org/10.1109/COMSNETS63942.2025.10885558. CR - Opara, C., Modesti, P. ve Golightly, L. (2025). Evaluating spam filters and stylometric detection of AI-generated phishing emails. Expert Systems with Applications, 235, 127044. https://doi.org/10.1016/j.eswa.2025.127044. CR - Patel, H., Rehman, U. ve Iqbal, F. (2024). Evaluating the efficacy of large language models in identifying phishing attempts. 2024 16th International Conference on Human System Interaction (HSI), 1–7. https://doi.org/10.1109/HSI61632.2024.10613528. CR - Peng, T., Harris, I. G. ve Sawa, Y. (2018). Detecting phishing attacks using natural language processing and machine learning. 2018 IEEE 12th International Conference on Semantic Computing (ICSC), 300–303. https://doi.org/10.1109/ICSC.2018.00056. CR - Pimpason, N., Viboonsang, P. ve Kosolsombat, S. (2025). Phishing email detection model using deep learning. 2025 IEEE International Conference on Cybernetics and Innovations (ICCI), Bangkok, Tayland, ss. 1–6. https://doi.org/10.1109/ICCI64209.2025.10987422. CR - Özker, U. (2021). İçerik tabanlı oltalama saldırısı tespit sistemi (Yüksek lisans tezi). İstanbul Kültür Üniversitesi. CR - Rabbi, M. F., Champa, A. I. ve Zibran, M. F. (2023). Phishy? Detecting phishing emails using ML and NLP. 2023 IEEE/ACIS 21st Inter-national Conference on Software Engineering Research, Management and Applications (SERA), 1–6. https://doi.org/10.1109/SERA57763.2023.10197758. CR - Roumeliotis, K. I., Tselikas, N. D. ve Nasiopoulos, D. K. (2024). Next-generation spam filtering: Comparative fine-tuning of LLMs, NLPs, and CNN models for email spam classification. Electronics, 13(11), 2034. https://doi.org/10.3390/electronics13112034. CR - Salloum, S., Gaber, T., Vadera, S. ve Shaalan, K. (2022). A systematic literature review on phishing email detection using natural langua-ge processing techniques. IEEE Access, 10, 65703–65734. https://doi.org/10.1109/ACCESS.2022.3183083. CR - Salloum, S., Gaber, T., Vadera, S. ve Shaalan, K. (2021). Phishing email detection using natural language processing techniques: A litera-ture survey. Procedia Computer Science, 189, 19–28. https://doi.org/10.1016/j.procs.2021.05.077. CR - Sawant, S., Savakhande, R., Sankhe, O. ve Tamboli, S. (2023). Phishing detection by integrating machine learning and deep lear-ning. 2023 International Conference on Advances in Computing and Communications (ICACC), New Delhi, India, ss. 104–111. https://doi.org/10.1109/ICACC58235.2023.10117854. CR - Toğaçar, M. (2021). Web sitelerinde gerçekleştirilen oltalama saldırılarının yapay zekâ yaklaşımı ile tespiti. Bitlis Eren Üniversitesi Fen Bilimleri Dergisi, 10(4), 1603–1614. https://doi.org/10.17798/bitlisfen.988001. CR - Turhanlar, M. (2019). Detecting Turkish phishing attacks with machine learning classifiers (Yüksek lisans tezi). Sakarya Üniversitesi. Uçar, M. (2020). Phishing detection system using extreme learning machines with different activation function based on majority voting. Politeknik Dergisi, 23(4), 1227–1235. https://doi.org/10.2339/politeknik.1098037 Verma, S., Ayala-Rivera, V. ve Portillo-Dominguez, A. O. (2023). Detection of phishing in mobile instant messaging using natural langua-ge processing and machine learning. In CONISOFT 2023: 11th International Conference in Software Engineering Research and Innovation (s. 106–113). IEEE. https://doi.org/10.1109/CONISOFT58849.2023.00029 UR - https://dergipark.org.tr/tr/pub/cdej/issue//1702075 L1 - https://dergipark.org.tr/tr/download/article-file/4882417 ER -