COMPARATIVE ANALYSIS OF ETHICAL INCIDENT APPROACHES IN GENERATIVE ARTIFICIAL INTELLIGENCE APPLICATIONS UTILIZING LARGE LANGUAGE MODELS (LLM)

Serdar Biroğul

doi:10.22531/muglajsci.1637684

Research Article

COMPARATIVE ANALYSIS OF ETHICAL INCIDENT APPROACHES IN GENERATIVE ARTIFICIAL INTELLIGENCE APPLICATIONS UTILIZING LARGE LANGUAGE MODELS (LLM)

Year 2025, Volume: 11 Issue: 1, 55 - 72, 30.06.2025

Serdar Biroğul

https://doi.org/10.22531/muglajsci.1637684

Abstract

In this study, the response performances of generative artificial intelligence applications used in many different business areas to ethical situations were observed. It is important to carefully examine the responses produced by generative artificial intelligence applications with Large Language Model (LLM) in all business areas. In the study, the answers to sample ethical cases in the context of LLM were examined through 5 large generative artificial intelligence applications with LLM structure. The reasons, explanations, justification elements and interpretations given by Deepseek, ChatGpt 4o, QwenChat 2.5 Max, Gemini 2.0 Flash and Copilot applications were requested in their responses to ethical cases. According to the comparison results, the agreement and disagreement between the applications were also examined and the approaches of LLMs to ethical issues were revealed through the answers they gave. The reason for examining ethical cases in this study is that there is no absolute one-way answer to ethical cases. Ethical situation evaluations that vary from person to person are also a challenging problem area in terms of LLM applications. 13 sample ethical cases were explained and questions were asked to these 5 generative artificial intelligence applications without any prior preparation stage. In the answers received, generative artificial intelligence applications were asked to base them on and comment on them. As a result of the findings obtained, evaluations were made according to common points, differences, and general trends. These findings show that LLMs have made progress in addressing ethical issues. It has been observed that applications should continue to develop in producing consistent and fair solutions to ethical dilemmas. This situation emphasizes once again the importance of human control in the ethical decision-making processes of LLMs and the importance of the integration of ethical rules.

Keywords

Large Language Models , Generative Artificial Intelligence , Ethic , Ethic Artificial Intelligence

References

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P. & Amodei, D., “Language models are few-shot learners.”, Advances in Neural Information Processing Systems, 33, 1877-1901, 2020.
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., & Liu, P. J. “Exploring the limits of transfer learning with a unified text-to-text transformer.”, Journal of Machine Learning Research, 21(140), 1-67, 2020.
Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., & Liang, P., “On the opportunities and risks of foundation models.” arXiv preprint arXiv:2108.07258, 2021.
Strubell, E., Ganesh, A., & McCallum, A., “Energy and policy considerations for deep learning in NLP.”, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 3645-3650, 2019
Carlini, N., Tramer, F., Wallace, E., Jagielski, M., Herbert-Voss, A., Lee, K., & Raffel, C., “Extracting training data from large language models.”, USENIX Security Symposium, 2633-2650, 2021
Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S., “On the dangers of stochastic parrots: Can language models be too big?. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610-623, 2021.
Marcus, G., “The next decade in AI: Four steps towards robust artificial intelligence.”, arXiv:2002.06177, 2020.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I., “Attention is All You Need.” arXiv https://arxiv.org/abs/1706.03762, 2017
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K., “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” arXiv https://arxiv.org/abs/1810.04805, 2019.
Hendrycks, D., & Gimpel, K., “Gaussian Error Linear Units (GELUs)”, arXiv. https://arxiv.org/abs/1606.08415, 2016
Ba et al., 2016, Ba, J. L., Kiros, J. R., & Hinton, G. E., “Layer normalization”, arXiv https://arxiv.org/abs/1607.06450, 2016,
He et al., 2016, He, K., Zhang, X., Ren, S., & Sun, J., “Deep residual learning for image recognition.”, In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
DeepSeek AI, “DeepSeek-V2: Advancing Large Language Model Capabilities.”, DeepSeek AI Technical Report, 2024.
OpenAI, “GPT-4 Technical Report”, arXiv https://arxiv.org/abs/2303.08774, 2023
Alibaba Cloud, “Qwen Language Models: Multilingual and Scalable AI Solutions.” Alibaba Technical Paper, 2024
Google DeepMind, “Gemini: A Family of Multimodal AI Models.” Google Research, 2024.
GitHub & OpenAI, “GitHub Copilot: AI-Powered Code Assistance.” GitHub Blog, 2023.
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., & Galstyan, “A survey on bias and fairness in machine learning.”, ACM Computing Surveys (CSUR), 54(6), 1-35, 2021.
European Commission., “Ethics guidelines for trustworthy AI.”, European Commission, 2019.
OECD, “OECD principles on artificial intelligence.”, OECD, 2019

BÜYÜK DİL MODELLERİNİ KULLANAN ÜRETKEN YAPAY ZEKA UYGULAMALARININ ETİK OLAYLARA YAKLAŞIMLARININ KARŞILAŞTIRMALI ANALİZİ

Year 2025, Volume: 11 Issue: 1, 55 - 72, 30.06.2025

Serdar Biroğul

https://doi.org/10.22531/muglajsci.1637684

Abstract

Bu çalışmada bir çok farklı iş alanında kullanılan üretken yapay zeka uygulamalarının etik durumlara karşı cevap performanslarının gözlemlenmiştir. Tüm iş alanlarında büyük dil modelli (Large Language Model- LLM) üretken yapay zeka uygulamalarının sorularak karşılık ürettikleri cevaplarının dikkatle incelenmesi önemlidir. Çalışmada LLM yapısına sahip 5 büyük üretken yapay zeka uygulaması üzerinden LLM bağlamında örnek etik olaylara dair cevaplar incelenmiştir. Deepseek, ChatGpt 4o, QwenChat 2.5 Max, Gemini 2.0 Flash ve Copilot uygulamalarının etik olaylara karşı verdikleri cevaplarda nedenleri, açıklamaları, temellendirme unsuları ve yorumlamaları istenmiştir. Karşılaştırma sonuçlarına göre uygulamalar arasındaki hem fikirlik ve ayrılık durumları da incelenmiş ve verdikleri cevaplar üzerinden LLM’lerin etik konulara yaklaşımları ortaya konmuştur. Bu çalışmada etik durumların incelenmesinin sebebi etik olayların mutlak tek yönlü cevabının olmamasıdır. Kişiden kişiye değişen etik durum değerlendirmeleri LLM uygulamaları açısından da zorlayıcı problem alanıdır. 13 örnek etik vaka anlatılmış ve daha önceden ön hazırlık aşaması olmadan bu 5 üretken yapay zeka uygulamalarına sorular sorulmuştur. Alınan cevaplarda üretken yapay zeka uygulamalarının bunları temellendirmeleri ve yorum yapmaları istenmiştir. Elde edilen bulgular neticesinde ortak noktalar, farklılıklar, genel eğilimlere göre değerlendirmeler yapılmıştır. Bu bulgular, LLM’lerin etik sorunları ele alma konusunda yol katettiklerini göstermektedir. Uygulamaların, etik ikilemlere yönelik tutarlı ve adil çözümler üretme konusunda gelişmelerine devam edilmesi gerektiği gözlemlenmiştir. Bu durum, LLM'lerin etik karar verme süreçlerinde insan denetiminin önemli olduğunu ve etik kuralların entegrasyonunun önemini bir kez daha vurgulamaktadır.

Keywords

Büyük Dil Modelleri , Üretken Yapay Zeka , Etik , Etik Yapay Zeka

References

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P. & Amodei, D., “Language models are few-shot learners.”, Advances in Neural Information Processing Systems, 33, 1877-1901, 2020.
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., & Liu, P. J. “Exploring the limits of transfer learning with a unified text-to-text transformer.”, Journal of Machine Learning Research, 21(140), 1-67, 2020.
Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., & Liang, P., “On the opportunities and risks of foundation models.” arXiv preprint arXiv:2108.07258, 2021.
Strubell, E., Ganesh, A., & McCallum, A., “Energy and policy considerations for deep learning in NLP.”, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 3645-3650, 2019
Carlini, N., Tramer, F., Wallace, E., Jagielski, M., Herbert-Voss, A., Lee, K., & Raffel, C., “Extracting training data from large language models.”, USENIX Security Symposium, 2633-2650, 2021
Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S., “On the dangers of stochastic parrots: Can language models be too big?. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610-623, 2021.
Marcus, G., “The next decade in AI: Four steps towards robust artificial intelligence.”, arXiv:2002.06177, 2020.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I., “Attention is All You Need.” arXiv https://arxiv.org/abs/1706.03762, 2017
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K., “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” arXiv https://arxiv.org/abs/1810.04805, 2019.
Hendrycks, D., & Gimpel, K., “Gaussian Error Linear Units (GELUs)”, arXiv. https://arxiv.org/abs/1606.08415, 2016
Ba et al., 2016, Ba, J. L., Kiros, J. R., & Hinton, G. E., “Layer normalization”, arXiv https://arxiv.org/abs/1607.06450, 2016,
He et al., 2016, He, K., Zhang, X., Ren, S., & Sun, J., “Deep residual learning for image recognition.”, In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
DeepSeek AI, “DeepSeek-V2: Advancing Large Language Model Capabilities.”, DeepSeek AI Technical Report, 2024.
OpenAI, “GPT-4 Technical Report”, arXiv https://arxiv.org/abs/2303.08774, 2023
Alibaba Cloud, “Qwen Language Models: Multilingual and Scalable AI Solutions.” Alibaba Technical Paper, 2024
Google DeepMind, “Gemini: A Family of Multimodal AI Models.” Google Research, 2024.
GitHub & OpenAI, “GitHub Copilot: AI-Powered Code Assistance.” GitHub Blog, 2023.
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., & Galstyan, “A survey on bias and fairness in machine learning.”, ACM Computing Surveys (CSUR), 54(6), 1-35, 2021.
European Commission., “Ethics guidelines for trustworthy AI.”, European Commission, 2019.
OECD, “OECD principles on artificial intelligence.”, OECD, 2019

There are 20 citations in total.

Details

Primary Language	English
Subjects	Reinforcement Learning
Journal Section	Articles
Authors	Serdar Biroğul 0000-0003-4966-5970
Publication Date	June 30, 2025
Submission Date	February 11, 2025
Acceptance Date	June 1, 2025
Published in Issue	Year 2025 Volume: 11 Issue: 1

Cite

APA	Biroğul, S. (2025). COMPARATIVE ANALYSIS OF ETHICAL INCIDENT APPROACHES IN GENERATIVE ARTIFICIAL INTELLIGENCE APPLICATIONS UTILIZING LARGE LANGUAGE MODELS (LLM). Mugla Journal of Science and Technology, 11(1), 55-72. https://doi.org/10.22531/muglajsci.1637684
AMA	Biroğul S. COMPARATIVE ANALYSIS OF ETHICAL INCIDENT APPROACHES IN GENERATIVE ARTIFICIAL INTELLIGENCE APPLICATIONS UTILIZING LARGE LANGUAGE MODELS (LLM). Mugla Journal of Science and Technology. June 2025;11(1):55-72. doi:10.22531/muglajsci.1637684
Chicago	Biroğul, Serdar. “COMPARATIVE ANALYSIS OF ETHICAL INCIDENT APPROACHES IN GENERATIVE ARTIFICIAL INTELLIGENCE APPLICATIONS UTILIZING LARGE LANGUAGE MODELS (LLM)”. Mugla Journal of Science and Technology 11, no. 1 (June 2025): 55-72. https://doi.org/10.22531/muglajsci.1637684.
EndNote	Biroğul S (June 1, 2025) COMPARATIVE ANALYSIS OF ETHICAL INCIDENT APPROACHES IN GENERATIVE ARTIFICIAL INTELLIGENCE APPLICATIONS UTILIZING LARGE LANGUAGE MODELS (LLM). Mugla Journal of Science and Technology 11 1 55–72.
IEEE	S. Biroğul, “COMPARATIVE ANALYSIS OF ETHICAL INCIDENT APPROACHES IN GENERATIVE ARTIFICIAL INTELLIGENCE APPLICATIONS UTILIZING LARGE LANGUAGE MODELS (LLM)”, Mugla Journal of Science and Technology, vol. 11, no. 1, pp. 55–72, 2025, doi: 10.22531/muglajsci.1637684.
ISNAD	Biroğul, Serdar. “COMPARATIVE ANALYSIS OF ETHICAL INCIDENT APPROACHES IN GENERATIVE ARTIFICIAL INTELLIGENCE APPLICATIONS UTILIZING LARGE LANGUAGE MODELS (LLM)”. Mugla Journal of Science and Technology 11/1 (June2025), 55-72. https://doi.org/10.22531/muglajsci.1637684.
JAMA	Biroğul S. COMPARATIVE ANALYSIS OF ETHICAL INCIDENT APPROACHES IN GENERATIVE ARTIFICIAL INTELLIGENCE APPLICATIONS UTILIZING LARGE LANGUAGE MODELS (LLM). Mugla Journal of Science and Technology. 2025;11:55–72.
MLA	Biroğul, Serdar. “COMPARATIVE ANALYSIS OF ETHICAL INCIDENT APPROACHES IN GENERATIVE ARTIFICIAL INTELLIGENCE APPLICATIONS UTILIZING LARGE LANGUAGE MODELS (LLM)”. Mugla Journal of Science and Technology, vol. 11, no. 1, 2025, pp. 55-72, doi:10.22531/muglajsci.1637684.
Vancouver	Biroğul S. COMPARATIVE ANALYSIS OF ETHICAL INCIDENT APPROACHES IN GENERATIVE ARTIFICIAL INTELLIGENCE APPLICATIONS UTILIZING LARGE LANGUAGE MODELS (LLM). Mugla Journal of Science and Technology. 2025;11(1):55-72.

Download Cover Image

Article Files

Full Text

Mugla Journal of Science and Technology (MJST) is licensed under the Creative Commons Attribution-Noncommercial-Pseudonymity License 4.0 international license.