Araştırma Makalesi
BibTex RIS Kaynak Göster

A Comprehensive Study on Transformer Architecture: Theoretical Foundations, Structural Components, and a Clinical Application

Yıl 2026, Cilt: 38 Sayı: 1 , 85 - 104 , 29.03.2026
https://doi.org/10.35234/fumbd.1727903
https://izlik.org/JA37JU48XP

Öz

This study provides a comprehensive examination of the Transformer architecture, which has revolutionized the field of Natural Language Processing (NLP). It begins with an overview of previous architectures such as FNN, RNN, LSTM, and GRU, highlighting their limitations. The innovative structure of the Transformer is then analyzed in depth, focusing on components such as self-attention and multi-head attention mechanisms. Mathematical formulations are presented to explain how attention computations are carried out within the model. Furthermore, in the application section of the paper, a BioClinicalBERT-based drug recommendation system is developed, and its performance on clinical texts is evaluated. The results demonstrate the potential of Transformer-based models to deliver specialized solutions in the healthcare domain.

Kaynakça

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. Attention Is All You Need, 2017, arXiv: 1706.03762. doi: 10.48550/arXiv.1706.03762.
  • Cho K, van Merrienboer B, Gülçehre Ç, Bahdanau D, Bougares F, Schwenk H, Bengio Y. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation, 2014, arXiv: 1406.1078. doi: 10.48550/arXiv.1406.1078.
  • Hochreiter S, Schmidhuber J. Long Short-Term Memory, Neural Comput, c. 9, s. 1735-1780, 1997, doi: 10.1162/neco.1997.9.8.1735.
  • Chung J, Gülçehre Ç, Cho K, Bengio Y. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling, 2014, arXiv: 1412.3555. doi: 10.48550/arXiv.1412.3555.
  • Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, 2019, arXiv: 1810.04805. doi: 10.48550/arXiv.1810.04805.
  • Radford A, Narasimhan K. Improving Language Understanding by Generative Pre-Training, 2018. Erişim: 18 Ağustos 2025. [Çevrimiçi]. Erişim adresi: https://www.semanticscholar.org/paper/Improving-Language-Understanding-by-Generative-Radford-Narasimhan/cd18800a0fe0b668a1cc19f2ec95b5003d0a5035
  • Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, ve diğerleri. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer, 2023, arXiv: 1910.10683. doi: 10.48550/arXiv.1910.10683.
  • Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, ve diğerleri. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, 2020, arXiv: 2010.11929. doi: 10.48550/arXiv.2010.11929.
  • Zhou H, Zhang S, Peng J, Zhang S, Li J, Xiong B, Zhang W, Xiong W, ve diğerleri. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting, 2020, arXiv: 2012.07436. doi: 10.48550/arXiv.2012.07436.
  • Ling X, Li Z, Wang Y, You Z. Transformers in Protein: A Survey, 2025, arXiv: 2505.20098. doi: 10.48550/arXiv.2505.20098.
  • Nerella S, Bandyopadhyay S, Zhang J, Contreras M, Siegel S, Bumin A, Silva B, Sena J, ve diğerleri. Transformers and large language models in healthcare: A review, Artif Intell Med, c. 154, s. 102900, 2024, doi: 10.1016/j.artmed.2024.102900.
  • Patil R, Boit S, Gudivada V, Nandigam J. A Survey of Text Representation and Embedding Techniques in NLP, IEEE Access, c. 11, s. 36120-36146, 2023, doi: 10.1109/ACCESS.2023.3266377.
  • Ethayarajh K. How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings, 2019, arXiv: 1909.00512. doi: 10.48550/arXiv.1909.00512.
  • Luo Q, Zeng W, Chen M, Peng G, Yuan X, Yin Q. Self-Attention and Transformers: Driving the Evolution of Large Language Models, içinde 2023 IEEE 6th International Conference on Electronic Information and Communication Technology (ICEICT), 2023, s. 401-405. doi: 10.1109/ICEICT57916.2023.10245906.
  • Deora P, Ghaderi R, Taheri H, Thrampoulidis C. On the Optimization and Generalization of Multi-head Attention, 2023, arXiv: 2310.12680. doi: 10.48550/arXiv.2310.12680.
  • Li J, Wang X, Tu Z, Lyu MR. On the diversity of multi-head attention, Neurocomputing, c. 454, s. 14-24, 2021, doi: 10.1016/j.neucom.2021.04.038.
  • He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition, içinde Proc IEEE Conf Comput Vis Pattern Recognit (CVPR), 2016, s. 770-778. doi: 10.1109/CVPR.2016.90.
  • Ba JL, Kiros JR, Hinton GE. Layer Normalization, 2016, arXiv: 1607.06450. doi: 10.48550/arXiv.1607.06450.
  • Xiong R, Yang Y, He D, Zheng K, Zheng S, Xing C, Zhang H, Lan Y, ve diğerleri. On Layer Normalization in the Transformer Architecture, 2020, arXiv: 2002.04745. doi: 10.48550/arXiv.2002.04745.
  • Gerber I. Attention Is Not All You Need: The Importance of Feedforward Networks in Transformer Models, 2025, arXiv: 2505.06633. doi: 10.48550/arXiv.2505.06633.
  • İncidelen M, Aydoğan M. Developing Question-Answering Models in Low-Resource Languages: A Case Study on Turkish Medical Texts Using Transformer-Based Approaches, içinde Proc 8th Int Artif Intell Data Process Symp (IDAP), 2024, s. 1-4. doi: 10.1109/IDAP64064.2024.10711128.
  • Zhou B, Yang G, Shi Z, Ma S. Natural Language Processing for Smart Healthcare, IEEE Rev Biomed Eng, c. 17, s. 4-18, 2024, doi: 10.1109/RBME.2022.3210270.
  • Turchin A, Masharsky S, Zitnik M. Comparison of BERT implementations for natural language processing of narrative medical documents, Inf Med Unlocked, c. 36, s. 101139, 2023, doi: 10.1016/j.imu.2022.101139.
  • Lee J, Yoon W, Kim S, Kim D, Kim S, So Ç, Kang J. BioBERT: a pre-trained biomedical language representation model for biomedical text mining, Bioinformatics, c. 36, sy 4, s. 1234-1240, 2020, doi: 10.1093/bioinformatics/btz682.
  • Biseda B, Desai G, Lin H, Philip A. Prediction of ICD Codes with Clinical BERT Embeddings and Text Augmentation with Label Balancing using MIMIC-III, 2020, arXiv: 2008.10492. doi: 10.48550/arXiv.2008.10492.
  • Alsentzer E, Murphy J, Boag W, Weng WH, Jin D, Naumann T, McDermott M. Publicly Available Clinical BERT Embeddings, 2019, arXiv: 1904.03323. doi: 10.48550/arXiv.1904.03323.
  • emilyalsentzer/Bio_ClinicalBERT · Hugging Face. Erişim: 17 Ağustos 2025. [Çevrimiçi]. Erişim adresi: https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT
  • Lyu W, Dong X, Wong R, Zheng S, Abell-Hart K, Wang F, Chen C. A Multimodal Transformer: Fusing Clinical Notes with Structured EHR Data for Interpretable In-Hospital Mortality Prediction, 2022, arXiv: 2208.10240. doi: 10.48550/arXiv.2208.10240.
  • Yang Z, Mitra A, Liu W, Berlowitz D, Yu H. TransformEHR: transformer-based encoder-decoder generative model to enhance prediction of disease outcomes using electronic health records, Nat Commun, c. 14, sy 1, s. 7857, 2023, doi: 10.1038/s41467-023-43715-z.
  • Cho HN, Jun TJ, Kim YH, Kang H, Ahn I, Gwon H, Kim Y, Seo J, ve diğerleri. Task-Specific Transformer-Based Language Models in Health Care: Scoping Review, JMIR Med Inform, c. 12, s. e49724, 2024, doi: 10.2196/49724.
  • Omolayo O, Taiwo A, Aduloju T. Transformer-based language models for clinical text mining: A systematic review of applications in diagnostic decision support, risk stratification, and electronic health record summarization, Gulf Journal of Advance Business Research, c. 1, s. 81-98, 2025.
  • Levra AG, Gatti M, Mene R, Shiffer D, Costantino G, Solbiati M, Furlan R, Dipaola F. A large language model-based clinical decision support system for syncope recognition in the emergency department: A framework for clinical workflow integration, Eur J Intern Med, c. 131, s. 113-120, 2025, doi: 10.1016/j.ejim.2024.09.017.
  • Ran WH, Xi X, Li F, Lu J, Jiang J, Huang H, Zhang Y, Li S. Structured Semantics from Unstructured Notes: Language Model Approaches to EHR-Based Decision Support. 2025. doi: 10.48550/arXiv.2506.06340.
  • UCI ML Drug Review dataset. Erişim: 25 Mayıs 2025. [Çevrimiçi]. Erişim adresi: https://www.kaggle.com/datasets/jessicali9530/kuc-hackathon-winter-2018.

Transformer Mimarisi Üzerine Kapsamlı Bir İnceleme: Teorik Temeller, Yapısal Özellikler ve Klinik Bir Uygulama

Yıl 2026, Cilt: 38 Sayı: 1 , 85 - 104 , 29.03.2026
https://doi.org/10.35234/fumbd.1727903
https://izlik.org/JA37JU48XP

Öz

Bu çalışma, Doğal Dil İşleme (NLP) alanında devrim yaratan Transformer mimarisinin teorik temellerini, bileşenlerini ve uygulama alanlarını kapsamlı bir biçimde ele almaktadır. İlk olarak, FNN, RNN, LSTM ve GRU gibi önceki mimariler özetlenmiş ve bu yapıların sınırlılıkları açıklanmıştır. Ardından, Transformer mimarisinin yenilikçi yapısı, özellikle self-attention ve multi-head attention mekanizmaları detaylı biçimde analiz edilmiştir. Mimari bileşenlerin matematiksel temsilleri sunularak dikkat hesaplamalarının nasıl gerçekleştiği ortaya konmuştur. Ayrıca, makalenin uygulama bölümünde, BioClinicalBERT temelli bir ilaç öneri sistemi geliştirilmiş ve bu sistemin klinik metinler üzerindeki etkinliği değerlendirilmiştir. Sonuçlar, Transformer tabanlı modellerin sağlık alanında özelleştirilmiş çözümler üretme potansiyelini ortaya koymaktadır.

Kaynakça

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. Attention Is All You Need, 2017, arXiv: 1706.03762. doi: 10.48550/arXiv.1706.03762.
  • Cho K, van Merrienboer B, Gülçehre Ç, Bahdanau D, Bougares F, Schwenk H, Bengio Y. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation, 2014, arXiv: 1406.1078. doi: 10.48550/arXiv.1406.1078.
  • Hochreiter S, Schmidhuber J. Long Short-Term Memory, Neural Comput, c. 9, s. 1735-1780, 1997, doi: 10.1162/neco.1997.9.8.1735.
  • Chung J, Gülçehre Ç, Cho K, Bengio Y. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling, 2014, arXiv: 1412.3555. doi: 10.48550/arXiv.1412.3555.
  • Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, 2019, arXiv: 1810.04805. doi: 10.48550/arXiv.1810.04805.
  • Radford A, Narasimhan K. Improving Language Understanding by Generative Pre-Training, 2018. Erişim: 18 Ağustos 2025. [Çevrimiçi]. Erişim adresi: https://www.semanticscholar.org/paper/Improving-Language-Understanding-by-Generative-Radford-Narasimhan/cd18800a0fe0b668a1cc19f2ec95b5003d0a5035
  • Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, ve diğerleri. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer, 2023, arXiv: 1910.10683. doi: 10.48550/arXiv.1910.10683.
  • Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, ve diğerleri. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, 2020, arXiv: 2010.11929. doi: 10.48550/arXiv.2010.11929.
  • Zhou H, Zhang S, Peng J, Zhang S, Li J, Xiong B, Zhang W, Xiong W, ve diğerleri. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting, 2020, arXiv: 2012.07436. doi: 10.48550/arXiv.2012.07436.
  • Ling X, Li Z, Wang Y, You Z. Transformers in Protein: A Survey, 2025, arXiv: 2505.20098. doi: 10.48550/arXiv.2505.20098.
  • Nerella S, Bandyopadhyay S, Zhang J, Contreras M, Siegel S, Bumin A, Silva B, Sena J, ve diğerleri. Transformers and large language models in healthcare: A review, Artif Intell Med, c. 154, s. 102900, 2024, doi: 10.1016/j.artmed.2024.102900.
  • Patil R, Boit S, Gudivada V, Nandigam J. A Survey of Text Representation and Embedding Techniques in NLP, IEEE Access, c. 11, s. 36120-36146, 2023, doi: 10.1109/ACCESS.2023.3266377.
  • Ethayarajh K. How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings, 2019, arXiv: 1909.00512. doi: 10.48550/arXiv.1909.00512.
  • Luo Q, Zeng W, Chen M, Peng G, Yuan X, Yin Q. Self-Attention and Transformers: Driving the Evolution of Large Language Models, içinde 2023 IEEE 6th International Conference on Electronic Information and Communication Technology (ICEICT), 2023, s. 401-405. doi: 10.1109/ICEICT57916.2023.10245906.
  • Deora P, Ghaderi R, Taheri H, Thrampoulidis C. On the Optimization and Generalization of Multi-head Attention, 2023, arXiv: 2310.12680. doi: 10.48550/arXiv.2310.12680.
  • Li J, Wang X, Tu Z, Lyu MR. On the diversity of multi-head attention, Neurocomputing, c. 454, s. 14-24, 2021, doi: 10.1016/j.neucom.2021.04.038.
  • He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition, içinde Proc IEEE Conf Comput Vis Pattern Recognit (CVPR), 2016, s. 770-778. doi: 10.1109/CVPR.2016.90.
  • Ba JL, Kiros JR, Hinton GE. Layer Normalization, 2016, arXiv: 1607.06450. doi: 10.48550/arXiv.1607.06450.
  • Xiong R, Yang Y, He D, Zheng K, Zheng S, Xing C, Zhang H, Lan Y, ve diğerleri. On Layer Normalization in the Transformer Architecture, 2020, arXiv: 2002.04745. doi: 10.48550/arXiv.2002.04745.
  • Gerber I. Attention Is Not All You Need: The Importance of Feedforward Networks in Transformer Models, 2025, arXiv: 2505.06633. doi: 10.48550/arXiv.2505.06633.
  • İncidelen M, Aydoğan M. Developing Question-Answering Models in Low-Resource Languages: A Case Study on Turkish Medical Texts Using Transformer-Based Approaches, içinde Proc 8th Int Artif Intell Data Process Symp (IDAP), 2024, s. 1-4. doi: 10.1109/IDAP64064.2024.10711128.
  • Zhou B, Yang G, Shi Z, Ma S. Natural Language Processing for Smart Healthcare, IEEE Rev Biomed Eng, c. 17, s. 4-18, 2024, doi: 10.1109/RBME.2022.3210270.
  • Turchin A, Masharsky S, Zitnik M. Comparison of BERT implementations for natural language processing of narrative medical documents, Inf Med Unlocked, c. 36, s. 101139, 2023, doi: 10.1016/j.imu.2022.101139.
  • Lee J, Yoon W, Kim S, Kim D, Kim S, So Ç, Kang J. BioBERT: a pre-trained biomedical language representation model for biomedical text mining, Bioinformatics, c. 36, sy 4, s. 1234-1240, 2020, doi: 10.1093/bioinformatics/btz682.
  • Biseda B, Desai G, Lin H, Philip A. Prediction of ICD Codes with Clinical BERT Embeddings and Text Augmentation with Label Balancing using MIMIC-III, 2020, arXiv: 2008.10492. doi: 10.48550/arXiv.2008.10492.
  • Alsentzer E, Murphy J, Boag W, Weng WH, Jin D, Naumann T, McDermott M. Publicly Available Clinical BERT Embeddings, 2019, arXiv: 1904.03323. doi: 10.48550/arXiv.1904.03323.
  • emilyalsentzer/Bio_ClinicalBERT · Hugging Face. Erişim: 17 Ağustos 2025. [Çevrimiçi]. Erişim adresi: https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT
  • Lyu W, Dong X, Wong R, Zheng S, Abell-Hart K, Wang F, Chen C. A Multimodal Transformer: Fusing Clinical Notes with Structured EHR Data for Interpretable In-Hospital Mortality Prediction, 2022, arXiv: 2208.10240. doi: 10.48550/arXiv.2208.10240.
  • Yang Z, Mitra A, Liu W, Berlowitz D, Yu H. TransformEHR: transformer-based encoder-decoder generative model to enhance prediction of disease outcomes using electronic health records, Nat Commun, c. 14, sy 1, s. 7857, 2023, doi: 10.1038/s41467-023-43715-z.
  • Cho HN, Jun TJ, Kim YH, Kang H, Ahn I, Gwon H, Kim Y, Seo J, ve diğerleri. Task-Specific Transformer-Based Language Models in Health Care: Scoping Review, JMIR Med Inform, c. 12, s. e49724, 2024, doi: 10.2196/49724.
  • Omolayo O, Taiwo A, Aduloju T. Transformer-based language models for clinical text mining: A systematic review of applications in diagnostic decision support, risk stratification, and electronic health record summarization, Gulf Journal of Advance Business Research, c. 1, s. 81-98, 2025.
  • Levra AG, Gatti M, Mene R, Shiffer D, Costantino G, Solbiati M, Furlan R, Dipaola F. A large language model-based clinical decision support system for syncope recognition in the emergency department: A framework for clinical workflow integration, Eur J Intern Med, c. 131, s. 113-120, 2025, doi: 10.1016/j.ejim.2024.09.017.
  • Ran WH, Xi X, Li F, Lu J, Jiang J, Huang H, Zhang Y, Li S. Structured Semantics from Unstructured Notes: Language Model Approaches to EHR-Based Decision Support. 2025. doi: 10.48550/arXiv.2506.06340.
  • UCI ML Drug Review dataset. Erişim: 25 Mayıs 2025. [Çevrimiçi]. Erişim adresi: https://www.kaggle.com/datasets/jessicali9530/kuc-hackathon-winter-2018.
Toplam 34 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Konular Doğal Dil İşleme
Bölüm Araştırma Makalesi
Yazarlar

Şahika Ercan 0009-0005-4596-7914

Erkan Tanyıldızı 0000-0003-2973-9389

Gönderilme Tarihi 26 Haziran 2025
Kabul Tarihi 2 Kasım 2025
Yayımlanma Tarihi 29 Mart 2026
DOI https://doi.org/10.35234/fumbd.1727903
IZ https://izlik.org/JA37JU48XP
Yayımlandığı Sayı Yıl 2026 Cilt: 38 Sayı: 1

Kaynak Göster

APA Ercan, Ş., & Tanyıldızı, E. (2026). Transformer Mimarisi Üzerine Kapsamlı Bir İnceleme: Teorik Temeller, Yapısal Özellikler ve Klinik Bir Uygulama. Fırat Üniversitesi Mühendislik Bilimleri Dergisi, 38(1), 85-104. https://doi.org/10.35234/fumbd.1727903
AMA 1.Ercan Ş, Tanyıldızı E. Transformer Mimarisi Üzerine Kapsamlı Bir İnceleme: Teorik Temeller, Yapısal Özellikler ve Klinik Bir Uygulama. Fırat Üniversitesi Mühendislik Bilimleri Dergisi. 2026;38(1):85-104. doi:10.35234/fumbd.1727903
Chicago Ercan, Şahika, ve Erkan Tanyıldızı. 2026. “Transformer Mimarisi Üzerine Kapsamlı Bir İnceleme: Teorik Temeller, Yapısal Özellikler ve Klinik Bir Uygulama”. Fırat Üniversitesi Mühendislik Bilimleri Dergisi 38 (1): 85-104. https://doi.org/10.35234/fumbd.1727903.
EndNote Ercan Ş, Tanyıldızı E (01 Mart 2026) Transformer Mimarisi Üzerine Kapsamlı Bir İnceleme: Teorik Temeller, Yapısal Özellikler ve Klinik Bir Uygulama. Fırat Üniversitesi Mühendislik Bilimleri Dergisi 38 1 85–104.
IEEE [1]Ş. Ercan ve E. Tanyıldızı, “Transformer Mimarisi Üzerine Kapsamlı Bir İnceleme: Teorik Temeller, Yapısal Özellikler ve Klinik Bir Uygulama”, Fırat Üniversitesi Mühendislik Bilimleri Dergisi, c. 38, sy 1, ss. 85–104, Mar. 2026, doi: 10.35234/fumbd.1727903.
ISNAD Ercan, Şahika - Tanyıldızı, Erkan. “Transformer Mimarisi Üzerine Kapsamlı Bir İnceleme: Teorik Temeller, Yapısal Özellikler ve Klinik Bir Uygulama”. Fırat Üniversitesi Mühendislik Bilimleri Dergisi 38/1 (01 Mart 2026): 85-104. https://doi.org/10.35234/fumbd.1727903.
JAMA 1.Ercan Ş, Tanyıldızı E. Transformer Mimarisi Üzerine Kapsamlı Bir İnceleme: Teorik Temeller, Yapısal Özellikler ve Klinik Bir Uygulama. Fırat Üniversitesi Mühendislik Bilimleri Dergisi. 2026;38:85–104.
MLA Ercan, Şahika, ve Erkan Tanyıldızı. “Transformer Mimarisi Üzerine Kapsamlı Bir İnceleme: Teorik Temeller, Yapısal Özellikler ve Klinik Bir Uygulama”. Fırat Üniversitesi Mühendislik Bilimleri Dergisi, c. 38, sy 1, Mart 2026, ss. 85-104, doi:10.35234/fumbd.1727903.
Vancouver 1.Şahika Ercan, Erkan Tanyıldızı. Transformer Mimarisi Üzerine Kapsamlı Bir İnceleme: Teorik Temeller, Yapısal Özellikler ve Klinik Bir Uygulama. Fırat Üniversitesi Mühendislik Bilimleri Dergisi. 01 Mart 2026;38(1):85-104. doi:10.35234/fumbd.1727903