Araştırma Makalesi
BibTex RIS Kaynak Göster

Hikâye Kitapları için Transformatör Tabanlı Bir Özetleme Modeli

Yıl 2025, Cilt: 37 Sayı: 1, 401 - 410, 27.03.2025
https://doi.org/10.35234/fumbd.1599232

Öz

Günümüzde, doküman özetleme için önerilen birçok model olmasına rağmen hikâye kitaplarını ele alan özetleme modelleri sınırlıdır. Bu problemi ele almak için yeni bir transformatör tabanlı özetleme modeli önerilmiştir. Önerilen model, KeyBERT, DistilBERT-NER, TF-IDF ve BART modelinden oluşmaktadır. KeyBERT ve TF-IDF, anahtar kelimeler çıkarılması için kullanılmıştır. DistilBERT-NER ise hikâye kitaplarından varlık isimleri çıkarımı için kullanılmıştır. Varlık isimleriyle bu anahtar kelimeler birleştirilerek İngilizce hikâye kitapları için transformatör tabanlı bir özetleme modeli oluşturulmuştur. Önerilen model, ince ayar yapılarak eğitilmiş T5, BART ve PEGASUS modelleriyle karşılaştırılmış ve sonuçlar insanlar tarafından değerlendirilmiştir. Deneysel sonuçlar, önerilen modelin temel yöntemlere kıyasla Rouge-L - F1 değeriyle hikâye özetlemede daha yüksek değerlere ulaştığı gözlemlenmiştir.

Kaynakça

  • Alomari A, Idris N, Sabri AQM and Alsmadi I. Deep reinforcement and transfer learning for abstractive text summarization: A review, Comput Speech Lang 2022; 71:101276.
  • Li P, Lu W and Cheng Q. Generating a related work section for scientific papers: an optimized approach with adopting problem and method information, Scientometrics 2022; 127(8), 4397-4417.
  • Cai X, Liu S, Yang L, Lu Y, Zhao J, Shen D and Liu T. COVIDSum: A linguistically enriched SciBERT-based summarization model for COVID-19 J Biomed Inform 2022; 127:103999.
  • Altmami NI and Menai MEB. Automatic summarization of scientific articles: A survey, J King Saud Univ-Comput Inf Sci 2020; 34(4), 1011-1028.
  • Wang Z, Duan Z, Zhang H, Wang C, Tian L, Chen B, Zhou M. Friendly topic assistant for transformer based abstractive summarization. Empir Methods Nat Lang Process 2020; 485-497.
  • Koncel-Kedziorski R, Bekal D, Luan Y, Lapata M and Hajishirzi H. Text generation from knowledge graphs with graph transformers, arXiv preprint arXiv:1904.023422, 019.
  • Huang Z and Xie Z. A patent keywords extraction method using TextRank model with prior public knowledge , Complex Intell Syst 2022; 8(1), 1-12.
  • Qiu D, Zheng Q. Improving TextRank Algorithm for Automatic Keyword Extraction with Tolerance Rough Set. Int J Fuzzy Syst 2022; 24(3).
  • Khademi ME, Fakhredanesh M. Persian automatic text summarization based on named entity recognition. Iran J Sci Technol Trans Electr Eng 2020; 1-12.
  • Du Y, Zhao Y, Yan J and Li Q. UGDAS: Unsupervised graph-network-based denoiser for abstractive summarization in biomedical domain, Methods 2022; 203, 160-166.
  • Xiao L, He H, Jin Y. FusionSum: Abstractive summarization with sentence fusion and cooperative reinforcement learning. Knowl-Based Syst 2022; 243: 108483.
  • Moirangthem DS and Lee M. Abstractive summarization of long texts by representing multiple compositionalities with temporal hierarchical pointer generator network, Neural Networks 2020; 124, 1-11.
  • Liang Z, Du J and Li C. Abstractive social media text summarization using selective reinforced Seq2Seq attention model, Neurocomputing 2020; 410, 432-440.
  • Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension, arXiv preprint arXiv:1910.13461, 2019.
  • Yu T, Su D, Dai W and Fung P. Dimsum@ laysumm 20: Bart-based approach for scientific document summarization, arXiv preprint arXiv:2010.09252, 2020.
  • Bajaj A, Dangati P, Krishna K, Ashok Kumar P, Uppaal R, Windsor B, Brenner E, Dotterrer D, Das R, McCallum A. Long Document Summarization in a Low Resource Setting using Pretrained Language Models, arXiv preprint arXiv:2103.00751, 2021.
  • Christian H, Agus MP, Suhartono D. Single document automatic text summarization using term frequency-inverse document frequency (TF-IDF). ComTech 2016; 7(4): 285-294.
  • Deng Z, Ma F, Lan R, Huang W and Luo X. A two-stage Chinese text summarization algorithm using keyword information and adversarial learning, Neurocomputing 2021; 425:117-126.
  • Li C, Xu W, Li S, Gao S. Guiding generation for abstractive text summarization based on key information guide network. NAACL HLT 2018; 2: 55-60.
  • Salton G, Buckley C. Term-weighting approaches in automatic text retrieval. Inf Process Manag 1988; 24(5): 513-523.
  • Li J, Tang T, Zhao WX, Wei Z, Yuan NJ and Wen JR. Few-shot knowledge graph-to-text generation with pretrained language models, arXiv preprint arXiv:2106.01623, 2021.
  • Xiaoye W and Mizuho I. Extractive Summarization Utilizing Keyphrases by Finetuning BERT-Based Model. In International Conference on Asian Digital Libraries, Cham: Springer International Publishing, 2022; 59-72.
  • Yoo C, Lee H. Improving Abstractive Dialogue Summarization Using Keyword Extraction. Appl Sci 2023; 13(17): 9771.
  • Zhao Z, Hou Y, Wang D, Yu M, Liu C and Ma X. Educational question generation of children storybooks via question type distribution learning and event-centric summarization. arXiv preprint arXiv:2203.14187, 2022.
  • Ling Z, Xie Y, Dong C and Shen Y. Enhancing Factual Consistency in Text Summarization via Counterfactual Debiasing. In Proceedings of the 31st International Conference on Computational Linguistics 2025, pp. 7912-7924.
  • Upadhyay A, Bhavsar N, Bhatnagar A, Singh M and Motlicek P. Automatic summarization for creative writing: BART based pipeline method for generating summary of movie scripts. In Proceedings of The Workshop on Automatic Summarization for Creative Writing, 2022, pp. 44-50.
  • Liu Y, Maier W, Minker W and Ultes S. Empathetic dialogue generation with pre-trained RoBERTa-GPT2 and external knowledge. In Conversational AI for Natural Human-Centric Interaction: International Workshop on Spoken Dialogue System Technology, Singapore: Springer Nature Singapore, 2022; 67-81.
  • Sanh V, Debut L, Chaumond J and Wolf T. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter, arXiv preprint arXiv:1910.01108, 2019.
  • Lin CY, Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, 2024; 74-81.
  • Falotico R, Quatto P. Fleiss’ kappa statistic without paradoxes. Qual Quant 2015; 49(2): 463-470.
  • Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ. Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 2020; 21(140): 1-67.
  • İnternet: T5-base fine-tuned fo News Summarization URL: https://huggingface.co/mrm8488/t5-base-finetuned-summarize-news, Son Erişim Tarihi: 08.12.2024.
  • Xu J, Desai S and Durrett G. Understanding neural abstractive summarization models via uncertainty. arXiv preprint arXiv:2010.07882, 2020.
  • İnternet: T5-base fine-tuned fo News Summarization URL: https://huggingface.co/philschmid/bart-large-cnn-samsum, Son Erişim Tarihi: 08.12.2024.
  • Zhang J, Zhao Y, Saleh M and Liu P. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In International Conference on Machine Learning, 2020; 11328-11339.
  • İnternet: T5-base fine-tuned fo News Summarization URL: https://huggingface.co/human-centered-summarization/financial-summarization-pegasus, Son Erişim Tarihi: 08.12.2024.

Transformer-Based Summarization Model for Story Books

Yıl 2025, Cilt: 37 Sayı: 1, 401 - 410, 27.03.2025
https://doi.org/10.35234/fumbd.1599232

Öz

Recently, although there are many proposed models for document summarization, summarization models for storybooks are limited. To address this problem, a new transformer-based summarization model is proposed. The proposed model consists of KeyBERT, DistilBERT-NER, TF-IDF and BART. KeyBERT and TF-IDF are used for keyword extraction. DistilBERT-NER was used to extract entity names from storybooks. By combining these keywords with entity names, a transformer-based summarization model for English storybooks was created. The proposed model is compared with the fine-tuned T5, BART and PEGASUS models and the results are evaluated by humans. The experimental results show that the proposed model achieves higher values of Rouge-L - F1 in story summarization compared to the baseline methods.

Kaynakça

  • Alomari A, Idris N, Sabri AQM and Alsmadi I. Deep reinforcement and transfer learning for abstractive text summarization: A review, Comput Speech Lang 2022; 71:101276.
  • Li P, Lu W and Cheng Q. Generating a related work section for scientific papers: an optimized approach with adopting problem and method information, Scientometrics 2022; 127(8), 4397-4417.
  • Cai X, Liu S, Yang L, Lu Y, Zhao J, Shen D and Liu T. COVIDSum: A linguistically enriched SciBERT-based summarization model for COVID-19 J Biomed Inform 2022; 127:103999.
  • Altmami NI and Menai MEB. Automatic summarization of scientific articles: A survey, J King Saud Univ-Comput Inf Sci 2020; 34(4), 1011-1028.
  • Wang Z, Duan Z, Zhang H, Wang C, Tian L, Chen B, Zhou M. Friendly topic assistant for transformer based abstractive summarization. Empir Methods Nat Lang Process 2020; 485-497.
  • Koncel-Kedziorski R, Bekal D, Luan Y, Lapata M and Hajishirzi H. Text generation from knowledge graphs with graph transformers, arXiv preprint arXiv:1904.023422, 019.
  • Huang Z and Xie Z. A patent keywords extraction method using TextRank model with prior public knowledge , Complex Intell Syst 2022; 8(1), 1-12.
  • Qiu D, Zheng Q. Improving TextRank Algorithm for Automatic Keyword Extraction with Tolerance Rough Set. Int J Fuzzy Syst 2022; 24(3).
  • Khademi ME, Fakhredanesh M. Persian automatic text summarization based on named entity recognition. Iran J Sci Technol Trans Electr Eng 2020; 1-12.
  • Du Y, Zhao Y, Yan J and Li Q. UGDAS: Unsupervised graph-network-based denoiser for abstractive summarization in biomedical domain, Methods 2022; 203, 160-166.
  • Xiao L, He H, Jin Y. FusionSum: Abstractive summarization with sentence fusion and cooperative reinforcement learning. Knowl-Based Syst 2022; 243: 108483.
  • Moirangthem DS and Lee M. Abstractive summarization of long texts by representing multiple compositionalities with temporal hierarchical pointer generator network, Neural Networks 2020; 124, 1-11.
  • Liang Z, Du J and Li C. Abstractive social media text summarization using selective reinforced Seq2Seq attention model, Neurocomputing 2020; 410, 432-440.
  • Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension, arXiv preprint arXiv:1910.13461, 2019.
  • Yu T, Su D, Dai W and Fung P. Dimsum@ laysumm 20: Bart-based approach for scientific document summarization, arXiv preprint arXiv:2010.09252, 2020.
  • Bajaj A, Dangati P, Krishna K, Ashok Kumar P, Uppaal R, Windsor B, Brenner E, Dotterrer D, Das R, McCallum A. Long Document Summarization in a Low Resource Setting using Pretrained Language Models, arXiv preprint arXiv:2103.00751, 2021.
  • Christian H, Agus MP, Suhartono D. Single document automatic text summarization using term frequency-inverse document frequency (TF-IDF). ComTech 2016; 7(4): 285-294.
  • Deng Z, Ma F, Lan R, Huang W and Luo X. A two-stage Chinese text summarization algorithm using keyword information and adversarial learning, Neurocomputing 2021; 425:117-126.
  • Li C, Xu W, Li S, Gao S. Guiding generation for abstractive text summarization based on key information guide network. NAACL HLT 2018; 2: 55-60.
  • Salton G, Buckley C. Term-weighting approaches in automatic text retrieval. Inf Process Manag 1988; 24(5): 513-523.
  • Li J, Tang T, Zhao WX, Wei Z, Yuan NJ and Wen JR. Few-shot knowledge graph-to-text generation with pretrained language models, arXiv preprint arXiv:2106.01623, 2021.
  • Xiaoye W and Mizuho I. Extractive Summarization Utilizing Keyphrases by Finetuning BERT-Based Model. In International Conference on Asian Digital Libraries, Cham: Springer International Publishing, 2022; 59-72.
  • Yoo C, Lee H. Improving Abstractive Dialogue Summarization Using Keyword Extraction. Appl Sci 2023; 13(17): 9771.
  • Zhao Z, Hou Y, Wang D, Yu M, Liu C and Ma X. Educational question generation of children storybooks via question type distribution learning and event-centric summarization. arXiv preprint arXiv:2203.14187, 2022.
  • Ling Z, Xie Y, Dong C and Shen Y. Enhancing Factual Consistency in Text Summarization via Counterfactual Debiasing. In Proceedings of the 31st International Conference on Computational Linguistics 2025, pp. 7912-7924.
  • Upadhyay A, Bhavsar N, Bhatnagar A, Singh M and Motlicek P. Automatic summarization for creative writing: BART based pipeline method for generating summary of movie scripts. In Proceedings of The Workshop on Automatic Summarization for Creative Writing, 2022, pp. 44-50.
  • Liu Y, Maier W, Minker W and Ultes S. Empathetic dialogue generation with pre-trained RoBERTa-GPT2 and external knowledge. In Conversational AI for Natural Human-Centric Interaction: International Workshop on Spoken Dialogue System Technology, Singapore: Springer Nature Singapore, 2022; 67-81.
  • Sanh V, Debut L, Chaumond J and Wolf T. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter, arXiv preprint arXiv:1910.01108, 2019.
  • Lin CY, Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, 2024; 74-81.
  • Falotico R, Quatto P. Fleiss’ kappa statistic without paradoxes. Qual Quant 2015; 49(2): 463-470.
  • Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ. Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 2020; 21(140): 1-67.
  • İnternet: T5-base fine-tuned fo News Summarization URL: https://huggingface.co/mrm8488/t5-base-finetuned-summarize-news, Son Erişim Tarihi: 08.12.2024.
  • Xu J, Desai S and Durrett G. Understanding neural abstractive summarization models via uncertainty. arXiv preprint arXiv:2010.07882, 2020.
  • İnternet: T5-base fine-tuned fo News Summarization URL: https://huggingface.co/philschmid/bart-large-cnn-samsum, Son Erişim Tarihi: 08.12.2024.
  • Zhang J, Zhao Y, Saleh M and Liu P. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In International Conference on Machine Learning, 2020; 11328-11339.
  • İnternet: T5-base fine-tuned fo News Summarization URL: https://huggingface.co/human-centered-summarization/financial-summarization-pegasus, Son Erişim Tarihi: 08.12.2024.
Toplam 36 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Konular Doğal Dil İşleme
Bölüm MBD
Yazarlar

Mehtap Ülker 0000-0001-8680-8518

Ahmet Bedri Özer 0000-0002-8005-7386

Yayımlanma Tarihi 27 Mart 2025
Gönderilme Tarihi 10 Aralık 2024
Kabul Tarihi 13 Mart 2025
Yayımlandığı Sayı Yıl 2025 Cilt: 37 Sayı: 1

Kaynak Göster

APA Ülker, M., & Özer, A. B. (2025). Hikâye Kitapları için Transformatör Tabanlı Bir Özetleme Modeli. Fırat Üniversitesi Mühendislik Bilimleri Dergisi, 37(1), 401-410. https://doi.org/10.35234/fumbd.1599232
AMA Ülker M, Özer AB. Hikâye Kitapları için Transformatör Tabanlı Bir Özetleme Modeli. Fırat Üniversitesi Mühendislik Bilimleri Dergisi. Mart 2025;37(1):401-410. doi:10.35234/fumbd.1599232
Chicago Ülker, Mehtap, ve Ahmet Bedri Özer. “Hikâye Kitapları için Transformatör Tabanlı Bir Özetleme Modeli”. Fırat Üniversitesi Mühendislik Bilimleri Dergisi 37, sy. 1 (Mart 2025): 401-10. https://doi.org/10.35234/fumbd.1599232.
EndNote Ülker M, Özer AB (01 Mart 2025) Hikâye Kitapları için Transformatör Tabanlı Bir Özetleme Modeli. Fırat Üniversitesi Mühendislik Bilimleri Dergisi 37 1 401–410.
IEEE M. Ülker ve A. B. Özer, “Hikâye Kitapları için Transformatör Tabanlı Bir Özetleme Modeli”, Fırat Üniversitesi Mühendislik Bilimleri Dergisi, c. 37, sy. 1, ss. 401–410, 2025, doi: 10.35234/fumbd.1599232.
ISNAD Ülker, Mehtap - Özer, Ahmet Bedri. “Hikâye Kitapları için Transformatör Tabanlı Bir Özetleme Modeli”. Fırat Üniversitesi Mühendislik Bilimleri Dergisi 37/1 (Mart 2025), 401-410. https://doi.org/10.35234/fumbd.1599232.
JAMA Ülker M, Özer AB. Hikâye Kitapları için Transformatör Tabanlı Bir Özetleme Modeli. Fırat Üniversitesi Mühendislik Bilimleri Dergisi. 2025;37:401–410.
MLA Ülker, Mehtap ve Ahmet Bedri Özer. “Hikâye Kitapları için Transformatör Tabanlı Bir Özetleme Modeli”. Fırat Üniversitesi Mühendislik Bilimleri Dergisi, c. 37, sy. 1, 2025, ss. 401-10, doi:10.35234/fumbd.1599232.
Vancouver Ülker M, Özer AB. Hikâye Kitapları için Transformatör Tabanlı Bir Özetleme Modeli. Fırat Üniversitesi Mühendislik Bilimleri Dergisi. 2025;37(1):401-10.