Araştırma Makalesi

A Multi-Metric Model for analyzing and comparing extractive text summarization approaches and algorithms on scientific papers

Cilt: 15 Sayı: 1 29 Mart 2024
PDF İndir
TR EN

A Multi-Metric Model for analyzing and comparing extractive text summarization approaches and algorithms on scientific papers

Abstract

In today's world, where data and information are increasingly proliferating, text summarization and technologies play a critical role in making large amounts of text data more accessible and meaningful. In business, the news industry, academic research, and many other fields, text summarization helps make quick decisions, access information faster, and manage resources more effectively. Additionally, text summarization research is conducted to further improve these technologies and develop new methods and algorithms to provide better summarization of texts. Therefore, text summarization and research in this field are of great importance in the information age. In this study, a new operating model for text summarization that can be applied to different algorithms is proposed and evaluated. Sixteen summarization algorithms covering six approaches (statistical, graph-based, content-based, pointer-based, position-based, and user-oriented) were implemented and tested on 50 different full-text article datasets. Four evaluation criteria (BLEU, Rouge-N, Rouge-L, METEOR) were used to assess the similarity between the generated summaries and the original summaries. The performance of the algorithms within each approach was averaged and the overall best-performing algorithm was selected. This best algorithm was subjected to further analysis through Topic Modelling and Keyword Extraction to identify key topics and keywords within the summarised text. The proposed model provides a standardized workflow for developing and thoroughly testing summarization algorithms across datasets and evaluation metrics to determine the most appropriate summarization approach. This study demonstrates the effectiveness of the model on a variety of algorithm types and text sources.

Keywords

Kaynakça

  1. [1] W. S. El-Kassas, C. R. Salama, A. A. Rafea, and H. K. Mohamed, “Automatic text summarization: A comprehensive survey,” Expert Syst Appl, vol. 165, p. 113679, 2021.
  2. [2] A. Dash, A. Shandilya, A. Biswas, K. Ghosh, S. Ghosh, and A. Chakraborty, “Summarizing user-generated textual content: Motivation and methods for fairness in algorithmic summaries,” Proc ACM Hum Comput Interact, vol. 3, no. CSCW, pp. 1–28, 2019.
  3. [3] N. Alami, M. El Mallahi, H. Amakdouf, and H. Qjidaa, “Hybrid method for text summarization based on statistical and semantic treatment,” Multimed Tools Appl, vol. 80, pp. 19567–19600, 2021.
  4. [4] A. Kanapala, S. Pal, and R. Pamula, “Text summarization from legal documents: a survey,” Artif Intell Rev, vol. 51, pp. 371–402, 2019.
  5. [5] S. Song, H. Huang, and T. Ruan, “Abstractive text summarization using LSTM-CNN based deep learning,” Multimed Tools Appl, vol. 78, pp. 857–875, 2019.
  6. [6] T. Liu, “A Hybrid Automatic Text summarization Model for Judgment Documents”.
  7. [7] D. Yadav, J. Desai, and A. K. Yadav, “Automatic text summarization methods: A comprehensive review,” arXiv preprint arXiv:2204.01849, 2022.
  8. [8] G. Erkan and D. R. Radev, “Lexrank: Graph-based lexical centrality as salience in text summarization,” Journal of artificial intelligence research, vol. 22, pp. 457–479, 2004.

Ayrıntılar

Birincil Dil

İngilizce

Konular

Doğal Dil İşleme

Bölüm

Araştırma Makalesi

Erken Görünüm Tarihi

29 Mart 2024

Yayımlanma Tarihi

29 Mart 2024

Gönderilme Tarihi

16 Ekim 2023

Kabul Tarihi

18 Şubat 2024

Yayımlandığı Sayı

Yıl 2024 Cilt: 15 Sayı: 1

Kaynak Göster

IEEE
[1]M. A. Dursun ve S. Serttaş, “A Multi-Metric Model for analyzing and comparing extractive text summarization approaches and algorithms on scientific papers”, DÜMF MD, c. 15, sy 1, ss. 31–48, Mar. 2024, doi: 10.24012/dumf.1376978.

Cited By

DUJE tarafından yayınlanan tüm makaleler, Creative Commons Atıf 4.0 Uluslararası Lisansı ile lisanslanmıştır. Bu, orijinal eser ve kaynağın uygun şekilde belirtilmesi koşuluyla, herkesin eseri kopyalamasına, yeniden dağıtmasına, yeniden düzenlemesine, iletmesine ve uyarlamasına izin verir. 24456