Research Article

A Multi-Metric Model for analyzing and comparing extractive text summarization approaches and algorithms on scientific papers

Volume: 15 Number: 1 March 29, 2024
TR EN

A Multi-Metric Model for analyzing and comparing extractive text summarization approaches and algorithms on scientific papers

Abstract

In today's world, where data and information are increasingly proliferating, text summarization and technologies play a critical role in making large amounts of text data more accessible and meaningful. In business, the news industry, academic research, and many other fields, text summarization helps make quick decisions, access information faster, and manage resources more effectively. Additionally, text summarization research is conducted to further improve these technologies and develop new methods and algorithms to provide better summarization of texts. Therefore, text summarization and research in this field are of great importance in the information age. In this study, a new operating model for text summarization that can be applied to different algorithms is proposed and evaluated. Sixteen summarization algorithms covering six approaches (statistical, graph-based, content-based, pointer-based, position-based, and user-oriented) were implemented and tested on 50 different full-text article datasets. Four evaluation criteria (BLEU, Rouge-N, Rouge-L, METEOR) were used to assess the similarity between the generated summaries and the original summaries. The performance of the algorithms within each approach was averaged and the overall best-performing algorithm was selected. This best algorithm was subjected to further analysis through Topic Modelling and Keyword Extraction to identify key topics and keywords within the summarised text. The proposed model provides a standardized workflow for developing and thoroughly testing summarization algorithms across datasets and evaluation metrics to determine the most appropriate summarization approach. This study demonstrates the effectiveness of the model on a variety of algorithm types and text sources.

Keywords

References

  1. [1] W. S. El-Kassas, C. R. Salama, A. A. Rafea, and H. K. Mohamed, “Automatic text summarization: A comprehensive survey,” Expert Syst Appl, vol. 165, p. 113679, 2021.
  2. [2] A. Dash, A. Shandilya, A. Biswas, K. Ghosh, S. Ghosh, and A. Chakraborty, “Summarizing user-generated textual content: Motivation and methods for fairness in algorithmic summaries,” Proc ACM Hum Comput Interact, vol. 3, no. CSCW, pp. 1–28, 2019.
  3. [3] N. Alami, M. El Mallahi, H. Amakdouf, and H. Qjidaa, “Hybrid method for text summarization based on statistical and semantic treatment,” Multimed Tools Appl, vol. 80, pp. 19567–19600, 2021.
  4. [4] A. Kanapala, S. Pal, and R. Pamula, “Text summarization from legal documents: a survey,” Artif Intell Rev, vol. 51, pp. 371–402, 2019.
  5. [5] S. Song, H. Huang, and T. Ruan, “Abstractive text summarization using LSTM-CNN based deep learning,” Multimed Tools Appl, vol. 78, pp. 857–875, 2019.
  6. [6] T. Liu, “A Hybrid Automatic Text summarization Model for Judgment Documents”.
  7. [7] D. Yadav, J. Desai, and A. K. Yadav, “Automatic text summarization methods: A comprehensive review,” arXiv preprint arXiv:2204.01849, 2022.
  8. [8] G. Erkan and D. R. Radev, “Lexrank: Graph-based lexical centrality as salience in text summarization,” Journal of artificial intelligence research, vol. 22, pp. 457–479, 2004.

Details

Primary Language

English

Subjects

Natural Language Processing

Journal Section

Research Article

Early Pub Date

March 29, 2024

Publication Date

March 29, 2024

Submission Date

October 16, 2023

Acceptance Date

February 18, 2024

Published in Issue

Year 2024 Volume: 15 Number: 1

IEEE
[1]M. A. Dursun and S. Serttaş, “A Multi-Metric Model for analyzing and comparing extractive text summarization approaches and algorithms on scientific papers”, DUJE, vol. 15, no. 1, pp. 31–48, Mar. 2024, doi: 10.24012/dumf.1376978.

Cited By