Research Article

NOVEL TERM WEIGHTING METHODS FOR TEXT CLASSIFICATION BASED ON ECONOMIC INEQUALITY METRICS

Volume: 27 Number: 1 March 27, 2026
EN TR

NOVEL TERM WEIGHTING METHODS FOR TEXT CLASSIFICATION BASED ON ECONOMIC INEQUALITY METRICS

Abstract

Term weighting plays a critical role in text classification tasks. Traditional methods, with a few exceptions, make limited or inadequate use of distributional characteristics of terms across classes. The core hypothesis in this study is that a term’s weight should be proportional to its uneven distribution across classes. Therefore, the proposed methods prioritize terms concentrated around one or a few classes rather than terms almost evenly distrusted across all classes. To implement this idea, we introduce a family of novel term weighting methods based on economic inequality metrics. These metrics are typically used to measure the unfairness of income distribution in a population, and adapt them to characterize term distributions. To quantify distributional unevenness or imbalance to assess term significance, we select one representative method from each of three major categories of inequality indices: Lorenz curve-based (Schultz), entropy-based (Theil with two variants), and social welfare-based (Atkinson). Experiments with four benchmark datasets (20NG, R8, R52, and WebKB) using two classifiers (Multinomial Naïve Bayes and Support Vector Machines) on f1-micro and f1-macro evaluation metrics have been conducted. The experimental results demonstrate that the proposed term weighting methods, particularly the method based on Schultz index, consistently demonstrate superior or highly competitive performances compared to both traditional and state-of-the-art term weighting approaches. Experimental findings confirm the validity of exploiting economic inequality principles for quantifying inter-class distributional characteristics of terms in term weighting. Thus, this work not only validates the effectives of proposed methods but also demonstrate the value of interdisciplinary work in term weighting literature.

Keywords

References

  1. [1] ITU. Measuring digital development: Facts and Figures 2024. International Telecommunication Union, 2024.
  2. [2] Mao Y, Liu Q, Zhang Y. Sentiment analysis methods, applications, and challenges: A systematic literature review. J King Saud Univ Comput Inf Sci 2024; 36(4).
  3. [3] Ahmed N, Amin R, Aldabbas H, Koundal D, Alouffi B, Shah T. Machine learning techniques for spam detection in email and iot platforms: analysis and research challenges. Secur Commun Netw 2022.
  4. [4] Wu H, Zhang Z, Wu W. Exploring syntactic and semantic features for authorship attribution. App Soft Comput 2021; 111.
  5. [5] Alnabhan MQ, Branco P. Fake news detection using deep learning: a systematic literature review. IEEE Access 2024; 12: 114435-114459.
  6. [6] Sun G, Cheng Y, Zhang Z, Tong X, Chai T. Text classification with improved word embedding and adaptive segmentation. Expert Syst Appl 2024; 238.
  7. [7] Schutz RR. On the measurement of income inequality. Am Econ Rev 1951; 41(1): 107-122.
  8. [8] Hoover EM. The measurement of industrial localization. Rev Econ Stat 1936; 18(4): 162-171.

Details

Primary Language

English

Subjects

Supervised Learning, Classification Algorithms, Natural Language Processing

Journal Section

Research Article

Publication Date

March 27, 2026

Submission Date

September 15, 2025

Acceptance Date

March 17, 2026

Published in Issue

Year 2026 Volume: 27 Number: 1

APA
Okkalıoğlu, M. (2026). NOVEL TERM WEIGHTING METHODS FOR TEXT CLASSIFICATION BASED ON ECONOMIC INEQUALITY METRICS. Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering, 27(1), 125-148. https://doi.org/10.18038/estubtda.1784468
AMA
1.Okkalıoğlu M. NOVEL TERM WEIGHTING METHODS FOR TEXT CLASSIFICATION BASED ON ECONOMIC INEQUALITY METRICS. Estuscience - Se. 2026;27(1):125-148. doi:10.18038/estubtda.1784468
Chicago
Okkalıoğlu, Murat. 2026. “NOVEL TERM WEIGHTING METHODS FOR TEXT CLASSIFICATION BASED ON ECONOMIC INEQUALITY METRICS”. Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering 27 (1): 125-48. https://doi.org/10.18038/estubtda.1784468.
EndNote
Okkalıoğlu M (March 1, 2026) NOVEL TERM WEIGHTING METHODS FOR TEXT CLASSIFICATION BASED ON ECONOMIC INEQUALITY METRICS. Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering 27 1 125–148.
IEEE
[1]M. Okkalıoğlu, “NOVEL TERM WEIGHTING METHODS FOR TEXT CLASSIFICATION BASED ON ECONOMIC INEQUALITY METRICS”, Estuscience - Se, vol. 27, no. 1, pp. 125–148, Mar. 2026, doi: 10.18038/estubtda.1784468.
ISNAD
Okkalıoğlu, Murat. “NOVEL TERM WEIGHTING METHODS FOR TEXT CLASSIFICATION BASED ON ECONOMIC INEQUALITY METRICS”. Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering 27/1 (March 1, 2026): 125-148. https://doi.org/10.18038/estubtda.1784468.
JAMA
1.Okkalıoğlu M. NOVEL TERM WEIGHTING METHODS FOR TEXT CLASSIFICATION BASED ON ECONOMIC INEQUALITY METRICS. Estuscience - Se. 2026;27:125–148.
MLA
Okkalıoğlu, Murat. “NOVEL TERM WEIGHTING METHODS FOR TEXT CLASSIFICATION BASED ON ECONOMIC INEQUALITY METRICS”. Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering, vol. 27, no. 1, Mar. 2026, pp. 125-48, doi:10.18038/estubtda.1784468.
Vancouver
1.Murat Okkalıoğlu. NOVEL TERM WEIGHTING METHODS FOR TEXT CLASSIFICATION BASED ON ECONOMIC INEQUALITY METRICS. Estuscience - Se. 2026 Mar. 1;27(1):125-48. doi:10.18038/estubtda.1784468