Research Article

Efficient Text Classification with Deep Learning on Imbalanced Data Improved with Better Distribution

Volume: 17 Number: 1 March 20, 2022
EN

Efficient Text Classification with Deep Learning on Imbalanced Data Improved with Better Distribution

Abstract

Technological developments and the widespread use of the internet cause the data produced on a daily basis to increase exponentially. An important part of this deluge of data is text data from applications such as social media, communication tools, customer service. The processing of this large amount of text data needs automation. Significant successes have been achieved in text processing recently. Especially with deep learning applications, text classification performance has become quite satisfactory. In this study, we proposed an innovative data distribution algorithm that reduces the data imbalance problem to further increase the text classification success. Experiment results show that there is an improvement of approximately 3.5% in classification accuracy and over 3 in F1 score with the algorithm that optimizes the data distribution.

Keywords

References

  1. [1] Lai S, Xu L, Liu K, Zhao J. Recurrent convolutional neural networks for text classification. In: 29th AAAI conference on artificial intelligence, Austin, Texas USA, January 25–30, 2015 2015.
  2. [2] Minaee S, Kalchbrenner N, Cambria E, Nikzad N, Chenaghlu M, Gao J. Deep Learning-based Text Classification: A Comprehensive Review. ACM Computing Surveys (CSUR), vol. 54, no. 3, pp. 1-40, 2021.
  3. [3] Tufek A, Aktas M S. On the provenance extraction techniques from large scale log files: a case study for the numerical weather prediction models. In: European Conference on Parallel Processing, 2020 : Springer, pp. 249-260.
  4. [4] Tezgider M, Yildiz B, Aydin G. Text classification using improved bidirectional transformer. Concurrency and Computation: Practice and Experience, p. e6486.
  5. [5] Soyalp G, Alar A, Ozkanli K, Yildiz B. Improving Text Classification with Transformer. In: 2021 6th International Conference on Computer Science and Engineering (UBMK), 2021; Ankara, Turkey, IEEE pp. 707-712.
  6. [6] Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed representations of words and phrases and their compositionality. In: 26th International Conference on Neural Information Processing Systems, 2013, Lake Tahoe, Nevada, pp. 3111-3119.
  7. [7] Joulin A, Grave E, Bojanowski P, Mikolov T. Bag of Tricks for Efficient Text Classification. In: 15th Conference of the European Chapter of the Association for Computational Linguistics, April 2017, Valencia, Spain: Association for Computational Linguistics, in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pp. 427-431.
  8. [8] Pennington J, Socher R, Manning C. Glove: Global Vectors for Word Representation. In: The Conference on Empirical Methods in Natural Language Processing (EMNLP). October 2014 Doha, Qatar: Association for Computational Linguistics, pp. 1532-1543.

Details

Primary Language

English

Subjects

Engineering

Journal Section

Research Article

Publication Date

March 20, 2022

Submission Date

February 6, 2022

Acceptance Date

February 22, 2022

Published in Issue

Year 2022 Volume: 17 Number: 1

APA
Yıldız, B. (2022). Efficient Text Classification with Deep Learning on Imbalanced Data Improved with Better Distribution. Turkish Journal of Science and Technology, 17(1), 89-98. https://doi.org/10.55525/tjst.1068940
AMA
1.Yıldız B. Efficient Text Classification with Deep Learning on Imbalanced Data Improved with Better Distribution. TJST. 2022;17(1):89-98. doi:10.55525/tjst.1068940
Chicago
Yıldız, Beytullah. 2022. “Efficient Text Classification With Deep Learning on Imbalanced Data Improved With Better Distribution”. Turkish Journal of Science and Technology 17 (1): 89-98. https://doi.org/10.55525/tjst.1068940.
EndNote
Yıldız B (March 1, 2022) Efficient Text Classification with Deep Learning on Imbalanced Data Improved with Better Distribution. Turkish Journal of Science and Technology 17 1 89–98.
IEEE
[1]B. Yıldız, “Efficient Text Classification with Deep Learning on Imbalanced Data Improved with Better Distribution”, TJST, vol. 17, no. 1, pp. 89–98, Mar. 2022, doi: 10.55525/tjst.1068940.
ISNAD
Yıldız, Beytullah. “Efficient Text Classification With Deep Learning on Imbalanced Data Improved With Better Distribution”. Turkish Journal of Science and Technology 17/1 (March 1, 2022): 89-98. https://doi.org/10.55525/tjst.1068940.
JAMA
1.Yıldız B. Efficient Text Classification with Deep Learning on Imbalanced Data Improved with Better Distribution. TJST. 2022;17:89–98.
MLA
Yıldız, Beytullah. “Efficient Text Classification With Deep Learning on Imbalanced Data Improved With Better Distribution”. Turkish Journal of Science and Technology, vol. 17, no. 1, Mar. 2022, pp. 89-98, doi:10.55525/tjst.1068940.
Vancouver
1.Beytullah Yıldız. Efficient Text Classification with Deep Learning on Imbalanced Data Improved with Better Distribution. TJST. 2022 Mar. 1;17(1):89-98. doi:10.55525/tjst.1068940

Cited By