Technological developments and the widespread use of the internet cause the data produced on a daily basis to increase exponentially. An important part of this deluge of data is text data from applications such as social media, communication tools, customer service. The processing of this large amount of text data needs automation. Significant successes have been achieved in text processing recently. Especially with deep learning applications, text classification performance has become quite satisfactory. In this study, we proposed an innovative data distribution algorithm that reduces the data imbalance problem to further increase the text classification success. Experiment results show that there is an improvement of approximately 3.5% in classification accuracy and over 3 in F1 score with the algorithm that optimizes the data distribution.
Text classification Data Imbalance Data Distribution Deep learning Word Embedding.
Birincil Dil | İngilizce |
---|---|
Konular | Mühendislik |
Bölüm | TJST |
Yazarlar | |
Yayımlanma Tarihi | 20 Mart 2022 |
Gönderilme Tarihi | 6 Şubat 2022 |
Yayımlandığı Sayı | Yıl 2022 Cilt: 17 Sayı: 1 |