Research Article

MOODETECTR: MOOD DETECTION FOR TURKISH LYRICS THROUGH WORD VECTORS

Volume: 8 Number: 3 September 3, 2020
TR EN

MOODETECTR: MOOD DETECTION FOR TURKISH LYRICS THROUGH WORD VECTORS

Abstract

Along with the increasing use of online music platforms, catalogue-based searches have turned into mood-based seeking. In this study, we propose MooDetecTR, a semi-supervised learning framework that employs word vectors for Turkish song mood detection. In this framework, first word vectors are created through a large collection of textual data, which include more than 2.5 million Turkish documents, by using Word2Vec and GloVe algorithms. Subsequently, lyrics vectors are generated through combining already trained word vectors of the words in the lyrics selected for mood detection. Lastly, lyrics vectors are fed into various machine-learning algorithms as features to create models for music mood detection. For comparison, Turkish music mood detection is performed both via traditional bag-of-words model, with TF-IDF weights, and Doc2Vec algorithm. The effects of stemming of the words and stop-words removal on the results are investigated, as well. The best micro-f1 score (54.36%) obtained by the proposed framework is 3.81%, and 2.92% higher (7.54%, and 5.68% relative improvements) than the best score obtained from Doc2Vec and bag-of-words methods, respectively. Consequently, the results obtained show the effectiveness of incorporating word vectors generated using big textual data into Turkish text classification process, which is clearly illustrated by the improved classification performance.

Keywords

References

  1. Ali, S. O., & Peynircioǧlu, Z. F. (2006). Songs and emotions: Are lyrics and melodies equal partners? Psychology of Music, 34(4), 511–534. https://doi.org/10.1177/0305735606067168
  2. Alparslan, E., Karahoca, A., & Bahşi, H. (2011). Classification of confidential documents by using adaptive neuro-fuzzy inference systems. In Procedia Computer Science (pp. 1412–1417). https://doi.org/10.1016/j.procs.2011.01.023
  3. Cakir, M. U., & Guldamlasioglu, S. (2016). Text Mining Analysis in Turkish Language Using Big Data Tools. In 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC) (pp. 614–618). https://doi.org/10.1109/COMPSAC.2016.203
  4. Casey, M. A., Veltkamp, R., Goto, M., Leman, M., Rhodes, C., & Slaney, M. (2008). Content-Based Music Information Retrieval: Current Directions and Future Challenges. In Proceedings of the IEEE (Vol. 96, pp. 668–696). https://doi.org/10.1109/JPROC.2008.916370
  5. Cohen, J. (1960). A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement, 20(1), 37–46. https://doi.org/10.1177/001316446002000104 Coltekin, C. (2010). A Freely Available Morphological Analyzer for Turkish. In Proceedings of the 7th International Conference on Language Resources and Evaluation (pp. 19–28).
  6. Danilák, M. (n.d.). Langdetect 1.0.7. Python Package Index.
  7. Fell, M., & Sporleder, C. (2014). Lyrics-based Analysis and Classification of Music. In International Conference on Computational Linguistics (pp. 620–631).
  8. Fleiss, J. L., Nee, J. C., & Landis, J. R. (1979). Large sample variance of kappa in the case of different sets of raters. Psychological Bulletin, 86(5), 974–977. https://doi.org/10.1037/0033-2909.86.5.974

Details

Primary Language

English

Subjects

Engineering

Journal Section

Research Article

Publication Date

September 3, 2020

Submission Date

February 15, 2019

Acceptance Date

February 2, 2020

Published in Issue

Year 2020 Volume: 8 Number: 3

IEEE
[1]B. Çimen and A. O. Durahim, “MOODETECTR: MOOD DETECTION FOR TURKISH LYRICS THROUGH WORD VECTORS”, KONJES, vol. 8, no. 3, pp. 499–509, Sept. 2020, doi: 10.36306/konjes.788046.