Research Article
BibTex RIS Cite

XGBoost-Driven Evaluation of Clustering Methods for MOBA Player Segmentation

Year 2025, Volume: 5 Issue: 1, 84 - 95, 31.12.2025
https://doi.org/10.57020/ject.1770052

Abstract

MOBA (Multiplayer Online Battle Arena) games are real-time, multiplayer digital games in which two teams compete strategically on a predefined map. Datasets from these games include diverse variables such as player performance, interactions, strategic behavior, and team dynamics. This study aims to cluster MOBA players based on behavioral patterns and assess the validity of these segments using classification techniques. During preprocessing, categorical variables were numerically encoded, and all features were scaled using the StandardScaler method. K-means and Hierarchical Clustering algorithms were applied to the dataset, and clusters were visualized in two dimensions via Principal Component Analysis (PCA). To evaluate cluster consistency, an XGBoost classification model was trained using stratified 5-fold cross-validation to predict cluster membership. The model achieved a mean accuracy of 0.91 ± 0.11 for the K-means clusters and 0.74 ± 0.07 for the Hierarchical clusters. These results indicate a strong internal structure in the K-means clusters. Additionally, the study explored relationships between player demographics (age, gender, weekly playtime) and psychological metrics (IGD-20, BIS-11, DERS) in relation to behavioral groupings. Findings highlight the value of combining unsupervised and supervised learning techniques to understand the complex profiles of MOBA players. Future work should focus on improving generalizability by incorporating larger, more diverse datasets and validating models with robust cross-validation protocols.

Ethical Statement

This study adhered to the principles of scientific research and publication ethics. No animal subjects were involved, and no additional ethics committee approval was required. The dataset used is publicly available on the OpenNeuro platform, shared in accordance with data security and ethical guidelines, and anonymized. This study is a secondary data analysis conducted on this publicly accessible and anonymized dataset; therefore, ethics committee approval was not necessary.

References

  • Arık, K., (2023), Machine Learning Models on MOBA Gaming: League of Legends Winner Prediction, Acta Infologica, vol. 7, no. 1, pp. 139–151, https://doi..org/10.26650/acin.1180583.
  • Schubert, M., A. Drachen, and T. Mahlmann, (2016), Esports Analytics Through Encounter Detection, MIT Sloan Sport. Anal. Conf., pp. 0–18.
  • Kugler, L., (2022), How AI is driving the esports boom, Commun. ACM, vol. 65, no. 9, pp. 17–18, doi: https://doi.org/10.1145/3546956.
  • Costa, L. M., A. C. C. Souza, and F. C. M. Souza, (2019), An Approach for Team Composition in League of Legends using Genetic Algorithm, Brazilian Symp. Games Digit. Entertain. SBGAMES, vol. 2019-Octob, pp. 52–61, https://doi..org/10.1109/SBGames.2019.00018.
  • Costa, L.M., Drachen, A., Souza, F.C.M. and Xexéo, G., (2024), Artificial Intelligence in MOBA Games: A Multivocal Literature Mapping, IEEE Trans. Games, vol. 16, no. 2, pp. 250–269, https://doi..org/10.1109/TG.2023.3282157.
  • Abidoye, R. B. and A. P. C. Chan, (2017), Artificial neural network in property valuation: application framework and research trend, Prop. Manag., vol. 35, no. 5, pp. 554–571, doi: https://doi.org/10.1108/PM-06-2016-0027.
  • Thavamuni, S., M. N. A. Khalid, and H. Iida, (2023), What makes an ideal team? Analysis of Popular Multiplayer Online Battle Arena (MOBA) games, Entertain. Comput., vol. 44, no. March 2022, p. 100523, doi: https://doi.org/10.1016/j.entcom.2022.100523.
  • Franco, G., M. Henrique Fonseca Ribeiro, and G. Comarela, (2019), Towards an interpretable metric for DOTA 2 players: An unsupervised learning approach, Proc. - 2019 Brazilian Conf. Intell. Syst. BRACIS 2019, pp. 341–346, https://doi..org/10.1109/BRACIS.2019.00067.
  • Ong, H. Y., S. Deolalikar, and M. Peng, (2015), Player Behavior and Optimal Team Composition for Online Multiplayer Games, pp. 1–5, doi: https://doi.org/10.48550/arXiv.1503.02230.
  • Sapienza, A., A. Bessi, and E. Ferrara, (2018), “Non-negative tensor factorization for human behavioral pattern mining in online games,” Inf., vol. 9, no. 3, doi: https://doi.org/10.3390/info9030066.
  • Ani, R., V. Harikumar, A. K. Devan, and O. S. Deepa, (2019), Victory prediction in league of legends using feature selection and ensemble methods, 2019 Int. Conf. Intell. Comput. Control Syst. ICCS 2019, no. Iciccs, pp. 74–77, https://doi..org/10.1109/ICCS45141.2019.9065758.
  • Sena, I. G. W. and A. W. R. Emanuel, (2023), Mobile Legend Game Prediction Using Machine Learning Regression Method, JURTEKSI (Jurnal Teknol. dan Sist. Informasi), vol. 9, no. 2, pp. 221–230.
  • Kamal, A. A., M. A. Mansor, L. Truna, N. M. Shaipullah, and N. H. H. Sultan, (2025), Machine Learning Applications in Multiplayer Online Battle Arena Esports—A Systematic Review, Pertanika J. Sci. Technol., vol. 33, no. 2, pp. 765–798, doi: https://doi.org/10.47836/pjst.33.2.11.
  • Rezapour, M. M., A. Fatemi, and M. A. Nematbakhsh, (2024), DeepSkill: A methodology for measuring teams’ skills in massively multiplayer online games, Multimed. Tools Appl., vol. 83, no. 10, pp. 31049–31079, doi: https://doi.org/10.1007/s11042-023-15796-x.
  • Xia, T., X. Lin, X. Mo, Q. Su, and S. Ding, (2024), Players’ continuous willingness to play in MOBA game ranking mode: through the lens of self-determination theory and social comparison theory, Humanit. Soc. Sci. Commun., vol. 11, no. 1, pp. 1–11, doi: https://doi.org/10.1057/s41599-024-03934-1.
  • Horie, R. and R. Nawa, (2017), A hands-on game by using a brain-computer interface, an immersive head mounted display, and a wearable gesture interface, 2017 IEEE 6th Glob. Conf. Consum. Electron. GCCE 2017, vol. 2017-Janua, no. Gcce, pp. 1–5, https://doi..org/10.1109/GCCE.2017.8229324.
  • Healey, J. A. and R. W. Picard, (2005), Detecting stress during real-world driving tasks using physiological sensors, IEEE Trans. Intell. Transp. Syst., vol. 6, no. 2, pp. 156–166, https://doi..org/10.1109/TITS.2005.848368.
  • Pantic, M., M. Valstar, R. Rademaker, and L. Maat, (2005), Web-based database for facial expression analysis, IEEE Int. Conf. Multimed. Expo, ICME 2005, vol. 2005, pp. 317–321, https://doi..org/10.1109/ICME.2005.1521424.
  • Paiva, A. C., R. Prada, and R. W. Picard,(2007), The HUMAINE database: Addressing the collection and annotation of naturalistic and induced emotional data.
  • Grimm, M., K. Kroschel, and S. Narayanan, (2008), The Vera am Mittag German audio-visual emotional speech database, 2008 IEEE Int. Conf. Multimed. Expo, ICME 2008 - Proc., pp. 865–868, https://doi..org/10.1109/ICME.2008.4607572.
  • Douglas-cowie, E., R. Cowie, and M. Schröder, (2000), A New Emotion Database: Considerations, Sources and Scope, In, pp. 39–44.
  • Petridis, S., B. Martinez, and M. Pantic, (2013), The MAHNOB Laughter database, Image Vis. Comput., vol. 31, no. 2, pp. 186–202, doi: http://dx.doi.org/10.1016/j.imavis.2012.08.014.
  • Koelstra, S. et al., (2012), DEAP: A database for emotion analysis; Using physiological signals, IEEE Trans. Affect. Comput., vol. 3, no. 1, pp. 18–31, https://doi..org/10.1109/T-AFFC.2011.15.
  • Alakus, T. B., M. Gonen, and I. Turkoglu, (2020), Database for an emotion recognition system based on EEG signals and various computer games – GAMEEMO, Biomed. Signal Process. Control, vol.60, p. 101951, doi: https://doi.org/10.1016/j.bspc.2020.101951.
  • Simp, A. X. V. I. and S. Remoto, (2013), PM2.5 Prediction Based on Random Forest, XGBoost, and Deep Learning Using Multisource Remote Sensing Data Mehdi, no. 1992, pp. 6425–6432, doi: https://doi.org/10.3390/atmos10070373.
  • Zahin, S. A., C. F. Ahmed, and T. Alam, (2018), An effective method for classification with missing values, Appl. Intell., vol. 48, no. 10, pp. 3209–3230, https://doi..org/10.1007/s10489-018-1139-9.
  • Johnson, T. F., N. J. B. Isaac, A. Paviolo, and M. González-Suárez, (2021), Handling missing values in trait data, Glob. Ecol. Biogeogr.,vol.30,no.1,pp.51–62,doi: https://doi.org/10.1111/geb.13185.
  • Hong-Zhi Li, Jia-Jia Yang, Zhen Lv, Li-Yang Wan, Wo Wang, Da-Qi Li, Dong-Dong Zhou, L. K. (2024), Research data supporting ‘EEG recording during playing MOBA game.’
  • Li, H. Z. et al., (2025), EEG dataset from playing Multiplayer Online Battle Arena games in natural settings, Sci. Data , vol. 12, no. 1, pp. 1–13, https://doi..org/10.1038/s41597-025-05435-5.
  • Lloyd, S. P., (1982), Least Squares Quantization in PCM, IEEE Trans. Inf. Theory, vol.28, no.2, pp. 129–137, doi: 10.1109/TIT.1982.1056489.
  • MacQueen, J., (1967), Some Methods for Classification and Analysis of Multivriate Observations, Proc. Fifth Berkeley Symp. Math. Stat. Probab., vol. 5, pp. 281–298.
  • Jain, A. K., (2010), Data clustering: 50 years beyond K-means, Pattern Recognit. Lett., vol. 31, no. 8, pp. 651–666, doi: https://doi.org/10.1016/j.patrec.2009.09.011.
  • Sinaga, K. P. and M. S. Yang, (2020), Unsupervised K-means clustering algorithm,” IEEE Access, vol. 8, pp. 80716–80727, https://doi..org/10.1109/ACCESS.2020.2988796.
  • Ran, X., Y. Xi, Y. Lu, X. Wang, and Z. Lu, (2023), Comprehensive survey on hierarchical clustering algorithms and the recent developments, vol. 56, no. 8. Springer Netherlands, doi: https://doi.org/10.1007/s10462-022-10366-3.
  • Murtagh, F. and P. Contreras, (2017), Algorithms for hierarchical clustering: an overview, II, Wiley Interdiscip. Rev. Data Min. Knowl. Discov., vol. 7, no. 6, pp. 1–16, doi: https://doi.org/10.1002/widm.1219.
  • Minami, S., H. Koyama, K. Watanabe, N. Saijo, and M. Kashino, (2024), Prediction of esports competition outcomes using EEG data from expert players, Comput. Human Behav., vol. 160, no. November 2023, p. 108351.
  • Mukmin, C. S., F. H. Masyfa, T. Widiyaningtyas, E. P. M. Syahri, I. M. Wirawan, and L. Hidayati, (2024), Predict Mobile Legends Match Results Based on In-Game Conditions Using Deep Learning Models, 2024 Beyond Technol. Summit Informatics Int. Conf. BTS-I2C 2024, pp. 637–641, https://doi..org/10.1109/BTS-I2C63534.2024.10941825.
  • Velez, D. R. et al., (2007), A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction, Genet. Epidemiol., vol. 31, no. 4, pp. 306–315, https://doi.org/10.1002/gepi.20211.
There are 38 citations in total.

Details

Primary Language English
Subjects Computer Gaming and Animation
Journal Section Research Article
Authors

Sema Yildirim 0000-0003-0807-8550

Submission Date August 22, 2025
Acceptance Date December 15, 2025
Early Pub Date December 15, 2025
Publication Date December 31, 2025
Published in Issue Year 2025 Volume: 5 Issue: 1

Cite

APA Yildirim, S. (2025). XGBoost-Driven Evaluation of Clustering Methods for MOBA Player Segmentation. Journal of Emerging Computer Technologies, 5(1), 84-95. https://doi.org/10.57020/ject.1770052
Journal of Emerging Computer Technologies
is indexed and abstracted by
Harvard Hollis, Scilit, ROAD, Google Scholar, OpenAIRE

Publisher
Izmir Academy Association

88x31.png