Evaluating the Performance of Large Language Models (LLMs) Through Grid-Based Game Competitions: An Extensible Benchmark and Leaderboard on the Path to Artificial General Intelligence (AGI)
Abstract
Keywords
References
- [1] H. Naveed, A. U. Khan, S. Qiu, M. Saqib, S. Anwar, M. Usman, N. Akhtar, N. Barnes, and A. Mian. A comprehensive overview of large language models. arXiv, 2023.
- [2] B. Goertzel and C. Pennachin, editors. Artificial General Intelligence, volume 2. Springer, New York, NY, USA, 2007.
- [3] I. Sutskever. The exciting, perilous journey toward agi, 2024. Available online: https://www.ted.com/talks/ilya sutskever the exciting perilous journey toward agi (accessed on 7 June 2024).
- [4] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need. In Advances in Neural Information Processing Systems, volume 30, 2017.
- [5] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv, 2018. arXiv:1810.04805.
- [6] A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever. Improving language understanding by generative pre-training, 2024. Available THE JOURNAL of COGNITIVE SYSTEMS, Vol.9, No.2, 2024 43 Copyright © The Journal of Cognitive Systems (JCS) ISSN: 2548-0650 http://dergipark.gov.tr/jcs online: https://paperswithcode.com/paper/improving-languageunderstanding-by (accessed on 7 June 2024).
- [7] S. Bubeck, V. Chandrasekaran, R. Eldan, J. Gehrke, E. Horvitz, E. Kamar, P. Lee, Y. T. Lee, Y. Li, S. Lundberg, et al. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv [Cs.CL], 2024. Available online: http://arxiv.org/abs/2303.12712 (accessed on 7 June 2024).
- [8] G. Team, R. Anil, S. Borgeaud, Y. Wu, J. B. Alayrac, J. Yu, R. Soricut, J. Schalkwyk, A. M. Dai, A. Hauth, et al. Gemini: A family of highly capable multimodal models. arXiv [Cs.CL], 2024. Available online: http://arxiv.org/abs/2312.11805 (accessed on 7 June 2024).
Details
Primary Language
English
Subjects
Artificial Intelligence (Other)
Journal Section
Research Article
Authors
Oguzhan Topsakal
*
0000-0002-9731-6946
United States
Edell Colby
This is me
United States
Harper Jackson
This is me
United States
Publication Date
February 1, 2025
Submission Date
January 1, 2025
Acceptance Date
January 10, 2025
Published in Issue
Year 2024 Volume: 9 Number: 2