Research Article
BibTex RIS Cite

CodelessML: A Beginner's Web Application for Getting Started with Machine Learning

Year 2024, , 582 - 599, 21.10.2024
https://doi.org/10.18009/jcer.1506864

Abstract

Building machine learning models requires intensive coding and installation of certain software. This is frequently a barrier for beginners learning about machine learning. To overcome this situation, we present CodelessML, a reproducible web-based application designed for Machine Learning beginners due to its coding-free and installation-free design, published under Code Ocean capsule. It provides a common workflow that eases the process of building Machine Learning models and using the model for predictions. Using the Agile method, CodelessML was successfully built using Python, Anaconda, and Streamlit It. By using CodelessML, users can get a walkthrough and interactive experience of building machine learning through a simplified machine learning process: exploratory data analytics (EDA), modelling, and prediction. The impact of the software was evaluated based on feedback from 79 respondents, which showed that based on a 5-scale Likert, CodelessML received average ratings of 4.4 in accessibility, 4.3 in content, and 4.4 in functionality. CodelessML serves as an accessible entry point for learning machine learning, offering online, free, and reproducible features.

Ethical Statement

Acknowledgement Due to the scope and method of the study, ethics committee permission was not required.

Project Number

01

References

  • Adeniran, A. O. (2019). Application of Likert scale’s type and Cronbach’s alpha analysis in an airport perception study. Scholar Journal of Applied Sciences and Research, 2(4), 1-5.
  • Aeberhard, S., & Forina, M. (1991). Wine [Data set]. UCI Machine Learning Repository, 10, C5PC7J.
  • Botchkarev, A. (2018). Performance metrics (Error Measures) in machine learning regression, forecasting and prognostics: Properties and typology. ArXiv.
  • Botchkarev, A. (2019). A new typology design of performance metrics to measure errors in machine learning regression algorithms. Interdisciplinary Journal of Information Knowledge and Management, 14, 045–076. https://doi.org/10.28945/4184 .
  • Burscher, B., Odijk, D., Vliegenthart, R., De Rijke, M., & De Vreese, C. H. (2014). Teaching the computer to code frames in news: comparing two supervised machine learning approaches to frame analysis. Communication Methods and Measures, 8(3), 190–206.
  • Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C. & Wirth, R. (2000). CRISP-DM 1.0 - Step-by-step data mining guide. CRISP-DM Consortium.
  • Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 785-794. https://doi.org/10.1145/2939672.2939785.
  • Dyck, J. (2018). Machine learning for engineering. In: Proceedings of the 23rd Asia and South Pacific Design Automation Conference. IEEE Press, pp. 422–427.
  • Fabian, P. (2011). Scikit-learn: Machine learning in Python. Journal of machine learning research 12, 2825-2830, https://doi.org/10.1145/3369834.
  • Ferguson, A. L. (2017). Machine learning and data science in soft materials engineering. Journal of Physics: Condensed Matter 30(4).
  • Fradkov, A. L. (2020). Early history of machine learning. IFAC-PapersOnLine, 53(2), 1385–1390. https://doi.org/10.1016/j.ifacol.2020.12.1888.
  • Grandini, M., Bagli, E., & Visani, G. (2020). Metrics for multi-class classification: An overview. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2008.05756.
  • Hair, J. F. J., Black, W. C., Babin, B. J., Anderson, R. E., Black, W. C., & Anderson, R. E. (2019). Multivariate data analysis. Cencage Learning.
  • Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T.-Y. (2017). LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in Neural Information Processing Systems, 30.
  • Kononenko, I. (2001). Machine learning for medical diagnosis: History, state of the art and perspective. Artificial Intelligence in Medicine, 23(1), 89-109.
  • Liu, Y. (2020). Python machine learning by example - Third Edition.
  • Martinez-Plumed, F., Contreras-Ochando, L., Ferri, C., Hernandez-Orallo, J., Kull, M., Lachiche, N., Ramirez-Quintana, M. J., & Flach, P. (2021). CRISP-DM Twenty years later: From data mining processes to data science trajectories. IEEE Transactions on Knowledge and Data Engineering, 33(8), 3048–3061.
  • McKinney, W. (2010, June). Data structures for statistical computing in python. Proceedings of the 9th Python in Science Conference, 445(1), 51-56.
  • Mohamed, A. E. (2017). Comparative study of four supervised machine learning techniques for classification. International Journal of Applied, 7(2), 1-15.
  • Nabil, D., Mosad, A., & Hefny, H. A. (2011). Web-Based applications quality factors: A survey and a proposed conceptual model. Egyptian Informatics Journal, 12(3), 211-217. https://doi.org/10.1016/j.eij.2011.09.003.
  • Naqa, I. E., & Murphy, M. J. (2015). What is machine learning? In Springer eBooks (pp. 3–11). https://doi.org/10.1007/978-3-319-18305-3_1
  • Nohara, Y., Matsumoto, K., Soejima, H., & Nakashima, N. (2022). Explanation of machine learning models using shapley additive explanation and application for real data in hospital. Computer Methods and Programs in Biomedicine, 214, 106584.
  • Novakovic, J. Dj., Veljovic, A., S. Ilic, S., Papic, Z., & Tomovic, M. (2017). Evaluation of classification models in machine learning. Theory and Applications of Mathematics & Computer Science, 7(1), 39–46.
  • Opitz, J. (2024). A closer look at classification evaluation metrics and a critical reflection of common evaluation practice. arXiv (Cornell University).
  • Ozgur, C., Colliau, T., Rogers, G., & Hughes, Z. (2021). MatLab vs. Python vs. R. Journal of Data Science, 15(3), 355–372. https://doi.org/10.6339/jds.201707_15(3).0001.
  • Register, Y., & Ko, A. J. (2020, August). Learning machine learning with personal data helps stakeholders ground advocacy arguments in model mechanics. Proceedings of the 2020 ACM Conference on International Computing Education Research, 67-78,
  • Sallow, A. B., Asaad, R. R., Ahmad, H. B., Abdulrahman, S. M., Hani, A. A., & Zeebaree, S. R. (2024). Machine learning skills to K–12. Journal of Soft Computing and Data Mining, 5(1), 132-141.
  • Sarkar, D., Bali, R., & Sharma, T. (2017). The python machine learning ecosystem. In Apress eBooks (pp. 67–118). https://doi.org/10.1007/978-1-4842-3207-1_2 .
  • Schlimmer, J. (1987). Automobile [Data set]. UCI machine learning repository. DOI, 10, C5B01C. https://doi.org/10.24432/C5B01C.
  • Tetzlaff, L. M., & Szepannek, G. (2022). mlr3shiny—State-of-the-art machine learning made easy. SoftwareX, 20, 101246. https://doi.org/10.1016/j.softx.2022.101246.
  • Tukey, J. W. (1962). The future of data analysis. The annals of mathematical statistics, 33(1), 1-67.
  • Wang, T., & Cheng, E. C. K. (2021). An investigation of barriers to Hong Kong K-12 schools incorporating Artificial Intelligence in education. Computers and Education Artificial Intelligence, 2, 100031. https://doi.org/10.1016/j.caeai.2021.100031.
  • Woodruff, K., Hutson, J., & Arnone, K. (2023). Perceptions and barriers to adopting artificial intelligence in k-12 education: A survey of educators in fifty states. In IntechOpen eBooks. https://doi.org/10.5772/intechopen.1002741 .
  • Zhou, Z.-H. (2017). “Machine learning challenges and impact: an interview with Thomas Dietterich.” National Science Review 5(1), 54–58.

CodelessML: A Beginner's Web Application for Getting Started with Machine Learning

Year 2024, , 582 - 599, 21.10.2024
https://doi.org/10.18009/jcer.1506864

Abstract

Building machine learning models requires intensive coding and installation of certain software. This is frequently a barrier for beginners learning about machine learning. To overcome this situation, we present CodelessML, a reproducible web-based application designed for Machine Learning beginners due to its coding-free and installation-free design, published under Code Ocean capsule. It provides a common workflow that eases the process of building Machine Learning models and using the model for predictions. Using the Agile method, CodelessML was successfully built using Python, Anaconda, and Streamlit It. By using CodelessML, users can get a walkthrough and interactive experience of building machine learning through a simplified machine learning process: exploratory data analytics (EDA), modelling, and prediction. The impact of the software was evaluated based on feedback from 79 respondents, which showed that based on a 5-scale Likert, CodelessML received average ratings of 4.4 in accessibility, 4.3 in content, and 4.4 in functionality. CodelessML serves as an accessible entry point for learning machine learning, offering online, free, and reproducible features.

Ethical Statement

Acknowledgement Due to the scope and method of the study, ethics committee permission was not required.

Project Number

01

References

  • Adeniran, A. O. (2019). Application of Likert scale’s type and Cronbach’s alpha analysis in an airport perception study. Scholar Journal of Applied Sciences and Research, 2(4), 1-5.
  • Aeberhard, S., & Forina, M. (1991). Wine [Data set]. UCI Machine Learning Repository, 10, C5PC7J.
  • Botchkarev, A. (2018). Performance metrics (Error Measures) in machine learning regression, forecasting and prognostics: Properties and typology. ArXiv.
  • Botchkarev, A. (2019). A new typology design of performance metrics to measure errors in machine learning regression algorithms. Interdisciplinary Journal of Information Knowledge and Management, 14, 045–076. https://doi.org/10.28945/4184 .
  • Burscher, B., Odijk, D., Vliegenthart, R., De Rijke, M., & De Vreese, C. H. (2014). Teaching the computer to code frames in news: comparing two supervised machine learning approaches to frame analysis. Communication Methods and Measures, 8(3), 190–206.
  • Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C. & Wirth, R. (2000). CRISP-DM 1.0 - Step-by-step data mining guide. CRISP-DM Consortium.
  • Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 785-794. https://doi.org/10.1145/2939672.2939785.
  • Dyck, J. (2018). Machine learning for engineering. In: Proceedings of the 23rd Asia and South Pacific Design Automation Conference. IEEE Press, pp. 422–427.
  • Fabian, P. (2011). Scikit-learn: Machine learning in Python. Journal of machine learning research 12, 2825-2830, https://doi.org/10.1145/3369834.
  • Ferguson, A. L. (2017). Machine learning and data science in soft materials engineering. Journal of Physics: Condensed Matter 30(4).
  • Fradkov, A. L. (2020). Early history of machine learning. IFAC-PapersOnLine, 53(2), 1385–1390. https://doi.org/10.1016/j.ifacol.2020.12.1888.
  • Grandini, M., Bagli, E., & Visani, G. (2020). Metrics for multi-class classification: An overview. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2008.05756.
  • Hair, J. F. J., Black, W. C., Babin, B. J., Anderson, R. E., Black, W. C., & Anderson, R. E. (2019). Multivariate data analysis. Cencage Learning.
  • Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T.-Y. (2017). LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in Neural Information Processing Systems, 30.
  • Kononenko, I. (2001). Machine learning for medical diagnosis: History, state of the art and perspective. Artificial Intelligence in Medicine, 23(1), 89-109.
  • Liu, Y. (2020). Python machine learning by example - Third Edition.
  • Martinez-Plumed, F., Contreras-Ochando, L., Ferri, C., Hernandez-Orallo, J., Kull, M., Lachiche, N., Ramirez-Quintana, M. J., & Flach, P. (2021). CRISP-DM Twenty years later: From data mining processes to data science trajectories. IEEE Transactions on Knowledge and Data Engineering, 33(8), 3048–3061.
  • McKinney, W. (2010, June). Data structures for statistical computing in python. Proceedings of the 9th Python in Science Conference, 445(1), 51-56.
  • Mohamed, A. E. (2017). Comparative study of four supervised machine learning techniques for classification. International Journal of Applied, 7(2), 1-15.
  • Nabil, D., Mosad, A., & Hefny, H. A. (2011). Web-Based applications quality factors: A survey and a proposed conceptual model. Egyptian Informatics Journal, 12(3), 211-217. https://doi.org/10.1016/j.eij.2011.09.003.
  • Naqa, I. E., & Murphy, M. J. (2015). What is machine learning? In Springer eBooks (pp. 3–11). https://doi.org/10.1007/978-3-319-18305-3_1
  • Nohara, Y., Matsumoto, K., Soejima, H., & Nakashima, N. (2022). Explanation of machine learning models using shapley additive explanation and application for real data in hospital. Computer Methods and Programs in Biomedicine, 214, 106584.
  • Novakovic, J. Dj., Veljovic, A., S. Ilic, S., Papic, Z., & Tomovic, M. (2017). Evaluation of classification models in machine learning. Theory and Applications of Mathematics & Computer Science, 7(1), 39–46.
  • Opitz, J. (2024). A closer look at classification evaluation metrics and a critical reflection of common evaluation practice. arXiv (Cornell University).
  • Ozgur, C., Colliau, T., Rogers, G., & Hughes, Z. (2021). MatLab vs. Python vs. R. Journal of Data Science, 15(3), 355–372. https://doi.org/10.6339/jds.201707_15(3).0001.
  • Register, Y., & Ko, A. J. (2020, August). Learning machine learning with personal data helps stakeholders ground advocacy arguments in model mechanics. Proceedings of the 2020 ACM Conference on International Computing Education Research, 67-78,
  • Sallow, A. B., Asaad, R. R., Ahmad, H. B., Abdulrahman, S. M., Hani, A. A., & Zeebaree, S. R. (2024). Machine learning skills to K–12. Journal of Soft Computing and Data Mining, 5(1), 132-141.
  • Sarkar, D., Bali, R., & Sharma, T. (2017). The python machine learning ecosystem. In Apress eBooks (pp. 67–118). https://doi.org/10.1007/978-1-4842-3207-1_2 .
  • Schlimmer, J. (1987). Automobile [Data set]. UCI machine learning repository. DOI, 10, C5B01C. https://doi.org/10.24432/C5B01C.
  • Tetzlaff, L. M., & Szepannek, G. (2022). mlr3shiny—State-of-the-art machine learning made easy. SoftwareX, 20, 101246. https://doi.org/10.1016/j.softx.2022.101246.
  • Tukey, J. W. (1962). The future of data analysis. The annals of mathematical statistics, 33(1), 1-67.
  • Wang, T., & Cheng, E. C. K. (2021). An investigation of barriers to Hong Kong K-12 schools incorporating Artificial Intelligence in education. Computers and Education Artificial Intelligence, 2, 100031. https://doi.org/10.1016/j.caeai.2021.100031.
  • Woodruff, K., Hutson, J., & Arnone, K. (2023). Perceptions and barriers to adopting artificial intelligence in k-12 education: A survey of educators in fifty states. In IntechOpen eBooks. https://doi.org/10.5772/intechopen.1002741 .
  • Zhou, Z.-H. (2017). “Machine learning challenges and impact: an interview with Thomas Dietterich.” National Science Review 5(1), 54–58.
There are 34 citations in total.

Details

Primary Language English
Subjects Development of Science, Technology and Engineering Education and Programs, Educational Technology and Computing
Journal Section Research Article
Authors

Hanif Noer Rofiq 0009-0004-0234-3055

Galuh Mafela Mutiara Sujak This is me 0009-0004-6706-5308

Project Number 01
Early Pub Date September 17, 2024
Publication Date October 21, 2024
Submission Date June 28, 2024
Acceptance Date September 6, 2024
Published in Issue Year 2024

Cite

APA Rofiq, H. N., & Sujak, G. M. M. (2024). CodelessML: A Beginner’s Web Application for Getting Started with Machine Learning. Journal of Computer and Education Research, 12(24), 582-599. https://doi.org/10.18009/jcer.1506864

download13894               13896   13897 14842      


Creative Commons License


This work is licensed under a Creative Commons Attribution 4.0 International License.


Dear Authors;

We would like to inform you that ORCID, which includes 16 digit number will be requested from the authors for the studies to be published in JCER. It is important to be sensitive on this issue. 


Best regards...