A Basic and Brief Scheme of an Application of a Machine Learning Process

Ömer Faruk Ertuğrul; Mehmet Emin Tağluk; Yılmaz Kaya

Research Article

A Basic and Brief Scheme of an Application of a Machine Learning Process

Year 2017, Volume: 1 Issue: 1, 16 - 22, 31.12.2017

Ömer Faruk Ertuğrul , Mehmet Emin Tağluk , Yılmaz Kaya

Abstract

Machine learning methods are powerful tools in modeling systems or extracting knowledge about a phenomenon from samples. This paper is written in order to make the process of machine learning clearer. Therefore, the reason behind the usage of each stage of this process was given briefly. Later, Highleyman dataset was employed in tests in ML methods.

Keywords

Machine learning, Cross-validation, Feature extraction, Feature selection

References

Aggarwal, C. C., & Yu, P. S. (2001). Outlier detection for high dimensional data. In ACM Sigmod Record (Vol. 30, No. 2, pp. 37-46). ACM. Blum, A. L., & Langley, P. (1997). Selection of relevant features and examples in machine learning. Artificial intelligence, 97(1), 245-271. Cawley, G. C., & Talbot, N. L. (2003). Efficient leave-one-out cross-validation of kernel fisher discriminant classifiers. Pattern Recognition, 36(11), 2585-2592. Cawley, G. C., & Talbot, N. L. (2004). Fast exact leave-one-out cross-validation of sparse least-squares support vector machines. Neural networks, 17(10), 1467-1475. Das, S. (2001). Filters, wrappers and a boosting-based hybrid for feature selection. In ICML 1, 74-81. Dash, M., & Liu, H. (1997). Feature selection for classification. Intelligent data analysis, 1(1-4), 131-156. Dietterich, T. G. (2000). Ensemble methods in machine learning. Multiple classifier systems, 1857, 1-15. Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery in databases. AI magazine, 17(3), 37. Fushiki, T. (2011). Estimation of prediction error by using K-fold cross-validation. Statistics and Computing, 21(2), 137-146. Gorunescu, F. (2011). Data Mining: Concepts, models and techniques (Vol. 12). Springer Science & Business Media. Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of machine learning research, 3(Mar), 1157-1182. Guyon, I., & Elisseeff, A. (2006). An introduction to feature extraction. Feature extraction, 1-25. Han, J., Pei, J., & Kamber, M. (2011). Data mining: concepts and techniques. Elsevier. Hastie, T., Tibshirani, R., & Friedman, J. (2009). Overview of supervised learning. In The elements of statistical learning (pp. 9-41). Springer New York. Hodeghatta, U. R., & Nayak, U. (2017). Unsupervised Machine Learning. In Business Analytics Using R-A Practical Approach(pp. 161-186). Apress. Hodge, V., & Austin, J. (2004). A survey of outlier detection methodologies. Artificial intelligence review, 22(2), 85-126. Hyvärinen, A., Gutmann, M., & Entner, D. (2010). Unsupervised Machine Learning. Ikonomakis, M., Kotsiantis, S., & Tampakas, V. (2005). Text classification using machine learning techniques. WSEAS transactions on computers, 4(8), 966-974. Jerez, J. M., Molina, I., García-Laencina, P. J., Alba, E., Ribelles, N., Martín, M., & Franco, L. (2010). Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artificial intelligence in medicine, 50(2), 105-115. King, D. E. (2009). Dlib-ml: A machine learning toolkit. Journal of Machine Learning Research, 10(Jul), 1755-1758. Kotsiantis, S. B., Zaharakis, I., & Pintelas, P. (2007). Supervised machine learning: A review of classification techniques. Lakshminarayan, K., Harp, S. A., Goldman, R. P., & Samad, T. (1996). Imputation of Missing Data Using Machine Learning Techniques. In KDD (pp. 140-145). Larose, D. T. (2014). Discovering knowledge in data: an introduction to data mining. John Wiley & Sons. Laurikkala, J., Juhola, M., Kentala, E., Lavrac, N., Miksch, S., & Kavsek, B. (2000). Informal identification of outliers in medical data. In Fifth International Workshop on Intelligent Data Analysis in Medicine and Pharmacology (Vol. 1, pp. 20-24). Liao, S. H., Chu, P. H., & Hsiao, P. Y. (2012). Data mining techniques and applications–A decade review from 2000 to 2011. Expert systems with applications, 39(12), 11303-11311. Liu, H., & Motoda, H. (Eds.). (1998). Feature extraction, construction and selection: A data mining perspective (Vol. 453). Springer Science & Business Media. Luger, G. F. & Stubblefield, W. A. (1989). Artificial Intelligence: Structures and Strategies for Complex Problem Solving, The Benjamin/Cummings Publishing Company, Inc. Meijer, R. J., & Goeman, J. J. (2013). Efficient approximate k‐fold and leave‐one‐out cross‐validation for ridge regression. Biometrical Journal, 55(2), 141-155. Mörchen, F. (2003). Time series feature extraction for data mining using DWT and DFT. Quinlan, J. R. (2014). C4. 5: programs for machine learning. Elsevier. Rasmussen, C. E., & Williams, C. K. (2006). Gaussian processes for machine learning (Vol. 1). Cambridge: MIT press. Refaat, M. (2010). Data preparation for data mining using SAS. Morgan Kaufmann. Refaeilzadeh, P., Tang, L., & Liu, H. (2009). Cross-validation. In Encyclopedia of database systems (pp. 532-538). Springer US. Saeys, Y., Inza, I., & Larrañaga, P. (2007). A review of feature selection techniques in bioinformatics. bioinformatics, 23(19), 2507-2517. Sophian, A., Tian, G. Y., Taylor, D., & Rudlin, J. (2003). A feature extraction technique based on principal component analysis for pulsed eddy current NDT. NDT & e International, 36(1), 37-41. Swiniarski, R. W., & Skowron, A. (2003). Rough set methods in feature selection and recognition. Pattern recognition letters, 24(6), 833-849. Van der Aalst, W. M. (2011). Data Mining. In Process Mining (pp. 59-91). Springer, Berlin, Heidelberg. Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2016). Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann. Witten, I. H., Frank, E., Trigg, L. E., Hall, M. A., Holmes, G., & Cunningham, S. J. (1999). Weka: Practical machine learning tools and techniques with Java implementations. Wong, T. T. (2015). Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recognition, 48(9), 2839-2846. Yang, Q., & Wu, X. (2006). 10 challenging problems in data mining research. International Journal of Information Technology & Decision Making, 5(04), 597-604. Zhang, S., Zhang, C., & Yang, Q. (2003). Data preparation for data mining. Applied Artificial Intelligence, 17(5-6), 375-381.

Yapay Öğrenme Metodolojisinin Tanıtımı

Year 2017, Volume: 1 Issue: 1, 16 - 22, 31.12.2017

Ömer Faruk Ertuğrul , Mehmet Emin Tağluk , Yılmaz Kaya

Abstract

Keywords

Yapay öğrenme, Çapraz Doğrulama, Özellik çıkarma, Özellik seçme

References

Aggarwal, C. C., & Yu, P. S. (2001). Outlier detection for high dimensional data. In ACM Sigmod Record (Vol. 30, No. 2, pp. 37-46). ACM. Blum, A. L., & Langley, P. (1997). Selection of relevant features and examples in machine learning. Artificial intelligence, 97(1), 245-271. Cawley, G. C., & Talbot, N. L. (2003). Efficient leave-one-out cross-validation of kernel fisher discriminant classifiers. Pattern Recognition, 36(11), 2585-2592. Cawley, G. C., & Talbot, N. L. (2004). Fast exact leave-one-out cross-validation of sparse least-squares support vector machines. Neural networks, 17(10), 1467-1475. Das, S. (2001). Filters, wrappers and a boosting-based hybrid for feature selection. In ICML 1, 74-81. Dash, M., & Liu, H. (1997). Feature selection for classification. Intelligent data analysis, 1(1-4), 131-156. Dietterich, T. G. (2000). Ensemble methods in machine learning. Multiple classifier systems, 1857, 1-15. Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery in databases. AI magazine, 17(3), 37. Fushiki, T. (2011). Estimation of prediction error by using K-fold cross-validation. Statistics and Computing, 21(2), 137-146. Gorunescu, F. (2011). Data Mining: Concepts, models and techniques (Vol. 12). Springer Science & Business Media. Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of machine learning research, 3(Mar), 1157-1182. Guyon, I., & Elisseeff, A. (2006). An introduction to feature extraction. Feature extraction, 1-25. Han, J., Pei, J., & Kamber, M. (2011). Data mining: concepts and techniques. Elsevier. Hastie, T., Tibshirani, R., & Friedman, J. (2009). Overview of supervised learning. In The elements of statistical learning (pp. 9-41). Springer New York. Hodeghatta, U. R., & Nayak, U. (2017). Unsupervised Machine Learning. In Business Analytics Using R-A Practical Approach(pp. 161-186). Apress. Hodge, V., & Austin, J. (2004). A survey of outlier detection methodologies. Artificial intelligence review, 22(2), 85-126. Hyvärinen, A., Gutmann, M., & Entner, D. (2010). Unsupervised Machine Learning. Ikonomakis, M., Kotsiantis, S., & Tampakas, V. (2005). Text classification using machine learning techniques. WSEAS transactions on computers, 4(8), 966-974. Jerez, J. M., Molina, I., García-Laencina, P. J., Alba, E., Ribelles, N., Martín, M., & Franco, L. (2010). Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artificial intelligence in medicine, 50(2), 105-115. King, D. E. (2009). Dlib-ml: A machine learning toolkit. Journal of Machine Learning Research, 10(Jul), 1755-1758. Kotsiantis, S. B., Zaharakis, I., & Pintelas, P. (2007). Supervised machine learning: A review of classification techniques. Lakshminarayan, K., Harp, S. A., Goldman, R. P., & Samad, T. (1996). Imputation of Missing Data Using Machine Learning Techniques. In KDD (pp. 140-145). Larose, D. T. (2014). Discovering knowledge in data: an introduction to data mining. John Wiley & Sons. Laurikkala, J., Juhola, M., Kentala, E., Lavrac, N., Miksch, S., & Kavsek, B. (2000). Informal identification of outliers in medical data. In Fifth International Workshop on Intelligent Data Analysis in Medicine and Pharmacology (Vol. 1, pp. 20-24). Liao, S. H., Chu, P. H., & Hsiao, P. Y. (2012). Data mining techniques and applications–A decade review from 2000 to 2011. Expert systems with applications, 39(12), 11303-11311. Liu, H., & Motoda, H. (Eds.). (1998). Feature extraction, construction and selection: A data mining perspective (Vol. 453). Springer Science & Business Media. Luger, G. F. & Stubblefield, W. A. (1989). Artificial Intelligence: Structures and Strategies for Complex Problem Solving, The Benjamin/Cummings Publishing Company, Inc. Meijer, R. J., & Goeman, J. J. (2013). Efficient approximate k‐fold and leave‐one‐out cross‐validation for ridge regression. Biometrical Journal, 55(2), 141-155. Mörchen, F. (2003). Time series feature extraction for data mining using DWT and DFT. Quinlan, J. R. (2014). C4. 5: programs for machine learning. Elsevier. Rasmussen, C. E., & Williams, C. K. (2006). Gaussian processes for machine learning (Vol. 1). Cambridge: MIT press. Refaat, M. (2010). Data preparation for data mining using SAS. Morgan Kaufmann. Refaeilzadeh, P., Tang, L., & Liu, H. (2009). Cross-validation. In Encyclopedia of database systems (pp. 532-538). Springer US. Saeys, Y., Inza, I., & Larrañaga, P. (2007). A review of feature selection techniques in bioinformatics. bioinformatics, 23(19), 2507-2517. Sophian, A., Tian, G. Y., Taylor, D., & Rudlin, J. (2003). A feature extraction technique based on principal component analysis for pulsed eddy current NDT. NDT & e International, 36(1), 37-41. Swiniarski, R. W., & Skowron, A. (2003). Rough set methods in feature selection and recognition. Pattern recognition letters, 24(6), 833-849. Van der Aalst, W. M. (2011). Data Mining. In Process Mining (pp. 59-91). Springer, Berlin, Heidelberg. Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2016). Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann. Witten, I. H., Frank, E., Trigg, L. E., Hall, M. A., Holmes, G., & Cunningham, S. J. (1999). Weka: Practical machine learning tools and techniques with Java implementations. Wong, T. T. (2015). Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recognition, 48(9), 2839-2846. Yang, Q., & Wu, X. (2006). 10 challenging problems in data mining research. International Journal of Information Technology & Decision Making, 5(04), 597-604. Zhang, S., Zhang, C., & Yang, Q. (2003). Data preparation for data mining. Applied Artificial Intelligence, 17(5-6), 375-381.

There are 1 citations in total.

Details

Journal Section	Articles
Authors	Ömer Faruk Ertuğrul Mehmet Emin Tağluk Yılmaz Kaya
Publication Date	December 31, 2017
Published in Issue	Year 2017 Volume: 1 Issue: 1

Cite

APA	Ertuğrul, Ö. F., Tağluk, M. E., & Kaya, Y. (2017). A Basic and Brief Scheme of an Application of a Machine Learning Process. Journal of Engineering and Technology, 1(1), 16-22.