Research Article
BibTex RIS Cite
Year 2018, Volume: 3 Issue: 3, 189 - 204, 29.12.2018
https://doi.org/10.30931/jetas.475215

Abstract

References

  • [1] Zikopoulos, P.C., Eaton, C., deRoos, D., Deutsch, T., Lapis, G., Understanding Big Data, McGrawHill, New York, 2012.
  • [2] Witten, Ian H., et al., Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, 2016.
  • [3] Friedman, J., Trevor H., and Tibshirani R., The Elements of Statistical Learning, Vol. 1. Springer series in statistics, New York, 2001.
  • [4] Weidner CI, Lin Q, Koch CM, Eisele L, Beier F, Ziegler P, Bauerschlag DO, Jo¨ckel KH, Erbel R, Mu¨hleisen TW, Zenke M, Bru¨mmendorf TH, Wagner W., “Aging of Blood Can Be Tracked by DNA Methylation Changes at Just Three CpG Sites”, Genome Biol 15.2 (2014):1–11.
  • [5] Gareth J., Witten D., Hastie T., Tibshirani R., An Introduction to Statistical Learning, Springer, New York, ISBN 978-1-4614-7137-0, 2015.
  • [6] Putin E, Mamoshina P, Aliper A, Korzinkin M, Moskalev A., “Deep Biomarkers of Human Aging : Application of Deep Neural Networks to Biomarker Development”, Aging 8.5 (2016):1–13.
  • [7] Hox, Joop J., Mirjam M., and Rens Van de Schoot, Multilevel Analysis: Techniques and Applications, Routledge, 2017.
  • [8] Hu, Rui, et al., "A Short-term Power Load Forecasting Model based on the Generalized Regression Neural Network with Decreasing Step Fruit Fly Optimization Algorithm", Neurocomputing, 221 (2017): 24-31.
  • [9] Kristof De W. and López-Torres L., "Efficiency in Education: a Review of Literature and a Way Forward", Journal of the Operational Research Society, 68.4 (2017): 339-363.
  • [10] Gunasekaran M. and Lopez D., "Health Data Analytics Using Scalable Logistic Regression with Stochastic Gradient Descent", International Journal of Advanced Intelligence Paradigms, 10.1-2 (2018): 118-132.
  • [11] Markus H., et al., "Economic Development Matters: A Meta‐Regression Analysis on the Relation between Environmental Management and Financial Performance", Journal of Industrial Ecology, 22.4 (2018): 720-744.
  • [12] https://sites.google.com/site/frankverbo/data-and-software/data-set-on-the-european-car-market.
  • [13] Aggarwal, C. C., An introduction to outlier analysis. In Outlier analysis, New York NY: Springer, (2013): 1-40.
  • [14] Ilango, V., Subramanian, R., & Vasudevan, V., “A five step procedure for outlier analysis in data mining”, European Journal of Scientific Research, 75(3) (2012): 327-339.

Multiple Regression Analysis System in Machine Learning and Estimating Effects of Data Transformation&Min-Max Normalization

Year 2018, Volume: 3 Issue: 3, 189 - 204, 29.12.2018
https://doi.org/10.30931/jetas.475215

Abstract

Machine learning area is a recent topic in data analysis and a researcher or worker of the area is called “Data Scientist” which nowadays has been a highly preferred job title in computing. In this study, we have two aims that the first is to implement a multiple regression analysis system which is developed in Ubuntu operating system on the Anaconda platform using Python3 in order to construct models of each attribute to make their estimations for future decisions taking less risk in advance of past experiences hided in cumulated data and the second aim is to find out effects of data transformation and min-max normalization in the data preparation before building models. After the system implementation, we test the system to determine the best estimation model of each attribute of the vehicles sold in the five European countries between 1970 and 1999. We have constructed six versions of the original dataset and these versions are used to construct regression models for further estimations. Finally, we compute the regression criterion value of R-Squared for each constructed-model and we compare the models according to these values. Computational results are very promising that the system can be used efficiently and the effects of the data transformation and min-max normalization are significant for some attributes.

References

  • [1] Zikopoulos, P.C., Eaton, C., deRoos, D., Deutsch, T., Lapis, G., Understanding Big Data, McGrawHill, New York, 2012.
  • [2] Witten, Ian H., et al., Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, 2016.
  • [3] Friedman, J., Trevor H., and Tibshirani R., The Elements of Statistical Learning, Vol. 1. Springer series in statistics, New York, 2001.
  • [4] Weidner CI, Lin Q, Koch CM, Eisele L, Beier F, Ziegler P, Bauerschlag DO, Jo¨ckel KH, Erbel R, Mu¨hleisen TW, Zenke M, Bru¨mmendorf TH, Wagner W., “Aging of Blood Can Be Tracked by DNA Methylation Changes at Just Three CpG Sites”, Genome Biol 15.2 (2014):1–11.
  • [5] Gareth J., Witten D., Hastie T., Tibshirani R., An Introduction to Statistical Learning, Springer, New York, ISBN 978-1-4614-7137-0, 2015.
  • [6] Putin E, Mamoshina P, Aliper A, Korzinkin M, Moskalev A., “Deep Biomarkers of Human Aging : Application of Deep Neural Networks to Biomarker Development”, Aging 8.5 (2016):1–13.
  • [7] Hox, Joop J., Mirjam M., and Rens Van de Schoot, Multilevel Analysis: Techniques and Applications, Routledge, 2017.
  • [8] Hu, Rui, et al., "A Short-term Power Load Forecasting Model based on the Generalized Regression Neural Network with Decreasing Step Fruit Fly Optimization Algorithm", Neurocomputing, 221 (2017): 24-31.
  • [9] Kristof De W. and López-Torres L., "Efficiency in Education: a Review of Literature and a Way Forward", Journal of the Operational Research Society, 68.4 (2017): 339-363.
  • [10] Gunasekaran M. and Lopez D., "Health Data Analytics Using Scalable Logistic Regression with Stochastic Gradient Descent", International Journal of Advanced Intelligence Paradigms, 10.1-2 (2018): 118-132.
  • [11] Markus H., et al., "Economic Development Matters: A Meta‐Regression Analysis on the Relation between Environmental Management and Financial Performance", Journal of Industrial Ecology, 22.4 (2018): 720-744.
  • [12] https://sites.google.com/site/frankverbo/data-and-software/data-set-on-the-european-car-market.
  • [13] Aggarwal, C. C., An introduction to outlier analysis. In Outlier analysis, New York NY: Springer, (2013): 1-40.
  • [14] Ilango, V., Subramanian, R., & Vasudevan, V., “A five step procedure for outlier analysis in data mining”, European Journal of Scientific Research, 75(3) (2012): 327-339.
There are 14 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Research Article
Authors

Ayla Saylı

Ceyda Akbulut This is me

Kemal Kosuta This is me

Publication Date December 29, 2018
Published in Issue Year 2018 Volume: 3 Issue: 3

Cite

APA Saylı, A., Akbulut, C., & Kosuta, K. (2018). Multiple Regression Analysis System in Machine Learning and Estimating Effects of Data Transformation&Min-Max Normalization. Journal of Engineering Technology and Applied Sciences, 3(3), 189-204. https://doi.org/10.30931/jetas.475215
AMA Saylı A, Akbulut C, Kosuta K. Multiple Regression Analysis System in Machine Learning and Estimating Effects of Data Transformation&Min-Max Normalization. JETAS. December 2018;3(3):189-204. doi:10.30931/jetas.475215
Chicago Saylı, Ayla, Ceyda Akbulut, and Kemal Kosuta. “Multiple Regression Analysis System in Machine Learning and Estimating Effects of Data Transformation&Min-Max Normalization”. Journal of Engineering Technology and Applied Sciences 3, no. 3 (December 2018): 189-204. https://doi.org/10.30931/jetas.475215.
EndNote Saylı A, Akbulut C, Kosuta K (December 1, 2018) Multiple Regression Analysis System in Machine Learning and Estimating Effects of Data Transformation&Min-Max Normalization. Journal of Engineering Technology and Applied Sciences 3 3 189–204.
IEEE A. Saylı, C. Akbulut, and K. Kosuta, “Multiple Regression Analysis System in Machine Learning and Estimating Effects of Data Transformation&Min-Max Normalization”, JETAS, vol. 3, no. 3, pp. 189–204, 2018, doi: 10.30931/jetas.475215.
ISNAD Saylı, Ayla et al. “Multiple Regression Analysis System in Machine Learning and Estimating Effects of Data Transformation&Min-Max Normalization”. Journal of Engineering Technology and Applied Sciences 3/3 (December 2018), 189-204. https://doi.org/10.30931/jetas.475215.
JAMA Saylı A, Akbulut C, Kosuta K. Multiple Regression Analysis System in Machine Learning and Estimating Effects of Data Transformation&Min-Max Normalization. JETAS. 2018;3:189–204.
MLA Saylı, Ayla et al. “Multiple Regression Analysis System in Machine Learning and Estimating Effects of Data Transformation&Min-Max Normalization”. Journal of Engineering Technology and Applied Sciences, vol. 3, no. 3, 2018, pp. 189-04, doi:10.30931/jetas.475215.
Vancouver Saylı A, Akbulut C, Kosuta K. Multiple Regression Analysis System in Machine Learning and Estimating Effects of Data Transformation&Min-Max Normalization. JETAS. 2018;3(3):189-204.