Research Article
BibTex RIS Cite

Konut Özellikleri Madenciliğiyle Konut Birim Fiyatlarını Sınıflandırması

Year 2022, Volume: 26 Issue: 3, 420 - 426, 20.12.2022
https://doi.org/10.19113/sdufenbed.1073504

Abstract

Sınıflandırma, istatistik, bilgisayar bilimi, matematik ve diğer birçok disiplin arasında veri madenciliği ile ortak bir alan yaratır. Bağımlı ve bağımsız değişkenler arasındaki ilişkiyi sınıflandırmak için sıklıkla kullanılan parametrik ve parametrik olmayan pek çok istatistiksel uygulamalar bulunmaktadır. Bu çalışmada yaygın olarak kullanılan iki sınıflandırma tekniği kullanılmıştır. Çalışmanın amacı, sıralı lojistik regresyon ve sınıflandırma ağaçları tekniklerinin sınıflandırma başarısını karşılaştırmaktır. Bu amaçla, Eskişehir’ de üç sınıflı bağımlı değişken olarak ele alınan konut birim fiyatlarını konut birim fiyatlarını etkileyen potansiyel faktörler incelenmiştir. Gerçek veri seti, eğitim, doğrulama ve test olmak üzere üç gruba bölünmüştür. Bu tekniklerin sınıflandırma başarısı, 5 katlı çapraz geçerlilik ile gösterilmiştir. Elde edilen sonuçlara göre, daha başarılı bir sınıflandırma, sınıflandırma ağaçları algoritmasıyla elde edilmiştir.

References

  • [1]Berkson, J. 1944. Application of the LogisticFunction to Bio-assay. Journal of the AmericanStatistical Association, 39(227), 357-365.
  • [2]Finney, D.J. 1971. Probit Analysis. 3rd,Cambridge University Press. Cambridge.
  • [3]Freeman, D.H. 1987. Applied Categorical DataAnalysis. Marcel Dekker Inc., New York.
  • [4]Cox, D.R. 1970. Analysis of Binary Data. 2nd,Chapman and Hall, London.
  • [5]Aranda-Ordaz, FJ. 1981. On Two Families ofTransformations to Additive for BinaryResponse. Biometrika, 68(2), 357-363.
  • [6]Johnson, W. 1985. Influence Measures forLogistic Regression: Another Point of View.Biometrika, 72(1), 59–65.
  • [7]McCullagh, P. 1980. Regression Models forOrdinal Data. Journal of the Royal StatisticalSociety. Series B, 42(2), 109-127.
  • [8]Ananth, C.V., Kleinbaum, D.G. 1997. RegressionModels for Ordinal Responses: A Review ofMethods and Applications. International Journalof Epidemiology, 26(6), 1323-1333.
  • [9]Blanco, B., Pino-Mejias, R., Lara, J., Rayo, S. 2013.Credit Scoring Models for the MicrofinanceIndustry Using Neural Networks: Evidence fromPeru. Expert System Applications, 40(1), 356-364.
  • [10]Pakgohar, A., Tabrizi, R.S., Khalilli, M., Esmaeili,A.2010. The Role of Human Factor in Incidenceand Severity of Road Crashes Based on the CARTand LR Regression: A Data Mining Approach.Procedia Computer Science, 3(8), 764-769.
  • [11]Twala, B. 2010. Multiple Classifier Application toCredit Risk Assessment. Expert SystemApplications, 37, 3326-3336.
  • [12]Breiman, L., Friedman, J., Olsen, R., Stone, C.1984. Classification and Regression Trees.Chapman & Hall, New York.
  • [13]Friedman, J.H. 1991. Multivariate AdaptiveRegression Splines. The Annals of Statistics,19(1), 1-67.
  • [14]Fu, C.Y. 2004. Combining Loglinear Model withClassification and Regression Tree (CART): AnApplication to Birth Data”. ComputationalStatistics & Data Analysis, 45(4), 865-874.
  • [15]Timofeev, R. 2004. Classification and regressionTrees (CART) theory and applications. HumboltUniversity, MSc Thesis, Berlin.
  • [16]Nagalla, R., Pothuganti, P., Pawar, D.S. 2017.Analyzing Gap Acceptance Behavior atUnsignalized Intersections Using Support VectorMachines, Decision Tree and Random Forests.Procedia Computer Science, 109C, 474-481.
  • [17]Griselda, L., Juan, De O, Joaquin, A. 2012. UsingDecision Trees to Extract Decision Rules FromPolice Reports on Road Accidents. ProcediaSocial and Behavioral Sciences, 53, 106-114.
  • [18]Agresti, A. 2007. An Introduction to CategoricalData Analysis, 2nd, Wiley and Sons, New York.
  • [19]O’Connell, A.A. 2000. Methods for ModelingOrdinal Outcome Variables. Measurement andEvaluation in Counseling and Development,33(3), 170-193.
  • [20]O’Connell, A.A. 2006. Logistic Regression Modelsfor Ordinal Response Variables. Thousand Oaks,SAGE, CA USA.
  • [21]O’Connell, A.A., Liu, X. 2011. Model Diagnosticsfor Proportional and Partial Proportional OddsModels. Journal of Modern Applied StatisticalMethods, 10(1), 139-175.
  • [22]Powers, D.A., Xie, Y. 2000. Statistical Models forCategorical Data Analysis. Academic Press, SanDiego USA.
  • [23]Hardin, J.W., Hilbe, J.M. 2007. Generalized LinearModels and Extensions, 2nd, Stata Press, TexasUSA.
  • [24]Montgomery, D.C., Peck, E.A. Vining, G.G. 2013.Introduction to Linear Regression Analysis. 5th,Wiley, USA.
  • [25]Lawson, C., Montgomery, D.C. 2006. LogisticRegression Analysis of Customer SatisfactionData. Quality and Reliability EngineeringInternational, 22(8), 971-984.
  • [26]Liao, T.F. 1994. Interpreting ProbabilityModels: Logit, Probit, and Other GeneralizedLinear Models. Quantitative Applications in theSocial Sciences, Sage Publications, 101.
  • [27]Chen, C.K., Hughes, J. 2004. Using ordinalregression model to analyze student satisfactionquestionnaires. Association for InstitutionalResearch, 1, 1-13.
  • [28]Breslaw, J., McIntosh, J. 1998. Simulated LatentVariable Estimation of Models with OrderedCategorical Data. Journal of Econometrics, 87(1),25-47.
  • [29]Kleinbaum, D.G., Klein, M. 2010. LogisticRegression: A Self-Learning Text. 3rd, Springer,New York USA.
  • [30]Hosmer, Jr D.W., Lemeshow, S., Sturdivant, R.X.2013. Applied Logistic Regression. John Wileyand Sons, New York USA.
  • [31]Long, J.S. 1997. Regression Models forCategorical and Limited Dependent Variables.Advanced Quantitative Techniques in The SocialSciences. Sage Publications, 7, 1997.
  • [32]Long, J.S., Freese, J. 2006. Regression Models forCategorical Dependent Variables Using Stata.2nd, Stata Press, Texas USA.
  • [33]Brant, R. 1990. Assessing Proportionality in TheProportional Odds Model for Ordinal LogisticRegression. Biometrics, 46(4), 1171-1178.
  • [34]Rai, S., Khandelwal, N., Boghey, R., 2020. Analysisof Customer Churn Prediction in Telecom SectorUsing CART Algorithm. 1st InternationalConference On Sustainable Technologies ForComputational Intelligence Book Series:Advances in Intelligent Systems and Computing,1045, 457-466.
  • [35]Liu, Y.F., Ma, B.Y., Wang, Y. 2021. Study onPrediction Model of Stroke Risk Based onDecision Tree and Regression Model. 2021 IEEEInternational Conference On Big Data (Big Data), December 15-18, Virtual, 4798-4801.
  • [36]Tareq, W.K., Shukur, O.B. 2021. Using CartApproach for Classifying Climatic Status ofMosul City. Journal Of Agricultural AndStatistical Sciences, 17, 2325-2331.
  • [37]Brian, R. 1996. Pattern Recognition and NeuralNetworks, Cambridge University Press, 354s,Cambridge.
  • [38]Landis, J.R., Koch, G.G. 1977. The Measurementof Observer Agreement for Categorical Data.Biometrics, 33(1), 159-174.

Mining Housing Features to Classify Housing Unit Price

Year 2022, Volume: 26 Issue: 3, 420 - 426, 20.12.2022
https://doi.org/10.19113/sdufenbed.1073504

Abstract

In data mining, classification builds an interdisciplinary field upon from statistics, computer science, mathematics and many other disciplines. There are numerous statistical applications where parametric and non-parametric methods are frequently used to train data to estimate mapping function. In this study, two of the most widely used techniques are applied to a real dataset. The goal of the study is to compare the classification success of ordinal logistic regression and the classification trees and to predict a categorical response. For this purpose, the potential factors affecting the housing unit price for sale as being the dependent variable with three classes in Eskişehir were examined. The real data set was split into three as train, validation and test groups. The classification performance of the techniques was demonstrated with 5-fold cross validation technique. According to the results, a more successful classification was made with the classification trees algorithm.

References

  • [1]Berkson, J. 1944. Application of the LogisticFunction to Bio-assay. Journal of the AmericanStatistical Association, 39(227), 357-365.
  • [2]Finney, D.J. 1971. Probit Analysis. 3rd,Cambridge University Press. Cambridge.
  • [3]Freeman, D.H. 1987. Applied Categorical DataAnalysis. Marcel Dekker Inc., New York.
  • [4]Cox, D.R. 1970. Analysis of Binary Data. 2nd,Chapman and Hall, London.
  • [5]Aranda-Ordaz, FJ. 1981. On Two Families ofTransformations to Additive for BinaryResponse. Biometrika, 68(2), 357-363.
  • [6]Johnson, W. 1985. Influence Measures forLogistic Regression: Another Point of View.Biometrika, 72(1), 59–65.
  • [7]McCullagh, P. 1980. Regression Models forOrdinal Data. Journal of the Royal StatisticalSociety. Series B, 42(2), 109-127.
  • [8]Ananth, C.V., Kleinbaum, D.G. 1997. RegressionModels for Ordinal Responses: A Review ofMethods and Applications. International Journalof Epidemiology, 26(6), 1323-1333.
  • [9]Blanco, B., Pino-Mejias, R., Lara, J., Rayo, S. 2013.Credit Scoring Models for the MicrofinanceIndustry Using Neural Networks: Evidence fromPeru. Expert System Applications, 40(1), 356-364.
  • [10]Pakgohar, A., Tabrizi, R.S., Khalilli, M., Esmaeili,A.2010. The Role of Human Factor in Incidenceand Severity of Road Crashes Based on the CARTand LR Regression: A Data Mining Approach.Procedia Computer Science, 3(8), 764-769.
  • [11]Twala, B. 2010. Multiple Classifier Application toCredit Risk Assessment. Expert SystemApplications, 37, 3326-3336.
  • [12]Breiman, L., Friedman, J., Olsen, R., Stone, C.1984. Classification and Regression Trees.Chapman & Hall, New York.
  • [13]Friedman, J.H. 1991. Multivariate AdaptiveRegression Splines. The Annals of Statistics,19(1), 1-67.
  • [14]Fu, C.Y. 2004. Combining Loglinear Model withClassification and Regression Tree (CART): AnApplication to Birth Data”. ComputationalStatistics & Data Analysis, 45(4), 865-874.
  • [15]Timofeev, R. 2004. Classification and regressionTrees (CART) theory and applications. HumboltUniversity, MSc Thesis, Berlin.
  • [16]Nagalla, R., Pothuganti, P., Pawar, D.S. 2017.Analyzing Gap Acceptance Behavior atUnsignalized Intersections Using Support VectorMachines, Decision Tree and Random Forests.Procedia Computer Science, 109C, 474-481.
  • [17]Griselda, L., Juan, De O, Joaquin, A. 2012. UsingDecision Trees to Extract Decision Rules FromPolice Reports on Road Accidents. ProcediaSocial and Behavioral Sciences, 53, 106-114.
  • [18]Agresti, A. 2007. An Introduction to CategoricalData Analysis, 2nd, Wiley and Sons, New York.
  • [19]O’Connell, A.A. 2000. Methods for ModelingOrdinal Outcome Variables. Measurement andEvaluation in Counseling and Development,33(3), 170-193.
  • [20]O’Connell, A.A. 2006. Logistic Regression Modelsfor Ordinal Response Variables. Thousand Oaks,SAGE, CA USA.
  • [21]O’Connell, A.A., Liu, X. 2011. Model Diagnosticsfor Proportional and Partial Proportional OddsModels. Journal of Modern Applied StatisticalMethods, 10(1), 139-175.
  • [22]Powers, D.A., Xie, Y. 2000. Statistical Models forCategorical Data Analysis. Academic Press, SanDiego USA.
  • [23]Hardin, J.W., Hilbe, J.M. 2007. Generalized LinearModels and Extensions, 2nd, Stata Press, TexasUSA.
  • [24]Montgomery, D.C., Peck, E.A. Vining, G.G. 2013.Introduction to Linear Regression Analysis. 5th,Wiley, USA.
  • [25]Lawson, C., Montgomery, D.C. 2006. LogisticRegression Analysis of Customer SatisfactionData. Quality and Reliability EngineeringInternational, 22(8), 971-984.
  • [26]Liao, T.F. 1994. Interpreting ProbabilityModels: Logit, Probit, and Other GeneralizedLinear Models. Quantitative Applications in theSocial Sciences, Sage Publications, 101.
  • [27]Chen, C.K., Hughes, J. 2004. Using ordinalregression model to analyze student satisfactionquestionnaires. Association for InstitutionalResearch, 1, 1-13.
  • [28]Breslaw, J., McIntosh, J. 1998. Simulated LatentVariable Estimation of Models with OrderedCategorical Data. Journal of Econometrics, 87(1),25-47.
  • [29]Kleinbaum, D.G., Klein, M. 2010. LogisticRegression: A Self-Learning Text. 3rd, Springer,New York USA.
  • [30]Hosmer, Jr D.W., Lemeshow, S., Sturdivant, R.X.2013. Applied Logistic Regression. John Wileyand Sons, New York USA.
  • [31]Long, J.S. 1997. Regression Models forCategorical and Limited Dependent Variables.Advanced Quantitative Techniques in The SocialSciences. Sage Publications, 7, 1997.
  • [32]Long, J.S., Freese, J. 2006. Regression Models forCategorical Dependent Variables Using Stata.2nd, Stata Press, Texas USA.
  • [33]Brant, R. 1990. Assessing Proportionality in TheProportional Odds Model for Ordinal LogisticRegression. Biometrics, 46(4), 1171-1178.
  • [34]Rai, S., Khandelwal, N., Boghey, R., 2020. Analysisof Customer Churn Prediction in Telecom SectorUsing CART Algorithm. 1st InternationalConference On Sustainable Technologies ForComputational Intelligence Book Series:Advances in Intelligent Systems and Computing,1045, 457-466.
  • [35]Liu, Y.F., Ma, B.Y., Wang, Y. 2021. Study onPrediction Model of Stroke Risk Based onDecision Tree and Regression Model. 2021 IEEEInternational Conference On Big Data (Big Data), December 15-18, Virtual, 4798-4801.
  • [36]Tareq, W.K., Shukur, O.B. 2021. Using CartApproach for Classifying Climatic Status ofMosul City. Journal Of Agricultural AndStatistical Sciences, 17, 2325-2331.
  • [37]Brian, R. 1996. Pattern Recognition and NeuralNetworks, Cambridge University Press, 354s,Cambridge.
  • [38]Landis, J.R., Koch, G.G. 1977. The Measurementof Observer Agreement for Categorical Data.Biometrics, 33(1), 159-174.
There are 38 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Makaleler
Authors

Betül Kan Kılınç 0000-0002-3746-2327

Simay Mirgen This is me 0000-0002-8858-6610

Publication Date December 20, 2022
Published in Issue Year 2022 Volume: 26 Issue: 3

Cite

APA Kan Kılınç, B., & Mirgen, S. (2022). Mining Housing Features to Classify Housing Unit Price. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 26(3), 420-426. https://doi.org/10.19113/sdufenbed.1073504
AMA Kan Kılınç B, Mirgen S. Mining Housing Features to Classify Housing Unit Price. SDÜ Fen Bil Enst Der. December 2022;26(3):420-426. doi:10.19113/sdufenbed.1073504
Chicago Kan Kılınç, Betül, and Simay Mirgen. “Mining Housing Features to Classify Housing Unit Price”. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi 26, no. 3 (December 2022): 420-26. https://doi.org/10.19113/sdufenbed.1073504.
EndNote Kan Kılınç B, Mirgen S (December 1, 2022) Mining Housing Features to Classify Housing Unit Price. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi 26 3 420–426.
IEEE B. Kan Kılınç and S. Mirgen, “Mining Housing Features to Classify Housing Unit Price”, SDÜ Fen Bil Enst Der, vol. 26, no. 3, pp. 420–426, 2022, doi: 10.19113/sdufenbed.1073504.
ISNAD Kan Kılınç, Betül - Mirgen, Simay. “Mining Housing Features to Classify Housing Unit Price”. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi 26/3 (December 2022), 420-426. https://doi.org/10.19113/sdufenbed.1073504.
JAMA Kan Kılınç B, Mirgen S. Mining Housing Features to Classify Housing Unit Price. SDÜ Fen Bil Enst Der. 2022;26:420–426.
MLA Kan Kılınç, Betül and Simay Mirgen. “Mining Housing Features to Classify Housing Unit Price”. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, vol. 26, no. 3, 2022, pp. 420-6, doi:10.19113/sdufenbed.1073504.
Vancouver Kan Kılınç B, Mirgen S. Mining Housing Features to Classify Housing Unit Price. SDÜ Fen Bil Enst Der. 2022;26(3):420-6.

e-ISSN: 1308-6529