HİYERARŞİK KÜMELEME MODELİ KULLANAN WEB TABANLI BİR ÖDEV DEĞERLENDİRME SİSTEMİ

Erdinç Uzun; Cihat Erdoğan; Ahmet Saygılı

Araştırma Makalesi

HİYERARŞİK KÜMELEME MODELİ KULLANAN WEB TABANLI BİR ÖDEV DEĞERLENDİRME SİSTEMİ

Yıl 2016, Cilt: 6 Sayı: 3, 87 - 98, 20.12.2016

Erdinç Uzun Cihat Erdoğan Ahmet Saygılı

Öz

Ödevlerin öğrencilerin öğretim sürecinde
önemli bir yeri bulunmaktadır. Klasik bir ödev değerlendirme sürecinde ödevin
sadece doğru olup olmadığı değerlendirilmektedir. Ancak, ödevin öğretime daha
iyi katkı verebilmesi için öğrencilerin yapmış oldukları intihallerin de göz
önüne alınması gerekmektedir. İntihalleri ve intihallerin oranını tespit etme
son derece zor bir ödev değerlendirme prosedürüdür. Bu çalışmada, bu prosedürü
kolaylaştıracak doküman benzerliği ölçütlerini hiyerarşik kümeleme modeli ile
bütünleştirebilen web tabanlı bir uygulama tanıtılacaktır. Bu uygulama,
kimlerin benzer ödev yaptığını ve ödevlerin hangi oranda benzerliğe sahip
olduğunu değerlendirme imkânı vermektedir. Bu uygulamanın doküman benzerliği hesaplanmasında
Cosine, Jaccard ve Dice benzerlik ölçütleri denenmiştir. Diğer taraftan
hiyerarşik kümeleme tarafında Tek Bağlantı, Tam Bağlantı ve Ortalama Grup olmak
üzere üç farklı algoritma incelenmiştir. Önceki yıllara ait iki öğretim
dönemini kapsayan 6 farklı öğretim üyesinin 18 farklı dersine ait 54 ödevi içeren
bir test verisi oluşturulmuştur. Her ödev için doküman benzerlik ölçütlerinin
ve kümeleme algoritmalarının çaprazlanmasından 9 farklı sonuç elde edilmiş ve
hiyerarşik kümeleme algoritmalarının ne kadar iyi olduğunu test etmek için
cophenetic korelasyon katsayıları hesaplanmıştır. Sonuçlar analiz edildiğinde,
doküman benzerliğinde Jaccard ölçütü ve hiyerarşik kümelemede Ortalama Grup
algoritmasının en uygun ödev değerlendirme çifti olduğu görülmüştür.

Anahtar Kelimeler

Hiyerarşik kümeleme, Doküman benzerliği, Yerel İntihal Tespiti, Yazılım Geliştirme

Kaynakça

Arda, B. (2003). Üniversitenin araştırma işlevi ve etik. C. Ü. Tıp Fakültesi Dergisi, 25 (4), 7-11.
Can, F., Kocberber, S., Balcik, E., Kaynak, C., Ocalan, H. C. ve Vursavas, O. M.(2008). Information retrieval on Turkish texts. Journal of the American Society for Information Science and Technology, 59 (3), 407-421.
Çiftçi, K.(2011). Minimum spanning tree reflects the alterations of the default mode network during Alzheimer's Disease. Annals of Biomedical Engineering, 39, 1493-1504.
Dice, L. R.(1945). Measures of the Amount of Ecologic Association Between Species. Ecology, 26 (3), 297-302. Donaldson J. L., Green, B. ve Sposato, P. H.(1981). A plagiarism detection system. Proceedings of the 12th SIGCSE symposium on Computer science education, 13 (1), 21-25.
Fung, B. K., Wang, K. ve Ester, M.(2004). Hierarchical Document Clustering. Encyclopedia of Data Warehousing and Mining, Idea Group Reference, 1 (A-H), 555-559.
Heintze, N.(1996). Scalable document fingerprinting. In Proceedings of the Second USENIX Workshop on Electronic Commerce, 191-200.
Karabulut, M., Gürbüz, M. ve Sandal, E. K.(2008). Hiyerarşik Küme Tekniği Kullanılarak Türkiye’deki İllerin Sosyo-ekonomik Benzerliklerinin Analizi. SDÜ Sosyal Bilimler Enstitüsü Dergisi, 3 (5), 65-78.
Lingxiao, J., Su, Z. ve Chiu, E.(2007). Context-based detection of clone-related bugs. Proceedings of the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, 55-64, Dubrovnik, Croatia.
Liu, X., Xu, C. ve Ouyang, B. (2015). Plagiarism Detection Algorithm for Source Code in Computer Science Education. International Journal of Distance Education Technologies (IJDET), 13(4), 29-39.
Ceglarek, D.(2013). Evaluation of the SHAPD2 algorithm efficiency in plagiarism detection tasks. Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE), 2013 International Conference on. IEEE, 9-11 May, Konya, Turkey.
Lyon, C., Malcolm, J. ve Dickerson, B.(2001). Detecting short passages of similar text in large document collections. In Proceedings of Conference on Empirical Methods in Natural Language Processing, 118-125. Manning, C. D., Raghavan, P. ve Schütze, H., Introduction to Information Retrieval. Cambridge University Press, http://informationretrieval.org/, 2008.
Samuel, M. ve Zelda, F.(2006). Similarity and originality in code: plagiarism and normal variation in student assignments. Proceedings of the 8th Australian conference on Computing education, 52, 143-150, Hobart, Australia, Australian Computer Society.
Schleimer, S., Wilkerson, D. S. ve Aiken, A.(2003). Winnowing: local algorithms for document fingerprinting. In Proceedings of the 2003 ACM SIGMOD - International Conference on Management of Data, pp. 76-85.
Tan, P. N., Steinbach, M. ve Kumar, V.(2005). Introduction to Data Mining. Addison-Wesley, ISBN 0-321-32136-7, Bölüm 8: pp. 500.
Uzun, E.(2012). A fuzzy ranking approach for improving search results in Turkish as an agglutinative language. Expert Systems with Applications, 39(5), 5658-5664.
Uzun, E., Karakuş, T., Kurşun, E. ve Karaaslan, H.(2007). Öğrenci gözüyle aşırma (intihal): neden ve çözüm önerileri. Akademik Bilişim ’07, Bildiri Kitabı (pp. 183-188), Dumlupınar Üniversitesi, Kütahya.
Wise, M.(1992). Detection of similarities in student programs: YAP'ing may be preferable to plague'ing. SIGCSE '92 Proceedings of the twenty-third SIGCSE technical symposium on Computer science education, 24 (1), 268-271.
Yerra, R. ve Ng, Y.(2005). A Sentence-Based Copy Detection Approach for Web Documents. Fuzzy Systems and Knowledge Discovery, Lecture Notes in Computer Science 557-570, Springer.
Yeşilbudak, M., Kahraman, H.T. ve Karacan H.(2011). Veri madenciliğinde nesne yönelimli birleştirici hiyerarşik kümeleme modeli. Gazi Üniversitesi Mühendislik Mimarlık Dergisi, 26 (1), 27-39.

A WEB-BASED ASSIGMENT EVALUATION APPLICATION USING HIERARCHICAL CLUSTERING MODEL

Yıl 2016, Cilt: 6 Sayı: 3, 87 - 98, 20.12.2016

Erdinç Uzun Cihat Erdoğan Ahmet Saygılı

Öz

Assignments
are one of the most important parts of education process of students. In the
classical assignment evaluation process, an assignment can be evaluated whether
it is correct or not. However, for the assignments to give better contribution
to education, plagiarisms committed by students should be considered. Detection
of plagiarism and its extent are extremely difficult assignment evaluation
procedures. In this study, in order to facilitate this procedure, a web-based
application, which can combine document similarity measures with hierarchical
clustering model, is introduced. This application gives the opportunity to
evaluate which students submit similar assignments and the assignments’
similarity degree. Cosine, Dice and Jaccard similarity measures have been
investigated in terms of document similarity calculation of this application.
On the other hand, three different algorithms including Single Linkage, Complete
Linkage and Average Group are examined in hierarchical clustering side. Test
data which covers two education period of previous years and contains 54
different assignments of 18 different courses of 6 lecturers, are created. By
using document similarity methods and hierarchical clustering algorithms, 9
different cophenetic correlation coefficients are obtained for each assignment
and cophenetic correlation coefficients are calculated to test how well
hierarchical clustering algorithms are . When the results were analyzed, it was
discovered that Jaccard measure in document similarity and Average Group
algorithm in hierarchical clustering is the best matching assignment evaluation
pair.

Anahtar Kelimeler

Hierarchical Clustering, Document similarity, Plagiarism Detection, Software Development

Kaynakça

Arda, B. (2003). Üniversitenin araştırma işlevi ve etik. C. Ü. Tıp Fakültesi Dergisi, 25 (4), 7-11.
Can, F., Kocberber, S., Balcik, E., Kaynak, C., Ocalan, H. C. ve Vursavas, O. M.(2008). Information retrieval on Turkish texts. Journal of the American Society for Information Science and Technology, 59 (3), 407-421.
Çiftçi, K.(2011). Minimum spanning tree reflects the alterations of the default mode network during Alzheimer's Disease. Annals of Biomedical Engineering, 39, 1493-1504.
Dice, L. R.(1945). Measures of the Amount of Ecologic Association Between Species. Ecology, 26 (3), 297-302. Donaldson J. L., Green, B. ve Sposato, P. H.(1981). A plagiarism detection system. Proceedings of the 12th SIGCSE symposium on Computer science education, 13 (1), 21-25.
Fung, B. K., Wang, K. ve Ester, M.(2004). Hierarchical Document Clustering. Encyclopedia of Data Warehousing and Mining, Idea Group Reference, 1 (A-H), 555-559.
Heintze, N.(1996). Scalable document fingerprinting. In Proceedings of the Second USENIX Workshop on Electronic Commerce, 191-200.
Karabulut, M., Gürbüz, M. ve Sandal, E. K.(2008). Hiyerarşik Küme Tekniği Kullanılarak Türkiye’deki İllerin Sosyo-ekonomik Benzerliklerinin Analizi. SDÜ Sosyal Bilimler Enstitüsü Dergisi, 3 (5), 65-78.
Lingxiao, J., Su, Z. ve Chiu, E.(2007). Context-based detection of clone-related bugs. Proceedings of the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, 55-64, Dubrovnik, Croatia.
Liu, X., Xu, C. ve Ouyang, B. (2015). Plagiarism Detection Algorithm for Source Code in Computer Science Education. International Journal of Distance Education Technologies (IJDET), 13(4), 29-39.
Ceglarek, D.(2013). Evaluation of the SHAPD2 algorithm efficiency in plagiarism detection tasks. Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE), 2013 International Conference on. IEEE, 9-11 May, Konya, Turkey.
Lyon, C., Malcolm, J. ve Dickerson, B.(2001). Detecting short passages of similar text in large document collections. In Proceedings of Conference on Empirical Methods in Natural Language Processing, 118-125. Manning, C. D., Raghavan, P. ve Schütze, H., Introduction to Information Retrieval. Cambridge University Press, http://informationretrieval.org/, 2008.
Samuel, M. ve Zelda, F.(2006). Similarity and originality in code: plagiarism and normal variation in student assignments. Proceedings of the 8th Australian conference on Computing education, 52, 143-150, Hobart, Australia, Australian Computer Society.
Schleimer, S., Wilkerson, D. S. ve Aiken, A.(2003). Winnowing: local algorithms for document fingerprinting. In Proceedings of the 2003 ACM SIGMOD - International Conference on Management of Data, pp. 76-85.
Tan, P. N., Steinbach, M. ve Kumar, V.(2005). Introduction to Data Mining. Addison-Wesley, ISBN 0-321-32136-7, Bölüm 8: pp. 500.
Uzun, E.(2012). A fuzzy ranking approach for improving search results in Turkish as an agglutinative language. Expert Systems with Applications, 39(5), 5658-5664.
Uzun, E., Karakuş, T., Kurşun, E. ve Karaaslan, H.(2007). Öğrenci gözüyle aşırma (intihal): neden ve çözüm önerileri. Akademik Bilişim ’07, Bildiri Kitabı (pp. 183-188), Dumlupınar Üniversitesi, Kütahya.
Wise, M.(1992). Detection of similarities in student programs: YAP'ing may be preferable to plague'ing. SIGCSE '92 Proceedings of the twenty-third SIGCSE technical symposium on Computer science education, 24 (1), 268-271.
Yerra, R. ve Ng, Y.(2005). A Sentence-Based Copy Detection Approach for Web Documents. Fuzzy Systems and Knowledge Discovery, Lecture Notes in Computer Science 557-570, Springer.
Yeşilbudak, M., Kahraman, H.T. ve Karacan H.(2011). Veri madenciliğinde nesne yönelimli birleştirici hiyerarşik kümeleme modeli. Gazi Üniversitesi Mühendislik Mimarlık Dergisi, 26 (1), 27-39.

Toplam 19 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Konular	Mühendislik
Bölüm	Makaleler
Yazarlar	Erdinç Uzun Cihat Erdoğan Bu kişi benim Ahmet Saygılı
Yayımlanma Tarihi	20 Aralık 2016
Gönderilme Tarihi	1 Haziran 2016
Yayımlandığı Sayı	Yıl 2016 Cilt: 6 Sayı: 3

Kaynak Göster

APA	Uzun, E., Erdoğan, C., & Saygılı, A. (2016). HİYERARŞİK KÜMELEME MODELİ KULLANAN WEB TABANLI BİR ÖDEV DEĞERLENDİRME SİSTEMİ. Ejovoc (Electronic Journal of Vocational Colleges), 6(3), 87-98.

Kapak Resmi İndir

Makale Dosyaları

Tam Metin