TY - JOUR TT - Türkçe Dokümanlar İçin N-gram Tabanlı Yeni Bir Sınıflandırma(Ng-ind): Yazar, Tür ve Cinsiyet AU - Doğan, Sibel AU - Diri, Banu PY - 2016 DA - June JF - Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi JO - TBV-BBMD PB - Akademik Bilişim Vakfı WT - DergiPark SN - 1305-8991 SP - 11 EP - 19 VL - 3 IS - 1 N2 - In this study, it is tried to find out aTurkish document’s genre, author and documentauthor’s gender with using the Turkish n-grammodel. In N-gram model, 2-, 3-, 4-grams were used,and total 6 feature vectors were produced on 3different data set. Some tests were made with theNg-ind method that we produced near the otherclassifiers such as Naive Bayes (NB), SupportVector Machine (SVM), Random Forest (RF), KNearestNeighbor (K-NN) and the successperformances were compared with each other. Inspite of the Ng-ind method gave better results thanthe other ones in gender and genre determination, itshowed better performance than the compoundedclassifiers in genre determination CR - 1. Doğan, S., 2006, “Türkçe Dokümanlar için N-gram Tabanlı Sınıflandırma: Yazar, Tür ve Cinsiyet”, Yıldız Teknik Üniv., Master Tezi CR - 2. Cavnar, W. B. ve Trenkle, J. M., 1994, “N-gram-based text categorization”, Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval. Information Systems Project Management, Jolyon E. Hallows, AMACOM Pres CR - 3. Peng F., Keselj V., Cerconey N., Thomasy C., 2003, “N-Gram-Based Author Profiles For Authorship Attribution”, Faculty of Computing Science, Dalhousie University, Canada CR - 4. Stamatatos E., Fakotakis N., Kokkinakis G., 2000, “Automatic Text Categorization in Terms of Genre and Author”, Computational Linguistics, pp.471-495 CR - 5. Peng F., Schuurmans D., 2003, “Combining Naive Bayes and N-gram Language Models for Test Classification”, School of Computer Science, University of Waterloo. CR - 6. Amasyalı M.F., Diri B., 2006, “Automatic Written Turkish Text Categorization in Terms of Author, Genre and Gender”, 11th International Conference on Applications of Natural Language to Information Systems, Austria CR - 7. Peng F., Wang S., Schuurmans D., 2003, “Language and Task Independent Te Categorization with Simple Language Models”, School of Computer Science, University of Waterloo CR - 8. Nowson S., Oberlander J., 2006, “Openness and gender in personal weblogs”, School of Informatics, University of Edinburgh, 2 Buccleuch Place, Edinburg, EH89LW CR - 9. Dupont P., 2006, “Noisy Sequence Classification with Smoothed Markov Chains”, Department of Computing Science and Engineering (INGI), Université catholique de Louvain Place Sainte Barbe, 2 B-1348 Louvain-la-Neuve – Belgium CR - 10. George H., 1995, “Estimating Continuous Distributions in Bayesian Classifiers”, Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 338-345. Morgan Kaufmann, San Mateo CR - 11. Breiman L., 1999, “Random forests– random features”, Technical Report 567, Department of Statistics, University of California, Berkeley CR - 12. Peng F., Schuurmans D., 2003, “Combining Naive Bayes and N-gram Language Models for Text Classification”, School of Computer Science, University of Waterloo CR - 13. Doyle J., Keselj V., 2005, “Automatic Categorization of Author Gender via NGram Analysis”, In The 6th Symposium on Natural Language Processing, SNLP'2005, Chiang Rai, Thailand, December CR - 14. http://sourceforge.net/projects/weka/ UR - https://dergipark.org.tr/tr/pub/tbbmd/issue//238775 L1 - https://dergipark.org.tr/tr/download/article-file/207187 ER -