TY  - JOUR
TT  - Türkçe Dokümanlar İçin N-gram Tabanlı Yeni Bir Sınıflandırma(Ng-ind): Yazar, Tür ve Cinsiyet
AU  - Doğan, Sibel
AU  - Diri, Banu
PY  - 2016
DA  - June
JF  - Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi
JO  - TBV-BBMD
PB  - Akademik Bilişim Vakfı
WT  - DergiPark
SN  - 1305-8991
SP  - 11
EP  - 19
VL  - 3
IS  - 1
N2  - In this study, it is tried to find out aTurkish document’s genre, author and documentauthor’s gender with using the Turkish n-grammodel. In N-gram model, 2-, 3-, 4-grams were used,and total 6 feature vectors were produced on 3different data set. Some tests were made with theNg-ind method that we produced near the otherclassifiers such as Naive Bayes (NB), SupportVector Machine (SVM), Random Forest (RF), KNearestNeighbor (K-NN) and the successperformances were compared with each other. Inspite of the Ng-ind method gave better results thanthe other ones in gender and genre determination, itshowed better performance than the compoundedclassifiers in genre determination
CR  - 1. Doğan, S., 2006, “Türkçe Dokümanlar için N-gram Tabanlı Sınıflandırma: Yazar, Tür ve Cinsiyet”, Yıldız Teknik Üniv., Master Tezi
CR  - 2. Cavnar, W. B. ve Trenkle, J. M., 1994, “N-gram-based text categorization”, Proceedings of SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval. Information Systems Project Management, Jolyon E. Hallows, AMACOM Pres
CR  - 3. Peng F., Keselj V., Cerconey N., Thomasy C., 2003, “N-Gram-Based Author Profiles For Authorship Attribution”, Faculty of Computing Science, Dalhousie University, Canada
CR  - 4. Stamatatos E., Fakotakis N., Kokkinakis G., 2000, “Automatic Text Categorization in Terms of Genre and Author”, Computational Linguistics, pp.471-495
CR  - 5. Peng F., Schuurmans D., 2003, “Combining Naive Bayes and N-gram Language Models for Test Classification”, School of Computer Science, University of Waterloo.
CR  - 6. Amasyalı M.F., Diri B., 2006, “Automatic Written Turkish Text Categorization in Terms of Author, Genre and Gender”, 11th International Conference on Applications of Natural Language to Information Systems, Austria
CR  - 7. Peng F., Wang S., Schuurmans D., 2003, “Language and Task Independent Te Categorization with Simple Language Models”, School of Computer Science, University of Waterloo
CR  - 8. Nowson S., Oberlander J., 2006, “Openness and gender in personal weblogs”, School of Informatics, University of Edinburgh, 2 Buccleuch Place, Edinburg, EH89LW
CR  - 9. Dupont P., 2006, “Noisy Sequence Classification with Smoothed Markov Chains”, Department of Computing Science and Engineering (INGI), Université catholique de Louvain Place Sainte Barbe, 2 B-1348 Louvain-la-Neuve – Belgium
CR  - 10. George H., 1995, “Estimating Continuous Distributions in Bayesian Classifiers”, Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 338-345. Morgan Kaufmann, San Mateo
CR  - 11. Breiman L., 1999, “Random forests– random features”, Technical Report 567, Department of Statistics, University of California, Berkeley
CR  - 12. Peng F., Schuurmans D., 2003, “Combining Naive Bayes and N-gram Language Models for Text Classification”, School of Computer Science, University of Waterloo
CR  - 13. Doyle J., Keselj V., 2005, “Automatic Categorization of Author Gender via NGram Analysis”, In The 6th Symposium on Natural Language Processing, SNLP&#039;2005, Chiang Rai, Thailand, December
CR  - 14. http://sourceforge.net/projects/weka/
UR  - https://dergipark.org.tr/tr/pub/tbbmd/issue//238775
L1  - https://dergipark.org.tr/tr/download/article-file/207187
ER  -