TY  - JOUR
T1  - Serbest Sırada Birliktelik İstatistiklerinin Kullanımıyla Türkçe&#039;nin Biçimbirimsel Belirsizliği&#039;nin Giderilmesi
TT  - Morphological Disambiguation of Turkish with Free-order Co-occurrence Statistics
AU  - Arslan, Enis
AU  - Orhan, Umut
AU  - Tahiroğlu, B. Tahir
PY  - 2018
DA  - November
Y2  - 2018
DO  - 10.17714/gumusfenbil.430034
JF  - Gümüşhane Üniversitesi Fen Bilimleri Dergisi
PB  - Gümüşhane Üniversitesi
WT  - DergiPark
SN  - 2146-538X
SP  - 46
EP  - 52
LA  - en
AB  - Bu makalede, Türkçe gibi biçimselolarak karmaşık yapıda olan dillerde sıklıkla karşılaşılan biçimbirimselbelirsizlik problemi için bir çözüm önerilmiştir. Genellikle, bu tipte birproblemin çözümü için bir cümledeki muhtemel kelime sıralarından uygun olanınseçilmesi için bilgiyi maksimuma çıkaran istatistiksel yöntemleruygulanmaktadır. Olasılıkların hesaplanması ve uygun sıranın seçilmesi içintercih edilecek metot uygulanacak dilin doğasına bağlıdır. Cümlelerde geçenkelimelerin madde başlarının oluşturduğu bir anlamsal çizgeden elde edilenbirliktelik istatistikleri kullanılarak alternatifler arasından uygun olankelime sıra dizilimi seçilmektedir. Bu çizge ağının belirsizlik içermeyenserbest sıralı karakteri istatistiklerin bağımsız olarak hesaplanmasındaoldukça faydalıdır. Olasılıksal değerler Naive Bayes (NB) yöntemi kullanılarakelde edilmekte ve her kelime sıraları arasından uygun olanının, Viterbialgoritmasından esinlenilerek, maksimumu seçilmektedir.
KW  - Birliktelik
KW  - Biçimbirimsel belirsizlik
KW  - Naive Bayes
KW  - Viterbi algoritması
N2  - In this article, a solution to the morphological ambiguity problem whichoccurs frequently in morphologically complex languages like Turkish is proposed.Generally, statistical methods are applicable for these tasks which maximizethe information, obtained for a probable word order sequence in a sentence. Thedecision in selection of the method for calculation of the probabilities andthe sequence selection method depends on the nature of the language. By usingthe co-occurrence statistics obtained from a semantic graph network whichrepresents the lemmas of the sentences, the best word order sequence isselected from the alternatives. The non-ambiguous and free-word-order characterof this network is helpful in determining the statistics independently. Theprobability values are obtained by using the Naive Bayes (NB) method and theselection of each word sequence is achieved by maximization, in the inspirationof the Viterbi algorithm.
CR  - Ballesteros, L. and Croft, W.B., 1998. August. Resolving ambiguity for cross-language retrieval. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval (pp. 64-71). ACM.
CR  - Beluga, S., Meštrović, A. and Martinčić-Ipšić, S., 2015. An overview of graph-based keyword extraction methods and approaches. Journal of information and organizational sciences, 39 (1), pp.1-20.
CR  - Borge-Holthoefer, J. and Arenas, A., 2010. Semantic networks: Structure and dynamics. Entropy, 12 (5), pp.1264-1302.
CR  - Duque, A., Stevenson, M., Martinez-Romo, J. and Araujo, L., 2018. Co-occurrence graphs for word sense disambiguation in the biomedical domain. Artificial intelligence in medicine.
CR  - Eryiğit, G., 2012. Biçimbilimsel Çözümleme. Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi, 5 (2).
CR  - Fan, X., Wang, J., Pu, X., Zhou, L. and Lv, B., 2011. On graph-based name disambiguation. Journal of Data and Information Quality (JDIQ), 2 (2), p.10.
CR  - Hessami, E., Mahmoudi, J. and Jadidinejad, A.H., 2011. Unsupervised graph-based word sense disambiguation using the lexical relation of WordNet. Int. J. Comput. Sci. Issues (IJCSI).
CR  - Lahiri, S., Choudhury, S.R. and Caragea, C., 2014. Keyword and keyphrase extraction using centrality measures on collocation networks. arXiv preprint arXiv:1401.6571.
CR  - Litvak, M., Last, M., Aizenman, H., Gobits, I. and Kandel, A., 2011. DegExt—A language-independent graph-based keyphrase extractor. In Advances in Intelligent Web Mastering–3 (pp. 121-130). Springer, Berlin, Heidelberg.
CR  - Martinez-Romo, J., Araujo, L., Borge-Holthoefer, J., Arenas, A., Capitán, J.A. and Cuesta, J.A., 2011. Disentangling categorical relationships through a graph of co-occurrences. Physical Review E, 84 (4), p.046108.
CR  - Matsuo, Y., Ohsawa, Y. and Ishizuka, M., 2001. November. Keyword: Extracting keywords from document s small world. In International Conference on Discovery Science (pp. 271-281). Springer, Berlin, Heidelberg.
CR  - Mihalcea, R. and Tarau, P., 2004. Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing.
CR  - Minkov, E., Cohen, W.W. and Ng, A.Y., 2006. August. Contextual search and name disambiguation in email using graphs. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 27-34). ACM.
CR  - Niwa, Y. and Nitta, Y., 1994. August. Co-occurrence vectors from corpora vs. distance vectors from dictionaries. In Proceedings of the 15th conference on Computational linguistics-Volume 1 (pp. 304-309). Association for Computational Linguistics.
CR  - Sak, H., Güngör, T. and Saraçlar, M., 2007, February. Morphological disambiguation of Turkish text with perceptron algorithm. In International Conference on Intelligent Text Processing and Computational Linguistics (pp. 107-118). Springer, Berlin, Heidelberg.
CR  - Sinha, R. and Mihalcea, R., 2007. Unsupervised graph-based word sense disambiguation using measures of word semantic similarity. In Semantic Computing, 2007. ICSC 2007. International Conference on (pp. 363-369). IEEE.
CR  - Tahiroğlu B.T. 2014, Türkçe Çevrim İçi Haber Metinlerinde Yeni Sözlerin (Neolojizm) Otomatik Çıkarımı. Derlem Dilbilim Uygulamaları, Özkan, B., Tahiroğlu, B. Tahir ve Özkan Ayşe Eda (Ed.), Karahan Kitabevi Yayınları, Adana, ss.1-22.
UR  - https://doi.org/10.17714/gumusfenbil.430034
L1  - https://dergipark.org.tr/tr/download/article-file/585618
ER  -