Türkçe'de Kalıp Anlatımlar: Türkçe Ulusal Derlemi'nden Görünümler

Selma Ayşe Özel; Yasin Bektaş; Hakan Yılmazer

Research Article

Formulaicity in Turkish: Evidence from the Turkish National Corpus

Year 2016, Volume: 13 Issue: 2, 1 - 33, 15.07.2016

Selma Ayşe Özel , Yasin Bektaş Hakan Yılmazer

Abstract

Formulaic
sequences are the most frequently occurred forms in a language. Identification
of formulaic sequences in language is useful for a wide range of areas
including linguistics, second language learning, natural language processing,
etc. To identify formulaic sequences in a language, the most preferred method
is to use a corpus, which may be formed from written texts or tape-recorded
conversations in the language, and count the frequencies of sequences in the
corpus. Then, most frequently occurring sequences are examined to find
formulas. Numerous studies have been made to identify formulas for several
languages like English. There exists only few studies about formulaicity in
Turkish and most of these studies focus on identifying formulas in the forms of
multi word units. Turkish, however, is an agglutinating language having a rich
and complex morphology, therefore formulaic sequences in affixation should be
discovered. Only very limited studies about formulaicity in affixation of
Turkish exist in the literature. In this study, we try to discover formulaic
sequences in affixation of Turkish by counting frequent suffix n-grams in
written and spoken Turkish by using the Turkish National Corpus, which is a
balanced, large scale, and general-purpose corpus for contemporary Turkish. We
list the most frequent suffix combinations not only for verbs but also for all
lexical categories like noun, adjective, verb, and adverb for both written and
spoken corpora from Turkish National Corpus, and discuss similarities and
differences in affixation in written and spoken usage of Turkish. We observe
that, we prefer shorter suffix sequences in spoken Turkish than in written
Turkish, and as the length of the suffix n-grams increase, we use different
formulaic sequences in written and spoken Turkish.

Keywords

Frequent suffix n-grams , written Turkish , spoken Turkish , Turkish National Corpus

References

Aksan, Y., Aksan, M., Koltuksuz, A., Sezer, T., Mersinli, Ü., Demirhan, U. U., Yılmazer, H., Kurtoğlu, Ö., Atasoy, G., Öz, S., & Yıldız, İ.,Construction of the Turkish National Corpus (TNC),3223-3227,N. Calzolari, K. Choukri, T. Declerck,İstanbul,English,İstanbul,Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC),Electronic

Türkçe'de Kalıp Anlatımlar: Türkçe Ulusal Derlemi'nden Görünümler

Year 2016, Volume: 13 Issue: 2, 1 - 33, 15.07.2016

Selma Ayşe Özel , Yasin Bektaş Hakan Yılmazer

Abstract

Kalıp anlatımlar yada sabit
ifade dizileri (formüller) bir dilde en sık gözlenen biçimlerden oluşur.
Dildeki formüllerin belirlenmesi; dilbilimi, yabancı dil öğrenimi, doğal dil
işleme gibi pek çok alan için faydalıdır. Bir dildeki sabit ifade dizilerini
belirleyebilmek için en çok tercih edilen yöntem bir derlem kullanmak ve
derlemdeki dizilerin sayısını belirlemektir. Türkçe’deki formüller ile ilgili
az sayıda çalışma bulunmaktadır. Bu çalışmada, Tükçe’de eklerde yer alan formül
dizilerini Türkçe Ulusal Derlemi’ni kullanarak, yazılı ve sözlü Türkçe’de en
sık görülen n’li biçimbirim dizilerinin sayısal dağılımını ortaya çıkarmaya
çalışmaktayız. Tüm sözcük kategorileri için en sık ek kombinasyonları
listelenmektedir. Sözlü Türkçe’de yazılı Türkçe’ye göre daha kısa ek dizilerinin
tercih edildiği, n’li biçimbirim dizilerinin uzunlukları arttıkça yazılı ve
sözlü Türkçe’de farklı formül dizilerinin kullanıldığı görülmektedir.

Keywords

Sık n’li biçimbirim dizileri , yazılı Türkçe , sözlü Türkçe , Türkçe Ulusal Derlemi

References

Aksan, Y., Aksan, M., Koltuksuz, A., Sezer, T., Mersinli, Ü., Demirhan, U. U., Yılmazer, H., Kurtoğlu, Ö., Atasoy, G., Öz, S., & Yıldız, İ.,Construction of the Turkish National Corpus (TNC),3223-3227,N. Calzolari, K. Choukri, T. Declerck,İstanbul,English,İstanbul,Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC),Electronic

There are 1 citations in total.

Details

Journal Section	Makaleler
Authors	Selma Ayşe Özel Yasin Bektaş This is me Hakan Yılmazer This is me
Publication Date	July 15, 2016
Published in Issue	Year 2016 Volume: 13 Issue: 2

Cite

APA	Özel, S. A., Bektaş, Y., & Yılmazer, H. (2016). Türkçe’de Kalıp Anlatımlar: Türkçe Ulusal Derlemi’nden Görünümler. Mersin Üniversitesi Dil Ve Edebiyat Dergisi, 13(2), 1-33.