TY - JOUR T1 - The Development of an Error-tagged Learner Corpus: TELC (Turkish-English Learner Corpus) and its Web-interface TT - Hata Etiketli Öğrenen Derlemi Geliştirilmesi: TELC (Türkçe-İngilizce Öğrenen Derlemi) ve Web-Arayüzü AU - Cangır, Hakan AU - Uzun, Kutay AU - Can, Taner AU - Oğuz, Enis AU - Kaya, Ömer Faruk PY - 2024 DA - December Y2 - 2024 DO - 10.18492/dad.1489654 JF - Dilbilim Araştırmaları Dergisi JO - JLR PB - Dilbilim Derneği WT - DergiPark SN - 1300-8552 SP - 279 EP - 307 VL - 35 IS - 2 LA - en AB - Though rather rare and not favoured by corpus linguists due to computationally hard-to-handle problems, learner corpora consisting of spoken and written texts by students from different L1 backgrounds can benefit both researchers in the field of second language acquisition and language teachers. Growing from this need and considering corpora’s potential importance for the language teachers and learners in the Turkish context, our L2 English learner corpus is yet another humble attempt to build an error-tagged learner corpus particularly scrutinizing lexical errors, which play a key role in the language production of second language learners. Building on Hemchua and Schmitt’s lexical error taxonomy and developed following the strict methodological considerations in the literature (e.g., error naming and fixing through several rounds of tagging), the corpus consists of 369 written texts by 231 university students (with 104,864 words, 3000+ tagged and fixed errors). The corpus database is provided with a user-friendly web-interface, which consists of statistical output, modules highlighting lexical errors and correct versions, different search options including error types, and an error-tagging add-in for further development. In addition to being a resourceful website trying to guide language practitioners and second language learners, it can be considered a platform with a capacity to be developed further by applied linguists conducting studies in this line of research. Finally, thanks to its easy-to-use interface and versatile features, it has potential to become a reference learner corpus for English as a foreign/second language with the contribution of other universities in Türkiye. KW - learner corpus KW - error-tagging KW - lexical errors KW - second language acquisition N2 - Oldukça nadir olmasına ve derlem dilbilimciler tarafından geliştirmedeki zorlukları nedeniyle tercih edilmemesine rağmen, farklı D1 geçmişlerine sahip öğrencilerin sözlü ve yazılı metinlerinden oluşan öğrenen derlemleri, hem ikinci dil edinimi alanındaki araştırmacılara hem de dil öğretmenlerine fayda sağlayabilir. Bu ihtiyaçtan yola çıkarak ve derlemlerin Türkiye bağlamında dil öğretmenleri ve öğrenenler için potansiyel önemini göz önünde bulundurarak, D2 İngilizce öğrenen derlemimiz, özellikle ikinci dil öğrenenlerin dil üretiminde kilit rol oynayan sözcük hatalarını inceleyen, hata etiketli bir öğrenen derlemi oluşturmaya yönelik bir girişimdir. Hemchua ve Schmitt'in sözcüksel hata taksonomisine dayanan ve alanyazındaki katı metodolojik hususlar (örneğin, hata adlandırma ve birkaç tur etiketleme yoluyla düzeltme) izlenerek geliştirilen derlem, 231 üniversite öğrencisinin 369 yazılı metninden (104.864 sözcük, 3000'den fazla etiketlenmiş ve düzeltilmiş hatadan) oluşmaktadır. Kullanıcı dostu arayüze sahip derlem veri tabanı, kullanıcıların istatistiksel çıktılara ulaşmasına ve sözcüksel hataları ve doğru versiyonlarını görüntüleyebilmesine ve derlem içinde farklı hata türlerini aramasına imkân sağlar. Ayrıca, arayüzde veri tabanının gelişimine olanak sağlayan hata etiketleme eklentisi mevcuttur. TELC, dil öğretenlere ve ikinci dil öğrenenlere rehber kaynak niteliğinde bir internet sitesi olmasının yanı sıra, bu alanda çalışmalar yürüten uygulamalı dilbilimciler tarafından geliştirilebilecek bir dijital platform olarak da değerlendirilebilir. Son olarak, kullanımı kolay arayüzü ve çok yönlü özellikleri sayesinde, Türkiye'deki diğer üniversitelerin de katkısıyla yabancı/ikinci dil olarak İngilizce öğretimi / öğrenimi için referans bir öğrenen derlemi olma potansiyeline sahiptir. CR - Anthony, L. (2023). AntConc (Version 4.2.4) [Computer Software]. Waseda University. Available from https://www.laurenceanthony.net/software CR - Berberich, K., & Kleiber, I. (2023). Tools for corpus linguistics. https://corpus-analysis.com/ CR - Biber, D., Gray, B., & Poonpon, K. (2011). Should we use characteristics of conversation to measure grammatical complexity in L2 writing development? Tesol Quarterly, 45(1), 5-35. https://doi.org/10.5054/tq.2011.244483 CR - Bley-Vroman, R. (1989). What is the logical problem of foreign language learning? In S. M. Gass & J. Schachter (Eds.), Linguistic perspectives on second language acquisition (pp. 41–68). Cambridge University Press. https://doi.org/10.1017/CBO9781139524544.005 CR - Cangır, H., Uzun, K., Can, T., Küllü, K., Oğuz, E., Kaya Ö. M. (2025). Linguistic features and L2 English writing quality: A multidimensional analysis. [Manuscript submitted for publication]. AILA Review. CR - Cortes, V. (2018). Corpus tools for Writing Teachers. In The TESOL Encyclopedia of English Language Teaching (pp. 1–6). John Wiley & Sons, Ltd. https://doi.org/10.1002/9781118784235.eelt0553 CR - Crosthwaite, P. (Ed.). (2024). Corpora for language learning: Bridging the research-practice divide (1st ed.). Routledge. https://doi.org/10.4324/9781003413301 CR - Ellis, N. C., & Laporte, N. (2014). Contexts of acquisition: Effects of formal instruction and naturalistic exposure on second language acquisition. In Tutorials in bilingualism (pp. 53-83). Psychology Press. CR - Francis, W., & Kučera, H. (1964). Manual of information to accompany a standard corpus of present-day edited American English, for use with digital computers. Brown University. CR - Friginal, E. (2013). Developing research report writing skills using corpora. English for Specific Purposes, 32(4), 208–220. https://doi.org/https://doi.org/10.1016/j.esp.2013.06.001 CR - Gablasova, D., Brezina, V., & McEnery, T. (2017). Exploring learner language through corpora: Comparing and interpreting corpus frequency information. Language Learning 67(S1), 130–154. https://doi.org/10.1111/lang.12226 CR - Gilquin, G. (2015). From design to collection of learner corpora. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge handbook of learner corpus research (pp. 9–34). Cambridge University Press. https://doi.org/10.1017/CBO9781139649414.002 CR - Gilquin, G., & Granger, S. (2015). Learner language. In D. Biber & R. Reppen (Eds.), The Cambridge handbook of English corpus linguistics (pp. 418–436). Cambridge University Press. https://doi.org/10.1017/CBO9781139764377.024 CR - Gilquin, G. (2023). Written learner corpora to inform teaching. In R.R. Jablonkai & E. Csomay (eds) The Routledge Handbook of Corpora and English Language Teaching and Learning (pp. 281-295). Routledge. CR - Granger, S. (1993). International Corpus of learner English. In Aarts, J., de Haan, P., & Oostdijk, N. (eds.) English language corpora: Design, analysis and exploitation, (pp. 57 – 71). Rodopi. https://doi.org/10.1163/9789004653559_007 CR - Granger, S. (2002). A Bird’s-eye review of learner corpus research. In Granger, S., Hung, J., Petch-Tyson, S. (eds.), Computer learner corpora, second language acquisition and foreign language teaching (pp. 3-33). John Benjamins. https://doi.org/10.1075/lllt.6.04gra CR - Granger, S. (2003). The International Corpus of Learner English: A new resource for foreign language learning and teaching and second language acquisition research. In Tesol Quarterly 37(3), pp. 538–546. https://doi.org/10.2307/3588404 CR - Granger, S. (2015). The contribution of learner corpora to reference and instructional materials design. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge handbook of learner corpus research (pp. 485-510). Cambridge University Press. https://doi.org/10.1017/CBO9781139649414.022 CR - Granger, S. (2021). Commentary: Have Learner Corpus Research and Second Language Acquisition Finally Met? In B. Le Bruyn & M. Paquot (Eds.), Learner corpus research meets second language acquisition (pp. 243–257). Cambridge University Press. https://doi.org/10.1017/9781108674577.012 CR - Granger, S., Dupont, M., Meunier, F., Naets, H., & Paquot, M. (2020). The International Corpus of Learner English. Version 3. Presses universitaires de Louvain. CR - Granger, S., Gilquin, G., & Meunier, F. (2015). Introduction: learner corpus research – past, present and future. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge handbook of learner corpus research (pp. 1–6). Cambridge University Press. https://doi.org/10.1017/cbo9781139649414.001 CR - Hemchua, S., & Schmitt, N. (2006). An analysis of lexical errors in the English compositions of Thai learners. Prospect, 21(3). 3-25. CR - Hunston, S. (2002). Corpora in applied linguistics. Cambridge University Press. CR - Kaya, F. Ö., Uzun, K., & Cangır, H. (2022). Using corpora for language teaching and assessment in L2: A narrative review. Focus on ELT Journal, 4(3), 46-62. https://doi.org/10.14744/felt.2022.4.3.4 CR - Kilgarriff, A., Baisa, V., Bušta, J., Jakubíček, M., Kovář, V., Michelfeit, J., Rychlý, P., & Suchomel, V. (2014). The Sketch Engine: ten years on. Lexicography, 1(1), 7–36. https://doi.org/10.1007/s40607-014-0009-9 CR - Kučera, H., & Francis, W. (1967). Computational analysis of present day American English. Brown University Press. https://doi.org/10.1002/asi.5090190414 CR - Kyle, K. (2016). Measuring syntactic development in L2 writing: Fine grained indices of 97 syntactic complexity and usage-based indices of syntactic sophistication [Doctoral dissertation, Georgia State University]. ScholarWorks @Georgia State University. http://scholarworks.gsu.edu/alesl_diss/35 CR - Kyle, K., Crossley, S. A., & Berger, C. (2018). The tool for the automatic analysis of lexical sophistication (TAALES): Version 2.0. Behavior Research Methods, 50, 1030–1046. https://doi.org/10.3758/s13428-017-0924-4 CR - Lee, J. J., Bychkovska, T., & Maxwell, J. D. (2019). Breaking the rules? A corpus-based comparison of informal features in L1 and L2 undergraduate student writing. System, 80, 143-153. https://doi.org/10.1016/j.system.2018.11.010 CR - Lee, S. (2011). Challenges of using corpora in language teaching and learning: Implications for secondary education. Linguistic Research, 28(1), 159–178. https://doi.org/10.17250/khisli.28.1.201104.009 CR - Leech, G. (1981). Semantics: the study of meaning. 2nd Ed. Penguin. CR - Liao, Y., & Fukuya, Y. J. (2004). Avoidance of phrasal verbs: The case of Chinese learners of English. Language Learning, 54(2), 193–226. https://doi.org/10.1111/j.1467-9922.2004.00254.x CR - Meunier, F. (2020). Introduction to learner Corpus research. In The Routledge handbook of second language acquisition and corpora (pp. 23-36). Routledge. https://doi.org/10.4324/9781351137904-4 CR - Moore, J. (2005). Common mistakes at Proficiency ... and how to avoid them. Cambridge University Press. CR - Murakami, A., & Alexopoulou, T. (2016). L1 influence on the acquisition order of English grammatical morphemes: A learner corpus study. Studies in Second Language Acquisition, 38(3), 365-401. https://doi.org/10.1017/S0272263115000352 CR - Myles, F. (2005). Interlanguage corpora and second language acquisition research. Second Language Research, 21(4), 373-391. https://doi.org/10.1191/0267658305sr252oa CR - Myles, F. (2021). Commentary: An SLA perspective on learner corpus research. In B. Le Bruyn & M. Paquot (Eds.), Learner Corpus Research Meets Second Language Acquisition (pp. 258–273). Cambridge University Press. https://doi.org/10.1017/9781108674577.013 CR - Nesselhauf, N. (2003). The Use of Collocations by Advanced Learners of English and Some Implications for Teaching. Applied Linguistics, 24(2), 223–242. https://doi.org/10.1093/applin/24.2.223 CR - O’Keeffe, A., McCarthy, M., & Carter, R. (2007). From Corpus to Classroom: Language Use and Language Teaching. Cambridge University Press. CR - Paquot, M., & Granger, S. (2012). Formulaic language in learner corpora. In Ann Rev Appl Linguist, 32, 130–149. https://doi.org/101017/S0267190512000098 CR - Paquot, M., Larsson, T., Hasselgård, H., Ebeling, S. O., De Meyere, D., Valentin, L., Laso, N. J., Verdaguer, I., & van Vuuren, S. (2022). The varieties of English for specific purposes database (VESPA): Towards a multi-L1 and multi-register learner corpus of disciplinary writing. Research in Corpus Linguistics, 10(2), 1–15. https://doi.org/10.32714/ ricl.10.02.02 CR - Pérez-Paredes, P. (2022). A systematic review of the uses and spread of corpora and data-driven learning in CALL research during 2011–2015. In Computer Assisted Language Learning, 35(1-2), 36–61. https://doi.org/10.1080/09588221.2019.1667832 CR - Schneider, G. (2023). Detecting and analysing learner difficulties using a learner corpus without error tagging. In K. Harrington & P. Ronan (Eds.), Demystifying corpus linguistics for English language teaching (pp. 229–257). Springer International Publishing. https://doi.org/10.1007/978-3-031-11220-1_12 CR - Selivan, L. (2023). Corpus linguistics and vocabulary teaching. In K. Harrington & P. Ronan (Eds.), Demystifying corpus linguistics for English language teaching (pp. 139–161). Springer International Publishing. https://doi.org/10.1007/978-3-031-11220-1_8 CR - Sinclair, J. M. (1990). Collins COBUILD English grammar. Collins. CR - Thewissen, J. (2013). Capturing L2 accuracy developmental patterns: Insights from an error‐tagged EFL learner corpus. The Modern Language Journal, 97(S1), 77-101. CR - Thewissen, J. (2015). Accuracy across proficiency levels: A learner corpus approach. Presses universitaires de Louvain. CR - Xiao, R. (2009). How can corpora help in language pedagogy. In Postgraduate Conference in Applied Linguistics, Ningbo, China. CR - Xu, Q. (2016). Application of learner corpora to second language learning and teaching: An overview. In English Language Teaching, 9(8), pp. 46–52. Available online at https://eric.ed.gov/?id=EJ1104573 UR - https://doi.org/10.18492/dad.1489654 L1 - https://dergipark.org.tr/en/download/article-file/3954694 ER -