A Novel Syntactic-Based Approach to Calculate Similarities Among Languages

Metin Bilgin

doi:10.19113/sdufenbed.1168260

Research Article

Diller Arasındaki Benzerliği Hesaplamak için Sözdizimsel Yeni Bir Yaklaşım

Year 2023, Volume: 27 Issue: 1, 125 - 136, 25.04.2023

Metin Bilgin

https://doi.org/10.19113/sdufenbed.1168260

Abstract

Bu çalışmada sunulan yaklaşım, söz-dizimsel analiz safhasından elde edilecek yeni özellik şablonunun kullanılmasıyla dillerin birbirlerine olan benzerliğinin hesaplanması üzerinedir. Önerilen yeni özellik şablonu yardımıyla dillerin benzerliklerinin hesaplanabilirliğini gösterebilmek için iki farklı dil ailesine mensup 6 farklı dil kümesi üzerinde çalışmalar gerçekleştirilmiştir. İlk çalışmada Ural-Altay dil ailesine ait Türki diller ailesine mensup Türkiye, Kazak ve Uygur Türkçelerinin söz-dizimsel analizden elde edilen üçlü örüntü şablonları geliştirilen yazılım vasıtasıyla otomatik olarak çıkarılabilmekte ve aynı yazılım içerisinde geliştirilen farklı bir modül sayesinde de istenen dillerin benzerlik analizi yapılabilmektedir. Böylece ayni dil ailesine mensup dillerin yapısal olarak birbirlerine benzer ilişkilerinin gösterilmesinin yanı sıra diller arasındaki yapısal farklılıklar da ortaya çıkarılabilmektedir. Yaklaşım geliştirilirken ilk hedef Türki diller arasındaki benzerliklerin belirlenmesi olsa da oluşturulmak istenen yapının gerçek amacı dilden bağımsız bir sistem oluşturabilmektir. Oluşturulan sistemin dilden bağımsız bir yapı oluşturabildiğini gösterebilmek adına ikinci bir çalışma gerçekleştirilmiştir. İkinci çalışmada Germen dil ailesine mensup İngilizce, İsveççe ve Norveçce derlemleri kullanılarak dillerin birbirlerine olan benzerliklerin ölçümlenmesi sağlanmıştır. Dil ailesi Türkçe ve metrikler Jaccard, Dice ve Similarity matching olduğunda, en yüksek benzerlik Türkçe-Uygurca olup, metriklerin değerleri sırasıyla %25.21, %40.27 ve %50.42'dir. Dil ailesi Germen olduğunda en yüksek benzerlik Norveç-İsveççe olup, metriklerin değerleri sırasıyla %37.15, %54.17 ve %74.3'tür.

Keywords

Dil Benzerliği, Evrensel Bağımlılık, Doğal Dil İşleme

References

[1] J.R. Searle, Indirect speech acts, in Speech Act, ed. P. Cole and J.L. Morgan (Academic Press, New York, 36 1975), p. 59-82.
[2] P.J. Taylor and S. Thomas, Linguistic style matching and negotiation outcome, Negotiation and Conflict Management Research 1(3) (2008) 263-281. https://doi.org/10.1111/j.1750-4716.2008.00016.x
[3] J.W. Pennebaker and L.D. Stone, Words of wisdom: language use over the life span, Journal of personality and social psychology 85(2) (2003) 291. https://doi.org/10.1037/0022-3514.85.2.291 [4] C.J. Groom and J.W. Pennebaker, The language of love: Sex, sexual orientation, and language use in online personal advertisements, Sex Roles 52 (2005) 447–461. https://doi.org/10.1007/s11199-005-3711-0
[5] C.M. Laserna, Y.T. Seih and J.W. Pennebaker, Um... who like says you know filler word use as a function of age, gender, and personality, Journal of Language and Social Psychology 33(3) (2014) 328-338. https://doi.org/10.1177/0261927X14526993
[6] M. Dehghani, K. Sagae, S. Sachdeva and J. Gratch, Analyzing political rhetoric in conservative and liberal weblogs related to the construction of the ground zero mosque, Journal of Information Technology Politics 11(1) (2014) 1–14. https://doi.org/10.1080/19331681.2013.826613
[7] J.A. Skoyen, A.K. Randall, M.R. Mehl and E.A. Butler, We overeat, but i can stay thin: Pronoun use and body weight in couples who eat to regulate emotion, Journal of Social and Clinical Psychology 33(8) (2014) 743. https://doi.org/10.1521/jscp.2014.33.8.743
[8] R.C. Berwick, A.D. Friederici, N. Chomsky and J.J. Bolhuis, Evolution, brain, and the nature of language, Trends in cognitive sciences 17(2) (2013) 89–98. https://doi.org/10.1016/j.tics.2012.12.002
[9] A.V. Miceli-Barone and G. Attardi, Non-projective dependency-based pre-reordering with recurrent neural network for machine translation, in Proceedings of the Ninth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST) (Association for Computational Linguistics, Colorado, USA, 2015), pp. 846-856.
[10] T. Xiao, J. Zhu, C. Zhang and T. Liu, Syntactic skeleton-based translation, in Proceedings of the thirtieth AAAI conference on artificial intelligence (AAAI) (AAAI Press, Phoenix, Arizona, USA, 2016), pp. 2856-2862.
[11] M. Song, W.C. Kim, D. Lee, G.E. Heo and K.Y. Kang, PKDE4J: entity and relation extraction for public knowledge discovery, Journal of Biomedical Informatics 57 (2015) 320–332. https://doi.org/10.1016/j.jbi.2015.08.008
[12] M. Yu, M.R. Gormley and M. Dredze, Combining word embeddings and feature embeddings for fine-grained relation extraction, in Proceedings of the 2015 Conference of the North American chapter of the Association for Computational Linguistics: human language technologies (NAACL) (Association for Computational Linguistics, Denver, Colorado, USA, 2015), pp. 1374-1379.
[13] S. Pado´ S, T.G. Noh, A. Stern, R. Wang and R. Zanoli, Design and realization of a modular architecture for textual entailment, Natural Language Engineering 21(2) (2015) 167-200. https://doi.org/10.1017/S1351324913000351
[14] M. Joshi, C. Penstein-Ros´e, Generalizing dependency features for opinion mining, in Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics and 4th International Joint Conference on Natural Language Processing (IJCNLP) (Association for Computational Linguistics, Stroudsburg, PA, USA, 2009), pp. 313-316.
[15] D. Vilares, M.A. Alonso and C. Go´mez-Rodr´ıguez, A linguistic approach for determining the topics of Spanish Twitter messages, Journal of Information Science 41(02) (2015) 127–145. https://doi.org/10.1177/0165551514561652
[16] D. Vilares , M.A. Alonso and C. Go´mez-Rodr´ıguez, A syntactic approach for opinion mining on Spanish reviews, Natural Language Engineering, 21(01) (2015) 139–163. https://doi.org/10.1017/S1351324913000181
[17] M.L. Errecalde, L.C. Cagnina and P. Rosso, Silhouette + attraction: A simple and effective method for text clustering, Natural Language Engineering 22(5) (2016) 687-726. https://doi.org/10.1017/S1351324915000273
[18] M. Potthast, A. Barron-Cedeno, B. Stein and P. Rosso, Cross-language plagiarism detection. Language Resources and Evaluation 45(1) (2011) 45-62. https://doi.org/10.1007/s10579-009-9114-z
[19] S.A. Crossley and D.S.McNamara (eds), Adaptive educational technologies for literacy instruction (Routledge, New York, 2017), pp. 1-310.
[20] K. Kyle, Measuring syntactic development in l2 writing: Fine grained indices of syntactic complexity and usage-based indices of syntactic sophistication, Dissertation, Georgia State University.
[21] K.G. Niederhoffer and J.W. Pennebaker, Linguistic style matching in social interaction, Journal of Language and Social Psychology 21(4) (2002) 337-360. https://doi.org/10.1177/026192702237953
[22] M.E. Ireland and J.W. Pennebaker, Language style matching in writing: synchrony in essays, correspondence, and poetry, Journal of personality and social psychology 99(3) (2010) 549-571. https://doi.org/10.1037/a0020386
[23] J.W. Pennebaker, M.E. Francis and R.J.Booth, Linguistic inquiry and word count: Liwc 2001, Mahway: Lawrence Erlbaum Associates (2001) 71.
[24] P. Gamallo, C. Gasperin, A. Agustini and G.P. Lopes, Syntactic-based methods for measuring word similarity, in International Conference on Text, Speech and Dialogue (TSD) (Springer, Berlin, Heidelberg, 2001), pp. 116- 15 125.
[25] S. Li, L. Zhou, J. Zhang, F. Zhou, J. Guo andW. Huo, Chinese-Lao Cross-Language Test Similarity Computing Based on WordNet, in Proceeding International Conference on Mechatronics and Intelligent Robotics (ICMIR) (Springer, Kunming, China, 2018), pp. 459-464
[26] D. Dinh, N.L. Thanh, English–Vietnamese cross-language paraphrase identification using hybrid feature classes, Journal of Heuristics (2019) 1-17. https://doi.org/10.1007/S10732-019-09411-2
[27] R. Steinberger, B. Pouliquen and J. Hagman J, Cross-lingual document similarity calculation using the multilingual thesaurus EUROVOC, in Proceedings of the 3rd Conference on Computational Linguistics and Intelligent Text Processing (CICLING) (Springer, Mexico City, Mexico, 2012), pp. 415-424.
[28] A. Barron-Cedeno, P. Rosso, D. Pinto and A. Juan, On cross-lingual plagiarism analysis using a statistical model, in Proceedings of International Workshop on Uncovering Plagiarism (ACM, Patras, Greece, 2008), pp. 26 9-14.
[29] J. Uszkoreit, J.M. Ponte, A.C. Popat and M. Dubiner, Large scale parallel document mining for machine translation, in Proceedings of the 23rd International Conference on Computational Linguistics (COLING) (Coling, Beijing, China,2010), pp. 1101-1109.
[30] E. Maike, F. Andrew and N. Kotaro, Calculating wikipedia article similarity using machine translation evaluation metrics, in Proceedings of Workshops of International Conference on Advanced Information Networking and Applications (IEEE, Biopolis, Singapore, 2011), pp. 620-625.
[31] M.C. De Marneffe, B. MacCartney and C.D. Manning, Generating typed dependency parses from phrase structure parses, in Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC) (ELRA, Genoa, Italy, 2006). pp. 449-454.
[32] M.C. De Marneffe and C.D. Manning, The Stanford typed dependencies representation, in Proceeding Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation (Association for Computational Linguistic, Manchester, United Kingdom, 2008), pp. 1-8.
[33] M.C. De Marneffe, T. Dozat, N. Silveira, K. Haverinen, F. Ginter, J. Nivre and C.D. Manning, Universal Stanford Dependencies: A cross-linguistic typology, in Proceedings of the Ninth International Conference on Language Resources and Evaluation (ELRA) (ELRA, Reykjavik, Iceland, 2014), pp. 4585-4592.
[34] S. Petrov, D. Das and R. McDonald, A universal part-of-speech tagset, in Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC) (ELRA, Istanbul, Turkey, 2012), pp. 2089-2096.
[35] D. 1 Zeman, Reusable Tagset Conversion Using Tagset Drivers, in Proceedings of the Sixth International Conference on Language Resources and Evaluation (ELRA) (ELRA, Marrakech, Morocco, 2008), pp. 213-218.
[36] D. Zeman, J. Nivre, Abrams M et al., Universal Dependencies 2.5, LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics (U´FAL), Faculty of Mathematics and Physics, Charles University, http://hdl.handle.net/11234/1-3105. Accessed 25 June 2022.
[37] UD Project (2019) Universal Dependency Project. https://universaldependencies.org/introduction.html. Accessed 25 June 2022.
[38] U. Sulubacak and G. Eryiğit, Implementing Universal Dependency, Morphology and Multiword Expression Annotation Standards for Turkish Language Processing, Turkish Journal of Electrical Engineering & Computer Sciences 26(3) (2018) 1662-1672. https://doi.org/10.3906/elk-1706-81
[39] U. Sulubacak, M. Go¨kırmak, F. Tyers, Ç. Çöltekin, J. Nivre and G. Eryiğit, Universal Dependencies for Turkish, in Proceedings of the 26th International Conference on Computational Linguistics (COLING) (COLING, Osaka, Japan, 2016), pp. 3444-3454.
[40] K. Oflazer, B. Say, D.Z. Hakkani-Tür and G. Tür, Building a Turkish Treebank, in Tree-banks:Building and Using Parsed Corpora, ed. A. Abeille (Academic Publishers, Dordrecht), p. 261-277.
[41] F.M. Tyers and J. Washington, Towards a free/open-source universal-dependency treebank for Kazakh, in proceeding of the 3rd International Conference on Turkic Languages Processing (IWCLUL) (Association for Computational Linguistics, Kazan, Tatarstan, Russia, 2015), pp. 108-120.
[42] A. Makazhanov, A. Sultangazina, O. Makhambetov and Z. Yessenbayev, Syntactic annotation of kazakh: Following the universal dependencies guidelines, in proceeding of the 3rd International Conference on Turkic Languages Processing (IWCLUL) (Association for Computational Linguistics, Kazan, Tatarstan, Russia, 2015), pp. 338-350.
[43] W. Mushajiang, T. Yibulayin, K. Abiderexiti and Y. Liu, Universal dependencies for Uyghur, in Proceedings of the Third International Workshop on Worldwide Language Service Infra-structure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (COLING, Osaka, Japan,2016), pp. 44-50.
[44] J. Nivre and B. Megyesi, Bootstrapping a Swedish treebank using cross-corpus harmonization and annotation projection, in Proceedings of the 6th International Workshop on Treebanks and Linguistic Theories (Northern European Association for Language Technology , Bergen, Norway,2007), pp. 97-102.
[45] N. Silveira, T. Dozat, M.C. De Marneffe, S. Bowman, M. Connor, J. Bauer and C. Manning, A Gold Standard Dependency Corpus for English, in Proceedings of the Ninth International Conference on Language Resources and Evaluation (ELRA) (ELRA,Reykjavik, Iceland, 2014), pp. 2897-2904.
[46] E. Velldal, L. Ovrelid and P. Hohle, Joint UD Parsing of Norwegian Bokmal and Nynorsk, in Proceedings of the 21st Nordic Conference on Computational Linguistics (Association for Computational Linguistics Gothenburg, Sweden, 2017), pp. 1-10.
[47] Dependency Graph (2019) Grew-Match. http://match.grew.fr/. Accessed 25 June 2022.

A Novel Syntactic-Based Approach to Calculate Similarities Among Languages

Year 2023, Volume: 27 Issue: 1, 125 - 136, 25.04.2023

Metin Bilgin

https://doi.org/10.19113/sdufenbed.1168260

Abstract

The approach presented in this study is about the calculation of the similarities among languages by using the new feature template to be obtained from the syntactic analysis phase. Studies were conducted on 6 different language sets from two different language families in order to show the calculability of similarity of languages with the help of the recommended new feature template. In the first study, triplet-pattern template which is obtained from the syntactic analysis of Turkey, Kazakh, and Uyghur Turkish languages from Turkic languages families belonging to the Ural-Altaic linguistic family, could be formed automatically through developed software, and also similarity analysis of the desired languages could be made thanks to a different module developed within the same software. Consequently, not only similar structural relations of the languages from the same language family but also structural differences among the languages can also be revealed. Even if the first aim is to determine the similarities among languages when developing an approach, the real aim of the desired structure is to form a system independent from the language. In order to show that the formed system has a structure independent from the language, another study was carried out. In the second study, the similarities among the languages were determined by using treebanks of English, Swedish and Norwegian from the Germen language family. When the language family is Turkic and the metrics are Jaccard, Dice, and Similarity Matching, the highest similarity is Turkish-Uyghur, and the values of the metrics are 25.21%, 40.27%, and 50.42%, respectively. When the language family is Germen, the highest similarity is Norwegian-Swedish, and the values of the metrics are 37.15%, 54.17%, and 74.3, respectively.

Keywords

References

[1] J.R. Searle, Indirect speech acts, in Speech Act, ed. P. Cole and J.L. Morgan (Academic Press, New York, 36 1975), p. 59-82.
[2] P.J. Taylor and S. Thomas, Linguistic style matching and negotiation outcome, Negotiation and Conflict Management Research 1(3) (2008) 263-281. https://doi.org/10.1111/j.1750-4716.2008.00016.x
[3] J.W. Pennebaker and L.D. Stone, Words of wisdom: language use over the life span, Journal of personality and social psychology 85(2) (2003) 291. https://doi.org/10.1037/0022-3514.85.2.291 [4] C.J. Groom and J.W. Pennebaker, The language of love: Sex, sexual orientation, and language use in online personal advertisements, Sex Roles 52 (2005) 447–461. https://doi.org/10.1007/s11199-005-3711-0
[5] C.M. Laserna, Y.T. Seih and J.W. Pennebaker, Um... who like says you know filler word use as a function of age, gender, and personality, Journal of Language and Social Psychology 33(3) (2014) 328-338. https://doi.org/10.1177/0261927X14526993
[6] M. Dehghani, K. Sagae, S. Sachdeva and J. Gratch, Analyzing political rhetoric in conservative and liberal weblogs related to the construction of the ground zero mosque, Journal of Information Technology Politics 11(1) (2014) 1–14. https://doi.org/10.1080/19331681.2013.826613
[7] J.A. Skoyen, A.K. Randall, M.R. Mehl and E.A. Butler, We overeat, but i can stay thin: Pronoun use and body weight in couples who eat to regulate emotion, Journal of Social and Clinical Psychology 33(8) (2014) 743. https://doi.org/10.1521/jscp.2014.33.8.743
[8] R.C. Berwick, A.D. Friederici, N. Chomsky and J.J. Bolhuis, Evolution, brain, and the nature of language, Trends in cognitive sciences 17(2) (2013) 89–98. https://doi.org/10.1016/j.tics.2012.12.002
[9] A.V. Miceli-Barone and G. Attardi, Non-projective dependency-based pre-reordering with recurrent neural network for machine translation, in Proceedings of the Ninth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST) (Association for Computational Linguistics, Colorado, USA, 2015), pp. 846-856.
[10] T. Xiao, J. Zhu, C. Zhang and T. Liu, Syntactic skeleton-based translation, in Proceedings of the thirtieth AAAI conference on artificial intelligence (AAAI) (AAAI Press, Phoenix, Arizona, USA, 2016), pp. 2856-2862.
[11] M. Song, W.C. Kim, D. Lee, G.E. Heo and K.Y. Kang, PKDE4J: entity and relation extraction for public knowledge discovery, Journal of Biomedical Informatics 57 (2015) 320–332. https://doi.org/10.1016/j.jbi.2015.08.008
[12] M. Yu, M.R. Gormley and M. Dredze, Combining word embeddings and feature embeddings for fine-grained relation extraction, in Proceedings of the 2015 Conference of the North American chapter of the Association for Computational Linguistics: human language technologies (NAACL) (Association for Computational Linguistics, Denver, Colorado, USA, 2015), pp. 1374-1379.
[13] S. Pado´ S, T.G. Noh, A. Stern, R. Wang and R. Zanoli, Design and realization of a modular architecture for textual entailment, Natural Language Engineering 21(2) (2015) 167-200. https://doi.org/10.1017/S1351324913000351
[14] M. Joshi, C. Penstein-Ros´e, Generalizing dependency features for opinion mining, in Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics and 4th International Joint Conference on Natural Language Processing (IJCNLP) (Association for Computational Linguistics, Stroudsburg, PA, USA, 2009), pp. 313-316.
[15] D. Vilares, M.A. Alonso and C. Go´mez-Rodr´ıguez, A linguistic approach for determining the topics of Spanish Twitter messages, Journal of Information Science 41(02) (2015) 127–145. https://doi.org/10.1177/0165551514561652
[16] D. Vilares , M.A. Alonso and C. Go´mez-Rodr´ıguez, A syntactic approach for opinion mining on Spanish reviews, Natural Language Engineering, 21(01) (2015) 139–163. https://doi.org/10.1017/S1351324913000181
[17] M.L. Errecalde, L.C. Cagnina and P. Rosso, Silhouette + attraction: A simple and effective method for text clustering, Natural Language Engineering 22(5) (2016) 687-726. https://doi.org/10.1017/S1351324915000273
[18] M. Potthast, A. Barron-Cedeno, B. Stein and P. Rosso, Cross-language plagiarism detection. Language Resources and Evaluation 45(1) (2011) 45-62. https://doi.org/10.1007/s10579-009-9114-z
[19] S.A. Crossley and D.S.McNamara (eds), Adaptive educational technologies for literacy instruction (Routledge, New York, 2017), pp. 1-310.
[20] K. Kyle, Measuring syntactic development in l2 writing: Fine grained indices of syntactic complexity and usage-based indices of syntactic sophistication, Dissertation, Georgia State University.
[21] K.G. Niederhoffer and J.W. Pennebaker, Linguistic style matching in social interaction, Journal of Language and Social Psychology 21(4) (2002) 337-360. https://doi.org/10.1177/026192702237953
[22] M.E. Ireland and J.W. Pennebaker, Language style matching in writing: synchrony in essays, correspondence, and poetry, Journal of personality and social psychology 99(3) (2010) 549-571. https://doi.org/10.1037/a0020386
[23] J.W. Pennebaker, M.E. Francis and R.J.Booth, Linguistic inquiry and word count: Liwc 2001, Mahway: Lawrence Erlbaum Associates (2001) 71.
[24] P. Gamallo, C. Gasperin, A. Agustini and G.P. Lopes, Syntactic-based methods for measuring word similarity, in International Conference on Text, Speech and Dialogue (TSD) (Springer, Berlin, Heidelberg, 2001), pp. 116- 15 125.
[25] S. Li, L. Zhou, J. Zhang, F. Zhou, J. Guo andW. Huo, Chinese-Lao Cross-Language Test Similarity Computing Based on WordNet, in Proceeding International Conference on Mechatronics and Intelligent Robotics (ICMIR) (Springer, Kunming, China, 2018), pp. 459-464
[26] D. Dinh, N.L. Thanh, English–Vietnamese cross-language paraphrase identification using hybrid feature classes, Journal of Heuristics (2019) 1-17. https://doi.org/10.1007/S10732-019-09411-2
[27] R. Steinberger, B. Pouliquen and J. Hagman J, Cross-lingual document similarity calculation using the multilingual thesaurus EUROVOC, in Proceedings of the 3rd Conference on Computational Linguistics and Intelligent Text Processing (CICLING) (Springer, Mexico City, Mexico, 2012), pp. 415-424.
[28] A. Barron-Cedeno, P. Rosso, D. Pinto and A. Juan, On cross-lingual plagiarism analysis using a statistical model, in Proceedings of International Workshop on Uncovering Plagiarism (ACM, Patras, Greece, 2008), pp. 26 9-14.
[29] J. Uszkoreit, J.M. Ponte, A.C. Popat and M. Dubiner, Large scale parallel document mining for machine translation, in Proceedings of the 23rd International Conference on Computational Linguistics (COLING) (Coling, Beijing, China,2010), pp. 1101-1109.
[30] E. Maike, F. Andrew and N. Kotaro, Calculating wikipedia article similarity using machine translation evaluation metrics, in Proceedings of Workshops of International Conference on Advanced Information Networking and Applications (IEEE, Biopolis, Singapore, 2011), pp. 620-625.
[31] M.C. De Marneffe, B. MacCartney and C.D. Manning, Generating typed dependency parses from phrase structure parses, in Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC) (ELRA, Genoa, Italy, 2006). pp. 449-454.
[32] M.C. De Marneffe and C.D. Manning, The Stanford typed dependencies representation, in Proceeding Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation (Association for Computational Linguistic, Manchester, United Kingdom, 2008), pp. 1-8.
[33] M.C. De Marneffe, T. Dozat, N. Silveira, K. Haverinen, F. Ginter, J. Nivre and C.D. Manning, Universal Stanford Dependencies: A cross-linguistic typology, in Proceedings of the Ninth International Conference on Language Resources and Evaluation (ELRA) (ELRA, Reykjavik, Iceland, 2014), pp. 4585-4592.
[34] S. Petrov, D. Das and R. McDonald, A universal part-of-speech tagset, in Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC) (ELRA, Istanbul, Turkey, 2012), pp. 2089-2096.
[35] D. 1 Zeman, Reusable Tagset Conversion Using Tagset Drivers, in Proceedings of the Sixth International Conference on Language Resources and Evaluation (ELRA) (ELRA, Marrakech, Morocco, 2008), pp. 213-218.
[36] D. Zeman, J. Nivre, Abrams M et al., Universal Dependencies 2.5, LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics (U´FAL), Faculty of Mathematics and Physics, Charles University, http://hdl.handle.net/11234/1-3105. Accessed 25 June 2022.
[37] UD Project (2019) Universal Dependency Project. https://universaldependencies.org/introduction.html. Accessed 25 June 2022.
[38] U. Sulubacak and G. Eryiğit, Implementing Universal Dependency, Morphology and Multiword Expression Annotation Standards for Turkish Language Processing, Turkish Journal of Electrical Engineering & Computer Sciences 26(3) (2018) 1662-1672. https://doi.org/10.3906/elk-1706-81
[39] U. Sulubacak, M. Go¨kırmak, F. Tyers, Ç. Çöltekin, J. Nivre and G. Eryiğit, Universal Dependencies for Turkish, in Proceedings of the 26th International Conference on Computational Linguistics (COLING) (COLING, Osaka, Japan, 2016), pp. 3444-3454.
[40] K. Oflazer, B. Say, D.Z. Hakkani-Tür and G. Tür, Building a Turkish Treebank, in Tree-banks:Building and Using Parsed Corpora, ed. A. Abeille (Academic Publishers, Dordrecht), p. 261-277.
[41] F.M. Tyers and J. Washington, Towards a free/open-source universal-dependency treebank for Kazakh, in proceeding of the 3rd International Conference on Turkic Languages Processing (IWCLUL) (Association for Computational Linguistics, Kazan, Tatarstan, Russia, 2015), pp. 108-120.
[42] A. Makazhanov, A. Sultangazina, O. Makhambetov and Z. Yessenbayev, Syntactic annotation of kazakh: Following the universal dependencies guidelines, in proceeding of the 3rd International Conference on Turkic Languages Processing (IWCLUL) (Association for Computational Linguistics, Kazan, Tatarstan, Russia, 2015), pp. 338-350.
[43] W. Mushajiang, T. Yibulayin, K. Abiderexiti and Y. Liu, Universal dependencies for Uyghur, in Proceedings of the Third International Workshop on Worldwide Language Service Infra-structure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (COLING, Osaka, Japan,2016), pp. 44-50.
[44] J. Nivre and B. Megyesi, Bootstrapping a Swedish treebank using cross-corpus harmonization and annotation projection, in Proceedings of the 6th International Workshop on Treebanks and Linguistic Theories (Northern European Association for Language Technology , Bergen, Norway,2007), pp. 97-102.
[45] N. Silveira, T. Dozat, M.C. De Marneffe, S. Bowman, M. Connor, J. Bauer and C. Manning, A Gold Standard Dependency Corpus for English, in Proceedings of the Ninth International Conference on Language Resources and Evaluation (ELRA) (ELRA,Reykjavik, Iceland, 2014), pp. 2897-2904.
[46] E. Velldal, L. Ovrelid and P. Hohle, Joint UD Parsing of Norwegian Bokmal and Nynorsk, in Proceedings of the 21st Nordic Conference on Computational Linguistics (Association for Computational Linguistics Gothenburg, Sweden, 2017), pp. 1-10.
[47] Dependency Graph (2019) Grew-Match. http://match.grew.fr/. Accessed 25 June 2022.

There are 46 citations in total.

Details

Primary Language	English
Subjects	Engineering
Journal Section	Makaleler
Authors	Metin Bilgin 0000-0002-4216-0542
Publication Date	April 25, 2023
Published in Issue	Year 2023 Volume: 27 Issue: 1

Cite

APA	Bilgin, M. (2023). A Novel Syntactic-Based Approach to Calculate Similarities Among Languages. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 27(1), 125-136. https://doi.org/10.19113/sdufenbed.1168260
AMA	Bilgin M. A Novel Syntactic-Based Approach to Calculate Similarities Among Languages. J. Nat. Appl. Sci. April 2023;27(1):125-136. doi:10.19113/sdufenbed.1168260
Chicago	Bilgin, Metin. “A Novel Syntactic-Based Approach to Calculate Similarities Among Languages”. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi 27, no. 1 (April 2023): 125-36. https://doi.org/10.19113/sdufenbed.1168260.
EndNote	Bilgin M (April 1, 2023) A Novel Syntactic-Based Approach to Calculate Similarities Among Languages. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi 27 1 125–136.
IEEE	M. Bilgin, “A Novel Syntactic-Based Approach to Calculate Similarities Among Languages”, J. Nat. Appl. Sci., vol. 27, no. 1, pp. 125–136, 2023, doi: 10.19113/sdufenbed.1168260.
ISNAD	Bilgin, Metin. “A Novel Syntactic-Based Approach to Calculate Similarities Among Languages”. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi 27/1 (April 2023), 125-136. https://doi.org/10.19113/sdufenbed.1168260.
JAMA	Bilgin M. A Novel Syntactic-Based Approach to Calculate Similarities Among Languages. J. Nat. Appl. Sci. 2023;27:125–136.
MLA	Bilgin, Metin. “A Novel Syntactic-Based Approach to Calculate Similarities Among Languages”. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, vol. 27, no. 1, 2023, pp. 125-36, doi:10.19113/sdufenbed.1168260.
Vancouver	Bilgin M. A Novel Syntactic-Based Approach to Calculate Similarities Among Languages. J. Nat. Appl. Sci. 2023;27(1):125-36.

Download Cover Image

Article Files

Full Text

e-ISSN :1308-6529
Linking ISSN (ISSN-L): 1300-7688

All published articles in the journal can be accessed free of charge and are open access under the Creative Commons CC BY-NC (Attribution-NonCommercial) license. All authors and other journal users are deemed to have accepted this situation. Click here to access detailed information about the CC BY-NC license.