Research Article
BibTex RIS Cite

Phonetic Distance Tools Used in Linguistics

Year 2024, Volume: 20 Issue: 2, 105 - 126, 25.07.2024

Abstract

Studies measuring the phonetic distance between languages are quantitative investigations aimed at exploring linguistic diversity. These studies reveal how specific words or linguistic structures vary across different regions. The goal of the study is to determine which tools are used to measure the phonetic distance of languages and to classify these tools according to the methods they employ. For this purpose, research on phonetic distance from the past to the present has been examined. In this respect, the study will guide research that aims to analyze phonetic distances between languages and develop phonetic distance measurement tools.
Articles, books, and theses regarding phonetic distance concerning languages or dialects have been identified. The methodology sections of these identified studies have been analyzed and compared. Studies with the same data collection and analysis methods have been consolidated. It has been found that three types of methods are used to calculate and analyze the differences between languages and dialects: traditional, perceptual, and computational.

References

  • Ahmed, T., Suffian, M., Khan, M. Y. & Bogliolo, A. (2022). Discovering lexical similarity using articulatory feature-based phonetic edit distance. IEEE Access, 10, 1533-1544.
  • Babu, A. A., Yellasiri, R. & Rao, A. A. (2015). Phonetic distance based accent classifier to identify pronunciation variants and oov words. Signal & Image Processing: An International Journal (SIPIJ), 6(4), 33-46.
  • Bowen, G. A. (2009). Document analysis as a qualitative research method. Qualitative Research Journal, 9(2), 27-40.
  • Campbell Collaboration, (2024). What is a systematic review? https://www.campbellcollaboration.org/what-is-a-systematic review.html.
  • Chambers, J. K. & Trudgill, P. (1998). Dialectology. Cambridge, United Kingdom: Cambridge University Press.
  • Covington, M. A. (1998). Alignment of multiple languages for historical comparison. In 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, 1, 275–279.
  • Daan, J. & Blok, D. P. (1969). Van randstad tot landrand; toelichting bij de kaart: dialecten en naamkunde, volume XXXVII of bijdragen en mededelin- gen der dialectencommissie van de koninklijke nederlandse akademie van wetenschappen te amsterdam. Noord-Hollandsche Uitgevers Maatschappij, Amsterdam.
  • Droppo, J. & Acero, A. (2010). Context dependent phonetıc strıng edit distance for automatic speech recognition. 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, TX, USA, 2010, 4358-4361.
  • Eden, S. E. (2018). Measuring phonological distance between languages. PhD thesis. Department of Linguistics.
  • Ellison, T.M., & Kirby, S. (2006). Measuring language divergence by intra-lexical comparison. Annual Meeting of the Association for Computational Linguistics. 273-278.
  • Goebl, H. (2006). Recent advances in salzburg dialectometry. Literary and Linguistic Computing, 21(4), 411–436.
  • Jellinghaus, H. (1892). Die Niederla ̈ndischen Volksmundarten; nach den Aufzeichnungen der Niederl ̈ander. D. Soltau’s Verlag, Norden and Leipzig.
  • Kang, S. S. (2015). Word similarity calcuration by using the edit distance metrics with consonant normalization. Journal of Information Processing Systems, 11(4), 573-582.
  • Karasar, N. (2005). Bilimsel araştırma yöntemi. Nobel Yayın Dağıtım.
  • Kessler, B. (1995). Computational dialectology in ırish gaelic. In Proceedings of the 7th Conference of the European Chapter of the Association for computational Linguistics, Dublin. EACL, 60–67.
  • Kessler, B. (2005). Phonetic comparison algorithms. Trasactions of the philogical society, 103(2), 243-260.
  • Kisler, T. & Reichel, U. D. (2013). A dialect dıstance metrıc based on string and temporal alignment. Elektronische Sprachsignalverarbeitung ESSV, 58-165.
  • Klatt, D. (1982). Prediction of perceived phonetic distance from critical-band spectra: A first step. ICASSP 82. IEEE International Conference on Acoustics, Speech, and Signal Processing, Paris, France, 1278-1281.
  • Gooskens, C. (1997). On the role of prosodic and verbal ınformation in the perception of dutch and english varieties. PhD thesis, Nijmegen: Katholieke Universiteit Nijmegen.
  • Gooskens, C. (2005). Travel time as a predictor of linguistic distance. DiG 13, 38-62.
  • Gooskens, C., & Heeringa, W. (2004). Perceptive evaluation of Levenshtein dialect distance measurements using Norwegian dialect data. Language Variation and Change, 16, 189-207.
  • Heeringa, W. J. (2004). Measuring dialect pronunciation differences using levenshtein distance. Thesis fully internal (DIV), University of Groningen.
  • Heeringa, W., Johnson, K. & Gooskens, C. (2009). Measuring norwegian dialect distances using acoustic features. Speech Communication, 51, 167-183.
  • Hoppenbrouwers, C. & Hoppenbrouwers, G. (2001). De indeling van de nederlandse streektalen. Dialecten van 156 steden en dorpen geklasseerd volgens de FFM. Koninklijke Van Gorcum B.V., Assen.
  • Kondrak, G. (2000). A new algorithm for the alignment of phonetic sequences. Applied Natural Language Processing Conference.
  • Kondrak, G. (2001). Identifying cognates by phonetic and semantic similarity. North American Chapter of the Association for Computational Linguistics, 1-8.
  • Kondrak, G. (2002). Determining recurrent sound correspondences by inducing translation models. In Proceedings of COLING, 488–494.
  • Kondrak, G. (2003). Phonetic alignment and similarity. Computers and the Humanities 37, 273–291
  • Kondrak, G., ve Dorr, B. (2006). Automatic identification of confusable drug names. Artificial Intelligence in Medicine, 36(1), 29-42.
  • Kondrak, G. ve Sherif, T. (2006). Evaluation of several phonetic similarity algorithms on the task of cognate identification. In Proceedings of the Workshop on Linguistic Distances, Sydney, Australia: Association for Computational Linguistics, 43-50.
  • Kruskal, J. B. (1999). An overview of sequence comparison. Time Warps, String edits, and Macromolecules. The Theory and Practice of Sequence Comparison, CSLI, Stanford, 1-44.
  • Kudare, J., Georgis, P., Möbius, Avgustinova, T. & Klakow, D. (2021). Phonetic distance and surprisal in multilingual priming: Evidence from slavic. Interspeech, 3944-3948.
  • Levenshtein, V. (1965). Binary codes capable of correcting deletions, insertions and reversals. Doklady Akademii Nauk SSSR, 163(4):845–848.
  • Mackay, W. & Kondrak, G. 2005. Computing word similarity and identifying cognates with Pair Hidden Markov Models. In Proceedings of the 9th Conference on Computational Natural Language Learning (CoNLL), 40–47.
  • MacLeod, B. (2021). Problems in the difference-in-distance measure of phonetic imitation. Journal of Phoentics, 87, 1-21.
  • Mak, B. K., & Barnard, E. (1996). Phone clustering using the Bhattacharyya distance. Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96, 4, 2005-2008, 4.
  • Melamed, I. D. (1999). Bitext Maps and alignment via pattern recognition. Association for Computational Linguistics, 25(1), 107-130.
  • Mulloni, A. & Pekar, V. (2006). Automatic detection of orthographic cues for cognate recognition. Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2387-2391.
  • Nerbonne, J., Colen, R., Gooskens, C., Kleiweg, P. ve Leinonen, T. (2011). Gabmap- A Web Application for Dialectology. Dialectologia, 1-23.
  • Nerbonne, J. & Heeringa, W. (1997). Measuring dialect distance phonetically. In Computational Phonology: Third Meeting of the ACL Special Interest Group in Computational Phonology, 11-19.
  • Nerbonne, J., Heeringa, W., Hout, E.V., Kooi, P.V., Otten, S., & Vis, W.V. (1996). Phonetic distance between dutch dialects. 1-15.
  • Och, F. J. & Ney, H. (2003). A systematic comparison of various statistical alignment models. Computational Linguistics, 29(1), 19–51.
  • Pucher, M., Türk, A., Ajmera, J., & Fecher, N. (2007). Phonetic distance measures for speech recognition vocabulary and grammar optimization. Computer Science, Linguistics.
  • Rensink, W. G. (1955). Dialectindeling naar opgaven van medewerkers. Mededelingen der Centrale Commissie voor Onderzoek van het Nederlandse Volkseigen, 7:20–23.
  • Ribeiro, A., Dias, G., Lopes, G. & Mexia, J. (2001). Cognates alignment. In Proceedings of Machine Translation Summit VIII, Spain, 18-22 Sem.
  • Ristad, E. S. & Yianilos, P. N. (1998). Learning string-edit distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(5), 522–532.
  • Se ́guy, J. (1973). La dialectometrie dans l’atlas linguistique de la gascogne. Revue de Linguistique Romane 37, 1–24.
  • Sencer, M. & Sencer, Y. (1978). Toplumsal araştırmalarda yöntembilim. Ankara: Türkiye ve Orta Doğu Amme İdaresi Enstitüsü Yayını.
  • Sooful, J. J., & Botha, E. C. (2002). An acoustic distance measure for automatic cross-language phoneme mapping. 7 th International Conference on Spoken Language Proccessing Denver, Colorado, 16-20.
  • Tiedemann, J. (1999). Automatic construction of weighted string similarity measures. Conference on Empirical Methods in Natural Language Processing. 213-220.
  • Wagner, R. & Fischer, J. (1974). The string-to-string correction problem. Journal of the Association for Computing Machinery, 21(1), 168-173.
  • Weijnen, A. (1946). De grenzen tussen de oost-noord-Brabantse dialecten on- derling. Bijdragen en Mededeelingen der Dialectcommissie van de Koninklijke Adademie van Wetenschappen te Amsterdam, 8:1–15.
  • Wieling, M. & Nerbonne, J. (2015). Advances in dialectometry. Annu. Rev. Linguist, 1, 243-264.
  • Winkler, J. (1874). Algemeen Nederduitsch en Friesch Dialecticon. Martinus Nijhoff, ’s-Gravenhage.

Dilbilimde Kullanılan Ses Bilgisi Mesafe Araçları

Year 2024, Volume: 20 Issue: 2, 105 - 126, 25.07.2024

Abstract

Diller arasındaki fonetik mesafeyi ölçen çalışmalar, dil çeşitliliğinin araştırılmasına yönelik niceliksel araştırmalardır. Bu çalışmalar, belirli kelimelerin veya dil yapılarının farklı bölgeler arasında nasıl değiştiğini ortaya koyar. Çalışmanın amacı, dillerin fonetik mesafesini ölçmek için hangi araçların kullanıldığını belirlemek ve bu araçları kullandıkları yöntemlere göre sınıflandırmaktır. Bu amaçla, geçmişten günümüze kadar fonetik mesafe ile ilgili çalışmalar incelenmiştir. Bu yönüyle çalışma, diller arasındaki fonetik mesafeyi analiz etmek, fonetik mesafe ölçüm araçları geliştirmek isteyen araştırmalara rehberlik edecektir.
Diller veya lehçeler hakkında yapılmış fonetik mesafe ile ilgili makaleler, kitaplar ve tezler belirlenmiştir. Belirlenen çalışmaların yöntem bölümleri analiz edilmiştir ve karşılaştırılmıştır. Aynı veri toplama ve analiz etme yöntemine sahip çalışmalar birleştirilmiştir. Diller ve lehçeler arasındaki farklılıkları hesaplamak ve analiz etmek için geleneksel, algısal ve hesaplamalı yöntem olmak üzere üç çeşit yöntemin kullanıldığı görülmüştür.

References

  • Ahmed, T., Suffian, M., Khan, M. Y. & Bogliolo, A. (2022). Discovering lexical similarity using articulatory feature-based phonetic edit distance. IEEE Access, 10, 1533-1544.
  • Babu, A. A., Yellasiri, R. & Rao, A. A. (2015). Phonetic distance based accent classifier to identify pronunciation variants and oov words. Signal & Image Processing: An International Journal (SIPIJ), 6(4), 33-46.
  • Bowen, G. A. (2009). Document analysis as a qualitative research method. Qualitative Research Journal, 9(2), 27-40.
  • Campbell Collaboration, (2024). What is a systematic review? https://www.campbellcollaboration.org/what-is-a-systematic review.html.
  • Chambers, J. K. & Trudgill, P. (1998). Dialectology. Cambridge, United Kingdom: Cambridge University Press.
  • Covington, M. A. (1998). Alignment of multiple languages for historical comparison. In 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, 1, 275–279.
  • Daan, J. & Blok, D. P. (1969). Van randstad tot landrand; toelichting bij de kaart: dialecten en naamkunde, volume XXXVII of bijdragen en mededelin- gen der dialectencommissie van de koninklijke nederlandse akademie van wetenschappen te amsterdam. Noord-Hollandsche Uitgevers Maatschappij, Amsterdam.
  • Droppo, J. & Acero, A. (2010). Context dependent phonetıc strıng edit distance for automatic speech recognition. 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, TX, USA, 2010, 4358-4361.
  • Eden, S. E. (2018). Measuring phonological distance between languages. PhD thesis. Department of Linguistics.
  • Ellison, T.M., & Kirby, S. (2006). Measuring language divergence by intra-lexical comparison. Annual Meeting of the Association for Computational Linguistics. 273-278.
  • Goebl, H. (2006). Recent advances in salzburg dialectometry. Literary and Linguistic Computing, 21(4), 411–436.
  • Jellinghaus, H. (1892). Die Niederla ̈ndischen Volksmundarten; nach den Aufzeichnungen der Niederl ̈ander. D. Soltau’s Verlag, Norden and Leipzig.
  • Kang, S. S. (2015). Word similarity calcuration by using the edit distance metrics with consonant normalization. Journal of Information Processing Systems, 11(4), 573-582.
  • Karasar, N. (2005). Bilimsel araştırma yöntemi. Nobel Yayın Dağıtım.
  • Kessler, B. (1995). Computational dialectology in ırish gaelic. In Proceedings of the 7th Conference of the European Chapter of the Association for computational Linguistics, Dublin. EACL, 60–67.
  • Kessler, B. (2005). Phonetic comparison algorithms. Trasactions of the philogical society, 103(2), 243-260.
  • Kisler, T. & Reichel, U. D. (2013). A dialect dıstance metrıc based on string and temporal alignment. Elektronische Sprachsignalverarbeitung ESSV, 58-165.
  • Klatt, D. (1982). Prediction of perceived phonetic distance from critical-band spectra: A first step. ICASSP 82. IEEE International Conference on Acoustics, Speech, and Signal Processing, Paris, France, 1278-1281.
  • Gooskens, C. (1997). On the role of prosodic and verbal ınformation in the perception of dutch and english varieties. PhD thesis, Nijmegen: Katholieke Universiteit Nijmegen.
  • Gooskens, C. (2005). Travel time as a predictor of linguistic distance. DiG 13, 38-62.
  • Gooskens, C., & Heeringa, W. (2004). Perceptive evaluation of Levenshtein dialect distance measurements using Norwegian dialect data. Language Variation and Change, 16, 189-207.
  • Heeringa, W. J. (2004). Measuring dialect pronunciation differences using levenshtein distance. Thesis fully internal (DIV), University of Groningen.
  • Heeringa, W., Johnson, K. & Gooskens, C. (2009). Measuring norwegian dialect distances using acoustic features. Speech Communication, 51, 167-183.
  • Hoppenbrouwers, C. & Hoppenbrouwers, G. (2001). De indeling van de nederlandse streektalen. Dialecten van 156 steden en dorpen geklasseerd volgens de FFM. Koninklijke Van Gorcum B.V., Assen.
  • Kondrak, G. (2000). A new algorithm for the alignment of phonetic sequences. Applied Natural Language Processing Conference.
  • Kondrak, G. (2001). Identifying cognates by phonetic and semantic similarity. North American Chapter of the Association for Computational Linguistics, 1-8.
  • Kondrak, G. (2002). Determining recurrent sound correspondences by inducing translation models. In Proceedings of COLING, 488–494.
  • Kondrak, G. (2003). Phonetic alignment and similarity. Computers and the Humanities 37, 273–291
  • Kondrak, G., ve Dorr, B. (2006). Automatic identification of confusable drug names. Artificial Intelligence in Medicine, 36(1), 29-42.
  • Kondrak, G. ve Sherif, T. (2006). Evaluation of several phonetic similarity algorithms on the task of cognate identification. In Proceedings of the Workshop on Linguistic Distances, Sydney, Australia: Association for Computational Linguistics, 43-50.
  • Kruskal, J. B. (1999). An overview of sequence comparison. Time Warps, String edits, and Macromolecules. The Theory and Practice of Sequence Comparison, CSLI, Stanford, 1-44.
  • Kudare, J., Georgis, P., Möbius, Avgustinova, T. & Klakow, D. (2021). Phonetic distance and surprisal in multilingual priming: Evidence from slavic. Interspeech, 3944-3948.
  • Levenshtein, V. (1965). Binary codes capable of correcting deletions, insertions and reversals. Doklady Akademii Nauk SSSR, 163(4):845–848.
  • Mackay, W. & Kondrak, G. 2005. Computing word similarity and identifying cognates with Pair Hidden Markov Models. In Proceedings of the 9th Conference on Computational Natural Language Learning (CoNLL), 40–47.
  • MacLeod, B. (2021). Problems in the difference-in-distance measure of phonetic imitation. Journal of Phoentics, 87, 1-21.
  • Mak, B. K., & Barnard, E. (1996). Phone clustering using the Bhattacharyya distance. Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96, 4, 2005-2008, 4.
  • Melamed, I. D. (1999). Bitext Maps and alignment via pattern recognition. Association for Computational Linguistics, 25(1), 107-130.
  • Mulloni, A. & Pekar, V. (2006). Automatic detection of orthographic cues for cognate recognition. Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2387-2391.
  • Nerbonne, J., Colen, R., Gooskens, C., Kleiweg, P. ve Leinonen, T. (2011). Gabmap- A Web Application for Dialectology. Dialectologia, 1-23.
  • Nerbonne, J. & Heeringa, W. (1997). Measuring dialect distance phonetically. In Computational Phonology: Third Meeting of the ACL Special Interest Group in Computational Phonology, 11-19.
  • Nerbonne, J., Heeringa, W., Hout, E.V., Kooi, P.V., Otten, S., & Vis, W.V. (1996). Phonetic distance between dutch dialects. 1-15.
  • Och, F. J. & Ney, H. (2003). A systematic comparison of various statistical alignment models. Computational Linguistics, 29(1), 19–51.
  • Pucher, M., Türk, A., Ajmera, J., & Fecher, N. (2007). Phonetic distance measures for speech recognition vocabulary and grammar optimization. Computer Science, Linguistics.
  • Rensink, W. G. (1955). Dialectindeling naar opgaven van medewerkers. Mededelingen der Centrale Commissie voor Onderzoek van het Nederlandse Volkseigen, 7:20–23.
  • Ribeiro, A., Dias, G., Lopes, G. & Mexia, J. (2001). Cognates alignment. In Proceedings of Machine Translation Summit VIII, Spain, 18-22 Sem.
  • Ristad, E. S. & Yianilos, P. N. (1998). Learning string-edit distance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(5), 522–532.
  • Se ́guy, J. (1973). La dialectometrie dans l’atlas linguistique de la gascogne. Revue de Linguistique Romane 37, 1–24.
  • Sencer, M. & Sencer, Y. (1978). Toplumsal araştırmalarda yöntembilim. Ankara: Türkiye ve Orta Doğu Amme İdaresi Enstitüsü Yayını.
  • Sooful, J. J., & Botha, E. C. (2002). An acoustic distance measure for automatic cross-language phoneme mapping. 7 th International Conference on Spoken Language Proccessing Denver, Colorado, 16-20.
  • Tiedemann, J. (1999). Automatic construction of weighted string similarity measures. Conference on Empirical Methods in Natural Language Processing. 213-220.
  • Wagner, R. & Fischer, J. (1974). The string-to-string correction problem. Journal of the Association for Computing Machinery, 21(1), 168-173.
  • Weijnen, A. (1946). De grenzen tussen de oost-noord-Brabantse dialecten on- derling. Bijdragen en Mededeelingen der Dialectcommissie van de Koninklijke Adademie van Wetenschappen te Amsterdam, 8:1–15.
  • Wieling, M. & Nerbonne, J. (2015). Advances in dialectometry. Annu. Rev. Linguist, 1, 243-264.
  • Winkler, J. (1874). Algemeen Nederduitsch en Friesch Dialecticon. Martinus Nijhoff, ’s-Gravenhage.
There are 54 citations in total.

Details

Primary Language Turkish
Subjects Linguistic Structures (Incl. Phonology, Morphology and Syntax), Computational Linguistics
Journal Section Research Article
Authors

Cemile Uzun 0000-0002-0102-3306

Publication Date July 25, 2024
Submission Date May 13, 2024
Acceptance Date July 5, 2024
Published in Issue Year 2024 Volume: 20 Issue: 2

Cite

APA Uzun, C. (2024). Dilbilimde Kullanılan Ses Bilgisi Mesafe Araçları. Dil Ve Edebiyat Dergisi, 20(2), 105-126.