<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.4 20241031//EN"
        "https://jats.nlm.nih.gov/publishing/1.4/JATS-journalpublishing1-4.dtd">
<article  article-type="research-article"        dtd-version="1.4">
            <front>

                <journal-meta>
                                                                <journal-id>saucis</journal-id>
            <journal-title-group>
                                                                                    <journal-title>Sakarya University Journal of Computer and Information Sciences</journal-title>
            </journal-title-group>
                                        <issn pub-type="epub">2636-8129</issn>
                                                                                            <publisher>
                    <publisher-name>Sakarya University</publisher-name>
                </publisher>
                    </journal-meta>
                <article-meta>
                                        <article-id pub-id-type="doi">10.35377/saucis.7.87942.1544012</article-id>
                                                                <article-categories>
                                            <subj-group  xml:lang="en">
                                                            <subject>Software Engineering (Other)</subject>
                                                    </subj-group>
                                            <subj-group  xml:lang="tr">
                                                            <subject>Yazılım Mühendisliği (Diğer)</subject>
                                                    </subj-group>
                                    </article-categories>
                                                                                                                                                        <title-group>
                                                                                                                                                            <article-title>TurkishLex:Development of a Context-Aware Spell Checker for Detecting and Correcting Spelling Errors in Turkish Texts</article-title>
                                                                                                    </title-group>
            
                                                    <contrib-group content-type="authors">
                                                                        <contrib contrib-type="author">
                                                                <name>
                                    <surname>Savci</surname>
                                    <given-names>Pinar</given-names>
                                </name>
                                                                    <aff>Arçelik A.S.</aff>
                                                            </contrib>
                                                    <contrib contrib-type="author">
                                                                    <contrib-id contrib-id-type="orcid">
                                        https://orcid.org/0000-0002-2498-3297</contrib-id>
                                                                <name>
                                    <surname>Daş</surname>
                                    <given-names>Bihter</given-names>
                                </name>
                                                                    <aff>FIRAT ÜNİVERSİTESİ</aff>
                                                            </contrib>
                                                                                </contrib-group>
                        
                                        <pub-date pub-type="pub" iso-8601-date="20241231">
                    <day>12</day>
                    <month>31</month>
                    <year>2024</year>
                </pub-date>
                                        <volume>7</volume>
                                        <issue>3</issue>
                                        <fpage>404</fpage>
                                        <lpage>415</lpage>
                        
                        <history>
                                    <date date-type="received" iso-8601-date="20240905">
                        <day>09</day>
                        <month>05</month>
                        <year>2024</year>
                    </date>
                                                    <date date-type="accepted" iso-8601-date="20241010">
                        <day>10</day>
                        <month>10</month>
                        <year>2024</year>
                    </date>
                            </history>
                                        <permissions>
                    <copyright-statement>Copyright © 2018, Sakarya University Journal of Computer and Information Sciences</copyright-statement>
                    <copyright-year>2018</copyright-year>
                    <copyright-holder>Sakarya University Journal of Computer and Information Sciences</copyright-holder>
                </permissions>
            
                                                                                                                        <abstract><p>In Turkish, correct spelling correction is crucial for effective communication and preserving the integrity of written text. The challenge lies in the complexity of Turkish morphology and spelling, which can lead to frequent and diverse spelling errors. This study presents a spelling checker adapted for Turkish by creating a new Turkish dataset. The proposed spelling checker model effectively captures both minor and major textual changes and can detect the error. Our findings show that the proposed spelling checker system provides high accuracy and reliability with 98.21% accuracy performance with the Symspell module in correcting Turkish texts. This study provides valuable information about the strengths and weaknesses of existing spelling checkers and contributes to the improvement of spelling correction tools for Turkish.</p></abstract>
                                                            
            
                                                                                        <kwd-group>
                                                    <kwd>Spell checker</kwd>
                                                    <kwd>  Spelling errors</kwd>
                                                    <kwd>  Natural Language Processing (NLP)</kwd>
                                                    <kwd>  Spelling correction</kwd>
                                                    <kwd>  Turkish texts</kwd>
                                            </kwd-group>
                            
                                                                                                                                                <funding-group specific-use="FundRef">
                    <award-group>
                                                                            <award-id>AR-22-087-0001</award-id>
                                            </award-group>
                </funding-group>
                                </article-meta>
    </front>
    <back>
                            <ref-list>
                                    <ref id="ref1">
                        <label>1</label>
                        <mixed-citation publication-type="journal">Y. Chaabi and F. Ataa Allah, “Amazigh spell checker using Damerau-Levenshtein algorithm and N-gram,” Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 8, Part B, pp. 6116–6124, Sep. 2022, doi: 10.1016/j.jksuci.2021.07.015.</mixed-citation>
                    </ref>
                                    <ref id="ref2">
                        <label>2</label>
                        <mixed-citation publication-type="journal">V. J. Hodge and J. Austin, “A comparison of a novel neural spell checker and standard spell checking algorithms,” Pattern Recognition, vol. 35, no. 11, pp. 2571–2580, Nov. 2002, doi: 10.1016/S0031-3203(01)00174-1.</mixed-citation>
                    </ref>
                                    <ref id="ref3">
                        <label>3</label>
                        <mixed-citation publication-type="journal">R. Garfinkel, E. Fernandez, and R. Gopal, “Design of an interactive spell checker: Optimizing the list of offered words,” Decision Support Systems, vol. 35, no. 3, pp. 385–397, Jun. 2003, doi: 10.1016/S0167-9236(02)00115-X.</mixed-citation>
                    </ref>
                                    <ref id="ref4">
                        <label>4</label>
                        <mixed-citation publication-type="journal">M. Nejja and A. Yousfi, “The Context in Automatic Spell Correction,” Procedia Computer Science, vol. 73, pp. 109–114, Jan. 2015, doi: 10.1016/j.procs.2015.12.055.</mixed-citation>
                    </ref>
                                    <ref id="ref5">
                        <label>5</label>
                        <mixed-citation publication-type="journal">K. Sarıtaş, C. A. Öz, and T. Güngör, “A comprehensive analysis of static word embeddings for Turkish,” Expert Systems with Applications, vol. 252, p. 124123, Oct. 2024, doi: 10.1016/j.eswa.2024.124123.</mixed-citation>
                    </ref>
                                    <ref id="ref6">
                        <label>6</label>
                        <mixed-citation publication-type="journal">S. Demir and B. Topcu, “Graph-based Turkish text normalization and its impact on noisy text processing,” Engineering Science and Technology, an International Journal, vol. 35, p. 101192, Nov. 2022, doi: 10.1016/j.jestch.2022.101192.</mixed-citation>
                    </ref>
                                    <ref id="ref7">
                        <label>7</label>
                        <mixed-citation publication-type="journal">Y. B. Kaya and A. C. Tantuğ, “Effect of tokenization granularity for Turkish large language models,” Intelligent Systems with Applications, vol. 21, p. 200335, Mar. 2024, doi: 10.1016/j.iswa.2024.200335.</mixed-citation>
                    </ref>
                                    <ref id="ref8">
                        <label>8</label>
                        <mixed-citation publication-type="journal">Kukich K. Techniques for automatically correcting words in text. ACM computing surveys (CSUR). 1992 Dec 1;24(4):377-439.</mixed-citation>
                    </ref>
                                    <ref id="ref9">
                        <label>9</label>
                        <mixed-citation publication-type="journal">P. T. Hacken and C. Tschichold, “Word Manager and CALL: Structured access to the lexicon as a tool for enriching learners’ vocabulary,” ReCALL, vol. 13, no. 1, pp. 121–131, May 2001, doi: 10.1017/S0958344001001112.</mixed-citation>
                    </ref>
                                    <ref id="ref10">
                        <label>10</label>
                        <mixed-citation publication-type="journal">W. Phatthiyaphaibun, K. Chaovavanich, C. Polpanumas, A. Suriyawongkul, L. Lowphansirikul, and P. Chormai, PyThaiNLP: Thai Natural Language Processing in Python. (Jun. 2024). Python. Accessed: Aug. 27, 2024. [Online]. Available: https://github.com/PyThaiNLP/pythainlp</mixed-citation>
                    </ref>
                                    <ref id="ref11">
                        <label>11</label>
                        <mixed-citation publication-type="journal">hunspell/hunspell. (Aug. 27, 2024). C++. hunspell. Accessed: Aug. 27, 2024. [Online]. Available: https://github.com/hunspell/hunspell</mixed-citation>
                    </ref>
                                    <ref id="ref12">
                        <label>12</label>
                        <mixed-citation publication-type="journal">A. Lertpiya, T. Chaiwachirasak, N. Maharattanamalai, T. Lapjaturapit, T. Chalothorn, N. Tirasaroj, et al., &quot;A preliminary study on fundamental Thai NLP tasks for user-generated Web content&quot;, Proc. Int. Joint Symp. Artif. Intell. Natural Lang. Process. (iSAI-NLP), pp. 1-8, Nov. 2018.</mixed-citation>
                    </ref>
                                    <ref id="ref13">
                        <label>13</label>
                        <mixed-citation publication-type="journal">S. Watcharabutsarakham, &quot;Spell checker for Thai document&quot;, Proc. IEEE Region Conf., pp. 1-4, Nov. 2005.</mixed-citation>
                    </ref>
                                    <ref id="ref14">
                        <label>14</label>
                        <mixed-citation publication-type="journal">M. Rodphon, K. Siriboon and B. Kruatrachue, &quot;Thai OCR error correction using token passing algorithm&quot;, Proc. IEEE Pacific Rim Conf. Commun. Comput. Signal Process., pp. 599-602, 2001.</mixed-citation>
                    </ref>
                                    <ref id="ref15">
                        <label>15</label>
                        <mixed-citation publication-type="journal">B. Kruatrachue, K. Somguntar and K. Siriboon, &quot;Thai OCR error correction using genetic algorithm&quot;, Proc. 1st Int. Symp. Cyber Worlds, pp. 137-141, 2002.</mixed-citation>
                    </ref>
                                    <ref id="ref16">
                        <label>16</label>
                        <mixed-citation publication-type="journal">H. T. Ng, S. M. Wu, T. Briscoe, C. Hadiwinoto, R. H. Susanto and C. Bryant, &quot;The CoNLL-2014 shared task on grammatical error correction&quot;, Proc. 18th Conf. Comput. Natural Lang. Learn. Shared Task, pp. 1-14, 2014.</mixed-citation>
                    </ref>
                                    <ref id="ref17">
                        <label>17</label>
                        <mixed-citation publication-type="journal">A. Rozovskaya and D. Roth, &quot;Grammatical error correction: Machine translation and classifiers&quot;, Proc. 54th Annu. Meeting Assoc. Comput. Linguistic, pp. 2205-2215, Aug. 2016, [online] Available: https://www.aclweb.org/anthology/P16-1208.</mixed-citation>
                    </ref>
                                    <ref id="ref18">
                        <label>18</label>
                        <mixed-citation publication-type="journal">M. Junczys-Dowmunt and R. Grundkiewicz, &quot;Phrase-based Machine Translation is State-of-the-Art for Automatic Grammatical Error Correction&quot;, Proc. Conf. Empirical Methods Natural Lang. Process., pp. 1546-1556, Nov. 2016, [online] Available: https://www.aclweb.org/anthology/D16-1161.</mixed-citation>
                    </ref>
                                    <ref id="ref19">
                        <label>19</label>
                        <mixed-citation publication-type="journal">Devlin, J., Chang, M. W., Lee, K., &amp; Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.</mixed-citation>
                    </ref>
                                    <ref id="ref20">
                        <label>20</label>
                        <mixed-citation publication-type="journal">Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., &amp; Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems (pp. 5998-6008).</mixed-citation>
                    </ref>
                                    <ref id="ref21">
                        <label>21</label>
                        <mixed-citation publication-type="journal">Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., &amp; Stoyanov, V. (2019). RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692.</mixed-citation>
                    </ref>
                                    <ref id="ref22">
                        <label>22</label>
                        <mixed-citation publication-type="journal">Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., &amp; Bengio, Y. (2014). Generative adversarial nets. In Advances in Neural Information Processing Systems (pp. 2672-2680).</mixed-citation>
                    </ref>
                                    <ref id="ref23">
                        <label>23</label>
                        <mixed-citation publication-type="journal">D. N. Mati, M. Hamiti, B. Selimi and J. Ajdari, &quot;Building Spell-Check Dictionary for Low-Resource Language by Comparing Word Usage,&quot; 2021 44th International Convention on Information, Communication and Electronic Technology (MIPRO), Opatija, Croatia, 2021, pp. 229-236, doi: 10.23919/MIPRO52101.2021.9597183.</mixed-citation>
                    </ref>
                                    <ref id="ref24">
                        <label>24</label>
                        <mixed-citation publication-type="journal">A. Kicsi, K. Szabó Ledenyi, and L. Vidács, “Radiologic text correction for better machine understanding,” Engineering Reports, vol. n/a, no. n/a, p. e12891, doi: 10.1002/eng2.12891.</mixed-citation>
                    </ref>
                                    <ref id="ref25">
                        <label>25</label>
                        <mixed-citation publication-type="journal">D. Pogrebnoi, A. Funkner, and S. Kovalchuk, “RuMedSpellchecker: A new approach for advanced spelling error correction in Russian electronic health records,” Journal of Computational Science, vol. 82, p. 102393, Oct. 2024, doi: 10.1016/j.jocs.2024.102393.</mixed-citation>
                    </ref>
                                    <ref id="ref26">
                        <label>26</label>
                        <mixed-citation publication-type="journal">E., O’Neill, R., Young, E., Thiaville, M., MacCarthy, J., Carson-Berndsen, &amp; A. Ventresque, S-capade: Spelling correction aimed at particularly deviant errors. In Statistical Language and Speech Processing: 8th International Conference, SLSP 2020, Cardiff, UK, October 14–16, 2020, Proceedings 8 (pp. 85-96). Springer International Publishing.</mixed-citation>
                    </ref>
                                    <ref id="ref27">
                        <label>27</label>
                        <mixed-citation publication-type="journal">U., Liyanapathirana, K., Gunasinghe, &amp; G. Dias. Sinspell: A comprehensive spelling checker for sinhala. arXiv preprint arXiv:2107.02983, 2021.</mixed-citation>
                    </ref>
                                    <ref id="ref28">
                        <label>28</label>
                        <mixed-citation publication-type="journal">O. Abiola, A. Abayomi-Alli, O. A. Tale, S. Misra, and O. Abayomi-Alli, “Sentiment analysis of COVID-19 tweets from selected hashtags in Nigeria using VADER and Text Blob analyser,” Journal of Electrical Systems and Inf Technol, vol. 10, no. 1, p. 5, Jan. 2023, doi: 10.1186/s43067-023-00070-9.</mixed-citation>
                    </ref>
                                    <ref id="ref29">
                        <label>29</label>
                        <mixed-citation publication-type="journal">N. Bölücü and B. Can, &quot;Context Based Automatic Spelling Correction for Turkish,&quot; 2019 Scientific Meeting on Electrical-Electronics &amp; Biomedical Engineering and Computer Science (EBBT), Istanbul, Turkey, 2019, pp. 1-4, doi: 10.1109/EBBT.2019.8742067.</mixed-citation>
                    </ref>
                                    <ref id="ref30">
                        <label>30</label>
                        <mixed-citation publication-type="journal">Aydoğan, M., &amp; Karci, A. (2020). Spelling Correction with the Dictionary Method for the Turkish Language Using Word Embeddings. Avrupa Bilim ve Teknoloji Dergisi, 57–63. https://doi.org/10.31590/ejosat.araconf8</mixed-citation>
                    </ref>
                                    <ref id="ref31">
                        <label>31</label>
                        <mixed-citation publication-type="journal">O. Büyük, M. Erden and L. M. Arslan, &quot;Context Influence on Sequence to Sequence Turkish Spelling Correction,&quot; 2019 27th Signal Processing and Communications Applications Conference (SIU), Sivas, Turkey, 2019, pp. 1-4, doi: 10.1109/SIU.2019.8806476.</mixed-citation>
                    </ref>
                            </ref-list>
                    </back>
    </article>
