<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.4 20241031//EN"
        "https://jats.nlm.nih.gov/publishing/1.4/JATS-journalpublishing1-4.dtd">
<article  article-type="research-article"        dtd-version="1.4">
            <front>

                <journal-meta>
                                                                <journal-id>süleyman demirel üniv. fen bilim. enst. derg.</journal-id>
            <journal-title-group>
                                                                                    <journal-title>Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi</journal-title>
            </journal-title-group>
                                        <issn pub-type="epub">1308-6529</issn>
                                                                                            <publisher>
                    <publisher-name>Süleyman Demirel Üniversitesi</publisher-name>
                </publisher>
                    </journal-meta>
                <article-meta>
                                        <article-id pub-id-type="doi">10.19113/sdufenbed.1753641</article-id>
                                                                <article-categories>
                                            <subj-group  xml:lang="en">
                                                            <subject>Speech Recognition</subject>
                                                    </subj-group>
                                            <subj-group  xml:lang="tr">
                                                            <subject>Konuşma Tanıma</subject>
                                                    </subj-group>
                                    </article-categories>
                                                                                                                                                        <title-group>
                                                                                                                        <trans-title-group xml:lang="tr">
                                    <trans-title>Türkçe Sesli Komut Sınıflandırması için Melez Spektral ve İstatistiksel Özellik Modellemesi: Çapraz Korelasyon ve Topluluk Öğrenmesi Yaklaşımı</trans-title>
                                </trans-title-group>
                                                                                                                                                                                                <article-title>Hybrid Spectral and Statistical Feature Modelling with Cross-Correlation and Ensemble Learning for Robust Turkish Voice Command Classification</article-title>
                                                                                                    </title-group>
            
                                                    <contrib-group content-type="authors">
                                                                        <contrib contrib-type="author">
                                                                    <contrib-id contrib-id-type="orcid">
                                        https://orcid.org/0000-0002-7632-1973</contrib-id>
                                                                <name>
                                    <surname>İkizler</surname>
                                    <given-names>Nuri</given-names>
                                </name>
                                                                    <aff>KARADENİZ TEKNİK ÜNİVERSİTESİ</aff>
                                                            </contrib>
                                                                                </contrib-group>
                        
                                        <pub-date pub-type="pub" iso-8601-date="20260424">
                    <day>04</day>
                    <month>24</month>
                    <year>2026</year>
                </pub-date>
                                        <volume>30</volume>
                                        <issue>1</issue>
                                        <fpage>67</fpage>
                                        <lpage>82</lpage>
                        
                        <history>
                                    <date date-type="received" iso-8601-date="20250729">
                        <day>07</day>
                        <month>29</month>
                        <year>2025</year>
                    </date>
                                                    <date date-type="accepted" iso-8601-date="20251222">
                        <day>12</day>
                        <month>22</month>
                        <year>2025</year>
                    </date>
                            </history>
                                        <permissions>
                    <copyright-statement>Copyright © 1995, Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi</copyright-statement>
                    <copyright-year>1995</copyright-year>
                    <copyright-holder>Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi</copyright-holder>
                </permissions>
            
                                                                                                <trans-abstract xml:lang="tr">
                            <p>Öz: Türkçe sesli komutların doğru bir şekilde sınıflandırılması, sesle kontrol edilen teknolojilerin gelişimi ve ana dil bağlamında insan-bilgisayar etkileşiminin sorunsuz bir şekilde gerçekleşmesi açısından kritik öneme sahiptir. Bu çalışmada, konuşma sinyallerinin zamansal, spektral ve zaman-frekans temelli özelliklerini yakalayarak tanıma doğruluğunu artırmayı amaçlayan çeşitli özellik çıkarım modelleri sistematik olarak değerlendirilmiştir. Altı farklı özellik vektörü modeli geliştirilmiş; son modelde ise Bilgi Kazancı tabanlı özellik seçimi ile Doğrusal Öngörümleme Kodlama kullanılarak elde edilen formant frekansları entegre edilerek kapsamlı ve ayrıştırıcı bir temsil elde edilmiştir. Sınıflandırma süreci, yaygın olarak kullanılan altı algoritma ile gerçekleştirilmiştir: Rastgele Orman, k-En Yakın Komşu, Çok Katmanlı Algılayıcı, Lojistik Model Ağacı, Destek Vektör Makineleri ve Rastgele Orman, Çok Katmanlı Algılayıcı ve Lojistik Model Ağacı yöntemlerini birleştiren bir topluluk oylama yöntemi. Topluluk oylama sınıflandırıcısı, %93,94 doğruluk oranı ile en yüksek performansı sergileyerek bireysel sınıflayıcıları ve temel modelleri anlamlı şekilde geride bırakmıştır. Bu çalışma, Türkçe sesli komut tanıma uygulamalarına yönelik sağlam, açıklanabilir ve yüksek performanslı bir özellik çerçevesi sunarak literatüre önemli bir katkı sağlamaktadır. Spektral, zamansal ve artikülatuvar özelliklerin entegrasyonu, sesli komutların daha başarılı bir şekilde ayrıştırılmasını mümkün kılmakta ve Türkçe dilindeki sesli kontrol sistemlerinin gelecekteki uygulamaları için değerli çıkarımlar sunmaktadır.</p></trans-abstract>
                                                                                                                                    <abstract><p>Accurate classification of Turkish voice commands is essential for advancing voice-controlled technologies and enabling seamless human-computer interaction in native language contexts. This study systematically evaluates multiple feature extraction models capturing temporal, spectral, and time-frequency characteristics of speech signals to enhance recognition accuracy. Six feature vector models were developed, with the final model integrating Information Gain-based feature selection and Linear Predictive Coding-derived formant frequencies to create a comprehensive and discriminative representation. Classification was performed using six widely adopted algorithms: Random Forest, k-Nearest Neighbors, Multilayer Perceptron, Logistic Model Tree, Support Vector Machine, and an Ensemble voting method combining Random Forest, Multilayer Perceptron, and Logistic Model Tree. The Ensemble voting classifier demonstrated superior performance, achieving an accuracy of 93.94%, significantly outperforming individual classifiers and baseline models. This study contributes to the literature by presenting a robust, explainable, and high-performing feature framework tailored for Turkish voice command recognition. The integration of spectral, temporal, and articulatory features enables improved discrimination of speech commands, offering valuable insights for future voice-activated applications in Turkish language environments.</p></abstract>
                                                            
            
                                                                                        <kwd-group>
                                                    <kwd>Turkish voice command recognition</kwd>
                                                    <kwd>  Feature extraction of speech</kwd>
                                                    <kwd>  Ensemble learning</kwd>
                                                    <kwd>  Information gain</kwd>
                                                    <kwd>  Cross-correlation</kwd>
                                            </kwd-group>
                            
                                                <kwd-group xml:lang="tr">
                                                    <kwd>Türkçe sesli komut tanıma</kwd>
                                                    <kwd>  Özellik çıkarımı</kwd>
                                                    <kwd>  Topluluk öğrenmesi</kwd>
                                                    <kwd>  Bilgi kazancı</kwd>
                                                    <kwd>  Çapraz korelasyon</kwd>
                                            </kwd-group>
                                                                                                                                        </article-meta>
    </front>
    <back>
                            <ref-list>
                                    <ref id="ref1">
                        <label>1</label>
                        <mixed-citation publication-type="journal">[1]	Jakob, D. 2022. Voice controlled devices and older adults – a systematic literature review. In Proc. International Conference on Human-Computer Interaction, Cham, Switzerland: Springer International Publishing, 175–200.</mixed-citation>
                    </ref>
                                    <ref id="ref2">
                        <label>2</label>
                        <mixed-citation publication-type="journal">[2]	Saritha, B., Laskar, M. A., Laskar, R. H. 2022. A comprehensive review on speaker recognition. In Advances in Speech and Music Technology: Computational Aspects and Applications, 3–23.</mixed-citation>
                    </ref>
                                    <ref id="ref3">
                        <label>3</label>
                        <mixed-citation publication-type="journal">[3]	Gormez, Y. 2024. Customized deep learning based Turkish automatic speech recognition system supported by language model. PeerJ Computer Science, 10, e1981.</mixed-citation>
                    </ref>
                                    <ref id="ref4">
                        <label>4</label>
                        <mixed-citation publication-type="journal">[4]	Çelik, Y. 2024. Application of deep learning for voice command classification in Turkish language. Bitlis Eren University Journal of Science and Technology, 13(3), 701–708.</mixed-citation>
                    </ref>
                                    <ref id="ref5">
                        <label>5</label>
                        <mixed-citation publication-type="journal">[5]	Kang, O., Pickering, L. 2024. Acoustic and temporal analysis for assessing speaking. In The Concise Companion to Language Assessment, 383.</mixed-citation>
                    </ref>
                                    <ref id="ref6">
                        <label>6</label>
                        <mixed-citation publication-type="journal">[6]	Abdul, Z. K., Al-Talabani, A. K. 2022. Mel frequency cepstral coefficient and its applications: A review. IEEE Access, 10, 122136–122158.</mixed-citation>
                    </ref>
                                    <ref id="ref7">
                        <label>7</label>
                        <mixed-citation publication-type="journal">[7]	Badhe, S. S., Shirbahadurkar, S. D., Gulhane, S. R. 2022. Renyi entropy and deep learning-based approach for accent classification. Multimedia Tools and Applications, 81(1), 1467–1499.</mixed-citation>
                    </ref>
                                    <ref id="ref8">
                        <label>8</label>
                        <mixed-citation publication-type="journal">[8]	Singh, V. K., Sharma, K., Sur, S. N. 2025. Acoustic scene classification using dynamic time warping technique based on short time Fourier transform and discrete wavelet transforms. Circuits, Systems, and Signal Processing, 44(3), 1887–1913.</mixed-citation>
                    </ref>
                                    <ref id="ref9">
                        <label>9</label>
                        <mixed-citation publication-type="journal">[9]	İkizler, N. 1995. Doğrusal Öngörümleme Kodlama ve Yapay Sinir Ağı Yöntemlerinin Ses Tanımada Kullanılması. Karadeniz Teknik Üniversitesi, Fen Bilimleri Enstitüsü Yüksek Lisans Tezi, 83s, Trabzon.</mixed-citation>
                    </ref>
                                    <ref id="ref10">
                        <label>10</label>
                        <mixed-citation publication-type="journal">[10]	Malik, M., Malik, M. K., Mehmood, K., Makhdoom, I. 2021. Automatic speech recognition: a survey. Multimedia Tools and Applications, 80(6),  9411–9457.</mixed-citation>
                    </ref>
                                    <ref id="ref11">
                        <label>11</label>
                        <mixed-citation publication-type="journal">[11]	Debnath, S., Roy, P. 2021. Appearance and shape-based hybrid visual feature extraction: toward audio–visual automatic speech recognition. Signal, Image, and Video Processing, 15(1), 25–32.</mixed-citation>
                    </ref>
                                    <ref id="ref12">
                        <label>12</label>
                        <mixed-citation publication-type="journal">[12]	Vashisht, V., Pandey, A. K., Yadav, S. P. 2021. Speech recognition using machine learning. IEIE Transactions on Smart Processing and Computation, 10(3), 233–239.</mixed-citation>
                    </ref>
                                    <ref id="ref13">
                        <label>13</label>
                        <mixed-citation publication-type="journal">[13]	Madhu, G., Bukka, A. 2023. Ensemble learning model for gender recognition using the human voice. In Proc. 2023 1st International Conference on Advanced Electronics, Electronics, Computer Intelligence (ICAEECI), October 2023, 1–5.</mixed-citation>
                    </ref>
                                    <ref id="ref14">
                        <label>14</label>
                        <mixed-citation publication-type="journal">[14]	Alsobhani, A., ALabboodi, H. M., Mahdi, H. 2021. Speech recognition using convolution deep neural networks. Journal of Physics: Conference Series, 1973(1), 012166, August 2021.</mixed-citation>
                    </ref>
                                    <ref id="ref15">
                        <label>15</label>
                        <mixed-citation publication-type="journal">[15]	Alharbi, S. et al. 2021. Automatic speech recognition: systematic literature review. IEEE Access, 9, 131858–131876.</mixed-citation>
                    </ref>
                                    <ref id="ref16">
                        <label>16</label>
                        <mixed-citation publication-type="journal">[16]	Bakır, H., Çayır, A. N., Navruz, T. S. 2024. A comprehensive experimental study for analyzing the effects of data augmentation techniques on voice classification. Multimedia Tools and Applications, 83(6), 17601–17628.</mixed-citation>
                    </ref>
                                    <ref id="ref17">
                        <label>17</label>
                        <mixed-citation publication-type="journal">[17]	Kurtkaya, M. 2020. Turkish speech command dataset. https://www.kaggle.com/murat-kurtkaya/turkish-speech-command-dataset (Accessed: 15.07.2025).</mixed-citation>
                    </ref>
                                    <ref id="ref18">
                        <label>18</label>
                        <mixed-citation publication-type="journal">[18]	İkizler, N. 2002. Türkçe’de Konuşmacıdan Bağımsız Hece Tanıma Sistemi, Karadeniz Teknik Üniversitesi, Fen Bilimleri Enstitüsü, Doktora Tezi, 143s, Trabzon.</mixed-citation>
                    </ref>
                                    <ref id="ref19">
                        <label>19</label>
                        <mixed-citation publication-type="journal">[19]	İkizler, N., Çavdar, İ. H., Ekim, G. 2005. Türkçe’de konuşmacıdan bağımsız ayrık hece tanıma sistemi. In Proc. IEEE Signal Processing and Communications Applications Conference (SIU), vol. 1, Kayseri, Türkiye, May, 55–59.</mixed-citation>
                    </ref>
                                    <ref id="ref20">
                        <label>20</label>
                        <mixed-citation publication-type="journal">[20]	Ekim, G., İkizler, N., Atasoy, A., Çavdar, İ. H. 2008. A speaker recognition system using cross correlation. In Proc. IEEE 16th Signal Processing and Communications Applications Conference (SIU), Aydın, Türkiye, April, 18–21.</mixed-citation>
                    </ref>
                                    <ref id="ref21">
                        <label>21</label>
                        <mixed-citation publication-type="journal">[21]	Liu, D., Xu, J., Zhang, P., Yan, Y. 2021. A unified system for multilingual speech recognition and language identification. Speech Communication, 127, 17–28.</mixed-citation>
                    </ref>
                                    <ref id="ref22">
                        <label>22</label>
                        <mixed-citation publication-type="journal">[22]	Iqbal, Y., Zhang, T., Gunawan, T. S., Pratondo, A., Zhao, X., Geng, Y., et al. 2025. A hybrid speech enhancement technique based on discrete wavelet transform and spectral subtraction. IEEE Access.</mixed-citation>
                    </ref>
                                    <ref id="ref23">
                        <label>23</label>
                        <mixed-citation publication-type="journal">[23]	Andriyanov, N. 2023. The use of correlation features in the problem of speech recognition. Algorithms, 16(2), 90.</mixed-citation>
                    </ref>
                                    <ref id="ref24">
                        <label>24</label>
                        <mixed-citation publication-type="journal">[24]	Ikizler, N., Ekim, G. 2025. Investigating the effects of Gaussian noise on epileptic seizure detection: The role of spectral flatness, bandwidth, and entropy. Engineering Science and Technology, International Journal, 64, 102005.</mixed-citation>
                    </ref>
                                    <ref id="ref25">
                        <label>25</label>
                        <mixed-citation publication-type="journal">[25]	Mohine, S., Gupta, P., Bansod, B. S., Bhalla, R., Basra, A. 2022. Evaluation of acoustic modality features for moving vehicle identification. Multidimensional Systems and Signal Processing, 33(4), 1349–1365.</mixed-citation>
                    </ref>
                                    <ref id="ref26">
                        <label>26</label>
                        <mixed-citation publication-type="journal">[26]	Nawas, K. K., Barik, M. K., Khan, A. N. 2021. Speaker recognition using random forest. In Proc. ITM Web Conference, vol. 37, 01022.</mixed-citation>
                    </ref>
                                    <ref id="ref27">
                        <label>27</label>
                        <mixed-citation publication-type="journal">[27]	Singh, M. K. 2024. Feature extraction and classification efficiency analysis using machine learning approach for speech signal. Multimedia Tools and Applications, 83(16), 47069–47084.</mixed-citation>
                    </ref>
                                    <ref id="ref28">
                        <label>28</label>
                        <mixed-citation publication-type="journal">[28]	Goh, M., Yann, X. L. 2021. A novel sentiments analysis model using perceptron classifier. International Journal of Electronics Engineering Applications, 9(4), 1–10.</mixed-citation>
                    </ref>
                                    <ref id="ref29">
                        <label>29</label>
                        <mixed-citation publication-type="journal">[29]	Ferrer, L. 2022. Analysis and comparison of classification metrics. arXiv preprint, arXiv:2209.05355.</mixed-citation>
                    </ref>
                                    <ref id="ref30">
                        <label>30</label>
                        <mixed-citation publication-type="journal">[30]	Omuya, E.O., Okeyo, G. O., Kimwele, M. W. 2021. Feature selection for classification using principal component analysis and information gain. Expert Systems with Applications, 174, 114765.</mixed-citation>
                    </ref>
                                    <ref id="ref31">
                        <label>31</label>
                        <mixed-citation publication-type="journal">[31]	Unluturk, A. 2023. Speech Command Based Intelligent Control of Multiple Home Devices for Physically Handicapped. In Proc. 2023 5th Global Power, Energy and Communication Conference (GPECOM), June, 560–564, IEEE.</mixed-citation>
                    </ref>
                                    <ref id="ref32">
                        <label>32</label>
                        <mixed-citation publication-type="journal">[32]	Karakaş, B. 2025. Türkçe Sesli Komut Verilerinin Evrişimsel Sinir Ağı ile Sınıflandırılması. Mühendislik Bilimleri ve Araştırmaları Dergisi, 7(1), 51–59.</mixed-citation>
                    </ref>
                                    <ref id="ref33">
                        <label>33</label>
                        <mixed-citation publication-type="journal">[33]	Uslu, İ. B., Tora, H., Sümer, E., Türker, M. 2020. Yalıtık Sözcüklü bir Türkçe Konuşma Tanıma Sisteminin Yapay Veri Artırımı ile Tasarımı ve Gerçekleştirilimi. Afyon Kocatepe University Journal of Science and Engineering, 20(6), 1147–1155.</mixed-citation>
                    </ref>
                                    <ref id="ref34">
                        <label>34</label>
                        <mixed-citation publication-type="journal">[34]	Işık, G. 2019. Türkçe Ağızların Tanınmasında Derin Öğrenme Tekniğinin Kullanılması. Hacettepe Üniversitesi Fen Bilimleri Enstitüsü, Doktora Tezi, 100s, Ankara.</mixed-citation>
                    </ref>
                            </ref-list>
                    </back>
    </article>
