In the study, a syllable-scale synchronization study was carried out by considering the grammatical structure of Turkish to emphasize simultaneously the sound and the text. Therefore, it was aimed to classify the vowels and consonants in Turkish within the word. For this purpose, two different Artificial Neural Network (ANN) models were preferred for this classification, and also the Mel-Frequency Cepstrum Coefficients method was preferred for extracting features of voice data. It has been observed that ANNs give the best results with deep learning. Tests were made with different numbers of coefficients in feature extraction. In the first stage of this study, a certain number of recordings were taken from the vowels and consonants in Turkish. Then, their feature was extracted and prepared for the training of networks. The best network structure and parameters were selected as a result of training and test made with different parameters. In this training, networks were asked to distinguish vowels from consonants. Afterward, the vowel-consonant distinction was made among 10 predetermined vectors of words and phrases. Layer-recurrent Neural Network and Pattern Recognition Network achieved an average success of 97.43% and 98.04%, respectively, in deep learning training carried out through the Mathworks Matlab software. Because Pattern Recognition Network achieved 98.82% success in recognizing vowels and 97.27% in recognizing consonants, this network model was preferred in vowel-consonant classification. After the classification process, timing files were created by determining the transition times of the vowels in the word. In the last step, an interface was created on the C# .NET platform for the synchronization process, and a syllabic algorithm was developed in this interface to emphasize the syllable synchronization of the text. Thus, the desired high precision was achieved in the simultaneous highlighting of the words.
Artificial Neural Networks Deep Learning Mel-Frequency Cepstrum Coefficients Sound-Text Synchronization
Primary Language | English |
---|---|
Subjects | Computer Software, Engineering |
Journal Section | Bilgisayar Mühendisliği / Computer Engineering |
Authors | |
Publication Date | March 1, 2022 |
Submission Date | September 17, 2021 |
Acceptance Date | November 18, 2021 |
Published in Issue | Year 2022 Volume: 12 Issue: 1 |