Regional Dynamic Sign Language Recognition using Multimodal Conformer Architecture

Gurusiddappa Hugar; Ramesh M. Kagalkar

doi:10.35377/saucis...1747386

Regional Dynamic Sign Language Recognition using Multimodal Conformer Architecture

Abstract

This paper proposes a multimodal Conformer architecture for dynamic sign language recognition in Kannada Sign Language (KSL). The model incorporates visual features extracted by EfficientNet-B0 together with 3D hand key points obtained from MediaPipe. The paper further proposes an Adaptive Confidence Correction (ACC) strategy, supported by k-NN classification, when the confidence scores for the hand key points of the signs are low and inconsistent. Finally, the dataset included 1,180 video samples covering 11 dynamic signs. Our tests demonstrate that the presented method achieves a state-of-the-art accuracy of 98.63% with a low inference time of 58ms while outperforming baselines including CNN-LSTM, Transformer, I3D, and SlowFast. Cross-validation tests and statistical analyses further support the robustness of the presented work. This work makes the following key contributions (1) the development of a newly collected in-house dataset Kannada Sign Language (KSL) dataset addressing the data scarcity of underrepresented regional sign languages, (2) the first adaptation of the Conformer architecture for sign language recognition, validated through ablation and cross-validation studies, and (3) a deployable, low-latency framework designed for mobile-edge integration and privacy-aware deployment. The dataset will be released upon acceptance and with a valid research request. By addressing accessibility challenges in human–computer interaction and offering a reproducible benchmark for regional sign-language technologies, this work supports future advances in cross-lingual generalization and low-resource optimization.

Keywords

Ethical Statement

The dataset comprised signers of diverse ages and genders to minimize potential bias. Written consent was obtained from the school for the use of their data in research.

References

S. Abdullahi and K. Chamnongthai, “IDF-sign: Addressing inconsistent depth features for dynamic sign word recognition,” IEEE Access, vol. 11, pp. 88511–88526, Aug. 2023, doi: 10.1109/ACCESS.2023.3305255.
B. Alsharif, A. S. Altaher, A. Altaher, M. Ilyas, and E. Alalwany, “Deep learning technology to recognize American Sign Language alphabet,” Sensors, vol. 23, no. 18, Art. no. 7970, Sep. 2023, doi: 10.3390/s23187970.
N. Bansal and A. Jain, “Word recognition from Indian Sign Language using GMT-MaskRCNN,” Multimed. Tools Appl., vol. 84, pp. 2565–2597, Jan. 2025, doi: 10.1007/s11042-024-20384-8.
Ö. M. Sincan and H. Y. Keleş, “AUTSL: A large scale multi-modal Turkish sign language dataset and baseline methods,” IEEE Access, vol. 8, pp. 181340–181355, Oct. 2020, doi: 10.1109/ACCESS.2020.3028072
H. P. J. Dutta, M. K. Bhuyan, D. R. Neog, K. F. MacDorman, and R. H. Laskar, “Patient assistance system based on hand gesture recognition,” IEEE Trans. Instrum. Meas., vol. 72, pp. 1–13, 2023, doi: 10.1109/TIM.2023.3282655.
S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi: 10.1162/neco.1997.9.8.1735.
A. Ji, Y. Wang, X. Miao, Z. Jiang, L. Xie, and S. Lu, “Dataglove for sign language recognition of people with hearing and speech impairment via wearable inertial sensors,” Sensors, vol. 23, no. 15, Art. no. 6693, Jul. 2023, doi: 10.3390/s23156693.
D. Kothadiya et al., “SignExplainer: An explainable AI-enabled framework for sign language recognition with ensemble learning,” IEEE Access, vol. 11, pp. 47410–47419, 2023, doi: 10.1109/ACCESS.2023.3274851. (8 authors)

A. S. M. Miah et al., “Spatial–temporal attention with graph and general neural network-based sign language recognition,” Pattern Anal. Appl., vol. 27, Art. no. 37, 2024, doi: 10.1007/s10044-024-01229-4. (8+ authors)
S. Paul, M. A. N. Abul, M. A. A. Walid, T. Rahman, S. M. J. Islam, K. Ahmed, and F. Khan, “An Adam based CNN and LSTM approach for sign language recognition in real time for deaf people,” Bull. Electr. Eng. Inform., vol. 13, no. 1, pp. 499–509, Feb. 2024, doi: 10.11591/eei.v13i1.6059.
K. K. Podder, M. E. Ezeddin, M. E. H. Chowdhury, A. Khandakar, and S. A. Aly, “Signer-independent Arabic Sign Language recognition system using deep learning model,” Sensors, vol. 23, no. 16, Art. no. 7156, Aug. 2023, doi: 10.3390/s23167156.
R. Rastgoo, K. Kiani, and S. Escalera, “Sign language recognition: A deep survey,” Expert Syst. Appl., vol. 164, Art. no. 113794, Feb. 2021, doi: 10.1016/j.eswa.2020.113794.
S. Sharma and S. Singh, “A spatio-temporal framework for dynamic Indian Sign Language recognition,” Wireless Pers. Commun., vol. 132, pp. 2527–2541, Sep. 2023, doi: 10.1007/s11277-023-10730-8.
J. Shin, A. S. M. Miah, M. A. M. Hasan, Y. Okuyama, and T. Toda, “Korean Sign Language recognition using transformer-based deep neural network,” Appl. Sci., vol. 13, no. 5, Art. no. 3029, Mar. 2023, doi: 10.3390/app13053029.
S. Srivastava, S. Singh, Pooja, and S. Prakash, “Continuous Sign Language recognition system using deep learning with MediaPipe Holistic,” Wireless Pers. Commun., vol. 137, pp. 1455–1468, Jul. 2024, doi: 10.1007/s11277-024-11356-0.
L. T. Woods and Z. A. Rana, “Modelling Sign Language with encoder-only transformers and human pose estimation keypoint data,” Mathematics, vol. 11, no. 9, Art. no. 2129, May 2023, doi: 10.3390/math11092129.
S. Xiong, C. Zou, J. Yun, H. Li, and W. Zhang, “A multi-scale dynamic attention network for continuous sign language recognition,” Signal Image Video Process., vol. 19, Art. no. 141, 2025, doi: 10.1007/s11760-024-03718-9.
H. Zhang, Z. Hu, D. Yu, L. Wang, and Y. Chen, “Multipath attention network for video action recognition,” Neural Process. Lett., vol. 56, Art. no. 124, 2024, doi: 10.1007/s11063-024-11591-3.
M. Alshewimy, “Efficient deep learning model for real-time sign language recognition,” Intell. Syst. Appl., vol. 20, Art. no. 200284, Nov. 2023, doi: 10.1016/j.iswa.2023.200284.
S. Kumar, R. Rani, and U. Chaudhari, “Real-time sign language detection: Empowering the disabled community,” MethodsX, vol. 13, Art. no. 102901, Dec. 2024, doi: 10.1016/j.mex.2024.102901.
S. Aiouez, A. Hamitouche, M. Belmadoui, K. Belattar, and F. Souami, “Real-time Arabic sign language recognition based on YOLOv5,” in Proc. Int. Conf. Image Process. Vis. Eng. (IMPROVE), 2022, pp. 17–25, doi: 10.5220/0010979300003209.
M. De Sisto et al., “Challenges with sign language datasets,” in Proc. 13th Lang. Resour. Eval. Conf. (LREC), Jun. 2022, pp. 2478–2487. [Online]. Available: https://aclanthology.org/2022.lrec-1.264.
A. Vaswani et al., “Attention is all you need,” in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 30, 2017. [Online]. Available: doi: 10.48550/arXiv.1706.03762.
S. Gan, Y. Yin, Z. Jiang, L. Xie, and S. Lu, “Skeleton-aware neural sign language translation,” in Proc. 31st ACM Int. Conf. Multimedia (ACM MM), Oct. 2023, pp. 4502–4512, doi: 10.1145/3581783.3611820.
J. Bora, S. Dehingia, A. Boruah, A. Chetia, and D. Gogoi, “Real-time Assamese sign language recognition using MediaPipe and deep learning,” Procedia Comput. Sci., vol. 218, pp. 1384–1393, 2023, doi: 10.1016/j.procs.2023.01.117.
M. Papatsimouli, P. Sarigiannidis, and G. F. Fragulis, “A survey of advancements in real-time sign language translators: Integration with IoT technology,” Technologies, vol. 11, no. 4, Art. no. 83, Jun. 2023, doi: 10.3390/technologies11040083.
S. Renjith and R. Manazhy, “Sign language: A systematic review on classification and recognition,” Multimed. Tools Appl., vol. 83, pp. 77077–77127, 2024, doi: 10.1007/s11042-024-18583-4.
R. Sreemathy, M. Turuk, S. Chaudhary, A. Mishra, and P. K. Patra, “Continuous word level sign language recognition using an expert system based on machine learning,” Int. J. Cogn. Comput. Eng., vol. 4, pp. 170–178, 2023, doi: 10.1016/j.ijcce.2023.04.002.
P. Jayanthi, P. R. K. S. Bhama, K. Swetha, and S. A. Subash, “Real time static and dynamic sign language recognition using deep learning,” J. Sci. Ind. Res., vol. 81, no. 11, pp. 1188–1195, Nov. 2022, doi: 10.56042/jsir.v81i11.52657.
K. Lakshmi, M. S. Samiya, M. R. Kumar, and P. K. Singuluri, “Real-time hand gesture recognition for deaf communication,” Int. J. Intell. Syst. Appl. Eng., vol. 11, no. 6s, pp. 23–37, 2023. [Online]. Available: https://ijisae.org/index.php/IJISAE/article/view/2825

Details

Primary Language

English

Subjects

Computer Software

Journal Section

Research Article

Authors

Gurusiddappa Hugar ^*
0000-0002-6350-6581
India

Ramesh M. Kagalkar
0000-0001-9985-1275
India

Early Pub Date

May 23, 2026

Publication Date

June 17, 2026

Submission Date

July 21, 2025

Acceptance Date

December 6, 2025

Published in Issue

Year 2026 Volume: 9 Number: 2

DOI

https://doi.org/10.35377/saucis...1747386

IZ

https://izlik.org/JA77BB63NP

Cite

RIS / Bibtex

APA

Hugar, G., & Kagalkar, R. M. (2026). Regional Dynamic Sign Language Recognition using Multimodal Conformer Architecture. Sakarya University Journal of Computer and Information Sciences, 9(2), 436-450. https://doi.org/10.35377/saucis...1747386

AMA

1.Hugar G, Kagalkar RM. Regional Dynamic Sign Language Recognition using Multimodal Conformer Architecture. SAUCIS. 2026;9(2):436-450. doi:10.35377/saucis.1747386

Chicago

Hugar, Gurusiddappa, and Ramesh M. Kagalkar. 2026. “Regional Dynamic Sign Language Recognition Using Multimodal Conformer Architecture”. Sakarya University Journal of Computer and Information Sciences 9 (2): 436-50. https://doi.org/10.35377/saucis. 1747386.

EndNote

Hugar G, Kagalkar RM (June 1, 2026) Regional Dynamic Sign Language Recognition using Multimodal Conformer Architecture. Sakarya University Journal of Computer and Information Sciences 9 2 436–450.

IEEE

[1]G. Hugar and R. M. Kagalkar, “Regional Dynamic Sign Language Recognition using Multimodal Conformer Architecture”, SAUCIS, vol. 9, no. 2, pp. 436–450, June 2026, doi: 10.35377/saucis...1747386.

ISNAD

Hugar, Gurusiddappa - Kagalkar, Ramesh M. “Regional Dynamic Sign Language Recognition Using Multimodal Conformer Architecture”. Sakarya University Journal of Computer and Information Sciences 9/2 (June 1, 2026): 436-450. https://doi.org/10.35377/saucis. 1747386.

JAMA

1.Hugar G, Kagalkar RM. Regional Dynamic Sign Language Recognition using Multimodal Conformer Architecture. SAUCIS. 2026;9:436–450.

MLA

Hugar, Gurusiddappa, and Ramesh M. Kagalkar. “Regional Dynamic Sign Language Recognition Using Multimodal Conformer Architecture”. Sakarya University Journal of Computer and Information Sciences, vol. 9, no. 2, June 2026, pp. 436-50, doi:10.35377/saucis. 1747386.

Vancouver

1.Gurusiddappa Hugar, Ramesh M. Kagalkar. Regional Dynamic Sign Language Recognition using Multimodal Conformer Architecture. SAUCIS. 2026 Jun. 1;9(2):436-50. doi:10.35377/saucis. 1747386