Regional Dynamic Sign Language Recognition using Multimodal Conformer Architecture
Abstract
This paper proposes a multimodal Conformer architecture for dynamic sign language recognition in Kannada Sign Language (KSL). The model incorporates visual features extracted by EfficientNet-B0 together with 3D hand key points obtained from MediaPipe. The paper further proposes an Adaptive Confidence Correction (ACC) strategy, supported by k-NN classification, when the confidence scores for the hand key points of the signs are low and inconsistent. Finally, the dataset included 1,180 video samples covering 11 dynamic signs. Our tests demonstrate that the presented method achieves a state-of-the-art accuracy of 98.63% with a low inference time of 58ms while outperforming baselines including CNN-LSTM, Transformer, I3D, and SlowFast. Cross-validation tests and statistical analyses further support the robustness of the presented work. This work makes the following key contributions (1) the development of a newly collected in-house dataset Kannada Sign Language (KSL) dataset addressing the data scarcity of underrepresented regional sign languages, (2) the first adaptation of the Conformer architecture for sign language recognition, validated through ablation and cross-validation studies, and (3) a deployable, low-latency framework designed for mobile-edge integration and privacy-aware deployment. The dataset will be released upon acceptance and with a valid research request. By addressing accessibility challenges in human–computer interaction and offering a reproducible benchmark for regional sign-language technologies, this work supports future advances in cross-lingual generalization and low-resource optimization.
Keywords
- Computer vision and pattern recognition
- Sign language recognition (SLR)
- Conformer architecture
- Real-time embedded systems
- Human-computer interactionz
Ethical Statement
References
- S. Abdullahi and K. Chamnongthai, “IDF-sign: Addressing inconsistent depth features for dynamic sign word recognition,” IEEE Access, vol. 11, pp. 88511–88526, Aug. 2023, doi: 10.1109/ACCESS.2023.3305255.
- B. Alsharif, A. S. Altaher, A. Altaher, M. Ilyas, and E. Alalwany, “Deep learning technology to recognize American Sign Language alphabet,” Sensors, vol. 23, no. 18, Art. no. 7970, Sep. 2023, doi: 10.3390/s23187970.
- N. Bansal and A. Jain, “Word recognition from Indian Sign Language using GMT-MaskRCNN,” Multimed. Tools Appl., vol. 84, pp. 2565–2597, Jan. 2025, doi: 10.1007/s11042-024-20384-8.
- Ö. M. Sincan and H. Y. Keleş, “AUTSL: A large scale multi-modal Turkish sign language dataset and baseline methods,” IEEE Access, vol. 8, pp. 181340–181355, Oct. 2020, doi: 10.1109/ACCESS.2020.3028072
- H. P. J. Dutta, M. K. Bhuyan, D. R. Neog, K. F. MacDorman, and R. H. Laskar, “Patient assistance system based on hand gesture recognition,” IEEE Trans. Instrum. Meas., vol. 72, pp. 1–13, 2023, doi: 10.1109/TIM.2023.3282655.
- S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi: 10.1162/neco.1997.9.8.1735.
- A. Ji, Y. Wang, X. Miao, Z. Jiang, L. Xie, and S. Lu, “Dataglove for sign language recognition of people with hearing and speech impairment via wearable inertial sensors,” Sensors, vol. 23, no. 15, Art. no. 6693, Jul. 2023, doi: 10.3390/s23156693.
- D. Kothadiya et al., “SignExplainer: An explainable AI-enabled framework for sign language recognition with ensemble learning,” IEEE Access, vol. 11, pp. 47410–47419, 2023, doi: 10.1109/ACCESS.2023.3274851. (8 authors)
Details
Primary Language
English
Subjects
Computer Software
Journal Section
Research Article
Early Pub Date
May 23, 2026
Publication Date
June 17, 2026
Submission Date
July 21, 2025
Acceptance Date
December 6, 2025
Published in Issue
Year 2026 Volume: 9 Number: 2
