Research Article

Deep Learning-Based Object Detection with Mobile Application and Expression Generation Using a Large Language Model

Number: Advanced Online Publication Early Pub Date: April 8, 2026
TR EN

Deep Learning-Based Object Detection with Mobile Application and Expression Generation Using a Large Language Model

Abstract

This work presents an integrated mobile solution that allows users to detect objects in their environment, measure their distances, and understand the spatial relationships between them. The system combines YOLOv11-based real-time object detection, LiDAR-assisted distance measurement, and GPT-4o expression generation, allowing users to locate desired objects and learn about nearby objects. This allows the user to understand not only the presence of objects but also their locations and their spatial relationships. In this study, images are captured with a mobile application during object detection, ensuring that the object is always within the frame. This prevents problems such as blurring and incorrect framing, which are frequently encountered in photos created by visually impaired users. Experimental results show that the YOLOv11 model demonstrates effective performance with an F1 score of 0.77 and a mAP value of 0.806. Furthermore, the fine-tuned GPT-4o model identifies object locations in images and generates expressions that include other surrounding objects. The present work proposes a system that integrates object detection, LiDAR-based distance measurement, and expression generation from a large language model. It provides a reference for the implementation of more advanced solutions in the future.

Keywords

References

  1. Abed, A. A., Al-Ibadi, A., & Abed, I. A. (2023). Real-time multiple face mask and fever detection using YOLOv3 and TensorFlow lite platforms. Bulletin of Electrical Engineering and Informatics, 12(2), 922-929.
  2. Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F. L., Almeida, D., Altenschmidt, J., Altman, S., & Anadkat, S. (2023). Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
  3. Alamsyah, D. P., Ramdhani, Y., Syam, A. T., & Setiadi, A. (2022). Augmented Reality English Education Based iOS with MobileNetV2 Image Recognition Model. 2022 Seventh International Conference on Informatics and Computing (ICIC),
  4. Alemdar, K. D., Kayacı Çodur, M., Codur, M. Y., & Uysal, F. (2023). Environmental Effects of Driver Distraction at Traffic Lights: Mobile Phone Use. Sustainability, 15(20), 15056.
  5. Boyar, T., & Yıldız, K. (2022). Powdery mildew detection in hazelnut with deep learning. Hittite Journal of Science and Engineering, 9(3), 159-166.
  6. Chen, C., Anjum, S., & Gurari, D. (2022). Grounding answers for visual questions asked by visually impaired people. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Chen, C., Tseng, Y.-Y., Li, Z., Venkatesh, A., & Gurari, D. (2025). Acknowledging Focus Ambiguity in Visual Questions. arXiv preprint arXiv:2501.02201.
  7. Chen, J., & Zhu, Z. (2023). Real-time 3D object detection, recognition and presentation using a mobile device for assistive navigation. SN Computer Science, 4(5), 543. Furniture Computer Vision Dataset. (2022). Retrieved 19.11.2025 from https://universe.roboflow.com/objectdetection-uzld5/furniture-ngpea-h6zxi/
  8. Gurari, D., Li, Q., Stangl, A. J., Guo, A., Lin, C., Grauman, K., Luo, J., & Bigham, J. P. (2018). Vizwiz grand challenge: Answering visual questions from blind people. Proceedings of the IEEE conference on computer vision and pattern recognition, Han, X., Zhang, Z., Ding, N., Gu, Y., Liu, X., Huo, Y., Qiu, J., Yao, Y., Zhang, A., & Zhang, L. (2021). Pre-trained models: Past, present and future. AI Open, 2, 225-250. He, L., Zhou, Y., Liu, L., Zhang, Y., & Ma, J. (2025). Application of the YOLOv11-seg algorithm for AI-based landslide detection and recognition. Scientific Reports, 15(1), 12421.

Details

Primary Language

English

Subjects

Computer Vision, Natural Language Processing

Journal Section

Research Article

Early Pub Date

April 8, 2026

Publication Date

-

Submission Date

November 21, 2025

Acceptance Date

December 16, 2025

Published in Issue

Year 2026 Number: Advanced Online Publication

APA
Dere, N., Yıldız, K., & Demir, Ö. (2026). Deep Learning-Based Object Detection with Mobile Application and Expression Generation Using a Large Language Model. Journal of Naval Sciences and Engineering, Advanced Online Publication, 69-93. https://doi.org/10.56850/jnse.1828189
AMA
1.Dere N, Yıldız K, Demir Ö. Deep Learning-Based Object Detection with Mobile Application and Expression Generation Using a Large Language Model. JNSE. 2026;(Advanced Online Publication):69-93. doi:10.56850/jnse.1828189
Chicago
Dere, Nurcihan, Kazım Yıldız, and Önder Demir. 2026. “Deep Learning-Based Object Detection With Mobile Application and Expression Generation Using a Large Language Model”. Journal of Naval Sciences and Engineering, no. Advanced Online Publication: 69-93. https://doi.org/10.56850/jnse.1828189.
EndNote
Dere N, Yıldız K, Demir Ö (April 1, 2026) Deep Learning-Based Object Detection with Mobile Application and Expression Generation Using a Large Language Model. Journal of Naval Sciences and Engineering Advanced Online Publication 69–93.
IEEE
[1]N. Dere, K. Yıldız, and Ö. Demir, “Deep Learning-Based Object Detection with Mobile Application and Expression Generation Using a Large Language Model”, JNSE, no. Advanced Online Publication, pp. 69–93, Apr. 2026, doi: 10.56850/jnse.1828189.
ISNAD
Dere, Nurcihan - Yıldız, Kazım - Demir, Önder. “Deep Learning-Based Object Detection With Mobile Application and Expression Generation Using a Large Language Model”. Journal of Naval Sciences and Engineering. Advanced Online Publication (April 1, 2026): 69-93. https://doi.org/10.56850/jnse.1828189.
JAMA
1.Dere N, Yıldız K, Demir Ö. Deep Learning-Based Object Detection with Mobile Application and Expression Generation Using a Large Language Model. JNSE. 2026;:69–93.
MLA
Dere, Nurcihan, et al. “Deep Learning-Based Object Detection With Mobile Application and Expression Generation Using a Large Language Model”. Journal of Naval Sciences and Engineering, no. Advanced Online Publication, Apr. 2026, pp. 69-93, doi:10.56850/jnse.1828189.
Vancouver
1.Nurcihan Dere, Kazım Yıldız, Önder Demir. Deep Learning-Based Object Detection with Mobile Application and Expression Generation Using a Large Language Model. JNSE. 2026 Apr. 1;(Advanced Online Publication):69-93. doi:10.56850/jnse.1828189