A Systematic Evaluation of Photometric Data Augmentation Combinations in Medical Object Detection
Abstract
This study examines how different data augmentation strategies influence medical object detection performance on the Kvasir-SEG dataset by using Faster R-CNN X101-FPN and YOLOv7 as benchmark models. Augmentation is widely used to improve robustness in image classification. However, systematic analyses in object detection are still limited because bounding-box integrity must be preserved during every geometric transformation step. In this work, eight photometric augmentation techniques (Hue, Noise, Saturation, Grayscale, Blur, Brightness, Contrast, and Cutout) were applied independently and in multi-level combinations. Each augmentation was tested in single, double, triple, and full pipelines. Model performance was evaluated through mean Average Precision (mAP) using the COCO evaluation standard. The results show that color-based augmentations improve detection accuracy more than distortion-based augmentations in polyp detection tasks. The results also show that excessive augmentation depth slows model convergence and prevents accuracy gains. This study provides a structured analysis of augmentation depth and diversity on a medical object detection dataset and offers clear guidance for designing effective augmentation pipelines for medical object detection.
Keywords
Deep learning, Computer vision, Data augmentation, Medical object detection
Supporting Institution
Ethical Statement
Thanks
References
- Alin, A. Y., Kusrini, & Yuana, K. A. (2023). Data Augmentation Method on Drone Object Detection with YOLOv5 Algorithm. 2023 8th International Conference on Informatics and Computing, ICIC 2023. https://doi.org/10.1109/ICIC60109.2023.10382123
- Alomar, K., Aysel, H. I., & Cai, X. (2023). Data Augmentation in Classification and Segmentation: A Survey and New Strategies. Journal of Imaging 2023, Vol. 9, Page 46, 9(2), 46. https://doi.org/10.3390/JIMAGING9020046
- Cerqueira, V., Santos, M., Roque, L., Baghoussi, Y., & Soares, C. (2024). Online Data Augmentation for Forecasting with Deep Learning. Lecture Notes in Computer Science, 16121 LNAI, 217–229. https://doi.org/10.1007/978-3-032-05176-9_17
- Chen, J., Zhu, S., & Luo, W. (2024). Instance segmentation of underwater images by using deep learning. Electronics, 13(2), 274. https://doi.org/10.3390/electronics13020274
- Cheung, T. H., & Yeung, D. Y. (2024). A Survey of Automated Data Augmentation for Image Classification: Learning to Compose, Mix, and Generate. IEEE Transactions on Neural Networks and Learning Systems, 35(10), 13185–13205. https://doi.org/10.1109/TNNLS.2023.3282258
- Detectron2. (n.d.). MODEL_ZOO. Retrieved September 15, 2025, from https://github.com/facebookresearch/detectron2/blob/main/MODEL_ZOO.md
- Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2), 303–338. https://doi.org/10.1007/S11263-009-0275-4/METRICS
- Feng, S. Y., Gangal, V., Wei, J., Chandar, S., Vosoughi, S., Mitamura, T., & Hovy, E. (2021). A survey of data augmentation approaches for NLP. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 968–988. https://doi.org/10.18653/v1/2021.findings-acl.84
- Gao, X., Xiao, Z., & Deng, Z. (2024). High accuracy food image classification via vision transformer with data augmentation and feature augmentation. Journal of Food Engineering, 365, 111833. https://doi.org/10.1016/J.JFOODENG.2023.111833
- Goceri, E. (2023). Medical image data augmentation: techniques, comparisons and interpretations. Artificial Intelligence Review, 56(11), 12561–12605. https://doi.org/10.1007/S10462-023-10453-Z/TABLES/10