Research Article
BibTex RIS Cite

Moving Object Detection in Video with Algorithms YOLO and Faster R-CNN in Different Conditions

Year 2022, , 40 - 54, 31.01.2022
https://doi.org/10.31590/ejosat.1013049

Abstract

In this study, YOLOv4 with YOLOv3 and Faster R-CNN compared for object detection both under challenging weather conditions and in the dark. It is difficult to detect moving objects such as pedestrians, cars, buses, and motorcycles in bad weather conditions or at night, especially in adverse weather conditions such as rain, fog, and snow. The objective of this study is to assess the performance of these three algorithms in such circumstances, as none of them were designed to work in bad weather or at night. Tesla P4 GPUs with 12GB of RAM were used for this study, with algorithms trained using open-image datasets, where YOLOv4 had the highest performance at 40,000 iterations, 72% mAP, and 63% recall. While YOLOv3 has achieved maximum at 36000 iterations, 65.53% mAP, and 54% recall, Faster R-CNN has achieved maximum at 36,000 iterations, 51% mAP, and 49% recall. According to the results, YOLOv4 performed the best compared to YOLOv3 and Faster R-CNN.

References

  • J. Redmon and A. Farhadi, (2017, July), "YOLO9000: Better, Faster, Stronger." 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), [online], pages 6517-6525.
  • K. Simonyan, & A. Zisserman, (2015, April), "Very Deep Convolutional Networks for Large-Scale Image Recognition." Computer Vision and Pattern Recognition (cs.CV) ArXiv, [online], pages 1409.1556.
  • R. Girshick, J. Donahue, T. Darrell and J. Malik, (2014, June), "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation." 2014 IEEE Conference on Computer Vision and Pattern Recognition, [online], pages 580-587.
  • R. Girshick, (2015, December), "Fast R-CNN." 2015 IEEE International Conference on Computer Vision (ICCV), [online], pages 1440-1448.
  • S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks." IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017 pp. 1137-1149.
  • G. Huang, Z. Liu, L. Van Der Maaten and K. Weinberger, (2017, Juley), "Densely Connected Convolutional Networks." 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), [online], pages 2261-2269.
  • J. Redmon, S. Divvala, R. Girshick and A. Farhadi, (2016, June), "You Only Look Once: Unified, Real-Time Object Detection." 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), [online], pages 779-788.
  • J. Redmon, A. Farhadi, (2018, April), "YOLOv3: An Incremental Improvement." Computer Vision and Pattern Recognition (cs.CV) ArXiv, [online], pages 1804.02767.
  • C. Ning, H. Zhou, Y. Song and J. Tang, (2017, Juley), "Inception Single Shot MultiBox Detector for object detection." 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), [online], pages 549-554.
  • A. Ćorović, V. Ilić, S. Ðurić, M. Marijan and B. Pavković, (2018, November), "The Real-Time Detection of Traffic Participants Using YOLO Algorithm." 2018 26th Telecommunications Forum (TELFOR), [online], pages 1-4.
  • K. He, X. Zhang, S. Ren and J. Sun, "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition." in IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, pp. 1904 - 1916.
  • A. Bochkovskiy, C. Wang, and H. Liao, (2020, April), "YOLOv4: Optimal Speed and Accuracy of Object Detection." Computer Science - Computer Vision and Pattern Recognition; Electrical Engineering and Systems Science - Image and Video Processing ArXiv, [online].
  • Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, (2019, November), "Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression." Computer Vision and Pattern Recognition (cs.CV) ArXiv, [online], pages 1911.08287.
  • C. Wang, H. Mark Liao, Y. Wu, P. Chen, J. Hsieh and I. Yeh, (2020, June), "CSPNet: A New Backbone that can Enhance Learning Capability of CNN." 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), [online], pages 1571-1580.
  • J. Hosang, R. Benenson and B. Schiele, (2017, Juley), "Learning Non-maximum Suppression." 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), [online], pages 6469-6477.
  • I. Krasin, T. Duerig, N. Alldrin, V. Ferrari, S. Abu-El-Haija , A. Kuznetsova, H. Rom, J. Uijlings, S. Popov, S. Kamali, M. Malloci, J. Pont-Tuset, A. Veit, S. Belongie, V. Gomes, A. Gupta, C. Sun, G. Chechik, D. Cai, Z. Feng, D. Narayanan, K. Murphy, (2017), "OpenImages: A public dataset for large-scale multi-label and multi-class image classification."
  • Vittorio, Angelo, "OIDv4_ToolKit, Toolkit to download and visualize single or multiple classes from the huge Open Images v4 dataset." Internet: github.com/EscVM/OIDv4_ToolKit, Jun 25, 2019, 2018.
  • J. Redmon, "Darknet: Open Source Neural Networks in C." Internet: http://pjreddie.com/darknet/, 2013, [2016].

Farklı Koşullarda YOLO ve Faster R-CNN Algoritmaları ile Videoda Hareketli Nesne Tespiti

Year 2022, , 40 - 54, 31.01.2022
https://doi.org/10.31590/ejosat.1013049

Abstract

Bu çalışmada, hem zorlu hava koşullarında hem de karanlıkta nesne tespiti için YOLOv4'ü YOLOv3 ve Faster R-CNN ile karşılaştırıldı. Yayalar, arabalar, otobüsler, motosikletler gibi hareketli nesneleri kötü hava koşullarında veya geceleri, özellikle yağmur, sis, kar gibi olumsuz hava koşullarında tespit etmek zordur. Bu çalışmanın amacı, hiçbiri kötü hava koşullarında veya gece çalışmak üzere tasarlanmamış olan bu üç algoritmanın bu gibi durumlarda performansını değerlendirmektir. YOLOv4'ün 40.000 yineleme, %72 mAP ve %63 geri çağırmada en yüksek performansa sahip olduğu açık-görüntü veri kümeleri kullanılarak eğitilen algoritmalarla bu çalışma için 12 GB RAM'e sahip Tesla P4 GPU'lar kullanıldı. YOLOv3 36.000 yineleme, %65.53 mAP ve %54 geri çağırma ile maksimuma ulaşırken, Faster R-CNN 36.000 yineleme, %51 mAP ve %49 geri çağırma ile maksimuma ulaştı. Sonuçlara göre YOLOv4, YOLOv3 ve Faster R-CNN'ye kıyasla en iyi performansı gösterdi.

References

  • J. Redmon and A. Farhadi, (2017, July), "YOLO9000: Better, Faster, Stronger." 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), [online], pages 6517-6525.
  • K. Simonyan, & A. Zisserman, (2015, April), "Very Deep Convolutional Networks for Large-Scale Image Recognition." Computer Vision and Pattern Recognition (cs.CV) ArXiv, [online], pages 1409.1556.
  • R. Girshick, J. Donahue, T. Darrell and J. Malik, (2014, June), "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation." 2014 IEEE Conference on Computer Vision and Pattern Recognition, [online], pages 580-587.
  • R. Girshick, (2015, December), "Fast R-CNN." 2015 IEEE International Conference on Computer Vision (ICCV), [online], pages 1440-1448.
  • S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks." IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017 pp. 1137-1149.
  • G. Huang, Z. Liu, L. Van Der Maaten and K. Weinberger, (2017, Juley), "Densely Connected Convolutional Networks." 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), [online], pages 2261-2269.
  • J. Redmon, S. Divvala, R. Girshick and A. Farhadi, (2016, June), "You Only Look Once: Unified, Real-Time Object Detection." 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), [online], pages 779-788.
  • J. Redmon, A. Farhadi, (2018, April), "YOLOv3: An Incremental Improvement." Computer Vision and Pattern Recognition (cs.CV) ArXiv, [online], pages 1804.02767.
  • C. Ning, H. Zhou, Y. Song and J. Tang, (2017, Juley), "Inception Single Shot MultiBox Detector for object detection." 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), [online], pages 549-554.
  • A. Ćorović, V. Ilić, S. Ðurić, M. Marijan and B. Pavković, (2018, November), "The Real-Time Detection of Traffic Participants Using YOLO Algorithm." 2018 26th Telecommunications Forum (TELFOR), [online], pages 1-4.
  • K. He, X. Zhang, S. Ren and J. Sun, "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition." in IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, pp. 1904 - 1916.
  • A. Bochkovskiy, C. Wang, and H. Liao, (2020, April), "YOLOv4: Optimal Speed and Accuracy of Object Detection." Computer Science - Computer Vision and Pattern Recognition; Electrical Engineering and Systems Science - Image and Video Processing ArXiv, [online].
  • Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, (2019, November), "Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression." Computer Vision and Pattern Recognition (cs.CV) ArXiv, [online], pages 1911.08287.
  • C. Wang, H. Mark Liao, Y. Wu, P. Chen, J. Hsieh and I. Yeh, (2020, June), "CSPNet: A New Backbone that can Enhance Learning Capability of CNN." 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), [online], pages 1571-1580.
  • J. Hosang, R. Benenson and B. Schiele, (2017, Juley), "Learning Non-maximum Suppression." 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), [online], pages 6469-6477.
  • I. Krasin, T. Duerig, N. Alldrin, V. Ferrari, S. Abu-El-Haija , A. Kuznetsova, H. Rom, J. Uijlings, S. Popov, S. Kamali, M. Malloci, J. Pont-Tuset, A. Veit, S. Belongie, V. Gomes, A. Gupta, C. Sun, G. Chechik, D. Cai, Z. Feng, D. Narayanan, K. Murphy, (2017), "OpenImages: A public dataset for large-scale multi-label and multi-class image classification."
  • Vittorio, Angelo, "OIDv4_ToolKit, Toolkit to download and visualize single or multiple classes from the huge Open Images v4 dataset." Internet: github.com/EscVM/OIDv4_ToolKit, Jun 25, 2019, 2018.
  • J. Redmon, "Darknet: Open Source Neural Networks in C." Internet: http://pjreddie.com/darknet/, 2013, [2016].
There are 18 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Articles
Authors

Abdulghani Mawlood A.ghani Abdulghani This is me 0000-0002-5642-0245

Gonca Gökçe Menekşe Dalveren 0000-0002-8649-1909

Publication Date January 31, 2022
Published in Issue Year 2022

Cite

APA Abdulghani, A. M. A., & Menekşe Dalveren, G. G. (2022). Moving Object Detection in Video with Algorithms YOLO and Faster R-CNN in Different Conditions. Avrupa Bilim Ve Teknoloji Dergisi(33), 40-54. https://doi.org/10.31590/ejosat.1013049