Year 2019,
Volume: 2 Issue: 2, 28 - 44, 30.12.2019
Abdelmalek Bouguettaya
,
Ahmed Kechıda
,
Amine Mohammed Taberkıt
References
- 1. Moore, G. E.: Cramming More Components onto Integrated Circuits. In proc. of the IEEE, Volume 86, Issue 1, pages 82{85 (1998). https://doi.org/10.1109/JPROC.1998.658762
- 2. Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In proc. of the IEEE, Volume 86, Issue 11, pages 2278{2324 (1998). https://doi.org/10.1109/5.726791
- 3. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In proc. of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR)(2001). https://doi.org/10.1109/CVPR.2001.990517
- 4. Lienhart, R., Maydt, J.: An Extended Set of Haar-like Features for Rapid Object Detection. In proc. in International Conference on Image Processing (2002). https://doi.org/10.1109/ICIP.2002.1038171
- 5. Han, S., Liu, X., Mao, H., Pu, J., Pedram, A., Horowitz, M. A., Dally, W. J.: EIE: efficient inference engine on compressed deep neural network. In ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA) (2016). https://doi.org/10.1109/ISCA.2016.30
- 6. Felzenszwalb, P. F., Girshick, R. B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. In IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 32, Issue 9, pages 1627-1645 (2010). https://doi.org/10.1109/TPAMI.2009.167
- 7. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pages 580{587 (2014). https://doi.org/10.1109/CVPR.2014.81
- 8. Girshick, R.: Fast r-cnn. In proc. of the ICCV conference, pages 1440{1448 (2015). https://doi.org/10.1109/ICCV.2015.169
- 9. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards realtime object detection with region proposal networks. In NIPS, Volume 39, Issue 6 (2015). https://doi.org/10.1109/TPAMI.2016.2577031
- 10. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR.2016.91
- 11. Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017).
https://doi.org/10.1109/CVPR.2017.690
- 12. Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
- 13. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A. C.: SSD: Single shot multibox detector. In ECCV (2016). https://doi.org/10.1007/978-3-319-46448-0 2
- 14. Krizhevsky, A., Sutskever, I., Hinton, G. E.: ImageNet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, Volume 1, pp. 10971105 (2012). https://doi.org/10.1145/3065386
- 15. Simonyan, K., Zisserman, A.: Very deep convolutional networks for largescale image recognition. arXiv Preprint, arXiv:1409.1556 (2014)
- 16. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)(2015). https://doi.org/10.1109/CVPR.2015.7298594
- 17. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In Proceedings of EEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016). https://doi.org/10.1109/CVPR.2016.90
- 18. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M.: Imagenet large scale visual recognition challenge. International Journal of Computer Vision (IJCV), Volume 115, Issue 3, pages 211-252 (2015). https://doi.org/10.1007/s11263-015-0816-y
- 19. Young, T., Hazarika, D., Poria, S., Cambria, E.: Recent Trends in Deep Learning Based Natural Language Processing. IEEE Computational Intelligence Magazine, Volume 13, Issue 3, pages 55-75 (2018). https://doi.org/10.1109/MCI.2018.2840738
- 20. Han, J., Zhang, D., Cheng, G., Liu, N., Xu, D.: Advanced Deep-Learning Techniques for Salient and Category-Specific Object Detection: A Survey. IEEE Signal Processing Magazine, volume 35, issue 1, pages 84-100 (2018). https://doi.org/10.1109/MSP.2017.2749125
- 21. Rosebrock, A.: Deep Learning for Computer Vision with Python. 1st edn. Published by Pyimagesearch (2017)
- 22. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: A large-scale hierarchical image database. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2009). https://doi.org/10.1109/CVPR.2009.5206848
- 23. Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv:1704.04861 (2017).
- 24. Mittal, S.: Power management techniques for data centers: a survey. arXiv:1404.6681 (2014)
- 25. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L. C.: Inverted residuals and linear bottlenecks: Mobile networks for classification, detection and segmentation. arXiv:1801.04381 (2018)
- 26. Ghazi, P., Happonen, A. P., Boutellier, J., Huttunen, H.: Embedded Implementation of a Deep Learning Smile Detector. 7th European Workshop on Visual Information Processing (EUVIP)(2018). https://doi.org/10.1109/EUVIP.2018.8611783
- 27. Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size. arXiv:1602.07360 (2016)
- 28. Wu, B., Wan, A., Iandola, F., Jin, P. H., Keutzer, K.: SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving. arXiv:1612.01051v4 (2019)
- 29. Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. In Proceedings of
IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018). https://doi.org/10.1109/CVPR.2018.00716
- 30. Wang, R. J., Li, X., Ling, C. X.: Pelee: A Real-Time Object Detection System on Mobile Devices. arXiv:1804.06882 (2018)
A Survey on Lightweight CNN-Based Object Detection Algorithms for Platforms with Limited Computational Resources
Year 2019,
Volume: 2 Issue: 2, 28 - 44, 30.12.2019
Abdelmalek Bouguettaya
,
Ahmed Kechıda
,
Amine Mohammed Taberkıt
Abstract
Autonomous drones must be able to identify the existence of one or more objects of interest in a complex environment with high accuracy and speed to fly around safely. Most existing object detection techniques, based on traditional machine learning algorithms, can't offer acceptable performance in complicated environments. Deep Convolutional Neural Networks (CNNs) provide us such ability with high performance. Today, deep CNN-based object detection algorithms are more and more used in Artificial Intelligence (AI) applications. However, it still very difficult to deploy large CNNs architectures on small devices with limited hardware resources, because they consist of millions of parameters, which make them computationally very exhausting. Lightweight CNN architectures are proposed as a solution to make the deployment of deep neural networks on small devices feasible. This paper focuses on reviewing recent used lightweight CNN architectures that can be implemented on embedded targets to improve the object detection performance for small devices-based systems, like drones. We need to select fast and lightweight CNN models to use them on drone platforms. The purpose of this reviewing is to choose the most accurate and fastest algorithm to implement it on our drones.
References
- 1. Moore, G. E.: Cramming More Components onto Integrated Circuits. In proc. of the IEEE, Volume 86, Issue 1, pages 82{85 (1998). https://doi.org/10.1109/JPROC.1998.658762
- 2. Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In proc. of the IEEE, Volume 86, Issue 11, pages 2278{2324 (1998). https://doi.org/10.1109/5.726791
- 3. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In proc. of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR)(2001). https://doi.org/10.1109/CVPR.2001.990517
- 4. Lienhart, R., Maydt, J.: An Extended Set of Haar-like Features for Rapid Object Detection. In proc. in International Conference on Image Processing (2002). https://doi.org/10.1109/ICIP.2002.1038171
- 5. Han, S., Liu, X., Mao, H., Pu, J., Pedram, A., Horowitz, M. A., Dally, W. J.: EIE: efficient inference engine on compressed deep neural network. In ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA) (2016). https://doi.org/10.1109/ISCA.2016.30
- 6. Felzenszwalb, P. F., Girshick, R. B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. In IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 32, Issue 9, pages 1627-1645 (2010). https://doi.org/10.1109/TPAMI.2009.167
- 7. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pages 580{587 (2014). https://doi.org/10.1109/CVPR.2014.81
- 8. Girshick, R.: Fast r-cnn. In proc. of the ICCV conference, pages 1440{1448 (2015). https://doi.org/10.1109/ICCV.2015.169
- 9. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards realtime object detection with region proposal networks. In NIPS, Volume 39, Issue 6 (2015). https://doi.org/10.1109/TPAMI.2016.2577031
- 10. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR.2016.91
- 11. Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017).
https://doi.org/10.1109/CVPR.2017.690
- 12. Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
- 13. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A. C.: SSD: Single shot multibox detector. In ECCV (2016). https://doi.org/10.1007/978-3-319-46448-0 2
- 14. Krizhevsky, A., Sutskever, I., Hinton, G. E.: ImageNet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, Volume 1, pp. 10971105 (2012). https://doi.org/10.1145/3065386
- 15. Simonyan, K., Zisserman, A.: Very deep convolutional networks for largescale image recognition. arXiv Preprint, arXiv:1409.1556 (2014)
- 16. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)(2015). https://doi.org/10.1109/CVPR.2015.7298594
- 17. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In Proceedings of EEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016). https://doi.org/10.1109/CVPR.2016.90
- 18. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M.: Imagenet large scale visual recognition challenge. International Journal of Computer Vision (IJCV), Volume 115, Issue 3, pages 211-252 (2015). https://doi.org/10.1007/s11263-015-0816-y
- 19. Young, T., Hazarika, D., Poria, S., Cambria, E.: Recent Trends in Deep Learning Based Natural Language Processing. IEEE Computational Intelligence Magazine, Volume 13, Issue 3, pages 55-75 (2018). https://doi.org/10.1109/MCI.2018.2840738
- 20. Han, J., Zhang, D., Cheng, G., Liu, N., Xu, D.: Advanced Deep-Learning Techniques for Salient and Category-Specific Object Detection: A Survey. IEEE Signal Processing Magazine, volume 35, issue 1, pages 84-100 (2018). https://doi.org/10.1109/MSP.2017.2749125
- 21. Rosebrock, A.: Deep Learning for Computer Vision with Python. 1st edn. Published by Pyimagesearch (2017)
- 22. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: A large-scale hierarchical image database. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2009). https://doi.org/10.1109/CVPR.2009.5206848
- 23. Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv:1704.04861 (2017).
- 24. Mittal, S.: Power management techniques for data centers: a survey. arXiv:1404.6681 (2014)
- 25. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L. C.: Inverted residuals and linear bottlenecks: Mobile networks for classification, detection and segmentation. arXiv:1801.04381 (2018)
- 26. Ghazi, P., Happonen, A. P., Boutellier, J., Huttunen, H.: Embedded Implementation of a Deep Learning Smile Detector. 7th European Workshop on Visual Information Processing (EUVIP)(2018). https://doi.org/10.1109/EUVIP.2018.8611783
- 27. Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size. arXiv:1602.07360 (2016)
- 28. Wu, B., Wan, A., Iandola, F., Jin, P. H., Keutzer, K.: SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving. arXiv:1612.01051v4 (2019)
- 29. Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. In Proceedings of
IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018). https://doi.org/10.1109/CVPR.2018.00716
- 30. Wang, R. J., Li, X., Ling, C. X.: Pelee: A Real-Time Object Detection System on Mobile Devices. arXiv:1804.06882 (2018)