Real-Time Multi-Object Recognition Using the Fusion of LIDAR and Camera Data

Mert Can Yaman; Şerafettin Erel

Research Article

Real-Time Multi-Object Recognition Using the Fusion of LIDAR and Camera Data

Year 2023, Volume: 2 Issue: 2, 1 - 19, 29.12.2023

Mert Can Yaman Şerafettin Erel

Abstract

Object recognition is one of the most significant research topics of today. The significance of object recognition, which has extensive use, will increase gradually. In this study, real-time object recognition was performed within the user-definable area with the data taken simultaneously from the built-in camera and LIDAR sensor of iPhone 13 Pro Max. The study was performed using MS-COCO dataset, Swift language and using SwiftUI as the framework. YOLO V5 algorithm is used for object recognition and video processing was performed with Swift Metal narrowing the area by the minimum-maximum distance determined in the interface on each frame depending on the real-time fusion data of camera and LIDAR. The areas outside the contours of objects within the value range are darkened. Thus, object recognition was performed in each darkened frame. In the study, object recognition was performed in the range of 0-15 m, which can be adjusted in the interface.

Keywords

LIDAR, Object Recognition, Data Fusion, Video Processing, iOS

References

[1] R. Solovyev, W. Wang and T. Gabruseva “Weighted boxes fusion: Ensembling boxes from different object detection mod-els”, Image and Vision Computing, vol. 107, 104117 pp. 1-6, 2021. doi: 10.1016/j.imavis.2021.104117
[2] S. Qi, X. Ning, G. Yang, L. Zhang, P. Long, W. Cai, and W. Li “Review of multi-view 3D object recognition methods based on deep learning” Displays, vol. 69, 102053, pp. 1-12, 2021. doi: 10.1016/j.displa.2021.102053
[3] Z. Zou, K. Chen, Z. Shi, Y. Guo and J. Ye, "Object Detection in 20 Years: A Survey," in Proceedings of the IEEE, vol. 111, no. 3, pp. 257-276, 2023. doi: 10.1109/JPROC.2023.3238524.
[4] M. Nikhitha, S. Roopa Sri and B. Uma Maheswari, "Fruit Recognition and Grade of Disease Detection using Inception V3 Model," 2019 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA), Coimba-tore, India, pp. 1040-1043, 2019. doi: 10.1109/ICECA.2019.8822095.
[5] M. Sogabe, N. Ito, T. Miyazaki, T. Kawase, T. Kanno and K. Kawashima “Detection of Instruments Inserted into Eye in Cataract Surgery Using Single-shot Multibox Detector,” Sensors & Materials, vol. 34, no. 1, pp. 47–54, 2022. doi: 10.18494/SAM3762
[6] B. Janakiramaiah, G. Kalyani, Karuna, A. et al. “Military object detection in defense using multi-level capsule networks,” Soft Comput 27, pp. 1045–1059, 2023. doi: 10.1007/s00500-021-05912-0
[7] M. Rezaei, M. Azarmi and F. M. P. Mir, “3d-net: Monocular 3d object recognition for traffic monitoring,” Expert Systems with Applications, vol. 227, 120253, pp.1-17, 2023. doi: 10.1016/j.eswa.2023.120253
[8] S. Kottner, M. J. Thali and D. Gascho, “Using the iPhone's LiDAR technology to capture 3D forensic data at crime and crash scenes,” Forensic Imaging, vol. 32, 200535. pp. 1-7, 2023. doi: 10.1016/j.fri.2023.200535
[9] Y. Okochi, H. Rizk, T. Amano and H. Yamaguchi, "Object Recognition from 3D Point Cloud on Resource-Constrained Edge Device," 2022 18th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob), Thessaloniki, Greece, 2022, pp. 369-374, 2022. doi: 10.1109/WiMob55322.2022.9941552.
[10] A. Guyot, M. Lennon, T. Lorho and L. Hubert-Moy, “Combined detection and segmentation of archeological structures from LiDAR data using a deep learning approach,” Journal of Computer Applications in Archaeology, 4(1), pp.1-19, 2021. doi : 10.5334/jcaa.64
[11] S. Tatsumi, K. Yamaguchi and N. Furuya, “Forest Scanner: A mobile application for measuring and mapping trees with LiDAR‐equipped iPhone and iPad,” Methods in Ecology and Evolution, 14(7), pp. 1603-1609, 2023. doi: 10.1111/2041-210X.13900
[12] V. Partel, L. Costa and Y. Ampatzidis, “Smart tree crop sprayer utilizing sensor fusion and artificial intelligence,” Computers and Electronics in Agriculture, vol. 191, 106556.pp. 1-13, 2021. doi: 10.1016/j.compag.2021.106556
[13] Y. Ji, S. Li, C. Peng, H. Xu, R. Cao and M. Zhang, “Obstacle detection and recognition in farmland based on fusion point cloud data,” Computers and Electronics in Agriculture, vol. 189, 106409, pp. 1-6, 2021. doi: 10.1016/j.compag.2021.106409
[14] X. Zuo, P. Geneva, W. Lee, Y. Liu and G. Huang, "LIC-Fusion: LiDAR-Inertial-Camera Odometry," 2019 IEEE/RSJ Interna-tional Conference on Intelligent Robots and Systems (IROS), Macau, China, pp. 5848-5854, 2019. doi: 10.1109/IROS40897.2019.8967746
[15] P. Viola and M. Jones, "Rapid object detection using a boosted cascade of simple features," Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, Kauai, HI, USA, 2001. doi: 10.1109/CVPR.2001.990517
[16] N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), San Diego, CA, USA, vol. 1, pp. 886-893, 2005. doi: 10.1109/CVPR.2005.177
[17] P. Felzenszwalb, D. McAllester and D. Ramanan, "A discriminatively trained, multiscale, deformable part model," 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, pp. 1-8, 2008. doi: 10.1109/CVPR.2008.4587597
[18] T. W. Wu, H. Zhang, W. Peng, F. Lü and P. J. He, “Applications of convolutional neural networks for intelligent waste identi-fication and recycling: A review. Resources,” Conservation and Recycling, vol. 190, 106813, pp. 1-16, 2023. doi: 10.1016/j.resconrec.2022.106813
[19] M. M. Taye, “Theoretical understanding of convolutional neural network: concepts, architectures, applications, future direc-tions,” Computation, vol.11(3), 52, pp. 1-23, 2023. doi: 10.3390/computation11030052
[20] A. F. Agarap, “Deep learning using rectified linear units (relu),” arXiv preprint arXiv:1803.08375, pp. 1-7, 2018. doi: 10.48550/arXiv.1803.08375
[21] J. Bharadiya, “Convolutional Neural Networks for Image Classification,” International Journal of Innovative Science and Research Technology, 8(5), pp. 673-677, 2023. doi: 10.5281/zenodo.8020781
[22] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, “You only look once: Unified, real-time object detection” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779-788, 2016. doi: 10.1109/CVPR.2016.91.
[23] R. Girshick, F. Iandola, T. Darrell and J. Malik, “Deformable part models are convolutional neural networks,” In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 437-446, 2015. doi: 10.48550/arXiv.1409.5403.
[24] R. Girshick, J. Donahue, T. Darrell, J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmenta-tion,” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580-587, 2014. doi: 10.1109/CVPR.2014.81.
[25] Y. H. Lee and Y. Kim, “Comparison of CNN and YOLO for Object Detection,” Journal of semiconductor and display tech-nology, vol. 19(1), pp. 85-92, 2020.
[26] P. Jiang, D. Ergu, F. Liu, Y. Cai and B. Ma, “A Review of Yolo algorithm developments,” Procedia Computer Science, vol. 199, pp. 1066-1073, 2022. https://doi.org/10.1016/j.procs.2022.01.135.
[27] X. Zhu, S. Lyu, X. Wang and Q. Zhao, “TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios,” In Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 2778-2788. doi: 10.1109/ICCVW54120.2021.00312
[28] L. Ting, Z. Baijun, Z. Yongsheng and Y. Shun, “Ship detection algorithm based on improved YOLO V5," In 2021 6th Inter-national Conference on Automation, Control and Robotics Engineering (CACRE), IEEE, Dalian, China, pp. 483-487, 2021. doi: 10.1109/CACRE52464.2021.9501331.
[29] J. Kim, J. Kim and J. Cho, "An advanced object classification strategy using YOLO through camera and LiDAR sensor fu-sion," 2019 13th International Conference on Signal Processing and Communication Systems (ICSPCS), Gold Coast, QLD, Australia, December 16-18, 2019 pp. 1-5. doi: 10.1109/ICSPCS47537.2019.9008742
[30] J. Clayton, “I: Metal Basics,” Metal Programming Guide: Tutorial and Reference Via Swift, Addison-Wesley, USA 2017.
[31] B. Behroozpour, P. A. Sandborn, M. C. Wu and B. E. Boser, “Lidar system architectures and circuits,” IEEE Communica-tions Magazine vol. 55(10), pp. 135-142, 2017. doi: 10.1109/MCOM.2017.1700030.
[32] Y. Li et al., "Deep Learning for LiDAR Point Clouds in Autonomous Driving: A Review," in IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 8, pp. 3412-3432, Aug. 2021. doi: 10.1109/TNNLS.2020.3015992.
[33] S. Kottner, M. J. Thali and D. Gascho, D. “Using the iPhone's LiDAR technology to capture 3D forensic data at crime and crash scenes,” Forensic Imaging, vol. 32, 200535, pp. 1-7, 2023. doi.org/10.1016/j.fri.2023.200535.

LIDAR ve Kamera Verilerinin Birleştirilmesiyle Gerçek Zamanlı Çoklu Nesne Tanıma

Year 2023, Volume: 2 Issue: 2, 1 - 19, 29.12.2023

Mert Can Yaman Şerafettin Erel

Abstract

Object recognition günümüzün en önemli araştırma konularındandır. Tarımdan, savunma ve uzay sanayisine kadar her alanda kapsamlı kullanımı olan nesne tanımanın öneminin giderek artacağı öngörülmektedir. Çalışmada, iPhone 13 Pro Max dahili kamerası ve LIDAR sensöründen eş-zamanlı olarak alınan verilerle kullanıcı tarafından sınırları belirlenebilen alan içerisinde gerçek-zamanlı nesne tanıma yapılmıştır. Çalışmada, MS-COCO veri seti ile Swift dili ve framework olarak SwiftUI kullanılarak gerçekleştirilmiştir. Nesne tanıma için YOLO V5 algoritması kullanılmış ve video işleme Swift Metal kullanılarak gerçek-zamanlı veriler üzerinde gerçekleştirilmiştir. Kamera ve LIDAR füzyonuna bağlı olarak gerçek-zamanlı olarak alınan verilerden elde edilen her bir frame üzerinde kullanıcının arayüzde belirlediği minimum-maksimum uzaklığa göre görüntü alanı daraltılarak Swift Metal ile video işleme gerçekleştirilmiştir. Gerçek zamanlı videoda her bir frame içerisinde kullanıcının belirlediği değer aralığındaki nesnelerin konturları dışındaki alanlar karartılmıştır. Böylece karartılmış olan her bir framede nesne tanıma gerçekleştirilmiştir. Çalışmada, arayüzde slider ile ayarlanabilen 0-15 m aralığında gerçek-zamanlı LIDAR ve kamera verileriyle nesne tanıma gerçekleştirilmiştir.

Keywords

LIDAR, Nesne Tanıma, Veri Füzyonu, Video İşleme, iOS

References

[1] R. Solovyev, W. Wang and T. Gabruseva “Weighted boxes fusion: Ensembling boxes from different object detection mod-els”, Image and Vision Computing, vol. 107, 104117 pp. 1-6, 2021. doi: 10.1016/j.imavis.2021.104117
[2] S. Qi, X. Ning, G. Yang, L. Zhang, P. Long, W. Cai, and W. Li “Review of multi-view 3D object recognition methods based on deep learning” Displays, vol. 69, 102053, pp. 1-12, 2021. doi: 10.1016/j.displa.2021.102053
[3] Z. Zou, K. Chen, Z. Shi, Y. Guo and J. Ye, "Object Detection in 20 Years: A Survey," in Proceedings of the IEEE, vol. 111, no. 3, pp. 257-276, 2023. doi: 10.1109/JPROC.2023.3238524.
[4] M. Nikhitha, S. Roopa Sri and B. Uma Maheswari, "Fruit Recognition and Grade of Disease Detection using Inception V3 Model," 2019 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA), Coimba-tore, India, pp. 1040-1043, 2019. doi: 10.1109/ICECA.2019.8822095.
[5] M. Sogabe, N. Ito, T. Miyazaki, T. Kawase, T. Kanno and K. Kawashima “Detection of Instruments Inserted into Eye in Cataract Surgery Using Single-shot Multibox Detector,” Sensors & Materials, vol. 34, no. 1, pp. 47–54, 2022. doi: 10.18494/SAM3762
[6] B. Janakiramaiah, G. Kalyani, Karuna, A. et al. “Military object detection in defense using multi-level capsule networks,” Soft Comput 27, pp. 1045–1059, 2023. doi: 10.1007/s00500-021-05912-0
[7] M. Rezaei, M. Azarmi and F. M. P. Mir, “3d-net: Monocular 3d object recognition for traffic monitoring,” Expert Systems with Applications, vol. 227, 120253, pp.1-17, 2023. doi: 10.1016/j.eswa.2023.120253
[8] S. Kottner, M. J. Thali and D. Gascho, “Using the iPhone's LiDAR technology to capture 3D forensic data at crime and crash scenes,” Forensic Imaging, vol. 32, 200535. pp. 1-7, 2023. doi: 10.1016/j.fri.2023.200535
[9] Y. Okochi, H. Rizk, T. Amano and H. Yamaguchi, "Object Recognition from 3D Point Cloud on Resource-Constrained Edge Device," 2022 18th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob), Thessaloniki, Greece, 2022, pp. 369-374, 2022. doi: 10.1109/WiMob55322.2022.9941552.
[10] A. Guyot, M. Lennon, T. Lorho and L. Hubert-Moy, “Combined detection and segmentation of archeological structures from LiDAR data using a deep learning approach,” Journal of Computer Applications in Archaeology, 4(1), pp.1-19, 2021. doi : 10.5334/jcaa.64
[11] S. Tatsumi, K. Yamaguchi and N. Furuya, “Forest Scanner: A mobile application for measuring and mapping trees with LiDAR‐equipped iPhone and iPad,” Methods in Ecology and Evolution, 14(7), pp. 1603-1609, 2023. doi: 10.1111/2041-210X.13900
[12] V. Partel, L. Costa and Y. Ampatzidis, “Smart tree crop sprayer utilizing sensor fusion and artificial intelligence,” Computers and Electronics in Agriculture, vol. 191, 106556.pp. 1-13, 2021. doi: 10.1016/j.compag.2021.106556
[13] Y. Ji, S. Li, C. Peng, H. Xu, R. Cao and M. Zhang, “Obstacle detection and recognition in farmland based on fusion point cloud data,” Computers and Electronics in Agriculture, vol. 189, 106409, pp. 1-6, 2021. doi: 10.1016/j.compag.2021.106409
[14] X. Zuo, P. Geneva, W. Lee, Y. Liu and G. Huang, "LIC-Fusion: LiDAR-Inertial-Camera Odometry," 2019 IEEE/RSJ Interna-tional Conference on Intelligent Robots and Systems (IROS), Macau, China, pp. 5848-5854, 2019. doi: 10.1109/IROS40897.2019.8967746
[15] P. Viola and M. Jones, "Rapid object detection using a boosted cascade of simple features," Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, Kauai, HI, USA, 2001. doi: 10.1109/CVPR.2001.990517
[16] N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), San Diego, CA, USA, vol. 1, pp. 886-893, 2005. doi: 10.1109/CVPR.2005.177
[17] P. Felzenszwalb, D. McAllester and D. Ramanan, "A discriminatively trained, multiscale, deformable part model," 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, pp. 1-8, 2008. doi: 10.1109/CVPR.2008.4587597
[18] T. W. Wu, H. Zhang, W. Peng, F. Lü and P. J. He, “Applications of convolutional neural networks for intelligent waste identi-fication and recycling: A review. Resources,” Conservation and Recycling, vol. 190, 106813, pp. 1-16, 2023. doi: 10.1016/j.resconrec.2022.106813
[19] M. M. Taye, “Theoretical understanding of convolutional neural network: concepts, architectures, applications, future direc-tions,” Computation, vol.11(3), 52, pp. 1-23, 2023. doi: 10.3390/computation11030052
[20] A. F. Agarap, “Deep learning using rectified linear units (relu),” arXiv preprint arXiv:1803.08375, pp. 1-7, 2018. doi: 10.48550/arXiv.1803.08375
[21] J. Bharadiya, “Convolutional Neural Networks for Image Classification,” International Journal of Innovative Science and Research Technology, 8(5), pp. 673-677, 2023. doi: 10.5281/zenodo.8020781
[22] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, “You only look once: Unified, real-time object detection” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779-788, 2016. doi: 10.1109/CVPR.2016.91.
[23] R. Girshick, F. Iandola, T. Darrell and J. Malik, “Deformable part models are convolutional neural networks,” In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 437-446, 2015. doi: 10.48550/arXiv.1409.5403.
[24] R. Girshick, J. Donahue, T. Darrell, J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmenta-tion,” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580-587, 2014. doi: 10.1109/CVPR.2014.81.
[25] Y. H. Lee and Y. Kim, “Comparison of CNN and YOLO for Object Detection,” Journal of semiconductor and display tech-nology, vol. 19(1), pp. 85-92, 2020.
[26] P. Jiang, D. Ergu, F. Liu, Y. Cai and B. Ma, “A Review of Yolo algorithm developments,” Procedia Computer Science, vol. 199, pp. 1066-1073, 2022. https://doi.org/10.1016/j.procs.2022.01.135.
[27] X. Zhu, S. Lyu, X. Wang and Q. Zhao, “TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios,” In Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 2778-2788. doi: 10.1109/ICCVW54120.2021.00312
[28] L. Ting, Z. Baijun, Z. Yongsheng and Y. Shun, “Ship detection algorithm based on improved YOLO V5," In 2021 6th Inter-national Conference on Automation, Control and Robotics Engineering (CACRE), IEEE, Dalian, China, pp. 483-487, 2021. doi: 10.1109/CACRE52464.2021.9501331.
[29] J. Kim, J. Kim and J. Cho, "An advanced object classification strategy using YOLO through camera and LiDAR sensor fu-sion," 2019 13th International Conference on Signal Processing and Communication Systems (ICSPCS), Gold Coast, QLD, Australia, December 16-18, 2019 pp. 1-5. doi: 10.1109/ICSPCS47537.2019.9008742
[30] J. Clayton, “I: Metal Basics,” Metal Programming Guide: Tutorial and Reference Via Swift, Addison-Wesley, USA 2017.
[31] B. Behroozpour, P. A. Sandborn, M. C. Wu and B. E. Boser, “Lidar system architectures and circuits,” IEEE Communica-tions Magazine vol. 55(10), pp. 135-142, 2017. doi: 10.1109/MCOM.2017.1700030.
[32] Y. Li et al., "Deep Learning for LiDAR Point Clouds in Autonomous Driving: A Review," in IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 8, pp. 3412-3432, Aug. 2021. doi: 10.1109/TNNLS.2020.3015992.
[33] S. Kottner, M. J. Thali and D. Gascho, D. “Using the iPhone's LiDAR technology to capture 3D forensic data at crime and crash scenes,” Forensic Imaging, vol. 32, 200535, pp. 1-7, 2023. doi.org/10.1016/j.fri.2023.200535.

There are 33 citations in total.

Details

Primary Language	English
Subjects	Computer Vision, Video Processing, Deep Learning, Software Engineering (Other), Electronics
Journal Section	Research Articles
Authors	Mert Can Yaman 0009-0008-5413-895X Şerafettin Erel 0000-0002-2437-1127
Publication Date	December 29, 2023
Published in Issue	Year 2023 Volume: 2 Issue: 2

Cite

APA	Yaman, M. C., & Erel, Ş. (2023). Real-Time Multi-Object Recognition Using the Fusion of LIDAR and Camera Data. Bozok Journal of Engineering and Architecture, 2(2), 1-19.
AMA	Yaman MC, Erel Ş. Real-Time Multi-Object Recognition Using the Fusion of LIDAR and Camera Data. BJEA. December 2023;2(2):1-19.
Chicago	Yaman, Mert Can, and Şerafettin Erel. “Real-Time Multi-Object Recognition Using the Fusion of LIDAR and Camera Data”. Bozok Journal of Engineering and Architecture 2, no. 2 (December 2023): 1-19.
EndNote	Yaman MC, Erel Ş (December 1, 2023) Real-Time Multi-Object Recognition Using the Fusion of LIDAR and Camera Data. Bozok Journal of Engineering and Architecture 2 2 1–19.
IEEE	M. C. Yaman and Ş. Erel, “Real-Time Multi-Object Recognition Using the Fusion of LIDAR and Camera Data”, BJEA, vol. 2, no. 2, pp. 1–19, 2023.
ISNAD	Yaman, Mert Can - Erel, Şerafettin. “Real-Time Multi-Object Recognition Using the Fusion of LIDAR and Camera Data”. Bozok Journal of Engineering and Architecture 2/2 (December 2023), 1-19.
JAMA	Yaman MC, Erel Ş. Real-Time Multi-Object Recognition Using the Fusion of LIDAR and Camera Data. BJEA. 2023;2:1–19.
MLA	Yaman, Mert Can and Şerafettin Erel. “Real-Time Multi-Object Recognition Using the Fusion of LIDAR and Camera Data”. Bozok Journal of Engineering and Architecture, vol. 2, no. 2, 2023, pp. 1-19.
Vancouver	Yaman MC, Erel Ş. Real-Time Multi-Object Recognition Using the Fusion of LIDAR and Camera Data. BJEA. 2023;2(2):1-19.

Article Files

Full Text