Derleme
BibTex RIS Kaynak Göster

Derin öğrenme ile 3B nokta bulutlarının sınıflandırılmasına genel bir bakış

Yıl 2022, Cilt: 13 Sayı: 1, 1 - 9, 30.03.2022
https://doi.org/10.24012/dumf.1067736

Öz

Nokta bulutu (NB) bir vektör uzayında nesneye ait bilgilerin x,y,z koordinat sisteminde matematiksel olarak temsil edilen noktalar kümesidir. Noktalar kaydedilen uzaysal koordinat sisteminde sınıflandırılarak nesne ya da bir alanı ifade eden anlamsal bilgileri tanımlarlar. 3 boyutlu (3B) nokta bulutu gelişen teknolojilerle beraber nesneleri sınıflandırma, algılama ve tanıma alanlarında son zamanlarda oldukça popüler bir hale gelmiştir. Lazer tarama sistemleri ile taranan nesneler 3B nokta bulutuna dönüştürülmüş ve bu verilerin sanal ortama aktarılması ile farklı veri setleri elde edilmiştir. 3B noktasal verileri derin ağlar ile başarılı bir şekilde sınıflandırmak için geliştirilen yöntemler detaylı olarak incelenmiştir. Nokta bulutlarına 3B koordinat sistemi ile birlikte farklı bilgiler dâhil ederek (derinlik ya da RGB(red-green-blue)) farklı boyutlarda veya farklı yoğunlukta nokta bulutları oluşturulmuştur. Ayrıca nokta bulutu veri kümesindeki her bir noktaya ait; harici veya dâhili bilgiler eklenmiş RGB değerleri ile nesneler renklendirilmiştir. Bu araştırmada 3B nokta bulutunu derin ağlar ile sınıflandıran yöntemlerin başarı performansları, avantajları, dezavantajları analiz edilmiştir. Özellikle uygulanan algoritmalar, denenen yöntemler ve oluşturulan modeller karşılaştırılmış ve tartışılmıştır. Son olarak makalede gelecekteki çalışmalara hız ve yön vermesi için güncel yöntemler kapsamlı bir şekilde sunulmuştur.

Kaynakça

  • [1] Qi, C. R., Su, H., Mo, K., & Guibas, L. J. (2017’a). PointNet: Deep learning on point sets for 3D classification and segmentation. Proceedings -30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 2017-January, 77–85. https://doi.org/10.1109/CVPR.2017.16
  • [2] Qi, C. R., Yi, L., Su, H., & Guibas, L. J. (2017). PointNet++: Deep hierarchical feature learning on point sets in a metric space. ArXiv, (Nips).
  • [3] Mccarthy, T., Fotheringham, A. S., Charlton, M., Winstanley, A., & Malley, V. O. (2007). Integration of LiDAR and stereoscopic imagery for route corridor surveying. ANational Centre for Geocomputation, National University of Ireland, Maynooth.
  • [4] Chang, A. X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., … Yu, F. (2015). ShapeNet: An Information-Rich 3D Model Repository. Retrieved from http://arxiv.org/abs/1512.03012.
  • [5] Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., & Xiao, J. (2015). 3D ShapeNets: A deep representation for volumetric shapes. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 07-12-June, 1912–1920. https://doi.org/10.1109/CVPR.2015.7298801.
  • [6] Dai, A., Chang, A. X., Savva, M., Halber, M., Funkhouser, T., & Nießner, M. (2017). ScanNet: Richly-annotated 3D reconstructions of indoor scenes. ArXiv.
  • [7] Geiger, A., Lenz, P., Stiller, C., & Urtasun, R. (2013). Vision meets robotics: The KITTI dataset. International Journal of Robotics Research, 32(11), 1231–1237.
  • [8] Kesten, R., Usman, M., Houston, J., Pandya, T., Nadhamuni, K., Ferreira, A., … Shet, V. (2019). Lyft Level 5 AV Dataset 2019. Retrieved from https://level5.lyft.com/ dataset/2020.
  • [9] Zhao, H., Jiang, L., Fu, C. W., & Jia, J. (2019). Pointweb: Enhancing local neighborhood features for point cloud processing. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019-June, 5560–5568. https://doi.org/10.1109/CVPR.2019.00571.
  • [10] Duan, Y., Zheng, Y., Lu, J., Zhou, J., & Tian, Q. (2019). Structural relational reasoning of point clouds. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019-June, 949–958. https://doi.org/10.1109/CVPR.2019.00104
  • [11] Rusu, R. B., & Cousins, S. (2011). 3D is here: Point Cloud Library (PCL). Proceedings - IEEE International Conference on Robotics and Automation, (May). https://doi.org/10.1109/ICRA.2011.5980567
  • [12] Nguyen, A., & Le, B. (2013). 3D point cloud segmentation: A survey. IEEE Conference on Robotics, Automation and Mechatronics, RAM - Proceedings, 225–230. https://doi.org/10.1109/RAM.2013.6758588
  • [13] Shamir, A. (2006). Segmentation and shape extraction of 3D boundary meshes. Eurographics, (September), 137–149.
  • [14] Pu, S., & Vosselman, G. (2006). Automatic extraction of building features from terrestrial laser scanning. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS Archives, 36.
  • [15] Xu, Danfei ; Anguelov, Dragomir;Jain, A. (2018). PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation. (arXiv:1711.10871v2 [cs.CV] UPDATED). The IEEE Conference on Computer Vision and Pattern Recognition, 244–253. Retrieved from http://arxiv.org/abs/1711.10871
  • [16] Chang, A. X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., … Yu, F. (2015). ShapeNet: An Information-Rich 3D Model Repository. Retrieved from http://arxiv.org/abs/1512.03012
  • [17] Xiang, Y., Mottaghi, R., & Savarese, S. (2014). Beyond PASCAL: A benchmark for 3D object detection in the wild. 2014 IEEE Winter Conference on Applications of Computer Vision, WACV 2014, 75–82. https://doi.org/10.1109/WACV.2014.6836 101
  • [18] Xiao, J., Hays, J., Ehinger, K. A., Oliva, A., & Torralba, A. (2010). SUN database: Large-scale scene recognition from abbey to zoo. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 3485–3492. https://doi.org/10.1109/CVPR.2010.5539970
  • [19] Armeni, I., Sener, O., Zamir, A. R., Jiang, H., Brilakis, I., Fischer, M., & Savarese, S. (2016). 3D Semantic Parsing of Large-Scale Indoor Spaces Supplementary Material. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1534–1543
  • [20] Hackel, T., Savinov, N., Ladicky, L., Wegner, J. D., Schindler, K., & Pollefeys, M. (2017). Semantic3D.Net: a New Large-Scale Point Cloud Classification Benchmark. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 4(1W1), 91–98. https:// doi.org/10.5194/isprs-annals-IV-1-W1-91-2017
  • [21] Fritsch, J., Kuhnl, T., & Geiger, A. (2013). A new performance measure and evaluation benchmark for road detection algorithms. IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC, (Itsc), 1693–1700. https://doi.org/10.1109/ITSC.2013.6728473
  • [22] Geiger, Andreas, Lenz, P., & Urtasun, R. (2012). Are we ready for autonomous driving? the KITTI vision benchmark suite. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 3354–3361. https://doi.org/10.1109/CVPR.2012.6248074
  • [23] Schoenberg, J. R., Nathan, A., & Campbell, M. (2010). Segmentation of dense range information in complex urban scenes. IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Conference Proceedings, 2033–2038. https://doi.org/10.1109/IROS.2010.5651749
  • [24] Xiong, X., Munoz, D., Bagnell, J. A., & Hebert, M. (2011). 3-D scene analysis via sequenced predictions over points and regions. Proceedings - IEEE International Conference on Robotics and Automation, 2609–2616. https://doi.org/10.1109/ICRA.2011.5980125
  • [25] Munoz, D., Bagnell, J. A., Vandapel, N., & Hebert, M. (2009). Contextual classification with functional max-margin markov networks. 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009, 2009 IEEE, 975–982. https://doi.org/10.1109/CVPRW.2009.5206590
  • [26] Vallet, B., Brédif, M., Serna, A., Marcotegui, B., & Paparoditis, N. (2015). TerraMobilita/iQmulus urban point cloud analysis benchmark. Computers and Graphics (Pergamon), 49, 126–133. https://doi.org/10.1016/j.cag.2015.03.004
  • [27] Su, H., Maji, S., Kalogerakis, & E., & Learned-Miller, E. (2018). GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 264–272. https://doi.org/10.1109/CVPR.2018.00 035
  • [28] Yang, Ze, & Wang, L. (2019). Learning relationships for multi-view 3D object recognition. Proceedings of the IEEE International Conference on Computer Vision, 2019-Octob, 7504–7513. https://doi.org/10.1109/ICCV.2019.00760
  • [29] Yu, T., Meng, J., & Yuan, J. (2018). Multi-view Harmonized Bilinear Network for 3D Object Recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2, 186–194. https://doi.org/10.1109/CVPR.2018.00027.
  • [30] Maturana, D., & Scherer, S. (2015). VoxNet: A 3D Convolutional Neural Network for real-time object recognition. IEEE International Conference on Intelligent Robots and Systems, 2015-Decem, 922–928. https://doi.org/10.1109/IROS.2015.7353481.
  • [31] Riegler, G., Osman Ulusoy, A., & Geiger, A. (2017). Octnet: Learning deep 3d representations at high resolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3577-3586).
  • [32] Wang, Y., Fathi, A., Kundu, A., Ross, D. A., Pantofaru, C., Funkhouser, T., & Solomon, J. (2020). Pillar-Based Object Detection for Autonomous Driving. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),12367 LNCS, 18– 34. https://doi.org/10.1007/978-3-030-58542-6_2
  • [33] Zhao, H., Jiang, L., Fu, C. W., & Jia, J. (2019). Pointweb: Enhancing local neighborhood features for point cloud processing. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019-June, 5560–5568. https://doi.org/10.1109/CVPR.2019.00571
  • [34] Duan, Y., Zheng, Y., Lu, J., Zhou, J., & Tian, Q. (2019). Structural relational reasoning of point clouds. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019-June, 949–958. https://doi.org/10.1109/CVPR.2019.00104
  • [35] Komarichev, A., Zhong, Z., & Hua, J. (2019). A-cnn: Annularly convolutional neural networks on point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 7421-7430).
  • [36] Liu, Y., Fan, B., Xiang, S., & Pan, C. (2019). Relation-shape convolutional neural network for point cloud analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8895-8904).
  • [37] Wu, W., Qi, Z., & Fuxin, L. (2019). Pointconv: Deep convolutional networks on 3d point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 9621-9630).
  • [38] Lan, S., Yu, R., Yu, G., & Davis, L. S. (2019). Modeling local geometric structure of 3d point clouds using geo-cnn. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 998-1008).
  • [39] Rao, Y., Lu, J., & Zhou, J. (2019). Spherical fractal convolutional neural networks for point cloud recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 452-460).
  • [40] Simonovsky, M., & Komodakis, N. (2017). Simonovsky, Komodakis - 2017 - Dynamic Edge-Conditioned Filters in Convolutional Neural Networks on Graphs.pdf. Cvpr, 3693–3702.
  • [41] Hassani, K., & Haley, M. (2019). Unsupervised multi-task feature learning on point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 8160-8171).
  • [42] Bischoff, B.S., et al. 2002. OpenMesh–a generic and efficient polygon mesh data structure. In OpenSG Symposium.
  • [43] S. Song, S. P. Lichtenberg, and J. Xiao, “Sun RGB-D: A RGB-D scene understanding benchmark suite,” in CVPR, 2015.
  • [44] İnner, A.Burak (2013). Sweart Platform Benzetim ve Eniyileme Yazılımının Gerçekleştirilmesi. Doktora Tezi, Kocaeli Üniversitesi Fen Bilimleri Enstitüsü, Kocaeli
  • [45] C. Liu and Y. Furukawa, “MASC: Multi-scale affinity with sparse convolution for 3D instance segmentation,” arXiv preprint arXiv:1902.04478, 2019.
  • [46] 46. B. Graham, M. Engelcke, and L. van der Maaten, “3D semantic segmentation with submanifold sparse convolutional networks,” in CVPR, 2018.
  • [47] Z. Liang, M. Yang, and C. Wang, “3D graph embedding learning with a structure-aware loss function for point cloud semantic instance segmentation,” arXiv preprint arXiv:1902.05247, 2019.
  • [48] Hoang, L., Lee, S. H., Kwon, O. H., & Kwon, K. R. (2019). A Deep Learning Method for 3D Object Classification Using the Wave Kernel Signature and A Center Point of the 3D-Triangle Mesh. Electronics, 8(10), 1196.
  • [49] Feng, Y., Feng, Y., You, H., Zhao, X., & Gao, Y. (2019, July). Meshnet: Mesh neural network for 3d shape representation. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, No. 01, pp. 8279-8286).
Toplam 49 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Bölüm Makaleler
Yazarlar

Muhammed Ahmet Demirtaş 0000-0003-4092-7284

Yayımlanma Tarihi 30 Mart 2022
Gönderilme Tarihi 3 Şubat 2022
Yayımlandığı Sayı Yıl 2022 Cilt: 13 Sayı: 1

Kaynak Göster

IEEE M. A. Demirtaş, “Derin öğrenme ile 3B nokta bulutlarının sınıflandırılmasına genel bir bakış”, DÜMF MD, c. 13, sy. 1, ss. 1–9, 2022, doi: 10.24012/dumf.1067736.
DUJE tarafından yayınlanan tüm makaleler, Creative Commons Atıf 4.0 Uluslararası Lisansı ile lisanslanmıştır. Bu, orijinal eser ve kaynağın uygun şekilde belirtilmesi koşuluyla, herkesin eseri kopyalamasına, yeniden dağıtmasına, yeniden düzenlemesine, iletmesine ve uyarlamasına izin verir. 24456