Dissimilarity weighting for graph-based point cloud segmentation using local surface gradients

Processing of 3D point cloud data is seen as a problem due to the difficulties of processing millions of unstructured points. The point cloud segmentation process is a crucial pre-classification stage such that it reduces the high processing time required to extract meaningful information from raw data and produces some distinctive features for the classification stage. Local surface inclinations of objects are the most effective features of 3D point clouds to provide meaningful information about the objects. Sampling the points into sub-volumes (voxels) is a technique commonly used in the literature to obtain the required neighboring point groups to calculate local surface directions (with normal vectors). The graph-based segmentation approaches are widely used for the surface segmentation using the attributes of the local surface orientations and continuities. In this study, only two geometrical primitives which are normal vectors and barycenters of point groups are used to weight the connections between the adjacent voxels (vertices). The defined 14 possible dissimilarity calculations of three angular values getting from the primitives are experimented and evaluated on five sample datasets that have reference data for segmentation. Finally, the results of the measures are compared in terms of accuracy and F1 score. According to the results, the weight measure W7 (seventh calculation) gives 0.8026 accuracy and 0.7305 F1 score with higher standard deviations, while the original weight measure (W8) of the segmentation method gives 0.7890 accuracy and 0.6774 F1 score with lower standard deviations. BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0/)


Introduction
Segmentation is a very important and challenging computer vision and computer graphics issue. In the segmentation stage, spatially close data elements are grouped according to their features such as color, lightness, intensity, and geometrical features [1]. The importance of this stage arises from that it reduces the data to be processed in the next level processes and produces more distinctive features [2]. Due to the handling of a high amount of raw data, the segmentation process is a very fundamental stage for 3D point cloud processing [3].
The local geometric features are the most used features for the point cloud segmentation process. As basic geometric features, local plane inclinations and estimated common planes between the adjacent local point groups establish the foundations of the tested dissimilarity weighting calculations in this work. Normal vectors of the local point groups represent the local plane inclinations and, the tangent vectors between the barycenters of the adjacent local point groups represent the estimated common planes [3], [4]. These local geometric features compose the local surface gradients. In the literature, the local point groups are determined usually by two techniques. The first one is to determine the nearest neighbors for each point and, the second one is to group the points into cubic volumes, which are called "voxels", within a regular grid structure [5], [6]. The voxel-based sampling technique has several advantages compared to the point-based nearest neighboring technique [7]. The local point groups are determined faster than the pointbased technique and reaching the nearby groups is easy due to the regular indexing of the voxels. In addition to these advantages, the noise and dense points in voxels are suppressed. In this way, the data size to be evaluated in the segmentation process is reduced by avoiding paying attention to similar points one by one in very dense regions. The octree organization of data is mostly used data structure to voxelize the data because of the low memory usage and the indexing facilities [5], [8].
Graph-based approaches are commonly used in both 2D image segmentation and 3D point cloud segmentation due to the discretely defined connections between data elements [9]. According to the graph-based approaches, the data elements are represented by vertices and the links between them are represented by connections [10], [11].
The weight values of the connections are specified with some similarity or dissimilarity measures between the vertices at the ends of the connections. "Efficient Graphbased Segmentation" (EGS) [10] which is one of the mostknown graph-based segmentation methods has gained popularity in the image processing field due to its efficiency in terms of correctness, execution time and easy implementation. This method has also been implemented for the 3D point cloud segmentation with some adaptations in the literature [12], [13]. The deficiency of this method is that its segmentation parameter is not a certain range and it changes according to both the connection weights and sizes of segments. A novel graph-based point cloud segmentation method, which named as "Boundary Constrained Voxel Segmentation" (BCVS) in this work, is proposed by Saglam et al. (2020) in the study [14]. They voxelize the points at first and merge the voxels by evaluating the weight values of connections between adjacent voxels. The parameter value that they use for the evaluation is in a certain range (0-90 o ). Their proposed method achieves an important success according to the methods which are popular in the literature in terms of accuracy and execution time.
In the literature, the graph vertices represent the points in some studies. Rabbani et al. (2006) presume connections between the points of the seed regions (each point is a seed region at the start) and the specified number of nearest points of the points by starting from the points with minimum residual in their method [15]. Whether or not to add the connections into the graph structure to be constructed are decided according to the angle differences of the normals and the spatial distances of the points for nearest neighbors. Strom et al. (2010) adapted the EGS method for 3D point cloud segmentation by using the normal differences and the color differences as connection weight between the points within a specified radius [12]. Bergamasco et al. (2012) proposed a semi-supervised region growing procedure that starts from a set of userspecified seed points [16]. They weighted the connections between the connected seed regions, derived from the triangulated raw points, with a similarity measure in which the scalar products of the normals and the Euclidean distance between the spatial centroids are used. Dutta et al. (2014) used the graph-cut image segmentation method [17] based on the three different weight values that are formed from surface normals, RGB color values, and eigenvalues of the covariance matrix of the local point distribution [18].
In this study, the BCVS [14] method with the octree voxelization technique is used as the graph-based segmentation method. For each voxel, the surface normal and the spatial center (barycenter) of the points in the voxel are calculated. Using only these features, different weight value calculations between the adjacent voxels are evaluated on five different point cloud datasets which have reference data to reach the most suitable weight measurement for graph-based 3D point cloud segmentation approaches.

The voxelization stage
In this stage, the local point groups are determined via sampling the nearest points into equal-sized cubic volumes. Since the traditional three-dimensional array structure unnecessarily consumes a large amount of memory space for empty spaces, a hierarchical divided data structure like the octree organization is preferred to voxelize point cloud data in the literature [4], [5], [7], [14].
The octree-organized data structure is a tree-structured data that consists of 8 linked nodes. The entire data is encapsulated into a cubic bounding box at first; and then, the bounding box is divided into eight equal-sized subvolumes. The dividing process continues as long as having at least one point in it until the size of sub-volumes reaches the desired size [8], [19]- [21].

Weighting the connections between the adjacent voxels
The dissimilarity measurements in different forms score the weight values of the connections between the adjacent voxels in this study. Two geometric primitives have been used to discover a suitable geometric dissimilarity measure between the local groups. Those are the PCA normals [22] of the point groups in the voxels and the barycenters of the groups.
The angular difference between the two unit normal vectors ̃ and ̃ are calculated by computing the inverse cosine of the dot product of the two vectors as in Equation (1) [14], [23], [24]. The angle can be in the range from the degree of 0 o to 90 o , corresponding from the lowest dissimilarity to the highest dissimilarity between the tendencies of the two planes.
The other local geometrical primitive is the estimated tangent vectors. The relationship of the surface normals of two nearby local planes with the vector that joins the barycenters ̅ and ̅ of the points belonging to the adjacent planes, which computed as in Equation (2), gives a clue about the tangent vector of the fitting plane of the two surfaces. The angular difference and the tangent vector of two adjacent planes are the two properties commonly used in the literature.
In Figure 1 (a), the angle between the normal vector and the surface can be obtained by calculating the angle between the unit normal vector ̃ and the unit tangent vector ̃ computed with Equation (3). ‖ ‖ is the Euclidean distance between the barycenters ̅ and ̅ . The angle is calculated by computing the inverse cosine of the dot product of the vectors ̃ and ̃ as in Equation (4). If the angle between the unit normal vector ̃ and the unit tangent vector ̃ is higher than 90 o , the angle is renamed as ′ and, is computed by the operation 180 − ′, namely will be supplementary of the angle between the vectors ̃ and ̃, as seen in Fig. 2 (b). After the angle is obtained, the angle = 90 − is obtained.
The angles and expose dissimilarity at the same tendency with the angle unless the surfaces show some specific features such as parallel and cylindrical [23]. and can be in the range of 0 − 90 degree like .
In this study, the angles , , and have been used with 14 different forms listed in Table 1 as weight measurements. All of the measures in Table 1 are in the range of 0 − 90 degrees. The weight value 0 refers to the lowest dissimilarity and, the weight value 90 refers to the highest dissimilarity.
The other important issue in the voxel-based point cloud segmentation is the residual voxels. These voxels have less than three points and, its normal vector does not give information about the local inclination [5]. In this study, these voxels are labeled as residual, and only non-residual voxels have a connection to these voxels, namely, there is not a connection between two residual voxels. If the voxel is residual and the voxel is non-residual, the weight value of the connection between them is , and vice versa. Because, where the voxel is residual, ̃ can be negligible and should not be evaluated in any measurement. A residual voxel can be merged once with a non-residual voxel.

The segmentation stage
As a segmentation method to test the weight measurements, the BCVS algorithm [14] is used in this study. The method poses a region growing approach in the segmentation stage. After the normal vectors and the barycenters of the voxels are obtained, all of the voxels seem as a segment at first. The connection between the adjacent voxels are weighted according to the selected weight measurements and sorted in ascending order of their weight values. Starting from the lowest weighted connection, the segments at the ends of the connection are taken into consideration to merge them. According to the method, the mutually adjacent voxels between the segments in the consideration are paired one-to-one concerning the least weight connections between them. If all of the weight values of the connections between the  paired voxels are equal or smaller than the segmentation parameter that is in the range 0-90, the two segments are merged. The residual voxels are merged only once and not paired with any voxel in the pairing process.

Evaluation metrics
As quantitative metrics to compare the segmentation successes of weight measurements, the accuracy and F1 score measurements are used. A larger segment has more impact on the accuracy score, while each segment has the same impact on the F1 score [14]. In both metrics, the result segments are firstly matched one-to-one with their mostly overlapping segments among the reference segments. The matching process has been done according to the study [26]. This process is performed in two stages. In the first stage, the mutually most overlapped segments by the number of matching points are paired. In the second stage, the unpaired segments at the end of the first stage are paired with the unpaired opposite segments that take part mutually in the first two overlapping segments. The segments which are unpaired at the end of the second stage stay as unpaired segments.
After the matching process is completed, the accuracy of entire result data, and the precision and recall values for each reference segment can be calculated as explained in the study [14]. The precision and recall values of unpaired segments are 0. The harmonic mean of average precision and recall values of reference segments give the F1 score value.

Datasets
In our experiments, five datasets are used to test the weight measurements. Three datasets (Sample 1, Sample 2, and Sample 3) are obtained from [14] and the other two datasets (Sample 4 and Sample 5) are obtained from [25].
Each dataset has reference segments. The sizes of the datasets and the numbers of their reference segments are listed in Table 2. The RGB colored presentation of the raw datasets can be seen in Figure 2.

Experimental results
In Tables 3 and 4, respectively the accuracy and F1 score comparisons of the weight measures with the best segmentation parameters can be seen with the average and standard deviation (SD) values. In Figure 3, the groundtruths of the datasets and the segmentation results of the weight measures 7 and 8 are visualized by being randomly colored.

Conclusions
In the 3D point cloud processing field, the segmentation stage is very important, because the segmentation stage allows the extractions of meaningful features from high amounts of cluttered data for high-level stages. The graph tools discretize the connections between voxels and facilitate the segmentation process. The attributes of the points and their local similarities and dissimilarities are some of the major interests of the segmentation algorithm. In the literature, a variety of local similarity and Figure 2. The RGB colored presentation of the raw datasets used in the experiments dissimilarity measurements is used to segment 3D point clouds according to the geometrical features. In this work, we have tested 14 different forms of the several basic local geometric features, explained in this paper, as dissimilarity measures on a graph-based segmentation method. As the experimental dataset, five data sets, such that each has a reference set, have been used with 14 weight measurement techniques to determine the most suitable measurement. Looking at numerical results, the weight measure 7 of them gives noticeable results. On the other hand, the weight measure 8 is one of the measures that give the best results with a small standard deviation. This study also proposes an appropriate weight measure for other point cloud segmentation methods that use graph-based approaches.

Acknowledgment
This work is supported by the Scientific and Technological Research Council of Turkey (TUBITAK) (project no: 119E012). This study was carried out in the scope of the Doctoral Thesis of Ali SAGLAM.

Author's Note
Abstract version of this paper was presented at 9th International Conference on Advanced Technologies (ICAT'20), 10-12 August 2020, Istanbul, Turkey with the title of "Dissimilarity Weighting for Graph Based Point Cloud Segmentation Using Local Surface Gradients".