A new approach for matching road lines using efficiency rates of similarity measures

The lack of common semantic information among corresponding geo-objects in different datasets required new matching approaches based on geometric and topological measures. In this study, a semi-automated matching approach based on the matching capabilities of geometric and topological measures was proposed. In the first stage, after the initial matching performed by a scoring system, the efficiency of each measure on the matching accuracy is evaluated manually by an operator. In the second stage, (1) the score of each measure is updated in accordance with the accuracy distributions. This means that the score of a measure is increased if it is relatively more significant than others. Finally, (2) matching process is repeated with new scores. The proposed approach was tested by matching tree, cellular, and hybrid-patterned road lines in municipal, private navigation, and OpenStreetMap datasets. The experimental testing shows that it has satisfactory results both in accuracy and completeness. F-measure is over 86% in hybrid-patterned Bosphorus datasets.


INTRODUCTION
Geometric integration establishes the relationships between the objects in a spatial dataset and the corresponding objects in another dataset and ensures that the target dataset reaches the required competence. Producing better (geometrically and semantically more up-to-date and rich) maps by using two different maps representing the same entities is also an important issue of integration and is called map conflation (Lynch and Saalfeld, 1985;Saalfeld, 1988). The integration process can be used for different purposes. Cobb et al. (1998) remarked the requirements for map conflation as; updating with the objects transferred from one dataset to another, optimization of geometric and semantic accuracy, and transferring data to a dataset containing missing information. The conflation process enables the spatial data generated by different sources to be used together. Geometric, topological and semantic similarities between objects are important criteria for the conflation process. The greater the similarity, the lower the operator effort.
The conflation process is based on the principle of matching geometries (point, line, and polygon) that represent the same real entities (Yuan and Tao, 1999). Determining the correspondences between the objects according to their relations and similarities is called matching process (Hacar and Gökgöz, 2019b).
In this study, a semi-automated matching approach based on the efficiency rates of the measures was proposed. In this section, related studies in the literature are examined. Following section presents the study area and datasets, the geometric and topological measures used to determine the similarities between the objects, and the proposed approach. Section 3 presents the experiment with tree-, cellular-, and hybrid-patterned road networks, and the evaluation of the results conducted with the statistics of the study. Section 4 concludes the study by discussing the results and giving several further suggestions.

Related Works
Many methods have been developed to match line objects since it was first applied in 1980s by Rosen and Saalfeld (1985) and Saalfeld (1988). Main problem in line matching is that none of the corresponding line geometries from different sources are geometrically identical. In other words, the geometrical properties of corresponding line objects such as orientation, length, shape, location have not equal values. According to Hacar and Gökgöz (2019b), there are three important reasons that researchers prefer to work with line matching rather than point and polygon matchings: (1) difficulties in establishing relationships between complex representations such as patterns, intersections, roundabouts, dead ends, (2) the need to keep navigation datasets up-to-date and (3) the rise of Volunteered Geographical Information (VGI) datasets.
The concepts of matching progress in spatial data integration have also been focused by researchers. Yuan and Tao (1999) classified the matching process by geometry, topology and semantic. Ruiz et al. (2011) also discussed the integration process by match type; geometric, topological and semantic. Volz (2006) classified the process by similarity measures; point-, linear-, and area-based and the hybrid. Xavier et al. (2016) classified the measures as geometric, topological, attribute, context, and semantic. Memduhoğlu and Başaraner (2018) compared thematic geographic ontologies created for cities and discussed about possible contributions of basic integration methods and technologies of spatial semantics for creating a multirepresentation spatial database paradigm. Hacar and Gökgöz (2019a) designed a conceptual model for matching process under spatial data integration by classifying the types of geometry, measure, relationship, and spatial information.
There have been developed many matching methods. While some of them works fully automated, others allow the user intervention. Xiong and Sperling (2004) proposed a semi-automatic method for matching road networks. By using a cluster-based matching process, strong relationships between nodes, edges, and segments in the two road networks are determined. Their method allows identifying and correcting missing matches, but requires significant interaction (operator intervention) during the process. Li and Goodchild (2011) proposed an automated optimization model to match the road lines using geometric and semantic measures, as well as an affine transformation. They used asymmetric property of one-way Hausdorff distance as a measure of dissimilarity. In addition, the Hamming distance was also used as a criterion of dissimilarity to show the difference between road names. Lei and Lei (2019) also developed a flow-based optimization model that seeks to minimize the total discrepancy between two datasets. Moreover, Araújo et al. (2019) proposed a Spark-based approach using the names of the places (semantic) and context information (e.g., neighbouring streets) to compare the corresponding objects in realworld data sources of New York and Curitiba.
Some researchers focused on matching objects in datasets that have a significant scale difference. To work with this kind of source datasets, researchers often use topological measures (e.g., the degree of connectivity (or the valence), spider function, buffer-growing, etc.) to match the corresponding objects. Mustière and Devogele (2008) proposed an approach relying on the comparison of geometrical, attributive, and topological properties of objects for matching networks with different levels of details. Olteanu-Raimond et al. (2015) used belief theory to represent and fuse knowledge from different sources to model imperfection (imprecision, uncertainty, and incompleteness), and make a decision. Chehreghan and Abbaspour (2018) developed an optimization-based matching approach for multi-source spatial datasets by taking into account several geometric criteria. The approach benefits from a genetic algorithm and sensitivity analysis to identify corresponding objects. Moreover, Guo et al. (2019) designed a new matching method for the objects in multi-scale geodatabases using weights of some well-known geometric and topological measures. The method has three stages; (1) entire, (2) partial matchings, and (3) roundabout detection and matching. The authors used a splitting process to match the unmatched road segments.
Some studies in urban lands are also crucial tasks of integration cases. Recently, VGI, social media, and geocoding data are used to extract and combine new spatial data in urban areas (Hacar 2020;Kılıç and Gülgen 2020;Bilgi et al. 2019). VGI enables generation of maps by using crowd-sourced volunteer contributors. Each volunteer has equal role to contribute the geometric and semantic properties of the geographical objects. However, since there is no rule to be a volunteer in VGI, non-expert contributors may draw features irregularly or inconsistently with basics of cartography. Therefore, result map may have low quality. In this context, geoobject matching is used as a process providing a solution for analysing and increasing the quality and accuracy of VGI data. Koukoletsos et al. (2012) proposed a matching approach to assess the completeness of VGI data. They developed a multi-step approach matching OpenStreetMap (OSM) road data with the UK's official mapping agency Ordnance Survey (OS), taking into account the similarities in geometric (search distance, direction, line-based buffer zone) and attribute (road names). Pourabdollah et al. (2013) also conducted a conflation study with attribute-rich OS data to improve the quality of OSM road data. Besides, Hacar and Gökgöz (2019b) conducted a matching study with OSM and TomTom navigation data. In some cases, line-based (linear) approaches to matching road objects may be insufficient. In such cases, an area-based (spatial) matching approach, like proposed by Fan et al. (2016), can be used. This method finds the corresponding blocks in source datasets with a spatial overlapping ratio. It then matches the surrounding roads using the matched blocks. Also Fan et al. (2016) tested their method by matching OSM and public city data and achieved satisfactory results in Heidelberg (Germany), a network of regular networks, and Shanghai (China), with a relatively more complex network. The sources and patterns of road networks are two important factors to consider in the matching process. Yang et al. (2014) classify the pattern groups of the blocks that the roads surround and match the nodes in the groups hierarchically. Hacar (2019) and Hacar and Gökgöz (2019b) developed a score-based multi-stage method and tested it with cellular-, tree-, and hybrid-patterned road networks. According to the method, the candidate matches are scored in accordance with the geometric and topological similarity and then the objects with high scores are matched incrementally.
The matching methods differ from each other according to the hierarchical steps of the approaches, even if they have some common stages, metrics or rules. The design of the method can primarily affect the sufficiency of the case study. Also, the complexity of road networks can reduce the sufficiency. The previous approaches had low interest in complex road networks such as in Istanbul. In this study, the scope of the proposed approach is determined to design a new matching model and its applicability in Istanbul road networks.

THE PROPOSED APPROACH
The proposed approach performs the matching process of road lines thanks to the efficiency rates. The rates are calculated using geometric and topological measures. The main idea for selecting the measures is to determine the similarities of corresponding matching pairs from different source datasets. As seen in Fig. 1, the matching process is managed in two stages in addition to a pre-process. Firstly, two road networks are aligned as a pre-process. In the first stage, road lines closer to each other than a predefined threshold distance value T are identified as candidate matchings. Hausdorff distance is used to determine the closeness between candidates. T should be large enough to identify possible correct matches and small enough not to cause too many missing matches (mismatching). The threshold can be determined by examining the source datasets and structure of road networks, and by conducting several experimental matching observations. After the selection of corresponding pairs, for each candidate matching, (1) similarity scores are calculated based on the measures of Hausdorff distance ( ), orientation ( ), sinuosity ( ), mean perpendicular distances ( ), mean length of triangular edges ( ) and modified degree of connectivity ( ) (Fig. 2). The maximum similarity score assigned to a candidate pair is 4 for all measures apart from sinuosity and mean perpendicular distance. Sinuosity and mean perpendicular distance represent similar characteristics of lines. The maximum score with respect to these indicators is 2 so that the maximum total score of these indicators shall be the same as the others, 4, for fairness (Hacar and Gökgöz, 2019b). Table 1 shows the computation criteria of scores for each measure.
(2) Sum of the similarity scores are obtained for each candidate pair, and (3) the candidates, whose total similarity scores are maximum, are selected as matched pairs and other candidates are eliminated. The efficiency of each measure is determined by comparing the matched pairs with the result of the manual matching. After determining the number of correct and incorrect matches for each measure, it is ensured that the score of the measure, which performs better results in term of the number of correct and incorrect matches, is higher than that of the relatively insignificant (less number of correct and/or much more incorrect matches). For this purpose, the efficiency ratio is used, where the numbers of correct and incorrect matchings are placed together. Each measure has its own efficiency ratio.
Maximum-Minimum normalization method was adapted to calculate the efficiency ratio. Briefly, the ratio is multiplied by the similarity scores to increase the effect of the measure that performs the matching process with high accuracy and reduces, but not disables, the effect of the measure with low accuracy.
The normalization consists of two equation: Profit (P) and Loss (L) (Eq. 1 and Eq. 2). While P represents how far the value Xi from minimum value, L represents how close the value Xi to maximum value. The following formulas are used as original Maximum-Minimum normalization measures (Başaraner, 2011;Şen, 2013).
These criteria can be adapted to calculate the normalized values and for each similarity measure with regards to the correct and incorrect match numbers as follows.
where (i=1,2,..,n) represents the number of correct matches of the respective measure, represents the least number of correct matches, and represents the maximum number of correct matches between all the measures. In addition, (i=1,2,..,n) represents the number of incorrect matches of the respective measure, represents the least number of incorrect matches, and represents the maximum number of incorrect matches between all the measures.
The efficiency rates could be calculated as follows: However, the efficiency ratio (Eq. 5) is to be zero for the measure that performs the maximum number of incorrect or minimum number of correct matches. This results in the score used for the respective measure being multiplied by a factor of 0 (zero) and the corresponding measure being ineffective (disabled) in the second stage of the approach. Since there is no correlation between the numbers of the correct and incorrect matches, making any measure ineffective may reduce the success of the process. Also, our experience in matching cases motivates us to consider all of the measures, even if it is relatively less significant (generating many incorrect matches). Therefore, the exponential function should be used with previous formula (Eq. 5). Exponential function prevents the least important measure from taking a value 0 (Eq. 6). In other words, the least important measure also affects the results in the second stage.   (Hacar, (2019); Hacar and Gökgöz, 2019b) Measure Criteria Hausdorff distance For each candidate pair, the first three closest matches are scored as 1 = 4, 2 = 2, and 3 = 1, respectively. The fourth and others are scored as ( >3 ∈ + ) = 0.

Orientation
Candidate pairs in the same class are scored as = 4. If they are in adjacent classes (seen in Fig. 2)), the score is assigned as = 2. Otherwise, the score is assigned as = 0.

Mean perpendicular distance
If the difference between the mean perpendicular distances of Line n and Line m is less than or equal to 2 ⁄ ( is the standard deviation of all mean perpendicular distances), then it is scored as = 2. If the difference between the mean perpendicular distances of Line n and Line m is greater than 2 ⁄ and less than or equal to , then it is scored as = 1. Otherwise, it is scored as = 0.

Mean length of triangle edges
If the difference between the mean length of triangle edges of Line n and Line m is less than or equal to 2 ⁄ ( is the standard deviation of all mean lengths of triangle edges), then this matching is scored as = 4. If the difference between the mean length of triangle edges of Line n and Line m is greater than 2 ⁄ and less than or equal to , then it is scored as = 2. Otherwise, it is scored as = 0.

Modified degree of connectivity
If the candidates have the same degree, then it is scored as = 4. If there is a just one degree of difference between the candidates, then it is scored as = 2. Otherwise, it is scored as = 0.
In the second stage, the matching process is repeated with similarity scores updated (optimized) with efficiency rates. This means that the score of a measure is increased whether it is relatively more significant than others. Finally, the candidates with the highest total similarity scores are determined as certain matches.

Study Area and Datasets
The proposed method was tested with tree-, cellular-, and hybrid-patterned road networks in Istanbul. We used different sources as; Istanbul Metropolitan Municipality (IMM), two private navigation companies Başarsoft and TomTom, and OSM, one of the popular VGI projects, to show how efficient the proposed approach with different samples (Fig. 3) (Table 2). Also, an additional matching process was conducted with a large amount of data covering Bosphorus of Istanbul to prove its efficiency in a realistic way (Fig. 4). In Bosphorus, major elevation differences exist from coastal land to exterior bound. This kind of local surface changes makes road networks complex and leads the road shapes to be similar with hybrid-patterns.

Pre-processing
The source datasets have different coordinate systems. This difference affects the calculation of similarity negatively. For example, the objects in the Başarsoft, TomTom, and OSM datasets have geographical coordinates in WGS84 datum. However, the measures used in the study are calculated in metric. Therefore, the geographical coordinates of the objects were transformed into the ITRF96 datum (Gauss-Krüger projection, Central meridian: 30° and GRS80 ellipsoid) where the IMM dataset was defined. Furthermore, two road networks were aligned using linear rubber-sheet transformation. Moreover, we set T distance threshold as 85m for tree-and cellular-patterned road networks and 50m for hybrid-patterned road network by using our previous matching experiences with source datasets and the study area.

Results and Evaluation
The results of the matching process were compared with the results of manual matching, and then, the numbers of correct and incorrect matches in Table 3 were determined. The evaluation was performed both integrated and separately with each geometric and topological measure. In the first stage of the approach, some results occurs categorically in accordance with the type of measures and road patterns. While Hausdorff distance measure performed the maximum number of correct and the least number of incorrect matches in both tree and cellular patterns, its result in hybrid pattern is different. Mean perpendicular distance performs the maximum number of correct matches. However, it also gave the most number of incorrect matches in hybrid patterns. Therefore, we examine the results of the measures by using their correctness and incorrectness percentages (Table 3). Hausdorff distance measure performed the maximum correctness and the minimum incorrectness in all patterns. Sinuosity and mean perpendicular distance measure gave the least correctness and the maximum incorrectness in cellular pattern. Orientation was the second best similarity measure in terms of both correct and incorrect matching in all patterns. From this point of view, it can be observed from Table 3 that mean perpendicular distance was the worst in all patterns. Similarly, mean length of triangle edges and modified degree of connectivity performed the least correctness and the most incorrectness in hybrid pattern. However, these measures gave similar results with orientation and sinuosity in tree.
The similarity scores used in the first stage were optimized by the in Table 4 and new similarity scores to be used in the second stage were calculated as in Table  5.   In the second stage, the relationships between the candidates were determined with new similarity scores in Table 5 and the process was performed for the last time. Accordingly, while the proposed approach, with the updated (optimized) scores, performed almost the same number of matches as the number of manual matching in tree and cellular patterns, some missing matching occurred in hybrid pattern (Table 6). The missing matching is related to two parameter: (1) matching capability of the approach and (2) distance threshold. While the approach was common for all the source patterns, the distance threshold T was different in hybrid pattern. Therefore, possible reason for the missing matches of hybrid was T.
With the updated similarity scores, the number of correct matches increased by 4 and the number of incorrect matches decreased by 7 in tree-patterned roads. Although the number of incorrect matches decreased by 2 in cellular-patterned roads, the number of correct matches also decreased by 1. While there is no change in the number of correct matches in hybrid roads after second stage, the number of incorrect matches decreased by 8.
The operation of controlling the manual matching could have been too hard with over a thousand corresponding matching pairs in Bosphorus datasets. Therefore, after generating the final matching with whole datasets, the correct and incorrect matches was determined by comparing randomly selected sample data with manual matching (Fig. 5). In Table 6 and 7, the results are based on the sample of Bosphorus datasets. Since Bosphorus datasets consist of several types of patterns, it is better to examine matching instances in accordance with the pattern type separately. Fig. 6 shows correctly matched road lines with cellular pattern. They were matched correctly both in the first stage and the second stage. Both this visual instance and Table 6 show that the second stage of the proposed method almost have the same result with the first one in cellular patterned-road networks. Besides, while the northwest roads with hybrid pattern in Fig. 7 was matched correctly, the south was a missing match. The possible reason is that the corresponding roads have quite different geometric properties such as sinuosity and centroid. Moreover, the road 1 in Fig. 8 was matched incorrectly with the road 2ˈ both in the first and second stage since the geometric and topological properties of the road 1 are more similar with the road 2ˈ than with the road 1ˈ. As a matter of course, there were expected instances showing us that the second stage optimized the matching process by eliminating the incorrect matches in the first stage. The road 1 Fig. 9b was matched with three roads in other datasets in the first stage. However, the matches with the roads 2ˈ and 3ˈ were incorrect. In the second stage, the efficiency rates ensured the elimination of the incorrect matches.   Determining the accuracy of a matching study only by the correct matches is not sufficient. For example, in a study area, there are 100 manually detected possible matches and a selected automated method performed 10 matches only. If none of the 10 matches is incorrect, the method is considered to have worked with 100% correctness. However, according to manual matching, the method could not identify 90 matches. This shows that completeness should also be taken into account when making assessments of accuracy. Therefore, three of the frequently used measures of statistical analysis in data science; precision (Eq. 7), recall (Eq. 8) and Fmeasure (Eq. 9) were used to evaluate the proposed method (Samal et al., 2004;Song et al., 2011;Fan et al., 2016). The precision measure is a ratio of the number of correct matches to the total number of matches. Therefore, the precision was used as the accuracy indicator. F-measure is an evaluation measure in which the precision (accuracy) and recall (completeness) together affect in a balanced way. In the second stage of the method, the accuracy increased by 5.4%, 1.2%, 2.9%, and 4.4% in tree-, cellular-, and hybrid-patterned roads and Bosphorus sample, respectively (Table 7). It can be said that the results are satisfactory in terms of accuracy.
Recall is a measure of how complete the methods are performed. For instance, when Table 6 is examined carefully, comparing with the manual matching result, the proposed approach performed two more matchings (over-matches) in the first stage and one missing in the second stage with tree-patterned roads. As seen in Table  7, the completeness is 100% in the first stage and 98.9% in the second stage. This means that over-matches do not affect the value of the recall measure. This also indicates that the recall value cannot be a standalone measure for the evaluation, but can be used to interpret the accuracy. From this point of view, recall value presented that the proposed approach ensured high completeness (almost fully complete). Therefore, the accuracy of the study is quite reliable. In hybrid-patterned roads, the recall value decreased in the second stage. This is because of that while the number of incorrect matches decreased, the number of missing matches increased. Also, F-measure increased by 3.1% and 0.7% in tree and cellular patterns. It has no change in hybrid pattern since (1) the number of correct matches had no change, and (2) decreasing number of incorrect matches was added to number of missing matches in both stages. The number of correct matching of each measure is close to each other (Table 3). Therefore, the correct matching numbers have no specifics. This assessment supports the proposed efficiency formula in which the incorrect matches are used. Moreover, Hausdorff distance performed the number of correct matches at least 3.5 times greater than the number of incorrect matches (Table 3). Other measures performed many incorrect matches. Sinuosity and mean perpendicular distance performed the worst in cellular pattern since most of the corresponding road lines has low curvature. The results show that some of the similarity measures are more important than others for the pattern type on which they are used. For instance in our experiments, while Hausdorff distance was the best-matcher for all patterns, the mean length of the triangle edges was the worst-matcher for only hybrid pattern. This kind of changeable order between measures clearly supports the proposed approach that optimizes the similarity scores using the efficiency rates.

CONCLUSION
This paper proposes a semi-automated approach for road objects in line geometry. Besides, since it determined the efficiency rates for the tree-, cellular-, and hybrid-patterned road network datasets, the second stage of the proposed approach can be performed automatically with the road networks in a similar pattern. For a road network with a different pattern, the efficiency rates must be recalculated since the similarity measures have different correctness and incorrectness in terms of the pattern type (Table 3). In addition, efficiency rates can be calculated using small samples for datasets containing a large number of road objects, and then, applied to the source datasets. In this case, after the efficiency rates are determined semi-automatically by a manual matching operator using randomly selected samples, the actual large data is matched automatically using these efficiency rates. To prove the efficiency of the proposed approach, we conducted an additional matching process with OSM and TomTom road networks in Bosphorus, Istanbul. Since the Bosphorus networks were hybrid-patterned, the efficiency rates had no need to be computed again. This enables the matching process with the same patterned roads to start directly from the second stage.
Utilization of Maximum-Minimum normalization and the exponential function enabled the efficiency rates to be ranged between 1 and 2. Thus, even the mean perpendicular distance was used as the least significant measure in the similarity calculation.
The proposed approach does not use any semantic information to determine the similarity between objects. Instead, the similarities are calculated on the basis of scores based on geometric and topological measures. The optimization process updates the scores using the efficiency rates.
In this study, the scoring rules and the geometric and topological measures were taken from the study of Hacar and Gökgöz (2019b). However, the proposed approach can be used to adapt different kind of scoring rules using different geometric and topological measures that are specific to the characteristics of the source datasets.
The proposed approach has an F-measure over 86% in hybrid-patterned Bosphorus datasets. The results are satisfactory in terms of accuracy and completeness. The experimental testing also show that there is no need to conduct a second stage for the cellular-patterned road networks.
Computing the time of the matching process is a hard task since the process is conducted semiautomatically. The process time changes according to the experiences of the matching operator in the stage of manual results. This may occur the disadvantage that prevents planning the geo-process routines.