DATA AUGMENTATION FOR A LEARNING-BASED VEHICLE MAKE-MODEL AND LICENSE PLATE MATCHING SYSTEM

The most important requirement for deep learning algorithms to run with a low error ratio is the realization of the training process with a sufficient amount of data. Using synthetic data is one of the most common approaches when the data set is not enough for training. Synthetic data production must be based on a real dataset to improve the prediction and classification abilities of the deep learning algorithms. The enrichment of the existing dataset using different techniques such as modified copies of existing data is called data augmentation. It can sometimes be difficult to generate enough datasets according to the type of problem, especially in image classification. In such cases, a dataset can be generated by duplicating and/or modifying existing pictures of the objects. In this study, data augmentation for a learning-based vehicle make-model and license plate matching system has been performed and a new vehicle image dataset has been generated. The proposed approach which has been used in creating the dataset is presented in detail. The generated new vehicle image dataset is available to developers as open-source.


Introduction
In this study, all stages of the data set generation which can be used for the vehicle make-model and license plate matching system are presented detail. In general, learning-based software systems are used to monitor and identify vehicles in traffic instantly. One of the most essential requirements for learning-based prediction and identification systems to run with a low error ratio is the realization of the training process with a sufficient amount of data [1][2][3]. Enough amount of data used for learning and testing increases the accuracy of the training stage [4][5][6]. Hence, learning-based algorithms may be capable of detecting small details and differences [7][8][9]. Moreover, the qualities and content of the datasets differ according to the aim of the studies. However, the main problem in this approach is to take and store different vehicle images to generate datasets. According to our literature review on vehicle make-model datasets, it has been observed that the existing datasets do not have enough vehicle pictures not only front-view but also rear-view for deep learning [10][11][12][13]. It has been observed that a few studies were focused on to develop datasets for vehicle images.
In recent years, one of the most important innovations in software development is deep learning which is a machine learning method that expresses deep neural networks [14][15][16][17]. It is an intelligent artificial neural network (ANN) system that allows us to train the created models to predict outputs using artificial neural networks from a given data set. But the main difference between ANN and deep learning is that a lot of data is needed for the application of deep learning. In the other words, deep learning is a more advanced form of machine learning method which needs a lot of data to find and to model correlations between inputs and outputs in the training stage. The other important point is that it creates new features by learning from the data itself. The difference between machine learning and deep learning for vehicle identification systems is shown in Figure 1. as block diagrams. In machine learning, determined features must be defined or created by the user before the training stage. On the other hand, in deep learning, features are defined or created by itself using ANN. The deep learning system identifies and tags extracted features and then starts to produce outputs using ANN, as well. The size of the data set is one of the important points when designing such systems. While designing a learning-based vehicle make-model and license plate matching system using deep learning methods, the system must be based on digital images, so the number of data should be expressed in thousands or more, not in hundreds. In this study, two different vehicle brands, Honda and Ford, which referred to herein as H and F are used. If data will be classified as two different objects (or vehicles for this study), it is necessary to create datasets that have an almost equivalent number of data. For example, let's assume that the H and the F brands have 2000 and 5000 data in their own datasets, respectively. It is not possible to make training and tests with these kinds of datasets for deep learning algorithms, accurately. Hence, approximately the same size datasets should be used for the learning stage. This approach increases the performance of the algorithms and decreases the prediction error ratio.
To increase the size of the dataset does not mean that the success rate of the model will increase linearly. Model accuracy can be stable, or its performance can be enhanced with small ratios such as one-thousandth, etc. In these cases, when the performance ratio of the model that continues with enhancements like one-thousandth is observed, at this point, the number of data in datasets should be fixed after an obvious breakpoint is determined. On the other hand, the contrary situation is possible. In this circumstance, the logical way is to use synthetic and/or augmented data for training and testing of the model. Thus, the performance of training and testing of the model can be approached to the desired ratio if it is possible as well by using synthetic and/or augmented data. As a result, it is possible to generate datasets that include thousands of data with different quantities and contents.
In this study, data augmentation for a learning-based vehicle make-model and license plate matching system has been performed and vehicle image datasets that belong to the F and the H brands have been generated. The generated datasets with augmented data are available to developers as opensource. Developers can download them from the following website [18]. The paper is organized as the following outlines. The preparation of datasets, data augmentation approaches, and properties of datasets are presented, in section II. In Section III, new datasets are presented. Finally, the conclusion is presented in the last section.

Dataset Preparation
In this study, two datasets that belong to the Focus(F) model of the Ford(F) between the 2012-2014 model year and the Civic(C) model of the Honda(H) brand with the 2016-2019 model year have been generated. Each dataset contains not only the front view of the vehicle but also the rear view of the vehicle. The block diagram of the data preparation steps is presented in Figure 2. As a first step, the low-resolution images obtained from the vehicle sales website were recorded automatically while generating the datasets. Each image has been tagged according to the year-brand-model pattern as a second step. For example, the tag 2012_2014_F_F_F was used for the front view of the F model of the F brand. The tag 2012_2014_F_F_R was used for the rear view of the same vehicle. the tag 2016_2019_H_C_F was used for the front view of the C model of the H brand. The tag 2016_2019_H_C_R was used for the rear view of the same vehicle. Thus, 2 different classes were generated for each vehicle model. Each image which has been download from [19] and [20] has 600x450 pixel dimensions.

Image augmentation process
In the literature, data augmentation which produces synthetic data from one or more real data is the morphological process [1,4,9,21]. In this progress, it is the acquisition of new images by applying processes such as rotation, horizontal and vertical inversion, translation, scaling, shifting, convert to grayscale, scrolling, Gauss noise (adding pixels), and etc. to existing [22]. The small datasets can be transformed into a larger training set with the new synthetic data created in this way. And also, approximately 40 random new images can be obtained from an image. Thus, overlapping in the training stage can be prevented due to the images which were produced using this technique [14,17]. In this study, 10000 images were generated by using rotation, horizontal inversion, and vertical inversion techniques. These morphological processes can be explained as follows: -Rotation: It is an important point to note with the rotation process is that the image dimensions may not be preserved after rotation. If an image has square dimensions, rotating it 90 degrees keeps the image size the same. If it has rectangular dimensions, rotating it 180 degrees keep the size the same. -Horizontal inversion (flipped): It is the complete reversal of the image on the horizontal axis. For example, it is the process of converting the vehicle image taken from the left side to the right side as a mirror reflection. -Vertical inversion (flipped): It is the complete reversal of the image on the vertical axis. Generated images by rotation, vertical inversion, and horizontal inversion processes from the original image are presented in Figure 3. -Translation: It involves moving the image in the X or Y direction (or both).
-Shifting: An image is shifted to the left or right by a selected pixel ratio on the x-axis. The overflow part of the image is added again from the other side of the image. -Convert to grayscale: All colored pixels convert to grayscale. Generated images by shifting and convert to grayscale processes from the original image are presented in Figure 4. -Gauss noise (adding pixels): Gaussian noise with a zero average has data points at essentially all frequencies and effectively distorts high-frequency characteristics. Adding enough amount of noise can increase learning ability. -Blurring: A filter used for blurring is a low pass filter. It allows to pass the low-frequency and cut high-frequency. A blurred image has not got sharp edges. Thus, this kind of filter is preventing the model from over-fitting. Generated images by adding noise (Gauss noise) and blurring processes from the original image are presented in Figure 5. -Cropping: It can be used to get the sample a random part from the original image. Then, the cropping part of the image is scaled to the original size of the image. This method is commonly known as random cropping.
-Scaling: The image can be scaled outward or inward by this technique. The dimension of the generated images is larger than the original image, possibly. If this situation occurs, it is necessary to equal the new image size to the original image.

Clearing similar images
It is possible to produce similar synthetic images due to random generation. And also, sometimes, some images can be exactly the same. To have similar images in the dataset may cause overlapping. Therefore, it is necessary to clear similar images from datasets. In this study, similar images (size, scale, etc.) removed from datasets by using a recursive clearing algorithm which compares all generated images recursively.

Tagging image names
Each augmented image was tagged with names to match the folder name in the entire datasets. Samples of augmented and tagged images are presented in Figure 6. As seen in Table 1, 10000 training, 1000 test, and 100 verification images were created in order to perform training, testing, and validation. Samples from the generated dataset are shown in Figure 7.

Conclusion
In this study, the steps of creating a dataset for a learning-based vehicle make-model and license plate matching system are proposed. For the generated dataset content, it was created as 4 classes based on one model of each of the two-vehicle brands. The dataset was generated as that can be seen in the brand model of the vehicle, as well as model year, front-view, and rear-view. The proposed approach which has been used in creating the dataset is presented in detail. The generated new vehicle image dataset is available to developers as open-source.