The purpose of
this study is that using different deep learning models for classification of
14 different animals. Deep Learning, an area of artificial intelligence, has
been used in a wide range of recent years. Especially, it using in advanced
level of image processing, voice recognition and natural language processing
fields. One of the most important reasons for using a large field in image
analysis is that it performs the feature extraction itself on the image and
gives high accuracy results. It performs learning by creating at different
levels representations for each image. Unlike other machine learning methods,
there is no need of an expert for feature extraction on the images. Convolution
Neural Network (CNN), which is the basic architecture of deep learning models, consists
of different layers. These are Convolution Layer, ReLu Layer, Pooling Layer and
Full Connected Layer. Deep learning models are designed using different numbers
of these layers. AlexNet and VggNet models are used for classified of 14
different animals. These animals are Horse, Camel, Cow, Goat, Sheep, Wolf, Dog,
Cat, Deer, Pig, Bear, Leopard, Elephant and Kangaroo respectively. Animals that
are most likely to encounter when during driving road were selected. Because
thinking this work to be a preliminary work for the control of autonomous
vehicle driving. The images of animals are collected in color (RGB) on the
internet. In order to increase the data diversity, images were also taken from
the ready data sets. A total of 150 images were collected with 125 training and
25 test data for each animal. Two different data sets have been created, with
each image having dimensions of 224x224 and 227x227. As a result of the study,
the classification of the animals was realized with %91.2 accuracy with VggNet
and %67.65 with AlexNet. The high error rate in AlexNet is due to the small
number of layers in the network and the high selection of parameter values. For
example, the filter size in the convolution layer in AlexNet architecture is
11x11 and the number of stride is 4. This situation causes data loss in
transferring the information to the next layer. In contrast, VggNet has a
filter size of 3x3 and a number of steps of 1, there is no data loss in the
transfer to the next layer.
Primary Language | English |
---|---|
Journal Section | Articles |
Authors | |
Publication Date | April 20, 2018 |
Published in Issue | Year 2018 Volume: 7 Issue: 1 |
As of 2021, JNRS is licensed under a Creative Commons Attribution-NonCommercial 4.0 International Licence (CC BY-NC).