Classification of Animals with Different Deep Learning Models

Özkan İnik; Bulent Turan

EN

Classification of Animals with Different Deep Learning Models

Abstract

The purpose of this study is that using different deep learning models for classification of 14 different animals. Deep Learning, an area of artificial intelligence, has been used in a wide range of recent years. Especially, it using in advanced level of image processing, voice recognition and natural language processing fields. One of the most important reasons for using a large field in image analysis is that it performs the feature extraction itself on the image and gives high accuracy results. It performs learning by creating at different levels representations for each image. Unlike other machine learning methods, there is no need of an expert for feature extraction on the images. Convolution Neural Network (CNN), which is the basic architecture of deep learning models, consists of different layers. These are Convolution Layer, ReLu Layer, Pooling Layer and Full Connected Layer. Deep learning models are designed using different numbers of these layers. AlexNet and VggNet models are used for classified of 14 different animals. These animals are Horse, Camel, Cow, Goat, Sheep, Wolf, Dog, Cat, Deer, Pig, Bear, Leopard, Elephant and Kangaroo respectively. Animals that are most likely to encounter when during driving road were selected. Because thinking this work to be a preliminary work for the control of autonomous vehicle driving. The images of animals are collected in color (RGB) on the internet. In order to increase the data diversity, images were also taken from the ready data sets. A total of 150 images were collected with 125 training and 25 test data for each animal. Two different data sets have been created, with each image having dimensions of 224x224 and 227x227. As a result of the study, the classification of the animals was realized with %91.2 accuracy with VggNet and %67.65 with AlexNet. The high error rate in AlexNet is due to the small number of layers in the network and the high selection of parameter values. For example, the filter size in the convolution layer in AlexNet architecture is 11x11 and the number of stride is 4. This situation causes data loss in transferring the information to the next layer. In contrast, VggNet has a filter size of 3x3 and a number of steps of 1, there is no data loss in the transfer to the next layer.

Keywords

AlexNet,CNN,Classification of Animals,Deep Learning,VggNet

References

Girshick, R., Donahue, J., Darrell, T., Malik, J., 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. 2014 Ieee Conference on Computer Vision and Pattern Recognition (Cvpr), 580-587.
Graves, A., Mohamed, A.-r., Hinton, G., 2013. Speech recognition with deep recurrent neural networks, Acoustics, speech and signal processing (icassp), 2013 ieee international conference on. IEEE, pp. 6645-6649.
He, K.M., Zhang, X.Y., Ren, S.Q., Sun, J., 2016. Deep Residual Learning for Image Recognition. 2016 Ieee Conference on Computer Vision and Pattern Recognition (Cpvr), 770-778.
Hermann, K.M., Kocisky, T., Grefenstette, E., Espeholt, L., Kay, W., Suleyman, M., Blunsom, P., 2015. Teaching machines to read and comprehend, Advances in Neural Information Processing Systems, pp. 1693-1701.
Heuritech, 2018. https://blog.heuritech.com/2016/02/29/a-brief-report-of-the-heuritech-deep-learning-meetup-5/.
Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.-r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., 2012. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine 29, 82-97.
Jarrett, K., Kavukcuoglu, K., LeCun, Y., 2009. What is the best multi-stage architecture for object recognition?, Computer Vision, 2009 IEEE 12th International Conference on. IEEE, pp. 2146-2153.
Jozefowicz, R., Vinyals, O., Schuster, M., Shazeer, N., Wu, Y., 2016. Exploring the limits of language modeling. arXiv preprint arXiv:1602.02410.
Krizhevsky, A., Sutskever, I., Hinton, G., 2012. ImageNet classification with deep convolutional neural networks. In NIPS’2012 . 23, 24, 27, 100, 200, 371, 456, 460.
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C., 2016. Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360.

Amodei, D., Ananthanarayanan, S., Anubhai, R., Bai, J., Battenberg, E., Case, C., Casper, J., Catanzaro, B., Cheng, Q., Chen, G., 2016. Deep speech 2: End-to-end speech recognition in english and mandarin, International Conference on Machine Learning, pp. 173-182.
Le, Q.V., 2013. Building high-level features using large scale unsupervised learning, Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, pp. 8595-8598.
LeCun, Y., Boser, B.E., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W.E., Jackel, L.D., 1990. Handwritten digit recognition with a back-propagation network, Advances in neural information processing systems, pp. 396-404.
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P., 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 2278–2324.
LeCun, Y., Huang, F.J., Bottou, L., 2004. Learning methods for generic object recognition with invariance to pose and lighting, Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on. IEEE, pp. II-104.
Lenz, I., Lee, H., Saxena, A., 2015. Deep learning for detecting robotic grasps. The International Journal of Robotics Research 34, 705-724.
Levine, S., Pastor, P., Krizhevsky, A., Ibarz, J., Quillen, D., 2016. Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. The International Journal of Robotics Research, 0278364917710318.
Lin, Y., Lv, F., Zhu, S., Yang, M., Cour, T., Yu, K., Cao, L., Huang, T., 2011. Large-scale image classification: fast feature extraction and svm training, Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, pp. 1689-1696.
Long, J., Shelhamer, E., Darrell, T., 2015. Fully convolutional networks for semantic segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431-3440.
Luong, M.-T., Pham, H., Manning, C.D., 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025.
Redmon, J., Divvala, S., Girshick, R., Farhadi, A., 2016. You only look once: Unified, real-time object detection, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779-788.
Bahdanau, D., Chorowski, J., Serdyuk, D., Brakel, P., Bengio, Y., 2016. End-to-end attention-based large vocabulary speech recognition, Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on. IEEE, pp. 4945-4949.
Ren, S., He, K., Girshick, R., Sun, J., 2015. Faster R-CNN: Towards real-time object detection with region proposal networks, Advances in neural information processing systems, pp. 91-99.
Ren, S.Q., He, K.M., Girshick, R., Sun, J., 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Ieee T Pattern Anal 39, 1137-1149.
Rohrbach, M., Stark, M., Schiele, B., 2011. Evaluating knowledge transfer and zero-shot learning in a large-scale setting, Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, pp. 1641-1648.
Sánchez, J., Perronnin, F., 2011. High-dimensional signature compression for large-scale image classification, Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, pp. 1665-1672.
Simonyan, K., Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
Szegedy, C., Liu, W., Jia, Y.Q., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A., 2015. Going Deeper with Convolutions. Proc Cvpr Ieee, 1-9.
Zeiler, M.D., Fergus, R., 2014. Visualizing and Understanding Convolutional Networks. Computer Vision - Eccv 2014, Pt I 8689, 818-833.
Bengio, S., Weston, J., Grangier, D., 2010. Label embedding trees for large multi-class tasks, Advances in Neural Information Processing Systems, pp. 163-171.
Bengio, Y., Courville, A., Vincent, P., 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence 35, 1798-1828.
Coates, A., Ng, A., Lee, H., 2011. An analysis of single-layer networks in unsupervised feature learning, Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp. 215-223.
Deng, J., Berg, A.C., Li, K., Fei-Fei, L., 2010. What does classifying more than 10,000 image categories tell us?, European conference on computer vision. Springer, pp. 71-84.
Deng, J., Satheesh, S., Berg, A.C., Li, F., 2011. Fast and balanced: Efficient label tree learning for large scale object recognition, Advances in Neural Information Processing Systems, pp. 567-575.
Deshpande, A., 2018. https://adeshpande3.github.io/adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html.devblogs.nvidia.com, 2016. Deep Learning for Julia.
Girshick, R., 2015. Fast R-CNN. Ieee I Conf Comp Vis, 1440-1448.

Details

Primary Language

English

Subjects

-

Journal Section

Research Article

Authors

Özkan İnik ^*
Türkiye

Bulent Turan

Publication Date

April 20, 2018

Submission Date

January 31, 2018

Acceptance Date

April 30, 2018

Published in Issue

Year 2018 Volume: 7 Number: 1

IZ

https://izlik.org/JA86ZT45MD

APA

İnik, Ö., & Turan, B. (2018). Classification of Animals with Different Deep Learning Models. Journal of New Results in Science, 7(1), 9-16. https://izlik.org/JA86ZT45MD

AMA

1.İnik Ö, Turan B. Classification of Animals with Different Deep Learning Models. JNRS. 2018;7(1):9-16. https://izlik.org/JA86ZT45MD

Chicago

İnik, Özkan, and Bulent Turan. 2018. “Classification of Animals With Different Deep Learning Models”. Journal of New Results in Science 7 (1): 9-16. https://izlik.org/JA86ZT45MD.

EndNote

İnik Ö, Turan B (April 1, 2018) Classification of Animals with Different Deep Learning Models. Journal of New Results in Science 7 1 9–16.

IEEE

[1]Ö. İnik and B. Turan, “Classification of Animals with Different Deep Learning Models”, JNRS, vol. 7, no. 1, pp. 9–16, Apr. 2018, [Online]. Available: https://izlik.org/JA86ZT45MD

ISNAD

İnik, Özkan - Turan, Bulent. “Classification of Animals With Different Deep Learning Models”. Journal of New Results in Science 7/1 (April 1, 2018): 9-16. https://izlik.org/JA86ZT45MD.

JAMA

1.İnik Ö, Turan B. Classification of Animals with Different Deep Learning Models. JNRS. 2018;7:9–16.

MLA

İnik, Özkan, and Bulent Turan. “Classification of Animals With Different Deep Learning Models”. Journal of New Results in Science, vol. 7, no. 1, Apr. 2018, pp. 9-16, https://izlik.org/JA86ZT45MD.

Vancouver

1.Özkan İnik, Bulent Turan. Classification of Animals with Different Deep Learning Models. JNRS [Internet]. 2018 Apr. 1;7(1):9-16. Available from: https://izlik.org/JA86ZT45MD