Derin Öğrenme Teknikleri İle Nesne Tespiti Ve Takibi Üzerine Bir İnceleme

Fatma Gülşah Tan; Asım Sinan Yüksel; Erdal Aydemir; Mevlüt Ersoy

doi:10.31590/ejosat.878552

Review

A Review On Object Detection And Tracking With Deep Learning Techniques

Year 2021, Issue: 25, 159 - 171, 31.08.2021

Fatma Gülşah Tan , Asım Sinan Yüksel , Erdal Aydemir , Mevlüt Ersoy

https://doi.org/10.31590/ejosat.878552

Cited By: 9

Abstract

Deep learning is one of the artificial intelligence approaches that has recently become popular for minimizing human error. Deep learning techniques have the ability to successfully detect and interpret with the use of large amounts of data in many areas. Especially, the rapid increase in labeled data accumulated in the field of image processing has made it necessary to turn to deep learning algorithms. With the increasing data in these areas, deep learning methods are used to separate useful information from big data and to give meaning to text, images and audio files. In recent years, there has been an increase in the studies conducted in the field of object detection and object tracking. If there is an object to be followed after detection and analysis on non-stationary images such as videos, it is more difficult to extract meaningful information. In such cases, the use of deep learning algorithms enables image processing problems to be solved easily. The aim of this study is to examine the applications of deep learning and object detection and tracking, to explain the latest developments, to help researchers who will work in this field by giving information about popular libraries, data sets, algorithms.

Keywords

Deep Learning , Image Processing , Object Detection , Object Tracking , Neural Networks

References

Amidi, A., Amidi, S. (2020). Derin Öğrenme El Kitabı. Derin Öğrenme El Kitabı: https://stanford.edu/~shervine/l/tr/teaching/cs-229/cheatsheet-deep-learning adresinden alındı.
Avidan, S. (2004). Support Vector Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1064-1072, doi:10.1109/TPAMI.2004.53.
Avila, S., Thome, N., Cord, M., Valle, E., De A. Araújo, A. (2013). Pooling in image representation: The visual codeword point of view. Computer Vision and Image Understanding, 453-465, doi: 10.1016/j.cviu.2012.09.007.
Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B. (2016). Simple online and realtime tracking. Proceedings - International Conference on Image Processing, ICIP, 3464–3468, doi: 10.1109/ICIP.2017.8296962.
Bolouri, H. (1995). Book Review: Fundamentals of Neural Networks — Architectures, Algorithms, and Applications: L. FAUSETT. International Journal of Electrical Engineering Education, doi: 10.1177/002072099503200320.
Brocardo, M., Traore, I., Woungang, I., Obaidat, M. (2017). Authorship verification using deep belief network systems. International Journal of Communication Systems, doi: 10.1002/dac.3259.
Brunetti, A., Buongiorno, D., Trotta, G., Bevilacqua, V. (2018). Computer vision and deep learning techniques for pedestrian detection and tracking: A survey. Neurocomputing, doi: 10.1016/j.neucom.2018.01.092.
CBINSIGHTS, C. (2019). The Race For AI: Here Are The Tech Giants Rushing To Snap Up Artificial Intelligence Startups. 2020 tarihinde CBINSIGHTS adresinden alındı.
Chaudhary, S., Khan, M., Bhatnagar, C. (2018). Multiple Anomalous Activity Detection in Videos. Procedia Computer Science, 336-345, doi: 10.1016/j.procs.2017.12.045.
Cheng, X., Song, C., Gu, Y., Chen, B. (2020). Learning Attention for Object Tracking with Adversarial Learning Network, doi: 10.21203/rs.3.rs-15512/v3.
Ciaparrone, G., Luque Sánchez, F., Tabik, S., Troiano, L., Tagliaferri, R., Herrera, F. (2020). Deep learning in video multi-object tracking: A survey. Neurocomputing, 61-88, doi: 10.1016/j.neucom.2019.11.023.
Collobert, R., Farabet, C., Kavukcuoğlu, K. (2017). Torch | Scientific computing for LuaJIT. NIPS Workshop on Machine Learning Open Source Software.
Cortes, C., Vapnik, V. (1995). Support-Vector Networks. Machine Learning, doi: 10.1023/A:1022627411411.
Cömert, O., Hekim, M., Adem, K. (2019). Faster R-CNN Kullanarak Elmalarda Çürük Tespiti. Uluslararası Mühendislik Araştırma ve Geliştirme Dergisi, 335-341, doi: 10.29137/umagd.469929.
Daş, R., Polat, B., Tuna, G. (2019). Derin Öğrenme ile Resim ve Videolarda Nesnelerin Tanınması ve Takibi. Fırat Üniversitesi Müh. Bil. Dergisi, 571-581, doi: 10.35234/fumbd.608778.
DeepLearning4j. (2020). DeepLearning4j: https://deeplearning4j.org/ adresinden alındı.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Li, F. (2009). ImageNet: a Large-Scale Hierarchical Image Database. 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20-25, doi: 10.1109/cvpr.2009.5206848.
Deng, L., Yu, D. (2013). Deep learning: Methods and applications. Foundations and Trends in Signal Processing, 197-387, doi: 10.1561/2000000039.
Deori, B., Meitei, D. (2014). A survey on moving object tracking in video. International Journal on Information Theory, 31-46, doi: 10.5121/ijit.2014.3304.
Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A. (2008). The PASCAL Visual Object Classes Challenge 2008 (VOC) Results. http://www.pascal-network.org/ challenges/VOC/voc2008/workshop/index.html adresinden alındı.
Fukushima, K. (1980). Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 193-202, doi: 10.1007/BF00344251.
Geiger, A., Lenz, P., Urtasun, R. (2012). Are we ready for autonomous driving? the KITTI vision benchmark suite. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 3354-3361, doi: 10.1109/CVPR.2012.6248074.
Gibney, E. (2016). Google AI algorithm masters ancient game of Go. Nature, doi: 10.1038/529445a.
Girshick, R. (2015). Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 1440-1448, doi: 10.1109/ICCV.2015.169.
Girshick, R., Donahue, J., Darrell, T., Malik, J. (2016). Region-Based Convolutional Networks for Accurate Object Detection and Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 142 – 158, doi: 10.1109/TPAMI.2015.2437384.
Griffin, G., Holub, A., Perona, P. (2007). Caltech-256 object category dataset. Caltech mimeo.
Hanbay, K., Üzen, H. (2017). Nesne tespit ve takip metotları: Kapsamlı bir derleme. Tr. Doğa ve Fen Dergisi.
Havaei, M., Davy, A., Warde-Farley, D., Biard, A., Courville, A., Bengio, Y., . . . Larochelle, H. (2017). Brain tumor segmentation with Deep Neural Networks. Medical Image Analysis, 18-31, doi: 10.1016/j.media.2016.05.004.
He, K., Gkioxari, G., Dollar, P., Girshic, R. (2017). Mask R-CNN. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2961-2969, doi: 10.1109/ICCV.2017.322.
Hinton, G., Osindero, S., Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 1527-1554, doi: 10.1162/neco.2006.18.7.1527.
Hochreiter, S., Schmidhuber, J. (1997). LONG SHORT-TERM MEMORY. Neural Computation, 1735-1780, doi: 10.1162/neco.1997.9.8.1735.
Iliadis, M., Spinoulas, L., Katsaggelos, A. (2020). DeepBinaryMask: Learning a binary mask for video compressive sensing.
Digital Signal Processing: A Review Journal, doi: 10.1016/j.dsp.2019.102591.
Ivakhnenko, A., Lapa, V. (1965). Cybernetic predicting devices. CCM Information Corporation.
Jacob, A., Anitha, J. (2012). Inspection of various object tracking techniques. International Journal of Engineering and Innovative Technology, 118-124, doi: 10.17605/OSF.IO/Y5K3H.
Jia, Y., Shelhame, E., Donahue, J., Karayev, S., Long, J., Girshick, R., . . . Darrell, T. (2014). Caffe: Convolutional Architecture for Fast Feature Embedding. Computer Science > Computer Vision and Pattern Recognition, doi: 10.1145/2647868.2654889.
Keras: The Python Deep Learning API. (2020). Keras: The Python Deep Learning API: https://keras.io/ adresinden alındı.
Kim, D. (2020). Deeplearning Method For Voice Recognition Model And Voice Recognition Device Based On Artificial Neural Network.
Kim, S., Nam, J., Ko, B. (2018). Online Tracker Optimization for Multi-Pedestrian Tracking Using a Moving Vehicle Camera. IEEE Access, 48675-48687, doi: 10.1109/ACCESS.2018.2867621.
Leijnen, S., Veen, F. (2020). The Neural Network Zoo. Proceedings, 9.
Li, D., Wang, R., Xie, C., Liu, L., Zhang, J., Li, R., . . . Liu, W. (2020). A recognition method for rice plant diseases and pests video detection based on deep convolutional neural network. Sensors, Switzerland, doi: 10.3390/s20030578.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., . . . Zitnick, C. (2014). Microsoft COCO: Common Objects in Context. European Conference on Computer Vision, 740-755.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., C. Berg, A. (2016). SSD: Single Shot MultiBox Detector. European Conference on Computer Vision, 21-37, doi: 10.1007/978-3-319-46448-0_2.
Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., Alsaadi, F. (2017). A survey of deep neural network architectures and their applications. Neurocomputing, doi: 10.1016/j.neucom.2016.12.038.
Lu, Y., Lu, C., Tang, C. (2017). Online Video Object Detection Using Association LSTM. Proceedings of the IEEE International Conference on Computer Vision, 2344-2352, doi: 10.1109/ICCV.2017.257.
Luo, W., Xing, J., Milan, A., Zhang, X., Liu, W., Zhao, X., Kim, T.-K. (2014). Multiple Object Tracking: A Literature Review. Computer Vision and Pattern Recognition, doi: 10.1016/j.artint.2020.103448.
Mark, E., Van Gool, L., Williams, C., Winn, J., Zisserman, A. (2010). The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 303-338, doi: 10.1007/s11263-009-0275-4.
McCulloch, W., Pitts, W. (1988). A logical calculus of the ideas immanent in nervous activity. Neurocomputing, 15-27, doi: 10.1016/s0092-8240(05)80006-0.
Medsker, L., Jain, L. (2001). Recurrent Neural Network Design and Applications. Boca Raton London New York Washington, D.C.: CRC Press.
Mikada, T., Kanno, T., Kawase, T., Miyazaki, T., Kawashima, K. (2020). Suturing Support by Human Cooperative Robot Control Using Deep Learning. IEEE Access, 167739-167746, doi: 10.1109/ACCESS.2020.3023786.
Moreira, D., Avila, S., Perez, M., Moraes, D., Testoni, V., Valle, E., . . . Rocha, A. (2016). Pornography classification: the hidden clues. Forensic Science International, 46-61, doi: 10.1016/j.forsciint.2016.09.010.
Nawaratne, R., Alahakoon, D., De Silva, D., Yu, X. (2020). Spatiotemporal anomaly detection using deep learning for real-time video surveillance. IEEE Transactions on Industrial Informatics, doi: 10.1109/TII.2019.2938527.
NVIDIA DIGITS. (2020). NVIDIA DEVELOPER: https://developer.nvidia.com/digits adresinden alındı.
Omeroglu, A., Kumbasar, N., Oral, E., Ozbek, I. (2019). Mask R-CNN Algoritması ile Hangar Tespiti. 27th Signal Processing and Communications Applications Conference (SIU), doi: 10.1109/siu.2019.8806552.
O’Shea, K., Nash, R. (2015). An Introduction to Convolutional Neural Networks. Neural and Evolutionary Computing.
Ojha, S., Sakhare, S. (2015). Image processing techniques for object tracking in video surveillance-a survey. In Pervasive Computing (ICPC), 2015 International Conference on, 1-6, doi: 10.1109/PERVASIVE.2015.7087180.
Ozbaysar, E., Borandag, E. (2018). Vehicle plate tracking system. 26th IEEE Signal Processing and Communications Applications Conference, 1-4, doi: 10.1109/SIU.2018.8404648.
Perez, M., Avila, S., Moreira, D., Moraes, D., Testoni, V., Valle, E., . . . Rocha, A. (2017). Video pornography detection through deep learning techniques and motion information. Neurocomputing, 279-293, doi: 10.1016/j.neucom.2016.12.017.
Prasad, P., Pathak, R., Gunjan, V., Rao, H. (2019). Deep Learning Based Representation for Face Recognition. ICCCE, 419-424, doi: 10.1007/978-981-13-8715-9_50.
Redmon, J., Divvala, S., Girshick, R., Farhadi, A. (2016). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 779-788, doi: 10.1109/CVPR.2016.91.
Ren, S., He, K., Girshick, R., Sun, J. (2016). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1137 – 1149, doi: 10.1109/TPAMI.2016.2577031.
Russell, B., Torralba, A., Murphy, K., Freeman, W. (2008). LabelMe: A database and web-based tool for image annotation. International Journal of Computer Vision, 157-173, doi: 10.1007/s11263-007-0090-8.
Salakhutdinov, R., Hinton, G. (2009). Replicated softmax: An undirected topic model. Advances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference, 1607-1614.
Saleem, M., Potgieter, J., Arif, K. (2019). Plant Disease Detection and Classification by Deep Learning. Plants, 8(11):468, doi: doi:10.3390/plants8110468.
Sarkar, M., Bruyn, A. (2020). LSTM Response Models for Direct Marketing Analytics:Replacing Feature Engineering with Deep Learning. Forthcoming in Journal of Interactive Marketing, doi: 10.2139/ssrn.3601025.
Sejnowsk, T., R. Rosenberg, C. (1986). NETtalk: a parallel network that learns to read aloud. The Johns Hopkins University Electrical Engineering and Computer Science Technical Report.
Shetty, D., Varma, J., Navi, S., Ahmed, M. (2020). Diving Deep into Deep Learning: History, Evolution, Types and Applications. The International Journal on Media Management, 2278-3075, doi: 10.35940/ijitee.A4865.019320.
Shotton, J., Winn, J., Rother, C., Criminisi, A. (2006). TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation. European Conference on Computer Vision, 1-15, doi: 10.1007/11744023_1.
Singla, Z., Randhawa, S., Jain, S. (2017). Statistical and sentiment analysis of consumer product reviews. 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT), 1-6, doi: 10.1109/ICCCNT.2017.8203960.
Sorin, V., Barash, Y., Konen, E., Klang, E. (2020). Deep Learning for Natural Language Processing in Radiology—Fundamentals and a Systematic Review. Journal of the American College of Radiology, 639-648.
Sun, W., Zheng, B., Qian, W. (2017). Automatic feature learning using multichannel ROI based on deep structured algorithms for computerized lung cancer diagnosis. Computers in Biology and Medicine, 530-539, doi: 10.1016/j.compbiomed.2017.04.006.
Şeker, A., Diri, B., Balık, H. (2017). Derin Öğrenme Yöntemleri ve Uygulamaları Hakkında Bir İnceleme. Gazi Mühendislik Bilimleri Dergisi, 47-64.
TensorFlow. (2020). TensorFlow: https://www.tensorflow.org/ adresinden alındı.
Theano Development Team. (2016). Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints, vol. abs/1605.02688.
Toğaçar, M., Ergen, B. (2019). Biyomedikal Görüntülerde Derin Öğrenme ile Mevcut Yöntemlerin Kıyaslanması. Fırat Üniversitesi Mühendislik Bilimleri Dergisi.
Toğaçar, M., Ergen, B., Sertkaya, M. (2019). Zatürre Hastalığının Derin Öğrenme Modeli ile Tespiti. Fırat Üniversitesi Mühendislik Bilimleri Dergisi.
Torralba, A., Fergus, R., Freeman, W. (2008). 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1958-1970, doi: 10.1109/TPAMI.2008.128.
Turing, A. (2012). Computing machinery and intelligence. Machine Intelligence: Perspectives on the Computational Model, 433-460, 10.1093/mind/lix.236.433.
Vignesh Kanna, J., Ebenezer Raj, S., Meena, M., Meghana, S., Mansoor Roomi, S. (2020). Deep Learning Based Video Analytics for Person Tracking. International Conference on Emerging Trends in Information Technology and Engineering, ic-ETITE 2020, doi: 10.1109/ic-ETITE47903.2020.173.
Viola, P., Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, doi: 10.1109/cvpr.2001.990517.
Von Ahn, L., Dabbish, L. (2004). Labeling images with a computer game. Conference on Human Factors in Computing Systems - Proceedings, 319-326, doi: 10.1145/985692.985733.
Wang, D. (1998). Unsupervised video segmentation based on watersheds and temporal tracking. IEEE Transactions on Circuits and Systems for Video Technology, 539-546, doi: 10.1109/76.718501.
Yang, S., Bailey, E., Yang, Z., Ostrometzky, J., Zussman, G., Seskar, I., Kostic, Z. (2020). COSMOS Smart Intersection: Edge Compute and Communications for Bird's Eye Object Tracking. IEEE Annual Conference on Pervasive Computing and Communications Workshops (PerCom), 1-7, doi: 10.1109/PerComWorkshops48775.2020.9156225.
Yao, B., Yang, X., Zhu, S. (2007). Introduction to a large-scale general purpose ground truth database: Methodology, annotation tool and benchmarks. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 169-183, doi: 10.1007/978-3-540-74198-5_14.
You, L., Li, Y., Wang, Y., Zhang, J., Yang, Y. (2016). A deep learning based RNNs model for automatic security audit of short messages. 2016 16th International Symposium on Communications and Information Technologies, ISCIT 2016, doi: 10.1109/ISCIT.2016.7751626.
Yousefi-Azar, M., Hamey, L. (2017). Text summarization using unsupervised deep learning. Expert Systems with Applications, doi: 10.1016/j.eswa.2016.10.017.
Yu, F., Li, W., Li, Q., Liu, Y., Shi, X., Yan, J. (2016). POI: Multiple object tracking with high performance detection and appearance feature. European Conferance on Computer Vision, 36-42, doi: 10.1007/978-3-319-48881-3_3.
Yuret, D. (2016). Julia ve Knet ile Derin Öğrenmeye Giriş. Julia ve Knet ile Derin Öğrenmeye Giriş: http://www.denizyuret.com/2016/09/julia-ve-knet-ile-derin-ogrenmeye-giris.html adresinden alındı.
Zhang, L., Gray, H., Ye, X., Collins, L., Allinson, N. (2019). Automatic individual pig detection and tracking in pig farms. Sensors (Switzerland), doi: 10.3390/s19051188.
Zhao, D., Fu, H., Xiao, L., Wu, T., Dai, B. (2018). Multi-object tracking with correlation filter for autonomous vehicle. Sensors (Switzerland), doi: 10.3390/s18072004.

Derin Öğrenme Teknikleri İle Nesne Tespiti Ve Takibi Üzerine Bir İnceleme

Year 2021, Issue: 25, 159 - 171, 31.08.2021

Fatma Gülşah Tan , Asım Sinan Yüksel , Erdal Aydemir , Mevlüt Ersoy

https://doi.org/10.31590/ejosat.878552

Cited By: 9

Abstract

Derin öğrenme, son zamanlarda insan hatalarını en aza indirmesiyle popüler olan yapay zekâ yaklaşımlarındandır. Derin öğrenme teknikleri birçok alanda büyük miktardaki veri kullanımı ile başarılı bir şekilde algılama, yorumlama yapabilme yeteneğine sahiptir. Özellikle görüntü işleme alanında birikmiş etiketli verilerdeki hızlı artış derin öğrenme algoritmalarına yönelmeyi zorunlu hale getirmiştir. Bu alanlardaki verilerin giderek artmasıyla büyük verilerden yararlı bilgiyi ayırmak ve metin, görüntü, ses dosyalarına anlam kazandırmak amacıyla derin öğrenme yöntemleri kullanılmaktadır. Son yıllarda, nesne tespiti ve nesne takibi alanında yapılan çalışmalarda artış görülmektedir. Videolar gibi durağan olmayan görüntüler üzerinde tespit ve analiz sonrasında takip edilecek olan bir nesne varsa anlamlı bilgiler çıkarmak daha zor olmaktadır. Bu gibi durumlarda derin öğrenme algoritmalarının kullanılması görüntü işleme problemlerinin kolaylıkla çözüme kavuşturulabilmesini sağlamaktadır. Bu çalışmanın amacı; derin öğrenme ile nesne tespiti ve takibi konusunda yapılan uygulamaları incelemek, son gelişmeleri anlatmak, popüler kütüphaneler, veri setleri, algoritmalar hakkında bilgi vererek bu alanda çalışacak olan araştırmacılara yardımcı olmaktır.

Keywords

Derin Öğrenme , Görüntü İşleme , Nesne Tespiti , Nesne Takibi , Sinir Ağları

References

Amidi, A., Amidi, S. (2020). Derin Öğrenme El Kitabı. Derin Öğrenme El Kitabı: https://stanford.edu/~shervine/l/tr/teaching/cs-229/cheatsheet-deep-learning adresinden alındı.
Avidan, S. (2004). Support Vector Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1064-1072, doi:10.1109/TPAMI.2004.53.
Avila, S., Thome, N., Cord, M., Valle, E., De A. Araújo, A. (2013). Pooling in image representation: The visual codeword point of view. Computer Vision and Image Understanding, 453-465, doi: 10.1016/j.cviu.2012.09.007.
Bewley, A., Ge, Z., Ott, L., Ramos, F., Upcroft, B. (2016). Simple online and realtime tracking. Proceedings - International Conference on Image Processing, ICIP, 3464–3468, doi: 10.1109/ICIP.2017.8296962.
Bolouri, H. (1995). Book Review: Fundamentals of Neural Networks — Architectures, Algorithms, and Applications: L. FAUSETT. International Journal of Electrical Engineering Education, doi: 10.1177/002072099503200320.
Brocardo, M., Traore, I., Woungang, I., Obaidat, M. (2017). Authorship verification using deep belief network systems. International Journal of Communication Systems, doi: 10.1002/dac.3259.
Brunetti, A., Buongiorno, D., Trotta, G., Bevilacqua, V. (2018). Computer vision and deep learning techniques for pedestrian detection and tracking: A survey. Neurocomputing, doi: 10.1016/j.neucom.2018.01.092.
CBINSIGHTS, C. (2019). The Race For AI: Here Are The Tech Giants Rushing To Snap Up Artificial Intelligence Startups. 2020 tarihinde CBINSIGHTS adresinden alındı.
Chaudhary, S., Khan, M., Bhatnagar, C. (2018). Multiple Anomalous Activity Detection in Videos. Procedia Computer Science, 336-345, doi: 10.1016/j.procs.2017.12.045.
Cheng, X., Song, C., Gu, Y., Chen, B. (2020). Learning Attention for Object Tracking with Adversarial Learning Network, doi: 10.21203/rs.3.rs-15512/v3.
Ciaparrone, G., Luque Sánchez, F., Tabik, S., Troiano, L., Tagliaferri, R., Herrera, F. (2020). Deep learning in video multi-object tracking: A survey. Neurocomputing, 61-88, doi: 10.1016/j.neucom.2019.11.023.
Collobert, R., Farabet, C., Kavukcuoğlu, K. (2017). Torch | Scientific computing for LuaJIT. NIPS Workshop on Machine Learning Open Source Software.
Cortes, C., Vapnik, V. (1995). Support-Vector Networks. Machine Learning, doi: 10.1023/A:1022627411411.
Cömert, O., Hekim, M., Adem, K. (2019). Faster R-CNN Kullanarak Elmalarda Çürük Tespiti. Uluslararası Mühendislik Araştırma ve Geliştirme Dergisi, 335-341, doi: 10.29137/umagd.469929.
Daş, R., Polat, B., Tuna, G. (2019). Derin Öğrenme ile Resim ve Videolarda Nesnelerin Tanınması ve Takibi. Fırat Üniversitesi Müh. Bil. Dergisi, 571-581, doi: 10.35234/fumbd.608778.
DeepLearning4j. (2020). DeepLearning4j: https://deeplearning4j.org/ adresinden alındı.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Li, F. (2009). ImageNet: a Large-Scale Hierarchical Image Database. 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20-25, doi: 10.1109/cvpr.2009.5206848.
Deng, L., Yu, D. (2013). Deep learning: Methods and applications. Foundations and Trends in Signal Processing, 197-387, doi: 10.1561/2000000039.
Deori, B., Meitei, D. (2014). A survey on moving object tracking in video. International Journal on Information Theory, 31-46, doi: 10.5121/ijit.2014.3304.
Everingham, M., Van Gool, L., Williams, C., Winn, J., Zisserman, A. (2008). The PASCAL Visual Object Classes Challenge 2008 (VOC) Results. http://www.pascal-network.org/ challenges/VOC/voc2008/workshop/index.html adresinden alındı.
Fukushima, K. (1980). Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 193-202, doi: 10.1007/BF00344251.
Geiger, A., Lenz, P., Urtasun, R. (2012). Are we ready for autonomous driving? the KITTI vision benchmark suite. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 3354-3361, doi: 10.1109/CVPR.2012.6248074.
Gibney, E. (2016). Google AI algorithm masters ancient game of Go. Nature, doi: 10.1038/529445a.
Girshick, R. (2015). Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 1440-1448, doi: 10.1109/ICCV.2015.169.
Girshick, R., Donahue, J., Darrell, T., Malik, J. (2016). Region-Based Convolutional Networks for Accurate Object Detection and Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 142 – 158, doi: 10.1109/TPAMI.2015.2437384.
Griffin, G., Holub, A., Perona, P. (2007). Caltech-256 object category dataset. Caltech mimeo.
Hanbay, K., Üzen, H. (2017). Nesne tespit ve takip metotları: Kapsamlı bir derleme. Tr. Doğa ve Fen Dergisi.
Havaei, M., Davy, A., Warde-Farley, D., Biard, A., Courville, A., Bengio, Y., . . . Larochelle, H. (2017). Brain tumor segmentation with Deep Neural Networks. Medical Image Analysis, 18-31, doi: 10.1016/j.media.2016.05.004.
He, K., Gkioxari, G., Dollar, P., Girshic, R. (2017). Mask R-CNN. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2961-2969, doi: 10.1109/ICCV.2017.322.
Hinton, G., Osindero, S., Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 1527-1554, doi: 10.1162/neco.2006.18.7.1527.
Hochreiter, S., Schmidhuber, J. (1997). LONG SHORT-TERM MEMORY. Neural Computation, 1735-1780, doi: 10.1162/neco.1997.9.8.1735.
Iliadis, M., Spinoulas, L., Katsaggelos, A. (2020). DeepBinaryMask: Learning a binary mask for video compressive sensing.
Digital Signal Processing: A Review Journal, doi: 10.1016/j.dsp.2019.102591.
Ivakhnenko, A., Lapa, V. (1965). Cybernetic predicting devices. CCM Information Corporation.
Jacob, A., Anitha, J. (2012). Inspection of various object tracking techniques. International Journal of Engineering and Innovative Technology, 118-124, doi: 10.17605/OSF.IO/Y5K3H.
Jia, Y., Shelhame, E., Donahue, J., Karayev, S., Long, J., Girshick, R., . . . Darrell, T. (2014). Caffe: Convolutional Architecture for Fast Feature Embedding. Computer Science > Computer Vision and Pattern Recognition, doi: 10.1145/2647868.2654889.
Keras: The Python Deep Learning API. (2020). Keras: The Python Deep Learning API: https://keras.io/ adresinden alındı.
Kim, D. (2020). Deeplearning Method For Voice Recognition Model And Voice Recognition Device Based On Artificial Neural Network.
Kim, S., Nam, J., Ko, B. (2018). Online Tracker Optimization for Multi-Pedestrian Tracking Using a Moving Vehicle Camera. IEEE Access, 48675-48687, doi: 10.1109/ACCESS.2018.2867621.
Leijnen, S., Veen, F. (2020). The Neural Network Zoo. Proceedings, 9.
Li, D., Wang, R., Xie, C., Liu, L., Zhang, J., Li, R., . . . Liu, W. (2020). A recognition method for rice plant diseases and pests video detection based on deep convolutional neural network. Sensors, Switzerland, doi: 10.3390/s20030578.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., . . . Zitnick, C. (2014). Microsoft COCO: Common Objects in Context. European Conference on Computer Vision, 740-755.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., C. Berg, A. (2016). SSD: Single Shot MultiBox Detector. European Conference on Computer Vision, 21-37, doi: 10.1007/978-3-319-46448-0_2.
Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., Alsaadi, F. (2017). A survey of deep neural network architectures and their applications. Neurocomputing, doi: 10.1016/j.neucom.2016.12.038.
Lu, Y., Lu, C., Tang, C. (2017). Online Video Object Detection Using Association LSTM. Proceedings of the IEEE International Conference on Computer Vision, 2344-2352, doi: 10.1109/ICCV.2017.257.
Luo, W., Xing, J., Milan, A., Zhang, X., Liu, W., Zhao, X., Kim, T.-K. (2014). Multiple Object Tracking: A Literature Review. Computer Vision and Pattern Recognition, doi: 10.1016/j.artint.2020.103448.
Mark, E., Van Gool, L., Williams, C., Winn, J., Zisserman, A. (2010). The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 303-338, doi: 10.1007/s11263-009-0275-4.
McCulloch, W., Pitts, W. (1988). A logical calculus of the ideas immanent in nervous activity. Neurocomputing, 15-27, doi: 10.1016/s0092-8240(05)80006-0.
Medsker, L., Jain, L. (2001). Recurrent Neural Network Design and Applications. Boca Raton London New York Washington, D.C.: CRC Press.
Mikada, T., Kanno, T., Kawase, T., Miyazaki, T., Kawashima, K. (2020). Suturing Support by Human Cooperative Robot Control Using Deep Learning. IEEE Access, 167739-167746, doi: 10.1109/ACCESS.2020.3023786.
Moreira, D., Avila, S., Perez, M., Moraes, D., Testoni, V., Valle, E., . . . Rocha, A. (2016). Pornography classification: the hidden clues. Forensic Science International, 46-61, doi: 10.1016/j.forsciint.2016.09.010.
Nawaratne, R., Alahakoon, D., De Silva, D., Yu, X. (2020). Spatiotemporal anomaly detection using deep learning for real-time video surveillance. IEEE Transactions on Industrial Informatics, doi: 10.1109/TII.2019.2938527.
NVIDIA DIGITS. (2020). NVIDIA DEVELOPER: https://developer.nvidia.com/digits adresinden alındı.
Omeroglu, A., Kumbasar, N., Oral, E., Ozbek, I. (2019). Mask R-CNN Algoritması ile Hangar Tespiti. 27th Signal Processing and Communications Applications Conference (SIU), doi: 10.1109/siu.2019.8806552.
O’Shea, K., Nash, R. (2015). An Introduction to Convolutional Neural Networks. Neural and Evolutionary Computing.
Ojha, S., Sakhare, S. (2015). Image processing techniques for object tracking in video surveillance-a survey. In Pervasive Computing (ICPC), 2015 International Conference on, 1-6, doi: 10.1109/PERVASIVE.2015.7087180.
Ozbaysar, E., Borandag, E. (2018). Vehicle plate tracking system. 26th IEEE Signal Processing and Communications Applications Conference, 1-4, doi: 10.1109/SIU.2018.8404648.
Perez, M., Avila, S., Moreira, D., Moraes, D., Testoni, V., Valle, E., . . . Rocha, A. (2017). Video pornography detection through deep learning techniques and motion information. Neurocomputing, 279-293, doi: 10.1016/j.neucom.2016.12.017.
Prasad, P., Pathak, R., Gunjan, V., Rao, H. (2019). Deep Learning Based Representation for Face Recognition. ICCCE, 419-424, doi: 10.1007/978-981-13-8715-9_50.
Redmon, J., Divvala, S., Girshick, R., Farhadi, A. (2016). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 779-788, doi: 10.1109/CVPR.2016.91.
Ren, S., He, K., Girshick, R., Sun, J. (2016). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1137 – 1149, doi: 10.1109/TPAMI.2016.2577031.
Russell, B., Torralba, A., Murphy, K., Freeman, W. (2008). LabelMe: A database and web-based tool for image annotation. International Journal of Computer Vision, 157-173, doi: 10.1007/s11263-007-0090-8.
Salakhutdinov, R., Hinton, G. (2009). Replicated softmax: An undirected topic model. Advances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference, 1607-1614.
Saleem, M., Potgieter, J., Arif, K. (2019). Plant Disease Detection and Classification by Deep Learning. Plants, 8(11):468, doi: doi:10.3390/plants8110468.
Sarkar, M., Bruyn, A. (2020). LSTM Response Models for Direct Marketing Analytics:Replacing Feature Engineering with Deep Learning. Forthcoming in Journal of Interactive Marketing, doi: 10.2139/ssrn.3601025.
Sejnowsk, T., R. Rosenberg, C. (1986). NETtalk: a parallel network that learns to read aloud. The Johns Hopkins University Electrical Engineering and Computer Science Technical Report.
Shetty, D., Varma, J., Navi, S., Ahmed, M. (2020). Diving Deep into Deep Learning: History, Evolution, Types and Applications. The International Journal on Media Management, 2278-3075, doi: 10.35940/ijitee.A4865.019320.
Shotton, J., Winn, J., Rother, C., Criminisi, A. (2006). TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation. European Conference on Computer Vision, 1-15, doi: 10.1007/11744023_1.
Singla, Z., Randhawa, S., Jain, S. (2017). Statistical and sentiment analysis of consumer product reviews. 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT), 1-6, doi: 10.1109/ICCCNT.2017.8203960.
Sorin, V., Barash, Y., Konen, E., Klang, E. (2020). Deep Learning for Natural Language Processing in Radiology—Fundamentals and a Systematic Review. Journal of the American College of Radiology, 639-648.
Sun, W., Zheng, B., Qian, W. (2017). Automatic feature learning using multichannel ROI based on deep structured algorithms for computerized lung cancer diagnosis. Computers in Biology and Medicine, 530-539, doi: 10.1016/j.compbiomed.2017.04.006.
Şeker, A., Diri, B., Balık, H. (2017). Derin Öğrenme Yöntemleri ve Uygulamaları Hakkında Bir İnceleme. Gazi Mühendislik Bilimleri Dergisi, 47-64.
TensorFlow. (2020). TensorFlow: https://www.tensorflow.org/ adresinden alındı.
Theano Development Team. (2016). Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints, vol. abs/1605.02688.
Toğaçar, M., Ergen, B. (2019). Biyomedikal Görüntülerde Derin Öğrenme ile Mevcut Yöntemlerin Kıyaslanması. Fırat Üniversitesi Mühendislik Bilimleri Dergisi.
Toğaçar, M., Ergen, B., Sertkaya, M. (2019). Zatürre Hastalığının Derin Öğrenme Modeli ile Tespiti. Fırat Üniversitesi Mühendislik Bilimleri Dergisi.
Torralba, A., Fergus, R., Freeman, W. (2008). 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1958-1970, doi: 10.1109/TPAMI.2008.128.
Turing, A. (2012). Computing machinery and intelligence. Machine Intelligence: Perspectives on the Computational Model, 433-460, 10.1093/mind/lix.236.433.
Vignesh Kanna, J., Ebenezer Raj, S., Meena, M., Meghana, S., Mansoor Roomi, S. (2020). Deep Learning Based Video Analytics for Person Tracking. International Conference on Emerging Trends in Information Technology and Engineering, ic-ETITE 2020, doi: 10.1109/ic-ETITE47903.2020.173.
Viola, P., Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, doi: 10.1109/cvpr.2001.990517.
Von Ahn, L., Dabbish, L. (2004). Labeling images with a computer game. Conference on Human Factors in Computing Systems - Proceedings, 319-326, doi: 10.1145/985692.985733.
Wang, D. (1998). Unsupervised video segmentation based on watersheds and temporal tracking. IEEE Transactions on Circuits and Systems for Video Technology, 539-546, doi: 10.1109/76.718501.
Yang, S., Bailey, E., Yang, Z., Ostrometzky, J., Zussman, G., Seskar, I., Kostic, Z. (2020). COSMOS Smart Intersection: Edge Compute and Communications for Bird's Eye Object Tracking. IEEE Annual Conference on Pervasive Computing and Communications Workshops (PerCom), 1-7, doi: 10.1109/PerComWorkshops48775.2020.9156225.
Yao, B., Yang, X., Zhu, S. (2007). Introduction to a large-scale general purpose ground truth database: Methodology, annotation tool and benchmarks. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 169-183, doi: 10.1007/978-3-540-74198-5_14.
You, L., Li, Y., Wang, Y., Zhang, J., Yang, Y. (2016). A deep learning based RNNs model for automatic security audit of short messages. 2016 16th International Symposium on Communications and Information Technologies, ISCIT 2016, doi: 10.1109/ISCIT.2016.7751626.
Yousefi-Azar, M., Hamey, L. (2017). Text summarization using unsupervised deep learning. Expert Systems with Applications, doi: 10.1016/j.eswa.2016.10.017.
Yu, F., Li, W., Li, Q., Liu, Y., Shi, X., Yan, J. (2016). POI: Multiple object tracking with high performance detection and appearance feature. European Conferance on Computer Vision, 36-42, doi: 10.1007/978-3-319-48881-3_3.
Yuret, D. (2016). Julia ve Knet ile Derin Öğrenmeye Giriş. Julia ve Knet ile Derin Öğrenmeye Giriş: http://www.denizyuret.com/2016/09/julia-ve-knet-ile-derin-ogrenmeye-giris.html adresinden alındı.
Zhang, L., Gray, H., Ye, X., Collins, L., Allinson, N. (2019). Automatic individual pig detection and tracking in pig farms. Sensors (Switzerland), doi: 10.3390/s19051188.
Zhao, D., Fu, H., Xiao, L., Wu, T., Dai, B. (2018). Multi-object tracking with correlation filter for autonomous vehicle. Sensors (Switzerland), doi: 10.3390/s18072004.

There are 90 citations in total.

Details

Primary Language	Turkish
Subjects	Engineering
Journal Section	Articles
Authors	Fatma Gülşah Tan 0000-0002-2748-0396 Asım Sinan Yüksel 0000-0003-1986-5269 Erdal Aydemir 0000-0003-4834-725X Mevlüt Ersoy 0000-0003-2963-7729
Publication Date	August 31, 2021
Published in Issue	Year 2021 Issue: 25

Cite

APA	Tan, F. G., Yüksel, A. S., Aydemir, E., Ersoy, M. (2021). Derin Öğrenme Teknikleri İle Nesne Tespiti Ve Takibi Üzerine Bir İnceleme. Avrupa Bilim Ve Teknoloji Dergisi(25), 159-171. https://doi.org/10.31590/ejosat.878552