Classification of Invoice Images By Using Convolutional Neural Networks

Ömer Arslan; Sait Ali Uymaz

doi:10.28979/jarnas.953634

Research Article

Classification of Invoice Images By Using Convolutional Neural Networks

Year 2022, Volume: 8 Issue: 1, 8 - 25, 10.03.2022

Ömer Arslan , Sait Ali Uymaz

https://doi.org/10.28979/jarnas.953634

Cited By: 2

Abstract

Today, as the companies grow, the number of personnel working within the company and the number of supplier companies that the company works with are also increasing. In parallel with this increase, the amount of expenditure made on behalf of the company increases, and more invoices are created. Since the invoices must be kept for legal reasons, physical invoices are transferred to the digital environment. Since large companies have large numbers of invoices, labor demand is higher in digitalizing invoices. In addition, as the number of invoices to be transferred to digital media increases, the number of possible errors during entry becomes more. This paper aims to automate the transfer of invoices to the digital environment. In this study, invoices belonging to four different templates were used. Invoice images taken from a bank system were used for the first time in this study, and the original invoice dataset was prepared. Furthermore, two more datasets were obtained by applying preprocessing methods (Zero-Padding, Brightness Augmentation) on the original dataset. The Invoice classification system developed using Convolutional Neural Networks (CNN) architectures named LeNet-5, VGG-19, and MobileNetV2 was trained on three different data sets. Data preprocessing techniques such as correcting the curvature and aspect ratio of the invoices and image augmentation with variable brightness ratio were applied to create the data sets. The datasets created with preprocessing techniques have increased the classification success of the proposed models. With this proposed model, invoice images were automatically classified according to their templates using CNN architectures. In experimental studies, a classification success rate of 99.83% was achieved in training performed on the data set produced by the data augmentation method.

Keywords

Convolutional Neural Networks, Deep Learning, Image Classification, Invoice

References

Afzal, M. Z., Capobianco, S., Malik, M. I., Marinai, S., Breuel, T. M., Dengel, A., & Liwicki, M. (2015). Deepdocclassifier: Document classification with deep convolutional neural network. Paper presented at the 2015 13th international conference on document analysis and recognition (ICDAR).
Aloysius, N., & Geetha, M. (2017). A review on deep convolutional neural networks. Paper presented at the 2017 International Conference on Communication and Signal Processing (ICCSP).
Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
Brown, J. M. (2017). Predicting math test scores using k-nearest neighbor. Paper presented at the 2017 IEEE Integrated STEM Education Conference (ISEC).
Carvalho, T., De Rezende, E. R., Alves, M. T., Balieiro, F. K., & Sovat, R. B. (2017). Exposing computer generated images by eye’s region classification via transfer learning of VGG19 CNN. Paper presented at the 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).
Casey, R., Ferguson, D., Mohiuddin, K., & Walach, E. (1992). Intelligent forms processing system. Machine Vision and Applications, 5(3), 143-155.
Chunhavittayatera, S., Chitsobhuk, O., & Tongprasert, K. (2006). Image registration using Hough transform and phase correlation. Paper presented at the 2006 8th International Conference Advanced Communication Technology.
Duda, R. O., & Hart, P. E. (1972). Use of the Hough transformation to detect lines and curves in pictures. Communications of the ACM, 15(1), 11-15.
Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., . . . Cai, J. (2018). Recent advances in convolutional neural networks. Pattern Recognition, 77, 354-377.
Ha, P. S., & Shakeri, M. (2016). License Plate Automatic Recognition based on edge detection. Paper presented at the 2016 Artificial Intelligence and Robotics (IRANOPEN).
Kang, L., Kumar, J., Ye, P., Li, Y., & Doermann, D. (2014). Convolutional neural networks for document image classification. Paper presented at the 2014 22nd International Conference on Pattern Recognition.
Khan, M., & Mufti, N. (2016). Comparison of various edge detection filters for ANPR. Paper presented at the 2016 Sixth International Conference on Innovative Computing Technology (INTECH).
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Paper presented at the Advances in neural information processing systems.
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
Liu, T., Fang, S., Zhao, Y., Wang, P., & Zhang, J. (2015). Implementation of training convolutional neural networks. arXiv preprint arXiv:1506.01195.
Nguyen, A.-D., Choi, S., Kim, W., Ahn, S., Kim, J., & Lee, S. (2019). Distribution Padding in Convolutional Neural Networks. Paper presented at the 2019 IEEE International Conference on Image Processing (ICIP).
O'Shea, K., & Nash, R. (2015). An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458. Rawat, W., & Wang, Z. (2017). Deep convolutional neural networks for image classification: A comprehensive review. Neural computation, 29(9), 2352-2449.
Reghunath, A., Nair, S. V., & Shah, J. (2019). Deep learning based Customized Model for Features Extraction. Paper presented at the 2019 International Conference on Communication and Electronics Systems (ICCES).
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L.-C. (2018). Mobilenetv2: Inverted residuals and linear bottlenecks. Paper presented at the Proceedings of the IEEE conference on computer vision and pattern recognition.
Saxen, F., Werner, P., Handrich, S., Othman, E., Dinges, L., & Al-Hamadi, A. (2019). Face attribute detection with mobilenetv2 and nasnet-mobile. Paper presented at the 2019 11th International Symposium on Image and Signal Processing and Analysis (ISPA).
Shaha, M., & Pawar, M. (2018). Transfer learning for image classification. Paper presented at the 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA).
Sidhwa, H., Kulshrestha, S., Malhotra, S., & Virmani, S. (2018). Text extraction from bills and invoices. Paper presented at the 2018 International Conference on Advances in Computing, Communication Control and Networking (ICACCCN).
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
Tang, Y. Y., Suen, C. Y., De Yan, C., & Cheriet, M. (1995). Financial document processing based on staff line and description language. IEEE transactions on systems, man, and cybernetics, 25(5), 738-754.
Tarawneh, A. S., Hassanat, A. B., Chetverikov, D., Lendak, I., & Verma, C. (2019). Invoice classification using deep features and machine learning techniques. Paper presented at the 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology (JEEIT).
Toğaçar, M., Cömert, Z., & Ergen, B. (2021). Intelligent skin cancer detection applying autoencoder, MobileNetV2 and spiking neural networks. Chaos, Solitons & Fractals, 144, 110714.
Wang, G., & Gong, J. (2019). Facial expression recognition based on improved LeNet-5 CNN. Paper presented at the 2019 Chinese Control And Decision Conference (CCDC).
Xia, Y., Cai, M., Ni, C., Wang, C., Shiping, E., & Li, H. (2019). A Switch State Recognition Method based on Improved VGG19 network. Paper presented at the 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC).
Zou, Y., Zhao, L., Qin, S., Pan, M., & Li, Z. (2020). Ship target detection and identification based on SSD_MobilenetV2. Paper presented at the 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC).

Year 2022, Volume: 8 Issue: 1, 8 - 25, 10.03.2022

Ömer Arslan , Sait Ali Uymaz

https://doi.org/10.28979/jarnas.953634

Cited By: 2

Abstract

References

Afzal, M. Z., Capobianco, S., Malik, M. I., Marinai, S., Breuel, T. M., Dengel, A., & Liwicki, M. (2015). Deepdocclassifier: Document classification with deep convolutional neural network. Paper presented at the 2015 13th international conference on document analysis and recognition (ICDAR).
Aloysius, N., & Geetha, M. (2017). A review on deep convolutional neural networks. Paper presented at the 2017 International Conference on Communication and Signal Processing (ICCSP).
Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
Brown, J. M. (2017). Predicting math test scores using k-nearest neighbor. Paper presented at the 2017 IEEE Integrated STEM Education Conference (ISEC).
Carvalho, T., De Rezende, E. R., Alves, M. T., Balieiro, F. K., & Sovat, R. B. (2017). Exposing computer generated images by eye’s region classification via transfer learning of VGG19 CNN. Paper presented at the 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).
Casey, R., Ferguson, D., Mohiuddin, K., & Walach, E. (1992). Intelligent forms processing system. Machine Vision and Applications, 5(3), 143-155.
Chunhavittayatera, S., Chitsobhuk, O., & Tongprasert, K. (2006). Image registration using Hough transform and phase correlation. Paper presented at the 2006 8th International Conference Advanced Communication Technology.
Duda, R. O., & Hart, P. E. (1972). Use of the Hough transformation to detect lines and curves in pictures. Communications of the ACM, 15(1), 11-15.
Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., . . . Cai, J. (2018). Recent advances in convolutional neural networks. Pattern Recognition, 77, 354-377.
Ha, P. S., & Shakeri, M. (2016). License Plate Automatic Recognition based on edge detection. Paper presented at the 2016 Artificial Intelligence and Robotics (IRANOPEN).
Kang, L., Kumar, J., Ye, P., Li, Y., & Doermann, D. (2014). Convolutional neural networks for document image classification. Paper presented at the 2014 22nd International Conference on Pattern Recognition.
Khan, M., & Mufti, N. (2016). Comparison of various edge detection filters for ANPR. Paper presented at the 2016 Sixth International Conference on Innovative Computing Technology (INTECH).
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Paper presented at the Advances in neural information processing systems.
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
Liu, T., Fang, S., Zhao, Y., Wang, P., & Zhang, J. (2015). Implementation of training convolutional neural networks. arXiv preprint arXiv:1506.01195.
Nguyen, A.-D., Choi, S., Kim, W., Ahn, S., Kim, J., & Lee, S. (2019). Distribution Padding in Convolutional Neural Networks. Paper presented at the 2019 IEEE International Conference on Image Processing (ICIP).
O'Shea, K., & Nash, R. (2015). An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458. Rawat, W., & Wang, Z. (2017). Deep convolutional neural networks for image classification: A comprehensive review. Neural computation, 29(9), 2352-2449.
Reghunath, A., Nair, S. V., & Shah, J. (2019). Deep learning based Customized Model for Features Extraction. Paper presented at the 2019 International Conference on Communication and Electronics Systems (ICCES).
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L.-C. (2018). Mobilenetv2: Inverted residuals and linear bottlenecks. Paper presented at the Proceedings of the IEEE conference on computer vision and pattern recognition.
Saxen, F., Werner, P., Handrich, S., Othman, E., Dinges, L., & Al-Hamadi, A. (2019). Face attribute detection with mobilenetv2 and nasnet-mobile. Paper presented at the 2019 11th International Symposium on Image and Signal Processing and Analysis (ISPA).
Shaha, M., & Pawar, M. (2018). Transfer learning for image classification. Paper presented at the 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA).
Sidhwa, H., Kulshrestha, S., Malhotra, S., & Virmani, S. (2018). Text extraction from bills and invoices. Paper presented at the 2018 International Conference on Advances in Computing, Communication Control and Networking (ICACCCN).
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
Tang, Y. Y., Suen, C. Y., De Yan, C., & Cheriet, M. (1995). Financial document processing based on staff line and description language. IEEE transactions on systems, man, and cybernetics, 25(5), 738-754.
Tarawneh, A. S., Hassanat, A. B., Chetverikov, D., Lendak, I., & Verma, C. (2019). Invoice classification using deep features and machine learning techniques. Paper presented at the 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology (JEEIT).
Toğaçar, M., Cömert, Z., & Ergen, B. (2021). Intelligent skin cancer detection applying autoencoder, MobileNetV2 and spiking neural networks. Chaos, Solitons & Fractals, 144, 110714.
Wang, G., & Gong, J. (2019). Facial expression recognition based on improved LeNet-5 CNN. Paper presented at the 2019 Chinese Control And Decision Conference (CCDC).
Xia, Y., Cai, M., Ni, C., Wang, C., Shiping, E., & Li, H. (2019). A Switch State Recognition Method based on Improved VGG19 network. Paper presented at the 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC).
Zou, Y., Zhao, L., Qin, S., Pan, M., & Li, Z. (2020). Ship target detection and identification based on SSD_MobilenetV2. Paper presented at the 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC).

There are 29 citations in total.

Details

Primary Language	English
Subjects	Artificial Intelligence
Journal Section	Research Article
Authors	Ömer Arslan 0000-0003-3474-2988 Sait Ali Uymaz 0000-0003-2748-8483
Early Pub Date	March 10, 2022
Publication Date	March 10, 2022
Submission Date	June 17, 2021
Published in Issue	Year 2022 Volume: 8 Issue: 1

Cite

APA	Arslan, Ö., & Uymaz, S. A. (2022). Classification of Invoice Images By Using Convolutional Neural Networks. Journal of Advanced Research in Natural and Applied Sciences, 8(1), 8-25. https://doi.org/10.28979/jarnas.953634
AMA	Arslan Ö, Uymaz SA. Classification of Invoice Images By Using Convolutional Neural Networks. JARNAS. March 2022;8(1):8-25. doi:10.28979/jarnas.953634
Chicago	Arslan, Ömer, and Sait Ali Uymaz. “Classification of Invoice Images By Using Convolutional Neural Networks”. Journal of Advanced Research in Natural and Applied Sciences 8, no. 1 (March 2022): 8-25. https://doi.org/10.28979/jarnas.953634.
EndNote	Arslan Ö, Uymaz SA (March 1, 2022) Classification of Invoice Images By Using Convolutional Neural Networks. Journal of Advanced Research in Natural and Applied Sciences 8 1 8–25.
IEEE	Ö. Arslan and S. A. Uymaz, “Classification of Invoice Images By Using Convolutional Neural Networks”, JARNAS, vol. 8, no. 1, pp. 8–25, 2022, doi: 10.28979/jarnas.953634.
ISNAD	Arslan, Ömer - Uymaz, Sait Ali. “Classification of Invoice Images By Using Convolutional Neural Networks”. Journal of Advanced Research in Natural and Applied Sciences 8/1 (March 2022), 8-25. https://doi.org/10.28979/jarnas.953634.
JAMA	Arslan Ö, Uymaz SA. Classification of Invoice Images By Using Convolutional Neural Networks. JARNAS. 2022;8:8–25.
MLA	Arslan, Ömer and Sait Ali Uymaz. “Classification of Invoice Images By Using Convolutional Neural Networks”. Journal of Advanced Research in Natural and Applied Sciences, vol. 8, no. 1, 2022, pp. 8-25, doi:10.28979/jarnas.953634.
Vancouver	Arslan Ö, Uymaz SA. Classification of Invoice Images By Using Convolutional Neural Networks. JARNAS. 2022;8(1):8-25.