TY  - JOUR
T1  - Lightweight Transformer Model for Agricultural Land Use and Land Cover  Classification
AU  - Çelik, Kemal
PY  - 2025
DA  - September
Y2  - 2025
DO  - 10.15832/ankutbd.1624812
JF  - Journal of Agricultural Sciences
JO  - J Agr Sci-Tarim Bili
PB  - Ankara University
WT  - DergiPark
SN  - 1300-7580
SP  - 941
EP  - 959
VL  - 31
IS  - 4
LA  - en
AB  - Observing agricultural land use via remote sensing images is essential for ensuring food security, estimating yields and planning efficient exports nonetheless precise classification continues to be difficult because of the varied and evolving characteristics of agricultural environments. This research aims to evaluate and optimize advanced deep learning architectures particularly Vision Transformer (ViT) models for agricultural land-use classification tasks. Specifically, we employed ViTBase-16 and other lightweight models DeiT-Tiny and EfficientNet-B0 applying techniques such as model layer compression and advanced data augmentation CutMix and Cutout to achieve high accuracy while significantly reducing computational complexity. Evaluation was performed using three benchmark remote sensing datasets EuroSAT, NWPU-RESISC45 and SIRI-WHU which include diverse spatial resolutions and agricultural classes relevant for practical monitoring. Findings indicate that the optimized ViT algorithm is highly effective in recognizing global spatial connections, consistently achieving remarkable classification accuracy exceeding 99% on a newly assembled dataset containing around 200 samples of Google Earth imagery. Furthermore, the first time in agricultural image classification compressing the ViTBase model by pruning 50% of its layers significantly reduced complexity maintainingcompetitive accuracy 97.9% on SIRI-WHU. The resulting models are particularly suitable for deployment on devices with limited computational resources supporting real-world operational agricultural monitoring systems. This study emphasizes the revolutionary possibilities and practical use of optimized transformer-based models that offer scalable and efficient solutions specifically designed for precision agriculture applications.
KW  - Agriculture
KW  - Data augmentation
KW  - Deep learning
KW  - Vision transformers
KW  - Land use
CR  - Ahmad M, Shabbir S, Roy S K, Hong D, Wu X, Yao J &amp; Chanussot J (2022). Hyperspectral Image Classification - Traditional to Deep Models: A Survey for Future Prospects. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 15: 968–999. doi: 
10.1109/JSTARS.2021.3133021
CR  - Albarakati H M, Khan M A, Hamza A, Khan F, Kraiem N, Jamel L &amp; Alroobaea R (2024). A Novel Deep Learning Architecture for Agriculture Land Cover and Land Use Classification from Remote Sensing Images Based on Network-Level Fusion of Self-Attention Architecture. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 17, 6338–6353. doi: 10.1109/JSTARS.2024.3369950
CR  - Albert A, Kaur J &amp; Gonzalez M C (2017). Using convolutional networks and satellite imagery to identify patterns in urban environments at a large scale. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Part F129685, 1357
1366. Association for Computing Machinery. doi: 10.1145/3097983.3098070
CR  - Alzahrani M S &amp; Alsaade F W (2023). Transform and Deep Learning Algorithms for the Early Detection and Recognition of Tomato Leaf Disease. Agronomy 13(5). doi: 10.3390/agronomy13051184
CR  - Attri I, Awasthi L K, Sharma T P &amp; Rathee P (2023). A review of deep learning techniques used in agriculture. Ecological Informatics, Vol. 
77. Elsevier B.V. doi: 10.1016/j.ecoinf.2023.102217
CR  - Basu S, Ganguly S, Mukhopadhyay S, DiBiano R, Karki M &amp; Nemani R (2015). DeepSat - A learning framework for satellite imagery. GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems, 03-06-November-2015. Association 
for Computing Machinery. doi: 10.1145/2820783.2820816
CR  - Bazi Y, Bashmal L, Al Rahhal M M, Dayil R Al &amp; Ajlan N Al (2021). Vision transformers for remote sensing image classification. Remote Sensing, 13(3): 1–20. doi: 10.3390/rs13030516
CR  - Beyer L, Zhai X &amp; Kolesnikov A (2022). Better plain ViT baselines for ImageNet-1k. Retrieved from http://arxiv.org/abs/2205.01580
CR  - Bowles C, Chen L, Guerrero R, Bentley P, Gunn R, Hammers A &amp; Rueckert D (2018). GAN Augmentation: Augmenting Training Data using Generative Adversarial Networks. Retrieved from http://arxiv.org/abs/1810.10863
CR  - Celik F, Balik Sanli F &amp; Bozic D (2024). Transformer Networks to Classify Weeds and Crops in High-Resolution Aerial Images From North East Serbia. Turkish Journal of Field Crops 29(2): 112–120. doi: 10.17557/tjfc.1511404
CR  - Chen W, Du X, Yang F, Beyer L, Zhai X, Lin T.-Y.&amp; Zhou D (2021). A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation. Retrieved from http://arxiv.org/abs/2112.09747
CR  - Cheng G, Han J &amp; Lu X (2017, October 1). Remote Sensing Image Scene Classification: Benchmark and State of the Art. Proceedings of the 
IEEE, Vol. 105, pp. 1865–1883. Institute of Electrical and Electronics Engineers Inc. doi: 10.1109/JPROC.2017.2675998
CR  - Cubuk E D, Zoph B, Mane D, Vasudevan V &amp; Le Q V (2019). Autoaugment: Learning augmentation strategies from data. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019-June, 113–123. IEEE Computer Society. doi: 
10.1109/CVPR.2019.00020
CR  - DeVries T &amp; Taylor G W (2017). Improved Regularization of Convolutional Neural Networks with Cutout. Retrieved from 
http://arxiv.org/abs/1708.04552
CR  - Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T &amp; Houlsby N (2020). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. Retrieved from http://arxiv.org/abs/2010.11929
CR  - Gandhar A, Gupta K, Pandey A K &amp; Raj D (2024, June 1). Fraud Detection Using Machine Learning and Deep Learning. SN Computer 
Science, Vol. 5. Springer. doi: 10.1007/s42979-024-02772-x
CR  - Gong B, Dai K, Shao J, Jing L &amp; Chen Y (2023). Fish-TViT: A novel fish species classification method in multi water areas based on transfer learning and vision transformer. Heliyon 9(6). doi: 10.1016/j.heliyon.2023.e16761
CR  - Guo N, Jiang M, Gao L, Li K, Zheng F, Chen X &amp; Wang M (2023). HFCC-Net: A Dual-Branch Hybrid Framework of CNN and CapsNet for Land-Use Scene Classification. Remote Sensing, 15(20). doi: 10.3390/rs15205044
CR  - Hamza A, Khan M A, Ur Rehman S, Albarakati H M, Alroobaea R, Baqasah A M &amp; Masood A (2023). An Integrated Parallel Inner Deep Learning Models Information Fusion with Bayesian Optimization for Land Scene Classification in Satellite Images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 16, 9888–9903. doi: 10.1109/JSTARS.2023.3324494
CR  - Hamza A, Khan M A, Ur Rehman S, Al-Khalidi M, Alzahrani A I, Alalwan N &amp; Masood A (2024). A Novel Bottleneck Residual and SelfAttention Fusion-Assisted Architecture for Land Use Recognition in Remote Sensing Images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 17: 2995–3009. doi: 10.1109/JSTARS.2023.3348874
CR  - Han K, Wang Y, Chen H, Chen X, Guo J, Liu Z &amp; Tao D (2020). A Survey on Visual Transformer. doi: 10.1109/TPAMI.2022.3152247
CR  - He K, Zhang X, Ren S &amp; Sun J (2016a). Deep residual learning for image recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016-December, 770–778. IEEE Computer Society. doi: 10.1109/CVPR.2016.90
CR  - He K, Zhang X, Ren S &amp; Sun J (2016b). Deep residual learning for image recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016-December, 770–778. IEEE Computer Society. doi: 10.1109/CVPR.2016.90
CR  - Helber P, Bischke B, Dengel A &amp; Borth D (2019). Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(7): 2217–2226. doi: 
10.1109/JSTARS.2019.2918242
CR  - Hinton G, Vinyals O &amp; Dean J (2015). Distilling the Knowledge in a Neural Network. Retrieved from http://arxiv.org/abs/1503.02531
CR  - Huete A, Didan K, Miura T, Rodriguez E P, Gao X &amp; Ferreira L G (2002). Overview of the radiometric and biophysical performance of the MODIS vegetation indices. Retrieved from www.elsevier.com/locate/rse
CR  - Jackson P T, Atapour-Abarghouei A, Bonner S, Breckon T &amp; Obara B (2018). Style Augmentation: Data Augmentation via Style Randomization. Retrieved from http://arxiv.org/abs/1809.05375
CR  - Joshi A, Pradhan B, Gite S &amp; Chakraborty S (2023, April 1). Remote-Sensing Data and Deep-Learning Techniques in Crop Mapping and Yield Prediction: A Systematic Review. Remote Sensing, Vol. 15. MDPI. doi: 10.3390/rs15082014
CR  - Khan S D &amp; Basalamah S (2023). Multi-Branch Deep Learning Framework for Land Scene Classification in Satellite Imagery. Remote Sensing, 
15(13). doi: 10.3390/rs15133408
CR  - Khan S, Naseer M, Hayat M, Zamir S W, Khan F S &amp; Shah M (2021). Transformers in Vision: A Survey. doi: 10.1145/3505244
CR  - Li X &amp; Li S (2022). Transformer Help CNN See Better: A Lightweight Hybrid Apple Disease Identification Model Based on Transformers. Agriculture (Switzerland), 12(6). doi: 10.3390/agriculture12060884
CR  - Martins V S, Kaleita A L, Gelder B K, da Silveira H L F &amp; Abe C A (2020). Exploring multiscale object-based convolutional neural network (multi-OCNN) for remote sensing image classification at high spatial resolution. ISPRS Journal of Photogrammetry and Remote Sensing, 
168: 56–73. doi: 10.1016/j.isprsjprs.2020.08.004
CR  - Maurício J, Domingues I &amp; Bernardino J (2023, May 1). Comparing Vision Transformers and Convolutional Neural Networks for Image Classification: A Literature Review. Applied Sciences (Switzerland), Vol. 13. MDPI. doi: 10.3390/app13095521
CR  - Miotto R, Wang F, Wang S, Jiang X &amp; Dudley J T (2017). Deep learning for healthcare: Review, opportunities and challenges. Briefings in Bioinformatics, 19(6): 1236–1246. doi: 10.1093/bib/bbx044
CR  - Nguyen T T, Hoang T D, Pham M T, Vu T T, Nguyen T H, Huynh Q T &amp; Jo J (2020). Monitoring agriculture areas with satellite images and deep learning. Applied Soft Computing Journal, 95. doi: 10.1016/j.asoc.2020.106565
CR  - Nie J, Yuan Y, Li Y, Wang H, Li J, Wang Y &amp; Ercisli S (2024). Few-shot Learning in Intelligent Agriculture: A Review of Methods and 
Applications. Tarim Bilimleri Dergisi 30(2): 216–228. doi: 10.15832/ankutbd.1339516
CR  - Papageorgiou E I, Markinos A T &amp; Gemtos T A (2011). Fuzzy cognitive map based approach for predicting yield in cotton crop production as a basis for decision support system in precision agriculture application. Applied Soft Computing Journal, 11(4): 3643–3657. doi: 
10.1016/j.asoc.2011.01.036
CR  - Park S, Im J, Park S, Yoo C, Han H &amp; Rhee J (2018). Classification and mapping of paddy rice by combining Landsat and SAR time series data. Remote Sensing, 10(3). doi: 10.3390/rs10030447
CR  - Ranftl R, Bochkovskiy A &amp; Koltun V (2021). Vision Transformers for Dense Prediction. Retrieved from https://github.com/intel-isl/DPT.
CR  - Reedha R, Dericquebourg E, Canals R &amp; Hafiane A (2022). Transformer Neural Network for Weed and Crop Classification of High Resolution UAV Images. Remote Sensing, 14(3). doi: 10.3390/rs14030592
CR  - Sandler M, Howard A, Zhu M, Zhmoginov A &amp; Chen L C (2018). MobileNetV2: Inverted Residuals and Linear Bottlenecks. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 4510–4520. IEEE Computer Society. doi: 
10.1109/CVPR.2018.00474
CR  - Shastry K A, Sanjay H A &amp; Deexith G (2017). Quadratic-radial-basis-function-kernel for classifying multi-class agricultural datasets with continuous attributes. Applied Soft Computing Journal, 58: 65–74. doi: 10.1016/j.asoc.2017.04.049
CR  - Shin H, Jeon S, Seol Y, Kim S &amp; Kang D (2023). Vision Transformer Approach for Classification of Alzheimer’s Disease Using 18F Florbetaben Brain Images. Applied Sciences (Switzerland), 13(6). doi: 10.3390/app13063453
CR  - Singh G, Singh S, Sethi G &amp; Sood V (2022). Deep Learning in the Mapping of Agricultural Land Use Using Sentinel-2 Satellite Data. Geographies, 2(4): 691–700. doi: 10.3390/geographies2040042
CR  - Slavkovikj V, Verstockt S, De Neve W, Van Hoecke S &amp; Van De Walle R (2015). Hyperspectral image classification with convolutional neural networks. MM 2015 - Proceedings of the 2015 ACM Multimedia Conference, 1159–1162. Association for Computing Machinery, 
Inc. doi: 10.1145/2733373.2806306
CR  - Suh H K, IJsselmuiden J, Hofstee J W &amp; van Henten E J (2018). Transfer learning for the classification of sugar beet and volunteer potato under field conditions. Biosystems Engineering, 174: 50–65. doi: 10.1016/j.biosystemseng.2018.06.017
CR  - Suravarapu V K &amp; Patil H Y (2023). Person Identification and Gender Classification Based on Vision Transformers for Periocular Images. 
Applied Sciences (Switzerland), 13(5). doi: 10.3390/app13053116
CR  - Szegedy C, Vanhoucke V, Ioffe S, Shlens J &amp; Wojna Z (2016). Rethinking the Inception Architecture for Computer Vision. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016-December, 2818–2826. IEEE Computer 
Society. doi: 10.1109/CVPR.2016.308
CR  - Tan M &amp; Le Q V (2019). EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. Retrieved from 
http://arxiv.org/abs/1905.11946
CR  - Tang X, Zhang X, Liu F &amp; Jiao L (2018). Unsupervised deep feature learning for remote sensing image retrieval. Remote Sensing, 10(8). doi: 
10.3390/rs10081243
CR  - Temenos A, Temenos N, Kaselimi M, Doulamis A &amp; Doulamis N (2023). Interpretable Deep Learning Framework for Land Use and Land Cover Classification in Remote Sensing Using SHAP. IEEE Geoscience and Remote Sensing Letters, 20. doi: 
10.1109/LGRS.2023.3251652
CR  - Thakur P S, Chaturvedi S, Khanna P, Sheorey T &amp; Ojha A (2023). Vision transformer meets convolutional neural network for plant disease classification. Ecological Informatics, 77. doi: 10.1016/j.ecoinf.2023.102245
CR  - Thirumaladevi S, Veera Swamy K &amp; Sailaja M (2023a). Remote sensing image scene classification by transfer learning to augment the accuracy. Measurement: Sensors, 25. doi: 10.1016/j.measen.2022.100645
CR  - Thirumaladevi S, Veera Swamy K &amp; Sailaja M (2023b). Remote sensing image scene classification by transfer learning to augment the accuracy. Measurement: Sensors, 25. doi: 10.1016/j.measen.2022.100645
CR  - Vaswani A, Brain G, Shazeer N, Parmar N, Uszkoreit J, Jones L &amp; Polosukhin I (2017). Attention Is All You Need. 
Vohra R &amp; Tiwari K C (2023). Land cover classification using multi-fusion based dense transpose convolution in fully convolutional network with feature alignment for remote sensing images. Earth Science Informatics, 16(1): 983–1003. doi: 10.1007/s12145-022-00891-8
CR  - Xia Z, Pan X, Song S, Erran Li, L &amp; Huang G (2022). Vision Transformer with Deformable Attention. Retrieved from 
https://github.com/LeapLabTHU/DAT.
CR  - Yun S, Han D, Oh S J, Chun S, Choe J &amp; Yoo Y (2019). CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features. 
Retrieved from http://arxiv.org/abs/1905.04899
CR  - Zahra U, Khan M A, Alhaisoni M, Alasiry A, Marzougui M &amp; Masood A (2024). An Integrated Framework of Two-Stream Deep Learning Models Optimal Information Fusion for Fruits Disease Recognition. IEEE Journal of Selected Topics in Applied Earth Observations and 
Remote Sensing, 17: 3038–3052. doi: 10.1109/JSTARS.2023.3339297
CR  - Zhai X, Kolesnikov A, Houlsby N &amp; Beyer L (2021). Scaling Vision Transformers. Retrieved from http://arxiv.org/abs/2106.04560
CR  - Zhang H, Cisse M, Dauphin Y N &amp; Lopez-Paz D (2017). mixup: Beyond Empirical Risk Minimization. Retrieved from 
http://arxiv.org/abs/1710.09412
CR  - Zhang M, Lin H, Wang G, Sun H &amp; Fu J (2018). Mapping paddy rice using a Convolutional Neural Network (CNN) with Landsat 8 datasets in the Dongting Lake Area, China. Remote Sensing, 10(11). doi: 10.3390/rs10111840
CR  - Zhang X, Zhou X, Lin M &amp; Sun J (2018). ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 6848–6856. IEEE Computer Society. doi: 
10.1109/CVPR.2018.00716
CR  - Zhao B, Zhong Y, Xia G S &amp; Zhang L (2016). Dirichlet-derived multiple topic scene classification model for high spatial resolution remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing, 54(4), 2108–2123. doi: 10.1109/TGRS.2015.2496185
CR  - Zhao S, Tu K, Ye S, Tang H, Hu Y &amp; Xie C (2023, November 3). Land Use and Land Cover Classification Meets Deep Learning: A Review. Sensors (Basel, Switzerland), Vol. 23. doi: 10.3390/s23218966
CR  - Zuo S, Xiao Y, Chang X &amp; Wang X (2022). Vision transformers for dense prediction: A survey. Knowledge-Based Systems, 253. doi: 
10.1016/j.knosys.2022.109552
UR  - https://doi.org/10.15832/ankutbd.1624812
L1  - https://dergipark.org.tr/en/download/article-file/4542577
ER  -