Research Article
BibTex RIS Cite

Machine Learning–Based Fault Diagnosis in Solar Photovoltaic Systems Using Data Balancing Techniques

Year 2025, Volume: 5 Issue: 3, 168 - 175, 30.10.2025
https://doi.org/10.5152/tepes.2025.25038
https://izlik.org/JA89BX27NW

Abstract

As solar energy adoption continues to rise, the demand for reliable photovoltaic (PV) systems has increased significantly. Ensuring the efficient and secure operation of PV systems requires accurate fault detection, making fault diagnosis a critical research area. This study investigates the diagnosis of short-circuit faults in PV systems by integrating machine learning algorithms with data balancing techniques. Four classifiers (Random Forest, CatBoost, Extreme Gradient Boosting, and Light Gradient Boosting Machine (LGBM)) were employed for fault classification, while Synthetic Minority Oversampling Technique (SMOTE), Random Oversampling, and Adaptive Synthetic Sampling were used to address class imbalance. Two datasets were analyzed: Dataset-1 with 11 features and Dataset-2 with 13 features. For Dataset-1, LGBM achieved the highest accuracy (79.28%) on the imbalanced data, which improved to 86.59% after applying SMOTE. By incorporating two additional features in Dataset-2, fault diagnosis accuracy increased to 98.57% on the imbalanced data and reached 100% when balanced with SMOTE. These findings demonstrate that combining LGBM with SMOTE significantly enhances short-circuit fault detection performance in PV systems.

References

  • 1. A. Khoshnami, and I. Sadeghkhani, “Sample entropy‐based fault detection for photovoltaic arrays,” IET Renew. Power Gener., vol. 12, no. 16, pp. 1966–1976, 2018.
  • 2. F. Aziz, A. U. Haq, S. Ahmad, Y. Mahmoud, M. Jalal, and U. Ali, “A novel convolutional neural network-based approach for fault classification in photovoltaic arrays,” IEEE Access, vol. 8, pp. 41889–41904, 2020.
  • 3. IRENA, “Renewable energy statistics 2025,” 2025. Abu Dhabi: International Renewable Energy Agency. Available at: https:// gensed.o rg/ wp-co ntent/up loads/20 25/07/IR ENA_DAT_ RE_Stati stics_20 25.pdf. [accessed: 20 August 2025].
  • 4. “Renewables,” 2024, Global Status Report, A Comprehensive Annual Overview of the State of Renewable Energy. Available at: https:// www. ren2 1.net/gs r-2024/. [accessed: 20 August 2025].
  • 5. “Republic of Türkiye ministry of energy and natural resources.” Available at: https:// enerji.g ov.tr/bi lgi-merk ezi-ener ji-elekt rik. [accessed: 20 August 2025].
  • 6. E. D. Chepp, and A. Krenzinger, “A methodology for prediction and assessment of shading on PV systems,” Sol. Energy, vol. 216, pp. 537–550, 2021.
  • 7. T. Cheng, M. Al‐Soeidat, D. D. C. Lu, and V. G. Agelidis, “Experimental study of PV strings affected by cracks,” J. Eng., vol. 2019, no. 18, pp. 5124–5128, 2019.
  • 8. L. L. Jiang, and D. L. Maskell, “Automatic fault detection and diagnosis for photovoltaic systems using combined artificial neural network and analytical based methods,” In 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–8, 2015.
  • 9. A. E. Lazzaretti et al., “A monitoring system for online fault detection and classification in photovoltaic plants,” Sensors (Basel), vol. 20, no. 17, p. 4688, 2020.
  • 10. E. Quiles-Cucarella, P. Sánchez-Roca, and I. Agustí-Mercader, “Performance optimization of machine-learning algorithms for fault detection and diagnosis in PV systems,” Electronics, vol. 14, no. 9, p. 1709, 2025.
  • 11. A. A. Al-Katheri, E. A. Al-Ammar, M. A. Alotaibi, W. Ko, S. Park, and H. J. Choi, “Application of artificial intelligence in PV fault detection,” Sustainability, vol. 14, no. 21, 13815, 2022.
  • 12. M. Dhimish, V. Holmes, B. Mehrdadi, and M. Dales, “Comparing Mamdani Sugeno fuzzy logic and RBF ANN network for PV fault detection,” Renew. Energy, vol. 117, pp. 257–274, 2018.
  • 13. N. C. Yang, and H. Ismail, “Robust intelligent learning algorithm using random forest and modified-independent component analysis for PV fault detection: In case of imbalanced data,” IEEE Access, vol. 10, pp. 41119–41130, 2022.
  • 14. Z. Yi, and A. H. Etemadi, “Line-to-line fault detection for photovoltaic arrays based on multiresolution signal decomposition and two-stage support vector machine,” IEEE Trans. Ind. Electron., vol. 64, no. 11, pp. 8546–8556, 2017.
  • 15. S. S. Ghoneim, A. E. Rashed, and N. I. Elkalashy, “Fault detection algorithms for achieving service continuity in photovoltaic farms,” Intell. Autom. Soft Comput., vol. 29, no. 3, pp. 467–479, 2021.
  • 16. M. Ma, H. Liu, Z. Zhang, P. Yun, and F. Liu, “Rapid diagnosis of hot spot failure of crystalline silicon PV module based on IV curve,” Microelectron. Reliab., vol. 100, 113402, 2019. [
  • 17. R. G. Vieira, F. M. de Araújo, M. Dhimish, and M. I. Guerra, “A comprehensive review on bypass diode application on photovoltaic modules,” Energies, vol. 13, no. 10, p. 2472, 2020.
  • 18. Y. Zhao, L. Yang, B. Lehman, J. J. F. de Palma, and R. Lyons, “Decision tree-based fault detection and classification in solar photovoltaic arrays,” in Proceedings of the 27th Annual IEEE Applied Power Electronics Conference and Exposition, 2012.
  • 19. M. K. Alam, F. Khan, J. Johnson, and J. Flicker, “A comprehensive review of catastrophic faults in PV arrays: Types, detection, and mitigation techniques,” IEEE J. Photovolt., vol. 5, no. 3, pp. 982–997, 2015.
  • 20. L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, 2001.
  • 21. M. Demirci, H. Gözde, and M. C. Taplamacioglu, “Improvement of power transformer fault diagnosis by using sequential Kalman filter sensor fusion,” Int. J. Electr. Power Energy Syst., vol. 149, 109038, 2023.
  • 22. G. Abdurrahman, and M. Sintawati, “Implementation of xgboost for classification of Parkinson’s disease,” In J. Phys. Conf. S., vol. 1538, no. 1, p. 012024, May, 2020.
  • 23. S. Li, and X. Zhang, “Research on orthopedic auxiliary classification and prediction model based on XGBoost algorithm,” Neural Comput. Appl., vol. 32, no. 7, pp. 1971–1979, 2020.
  • 24. P. Tao, H. Shen, Y. Zhang, P. Ren, J. Zhao, and Y. Jia, “Status forecast and fault classification of smart meters using LightGBM algorithm improved by random forest,” Wirel. Commun. Mob. Comput., vol. 2022, no. 1, 3846637, 2022.
  • 25. J. T. Hancock, and T. M. Khoshgoftaar, “CatBoost for big data: An interdisciplinary review,” J. Big Data, vol. 7, no. 1, p. 94, 2020.
  • 26. W. Chang, X. Wang, J. Yang, and T. Qin, “An improved CatBoost-based classification model for ecological suitability of blueberries,” Sensors (Basel), vol. 23, no. 4, 1811, 2023.
  • 27. J. M. Johnson, and T. M. Khoshgoftaar, “Deep learning and data sampling with imbalanced big data,” In 20th IEEE international conference on information reuse and integration for data science (IRI). New York: IEEE, pp. 175–183, 2019.
  • 28. M. Hayaty, S. Muthmainah, and S. M. Ghufran, “Random and synthetic over-sampling approach to resolve data imbalance in classification,” Int. J. Artif. Intell. Res., vol. 4, no. 2, pp. 86–94, 2021.
  • 29. G. A. Pradipta, R. Wardoyo, A. Musdholifah, I. N. H. Sanjaya, and M. Ismail, “SMOTE for handling imbalanced data problem: A review,” In 2021 sixth international conference on informatics and computing (ICIC), pp. 1–8, 2021.
  • 30. N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” J. Artif. Intell. Res., vol. 16, pp. 321–357, 2002.
  • 31. M. Imani, A. Beikmohammadi, and H. R. Arabnia, “Comprehensive analysis of random forest and XGBoost performance with SMOTE, ADASYN, and GNUS under varying imbalance levels,” Technologies, vol. 13, no. 3, p. 88, 2025.
  • 32. K. Zhong, X. Tan, S. Liu, Z. Lu, X. Hou, and Q. Wang, “Prediction of slope failure probability based on machine learning with genetic-ADASYN algorithm,” Eng. Geol., vol. 346, 107885, 2025.
  • 33. H. Yakupoglu, H. Gozde and M.C. Taplamacioglu, “Online noise-adaptive Kalman filter integrated novel autoencoder for multi-fault detection and early warning of wind turbines,” Measurement, vol. 256, 118538, doi:
There are 33 citations in total.

Details

Primary Language English
Subjects Photovoltaic Power Systems
Journal Section Research Article
Authors

Merve Demirci 0000-0001-8192-7366

Submission Date September 6, 2025
Acceptance Date September 22, 2025
Publication Date October 30, 2025
DOI https://doi.org/10.5152/tepes.2025.25038
IZ https://izlik.org/JA89BX27NW
Published in Issue Year 2025 Volume: 5 Issue: 3

Cite

APA Demirci, M. (2025). Machine Learning–Based Fault Diagnosis in Solar Photovoltaic Systems Using Data Balancing Techniques. Turkish Journal of Electrical Power and Energy Systems, 5(3), 168-175. https://doi.org/10.5152/tepes.2025.25038
AMA 1.Demirci M. Machine Learning–Based Fault Diagnosis in Solar Photovoltaic Systems Using Data Balancing Techniques. TEPES. 2025;5(3):168-175. doi:10.5152/tepes.2025.25038
Chicago Demirci, Merve. 2025. “Machine Learning–Based Fault Diagnosis in Solar Photovoltaic Systems Using Data Balancing Techniques”. Turkish Journal of Electrical Power and Energy Systems 5 (3): 168-75. https://doi.org/10.5152/tepes.2025.25038.
EndNote Demirci M (October 1, 2025) Machine Learning–Based Fault Diagnosis in Solar Photovoltaic Systems Using Data Balancing Techniques. Turkish Journal of Electrical Power and Energy Systems 5 3 168–175.
IEEE [1]M. Demirci, “Machine Learning–Based Fault Diagnosis in Solar Photovoltaic Systems Using Data Balancing Techniques”, TEPES, vol. 5, no. 3, pp. 168–175, Oct. 2025, doi: 10.5152/tepes.2025.25038.
ISNAD Demirci, Merve. “Machine Learning–Based Fault Diagnosis in Solar Photovoltaic Systems Using Data Balancing Techniques”. Turkish Journal of Electrical Power and Energy Systems 5/3 (October 1, 2025): 168-175. https://doi.org/10.5152/tepes.2025.25038.
JAMA 1.Demirci M. Machine Learning–Based Fault Diagnosis in Solar Photovoltaic Systems Using Data Balancing Techniques. TEPES. 2025;5:168–175.
MLA Demirci, Merve. “Machine Learning–Based Fault Diagnosis in Solar Photovoltaic Systems Using Data Balancing Techniques”. Turkish Journal of Electrical Power and Energy Systems, vol. 5, no. 3, Oct. 2025, pp. 168-75, doi:10.5152/tepes.2025.25038.
Vancouver 1.Merve Demirci. Machine Learning–Based Fault Diagnosis in Solar Photovoltaic Systems Using Data Balancing Techniques. TEPES. 2025 Oct. 1;5(3):168-75. doi:10.5152/tepes.2025.25038