Research Article
BibTex RIS Cite

Application of machine learning algorithms in the investigation of groundwater quality parameters over YSR district, India

Year 2023, , 64 - 72, 15.01.2023
https://doi.org/10.31127/tuje.1032314

Abstract

Human life sustained for decades due to the availability of basic needs, and freshwater is one of them. However, groundwater quality is constantly under pressure. This can be attributed to anthropogenic activities not limited to urban areas but to rural zones. Machine learning methods like linear discriminant analysis (LDA), Classification and Regression Trees (CART), k-Nearest Neighbour (KNN), Support Vector Machines (SVM) and, Random Forest (RF) models were used to analyse groundwater quality variables. The mean accuracy of each classifier was calculated, and the obtained mean accuracies were 77.5% (LDA), 87% (CART), 96% (KNN), 93.5% (SVM) and 96% (RF). RF and KNN models were selected as optimal models with higher accuracy. This study made it apparent that machine learning algorithms can estimate and predict water quality variables with significant accuracy. In this study, the observations and variables were compared with the water quality index and drinking water limits provided by the Bureau of Indian Standards. The water quality index for each observation was calculated. If at least four variables have a higher value than prescribed limits, it was assigned a value of 1; if more than four variables reported higher values, it was assigned a value of 2. 

Supporting Institution

None

Project Number

None

Thanks

Central Ground Water Board (CGWB), Ministry of Jal Shakti, Department of Water Resources, River Development and Ganga Rejuvenation, Government of India

References

  • Aytaç, E. (2020). Unsupervised learning approach in defining the similarity of catchments: Hydrological response unit-based k-means clustering, a demonstration on Western Black Sea Region of Turkey. International Soil and Water Conservation Research, 8(3), 321–331. https://doi.org/10.1016/j.iswcr.2020.05.002
  • Singha, S., Pasupuleti, S., Singha, S. S., Singh, R., & Kumar, S. (2021). Prediction of groundwater quality using efficient machine learning technique. Chemosphere, 276. https://doi.org/10.1016/j.chemosphere.2021.130265
  • Bilali, A., Taleb, A., & Brouziyne, Y. (2021). Groundwater quality forecasting using machine learning algorithms for irrigation purposes. Agricultural Water Management, 245. https://doi.org/10.1016/j.agwat.2020.106625
  • Yenugu, S. R., Vangala, S., & Badri, S. (2020a). Groundwater quality evaluation using GIS and water quality index in and around inactive mines, Southwestern parts of Cuddapah basin, Andhra Pradesh, South India. HydroResearch, 3, 146–157. https://doi.org/10.1016/j.hydres.2020.11.001
  • Brindha, K., Pavelic, P., Sotoukee, T., Douangsavanh, S., & Elango, L. (2017). Geochemical Characteristics and Groundwater Quality in the Vientiane Plain, Laos. Exposure and Health, 9(2), 89–104. https://doi.org/10.1007/s12403-016-0224-8
  • Reddy, B. M., V.Sunitha, M.Prasad, Reddy, Y. S., & Reddy, M. R. (2019). Evaluation of groundwater suitability for domestic and agricultural utility in semi-arid region of Anantapur, Andhra Pradesh State, South India. Groundwater for Sustainable Development, 9, 100262. https://doi.org/10.1016/j.gsd.2019.100262
  • Datta, P. S., & Tyagi, S. K. (1996). Major Ion Chemistry of Groundwater in Delhi Area: Chemical Weathering Processes and Groundwater Flow Regime. Journal of Geological Society of India (Online Archive from Vol 1 to Vol 78), 47(2), 179–188.
  • Raju, N. J. (2007). Hydrogeochemical parameters for assessment of groundwater quality in the upper Gunjanaeru River basin, Cuddapah District, Andhra Pradesh, South India. Environmental Geology, 52(6), 1067–1074. https://doi.org/10.1007/s00254-006-0546-0
  • Ramakrishna Reddy, M., Janardhana Raju, N., Venkatarami Reddy, Y., & Reddy, T. V. K. (2000). Water resources development and management in the Cuddapah district, India. Environmental Geology, 39(3), 342–352. https://doi.org/10.1007/s002540050013
  • Sreedevi, P. D. (2004a). Groundwater Quality of Pageru River Basin, Cuddapah District, Andhra Pradesh. Journal of Geological Society of India (Online Archive from Vol 1 to Vol 78), 64(5), 619–636.
  • 11. Bedi, S., Samal, A., Ray, C., & Snow, D. (2020). Comparative evaluation of machine learning models for groundwater quality assessment. Environmental Monitoring and Assessment, 192(12), 776. https://doi.org/10.1007/s10661-020-08695-3
  • Mosavi, A., Hosseini, F. S., Choubin, B., Abdolshahnejad, M., Gharechaee, H., Lahijanzadeh, A., & Dineva, A. A. (2020). Susceptibility Prediction of Groundwater Hardness Using Ensemble Machine Learning Models. Water, 12(10), 2770. https://doi.org/10.3390/w12102770
  • Sajedi-Hosseini, F., Malekian, A., Choubin, B., Rahmati, O., Cipullo, S., Coulon, F., & Pradhan, B. (2018). A novel machine learning-based approach for the risk assessment of nitrate groundwater contamination. Science of The Total Environment, 644, 954–962. https://doi.org/10.1016/j.scitotenv.2018.07.054
  • Agrawal, P., Sinha, A., Kumar, S., Agarwal, A., Banerjee, A., Villuri, V. G. K., … Pasupuleti, S. (2021). Exploring Artificial Intelligence Techniques for Groundwater Quality Assessment. Water, 13(9), 1172. https://doi.org/10.3390/w13091172
  • Tamiru, H., & Wagari, M. (2021). Comparison of ANN model and GIS tools for delineation of groundwater potential zones, Fincha Catchment, Abay Basin, Ethiopia. Geocarto International, 0(0), 1–19. https://doi.org/10.1080/10106049.2021.1946171
  • Naghibi, S. A., Pourghasemi, H. R., & Abbaspour, K. (2018). A comparison between ten advanced and soft computing models for groundwater qanat potential assessment in Iran using R and GIS. Theoretical and Applied Climatology, 131(3), 967–984. https://doi.org/10.1007/s00704-016-2022-4
  • Golkarian, A., Naghibi, S. A., Kalantar, B., & Pradhan, B. (2018). Groundwater potential mapping using C5.0, random forest, and multivariate adaptive regression spline models in GIS. Environmental Monitoring and Assessment, 190(3), 149. https://doi.org/10.1007/s10661-018-6507-8
  • Acar, E., & Özerdem, M. S. (2020). On a yearly basis prediction of soil water content utilizing sar data: A machine learning and feature selection approach. Turkish Journal of Electrical Engineering & Computer Sciences, 28(4), 2316–2330. Retrieved from https://online-journals.tubitak.gov.tr/publishedManuscriptDetails.htm?id=27563
  • Acar, E., Ozerdem, M. S., & Ustundag, B. B. (2019). Machine Learning based Regression Model for Prediction of Soil Surface Humidity over Moderately Vegetated Fields. 2019 8th International Conference on Agro-Geoinformatics (Agro-Geoinformatics), 1–4. 8820461 https://doi.org/10.1109/AgroGeoinformatics.2019.
  • Al-Adhaileh, M. H., & Alsaade, F. W. (2021). Modelling and prediction of water quality by using artificial intelligence. Sustain., 13. https://doi.org/10.3390/su13084259
  • https://indiawris.gov.in/wris/#/GWQuality
  • http://cgwb.gov.in/GW-data-access.html
  • Districts, India, 2016—University of Texas Libraries GeoData. (n.d.). Retrieved November 21, 2021, from https://geodata.lib.utexas.edu/catalog/stanford-sh819zz8121
  • Yenugu, S. R., Vangala, S., & Badri, S. (2020b). Monitoring of groundwater quality for drinking purposes using the WQI method and its health implications around inactive mines in Vemula-Vempalli region, Kadapa District, South India. Applied Water Science, 10(8), 202. https://doi.org/10.1007/s13201-020-01284-2
  • Sreedevi, P. D. (2004b). Groundwater quality of Pageru River basin, Cuddapah District, Andhra Pradesh. Journal of Geological Society of India, 64.
  • Castro, C. L., & Braga, A. P. (2013). Novel cost-sensitive approach to improve the multilayer perceptron performance on imbalanced data. IEEE Transactions on Neural Networks and Learning Systems, 24. https://doi.org/10.1109/TNNLS.2013.2246188
  • Collins, R., & Jerkins, A. (1996). The impact of agriculture land use on stream chemistry in the middle Hills of the Himalayas, Nepal. Journal of Hydrology, 185. https://doi.org/10.1016/0022-1694(95)03008-5
  • Ako, A. A., Eyong, G. E. T., Shimada, J., Koike, K., Hosono, T., Ichiyanagi, K., … Roger, N. N. (2014). Nitrate contamination of groundwater in two areas of the Cameroon Volcanic Line (Banana Plain and Mount Cameroon area). Applied Water Science, 4(2), 99–113. https://doi.org/10.1007/s13201-013-0134-x
  • Cateni, S., Colla, V., & Vannucci, M. (2014). A method for resampling imbalanced datasets in binary classification tasks for real-world problems. Neurocomputing, 135. https://doi.org/10.1016/J.NEUCOM.2013.05.059
  • Ajmera, T. K., & Goyal, M. K. (2012). Development of stage discharge rating curve using model tree and neural networks: An application to Peachtree Creek in Atlanta. Expert Systems with Applications, 39. https://doi.org/10.1016/j.eswa.2011.11.101
  • Zhou, Z. H., & Liu, X. Y. (2006). Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Transactions on Knowledge and Data Engineering, 18. https://doi.org/10.1109/TKDE.2006.17
  • Zhang, C., Tang, Y., Xu, X., & Kiely, G. (2011). Towards spatial geochemical modelling: Use of geographically weighted regression for mapping soil organic carbon contents in Ireland. Applied Geochemistry, 26.
  • Cunningham, P., & Delany, S. J. (2021). k-Nearest Neighbour Classifiers—A Tutorial. Conference Papers. https://doi.org/10.1145/3459665
  • Celestino, A. E. M., Cruz, D. A. M., Sánchez, E. M. O., & Reyes, F. G. (n.d.). Groundwater Quality Assessment: An Improved Approach to K-Means Clustering, Principal Component Analysis and Spatial Analysis: A Case Study. Retrieved from https://core.ac.uk/display/156977871
  • Biau, G. (2012). Analysis of a Random Forests Model. Journal of Machine Learning Research, 13(38), 1063–1095. Retrieved from http://jmlr.org/papers/v13/biau12a.html
  • Hastie, T., Tibshirani, R., & Friedman, J. (2009). Random Forests. In T. Hastie, R. Tibshirani, & J. Friedman (Eds.), The Elements of Statistical Learning: Data Mining, Inference, and Prediction (pp. 587–604). New York, NY: Springer. https://doi.org/10.1007/978-0-387-84858-7_15
  • Gislason, P. O., Benediktsson, J. A., & Sveinsson, J. R. (2006). Random Forests for land cover classification. Pattern Recognition Letters, 27(4), 294–300. https://doi.org/10.1016/j.patrec.2005.08.011
Year 2023, , 64 - 72, 15.01.2023
https://doi.org/10.31127/tuje.1032314

Abstract

Project Number

None

References

  • Aytaç, E. (2020). Unsupervised learning approach in defining the similarity of catchments: Hydrological response unit-based k-means clustering, a demonstration on Western Black Sea Region of Turkey. International Soil and Water Conservation Research, 8(3), 321–331. https://doi.org/10.1016/j.iswcr.2020.05.002
  • Singha, S., Pasupuleti, S., Singha, S. S., Singh, R., & Kumar, S. (2021). Prediction of groundwater quality using efficient machine learning technique. Chemosphere, 276. https://doi.org/10.1016/j.chemosphere.2021.130265
  • Bilali, A., Taleb, A., & Brouziyne, Y. (2021). Groundwater quality forecasting using machine learning algorithms for irrigation purposes. Agricultural Water Management, 245. https://doi.org/10.1016/j.agwat.2020.106625
  • Yenugu, S. R., Vangala, S., & Badri, S. (2020a). Groundwater quality evaluation using GIS and water quality index in and around inactive mines, Southwestern parts of Cuddapah basin, Andhra Pradesh, South India. HydroResearch, 3, 146–157. https://doi.org/10.1016/j.hydres.2020.11.001
  • Brindha, K., Pavelic, P., Sotoukee, T., Douangsavanh, S., & Elango, L. (2017). Geochemical Characteristics and Groundwater Quality in the Vientiane Plain, Laos. Exposure and Health, 9(2), 89–104. https://doi.org/10.1007/s12403-016-0224-8
  • Reddy, B. M., V.Sunitha, M.Prasad, Reddy, Y. S., & Reddy, M. R. (2019). Evaluation of groundwater suitability for domestic and agricultural utility in semi-arid region of Anantapur, Andhra Pradesh State, South India. Groundwater for Sustainable Development, 9, 100262. https://doi.org/10.1016/j.gsd.2019.100262
  • Datta, P. S., & Tyagi, S. K. (1996). Major Ion Chemistry of Groundwater in Delhi Area: Chemical Weathering Processes and Groundwater Flow Regime. Journal of Geological Society of India (Online Archive from Vol 1 to Vol 78), 47(2), 179–188.
  • Raju, N. J. (2007). Hydrogeochemical parameters for assessment of groundwater quality in the upper Gunjanaeru River basin, Cuddapah District, Andhra Pradesh, South India. Environmental Geology, 52(6), 1067–1074. https://doi.org/10.1007/s00254-006-0546-0
  • Ramakrishna Reddy, M., Janardhana Raju, N., Venkatarami Reddy, Y., & Reddy, T. V. K. (2000). Water resources development and management in the Cuddapah district, India. Environmental Geology, 39(3), 342–352. https://doi.org/10.1007/s002540050013
  • Sreedevi, P. D. (2004a). Groundwater Quality of Pageru River Basin, Cuddapah District, Andhra Pradesh. Journal of Geological Society of India (Online Archive from Vol 1 to Vol 78), 64(5), 619–636.
  • 11. Bedi, S., Samal, A., Ray, C., & Snow, D. (2020). Comparative evaluation of machine learning models for groundwater quality assessment. Environmental Monitoring and Assessment, 192(12), 776. https://doi.org/10.1007/s10661-020-08695-3
  • Mosavi, A., Hosseini, F. S., Choubin, B., Abdolshahnejad, M., Gharechaee, H., Lahijanzadeh, A., & Dineva, A. A. (2020). Susceptibility Prediction of Groundwater Hardness Using Ensemble Machine Learning Models. Water, 12(10), 2770. https://doi.org/10.3390/w12102770
  • Sajedi-Hosseini, F., Malekian, A., Choubin, B., Rahmati, O., Cipullo, S., Coulon, F., & Pradhan, B. (2018). A novel machine learning-based approach for the risk assessment of nitrate groundwater contamination. Science of The Total Environment, 644, 954–962. https://doi.org/10.1016/j.scitotenv.2018.07.054
  • Agrawal, P., Sinha, A., Kumar, S., Agarwal, A., Banerjee, A., Villuri, V. G. K., … Pasupuleti, S. (2021). Exploring Artificial Intelligence Techniques for Groundwater Quality Assessment. Water, 13(9), 1172. https://doi.org/10.3390/w13091172
  • Tamiru, H., & Wagari, M. (2021). Comparison of ANN model and GIS tools for delineation of groundwater potential zones, Fincha Catchment, Abay Basin, Ethiopia. Geocarto International, 0(0), 1–19. https://doi.org/10.1080/10106049.2021.1946171
  • Naghibi, S. A., Pourghasemi, H. R., & Abbaspour, K. (2018). A comparison between ten advanced and soft computing models for groundwater qanat potential assessment in Iran using R and GIS. Theoretical and Applied Climatology, 131(3), 967–984. https://doi.org/10.1007/s00704-016-2022-4
  • Golkarian, A., Naghibi, S. A., Kalantar, B., & Pradhan, B. (2018). Groundwater potential mapping using C5.0, random forest, and multivariate adaptive regression spline models in GIS. Environmental Monitoring and Assessment, 190(3), 149. https://doi.org/10.1007/s10661-018-6507-8
  • Acar, E., & Özerdem, M. S. (2020). On a yearly basis prediction of soil water content utilizing sar data: A machine learning and feature selection approach. Turkish Journal of Electrical Engineering & Computer Sciences, 28(4), 2316–2330. Retrieved from https://online-journals.tubitak.gov.tr/publishedManuscriptDetails.htm?id=27563
  • Acar, E., Ozerdem, M. S., & Ustundag, B. B. (2019). Machine Learning based Regression Model for Prediction of Soil Surface Humidity over Moderately Vegetated Fields. 2019 8th International Conference on Agro-Geoinformatics (Agro-Geoinformatics), 1–4. 8820461 https://doi.org/10.1109/AgroGeoinformatics.2019.
  • Al-Adhaileh, M. H., & Alsaade, F. W. (2021). Modelling and prediction of water quality by using artificial intelligence. Sustain., 13. https://doi.org/10.3390/su13084259
  • https://indiawris.gov.in/wris/#/GWQuality
  • http://cgwb.gov.in/GW-data-access.html
  • Districts, India, 2016—University of Texas Libraries GeoData. (n.d.). Retrieved November 21, 2021, from https://geodata.lib.utexas.edu/catalog/stanford-sh819zz8121
  • Yenugu, S. R., Vangala, S., & Badri, S. (2020b). Monitoring of groundwater quality for drinking purposes using the WQI method and its health implications around inactive mines in Vemula-Vempalli region, Kadapa District, South India. Applied Water Science, 10(8), 202. https://doi.org/10.1007/s13201-020-01284-2
  • Sreedevi, P. D. (2004b). Groundwater quality of Pageru River basin, Cuddapah District, Andhra Pradesh. Journal of Geological Society of India, 64.
  • Castro, C. L., & Braga, A. P. (2013). Novel cost-sensitive approach to improve the multilayer perceptron performance on imbalanced data. IEEE Transactions on Neural Networks and Learning Systems, 24. https://doi.org/10.1109/TNNLS.2013.2246188
  • Collins, R., & Jerkins, A. (1996). The impact of agriculture land use on stream chemistry in the middle Hills of the Himalayas, Nepal. Journal of Hydrology, 185. https://doi.org/10.1016/0022-1694(95)03008-5
  • Ako, A. A., Eyong, G. E. T., Shimada, J., Koike, K., Hosono, T., Ichiyanagi, K., … Roger, N. N. (2014). Nitrate contamination of groundwater in two areas of the Cameroon Volcanic Line (Banana Plain and Mount Cameroon area). Applied Water Science, 4(2), 99–113. https://doi.org/10.1007/s13201-013-0134-x
  • Cateni, S., Colla, V., & Vannucci, M. (2014). A method for resampling imbalanced datasets in binary classification tasks for real-world problems. Neurocomputing, 135. https://doi.org/10.1016/J.NEUCOM.2013.05.059
  • Ajmera, T. K., & Goyal, M. K. (2012). Development of stage discharge rating curve using model tree and neural networks: An application to Peachtree Creek in Atlanta. Expert Systems with Applications, 39. https://doi.org/10.1016/j.eswa.2011.11.101
  • Zhou, Z. H., & Liu, X. Y. (2006). Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Transactions on Knowledge and Data Engineering, 18. https://doi.org/10.1109/TKDE.2006.17
  • Zhang, C., Tang, Y., Xu, X., & Kiely, G. (2011). Towards spatial geochemical modelling: Use of geographically weighted regression for mapping soil organic carbon contents in Ireland. Applied Geochemistry, 26.
  • Cunningham, P., & Delany, S. J. (2021). k-Nearest Neighbour Classifiers—A Tutorial. Conference Papers. https://doi.org/10.1145/3459665
  • Celestino, A. E. M., Cruz, D. A. M., Sánchez, E. M. O., & Reyes, F. G. (n.d.). Groundwater Quality Assessment: An Improved Approach to K-Means Clustering, Principal Component Analysis and Spatial Analysis: A Case Study. Retrieved from https://core.ac.uk/display/156977871
  • Biau, G. (2012). Analysis of a Random Forests Model. Journal of Machine Learning Research, 13(38), 1063–1095. Retrieved from http://jmlr.org/papers/v13/biau12a.html
  • Hastie, T., Tibshirani, R., & Friedman, J. (2009). Random Forests. In T. Hastie, R. Tibshirani, & J. Friedman (Eds.), The Elements of Statistical Learning: Data Mining, Inference, and Prediction (pp. 587–604). New York, NY: Springer. https://doi.org/10.1007/978-0-387-84858-7_15
  • Gislason, P. O., Benediktsson, J. A., & Sveinsson, J. R. (2006). Random Forests for land cover classification. Pattern Recognition Letters, 27(4), 294–300. https://doi.org/10.1016/j.patrec.2005.08.011
There are 37 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Articles
Authors

Jagadish Kumar Mogaraju 0000-0002-6461-8614

Project Number None
Publication Date January 15, 2023
Published in Issue Year 2023

Cite

APA Mogaraju, J. K. (2023). Application of machine learning algorithms in the investigation of groundwater quality parameters over YSR district, India. Turkish Journal of Engineering, 7(1), 64-72. https://doi.org/10.31127/tuje.1032314
AMA Mogaraju JK. Application of machine learning algorithms in the investigation of groundwater quality parameters over YSR district, India. TUJE. January 2023;7(1):64-72. doi:10.31127/tuje.1032314
Chicago Mogaraju, Jagadish Kumar. “Application of Machine Learning Algorithms in the Investigation of Groundwater Quality Parameters over YSR District, India”. Turkish Journal of Engineering 7, no. 1 (January 2023): 64-72. https://doi.org/10.31127/tuje.1032314.
EndNote Mogaraju JK (January 1, 2023) Application of machine learning algorithms in the investigation of groundwater quality parameters over YSR district, India. Turkish Journal of Engineering 7 1 64–72.
IEEE J. K. Mogaraju, “Application of machine learning algorithms in the investigation of groundwater quality parameters over YSR district, India”, TUJE, vol. 7, no. 1, pp. 64–72, 2023, doi: 10.31127/tuje.1032314.
ISNAD Mogaraju, Jagadish Kumar. “Application of Machine Learning Algorithms in the Investigation of Groundwater Quality Parameters over YSR District, India”. Turkish Journal of Engineering 7/1 (January 2023), 64-72. https://doi.org/10.31127/tuje.1032314.
JAMA Mogaraju JK. Application of machine learning algorithms in the investigation of groundwater quality parameters over YSR district, India. TUJE. 2023;7:64–72.
MLA Mogaraju, Jagadish Kumar. “Application of Machine Learning Algorithms in the Investigation of Groundwater Quality Parameters over YSR District, India”. Turkish Journal of Engineering, vol. 7, no. 1, 2023, pp. 64-72, doi:10.31127/tuje.1032314.
Vancouver Mogaraju JK. Application of machine learning algorithms in the investigation of groundwater quality parameters over YSR district, India. TUJE. 2023;7(1):64-72.
Flag Counter