Abstract
A decision tree-based approach is proposed to predict ground water quality based on the United States Salinity Laboratory (USSL) diagram using the data from aquifers in agricultural lands of Ardebil province, northwest of Iran. Several combinations of hydro chemical parameters of groundwater and monthly precipitation with different lag time were considered to find an accurate and economical alternative for groundwater quality classification. The performance evaluation was based on the number of correctly classified instances (CCI) and kappa statistics. The results suggested the suitability of decision tree-based classification approach for the used data sets. The overall average of CCI and kappa statistic for the prediction of groundwater quality classes based on the USSL diagram was 0.88 and 0.83 %, respectively. Principal component analysis (PCA) was also used to determine the important parameters for groundwater quality classification. The results showed that groundwater quality classification by decision tree is more precise and efficient in comparison with PCA. The best alternative could evaluate groundwater quality class with only two parameters: electrical conductivity and cumulative precipitation of 11 months earlier. The developed model is able to predict water quality class by only two variables and this lead to a reduction in the number of variables analyzed on a routine basis, resulting in a significant reduction in laboratory costs and latency times between the sampling moment and the outcome of the laboratory analyses.
Similar content being viewed by others
References
Al-Tamir MA (2008) Interpretation of ground water quality data variation in Erbil City, Northern Iraq. Al-Rafidain Eng J 16(2):24–30
Atkins JP, Burdon D, Allen J (2007) An application of contingent valuation and decision tree analysis to water quality improvements. Mar Pollut Bull 55:591–602
Belkhiri L, Boudoukha A, Mouni L, Baouz T (2010) Multivariate statistical characterization of groundwater quality in Ain Azel plain, Algeria. Afr J Environ Sci Technol 4(8):526–534
Carletta J (1996) Assessing agreement on classification tasks: the kappa statistic. Computational Linguistics 22(2):249–254
Crowther J, Wyer MD, Bradford M, Kay D, Francis CA (2003) Modelling faecal indicator concentrations in large rural catchments using land use and topographic data. J Appl Microbiol 94:962–973
Cohen J (1960) A coefficient of agreement for nominal scales. Educ Psychol Meas 20:37–46
Eaton EM (1950) Significance of carbonate in irrigation waters. Soil Sci 69:123–133
Gangopadhyay S, Gupta AS, Nachabe MH (2001) Evaluation of groundwater monitoring network by principal component analysis. Groundwater 39(2):181–191
Jackson JE (1991) A user’s guide to principal components. John Wiley and Sons, Inc., New York, N.Y., USA, 592 p
Hamilton H, Gurak E, Findlater L, Olive W (2001) Overview of decision trees. Available From: <http://dms.irb.hr/tutorial/tut_dtrees.php>. Accessed 18 May 2011
Iranian Ministry of Energy (2010) Groundwater chemistry of Ardebil plain aquifer, Iranian Ministry of Energy, Statistical report. East Azarbaijan office, Iran
Joarder MAM, Raihan F, Alam JB, Hasanuzzaman S (2008) Regression analysis of ground water quality data of Sunamganj District, Bangladesh. Int J Environ Res 2(3):291–296
Kaiser HF (1958) The varimax criterion for analytic rotation in factor analysis. Psychometrika 23(3):187–200. doi:10.1007/BF02289233
Lee HK, Oh KD, Park DH, Jung JH, Yoon SJ (1997) Fuzzy expert system to determine stream water quality classification from ecological information. Water Sci Technol 36(12):199–206
Liao H, Sun W (2010) Forecasting and evaluating water quality of Chao Lake based on an improved decision tree method. Procedia Environ Sci 2:970–979
Litaor MI, Brielmann H, Reichmann O, Shenker M (2010) Hydrochemical analysis of groundwater using a tree-based model. J Hydrol 387:273–282
Ma J, Guo J, Liu X (2010) Water quality evaluation model based on principal component analysis and information entropy: application in Jinshui River. J Resour Ecol 1(3):249–252
Mathur P, Sharma S, Soni B (2010) Multiple regression equations modelling of groundwater of Ajmer-Pushkar railway line region, Rajasthan (India). J Environ Sci Eng 52(1):11–4
Mirabbasi R, Mazloumzadeh SM, Rahnama MB (2008) Evaluation of irrigation water quality using fuzzy logic. Res J Environ Sci 2(5):340–352
Pal M, Mather PM (2003) An assessment of the effectiveness of decision tree methods for land cover classification. Remote Sens Environ 86:554–565
Piper AM (1944) A graphic procedure in the geochemical interpretation of water analyses. Am Geophys Union Trans 25:914–923
Priya KL, Arulraj GP (2011) A correlation–regression model for the physicochemical parameters of the groundwater in Coimbatore city, India. Environ Technol 32(7):731–738
Quinlan JR (1993) C4.5 Programs for machine learning. Morgan, Kaufmann, SanMateo (CA)
Rao YRS, Keshari AK, Gosain AK (2010) Evaluation of regional groundwater quality using PCA and geostatistics in the urban coastal aquifer, East Coast of India. Intern J Environ Waste Manag 5(1/2):163–180
Santos MF, Cortez P, Quintela H, Neves J, Vicente H (2005) Zanassi A, Brebbia CA, Ebecken NFF. In: Data mining VI: data mining, text mining and their business applications. WIT Press, Southampton, pp 523–532
Sattari MT, Apaydin H, Ozturk F (2011) Flow estimations for the Sohu Stream using artificial neural networks. Environ Earth Sci. doi:10.1007/s12665-011-1428-7
Schoeller H (1962) Les Eaux Sutter Raines. Masson et cie. 67, Paris
Shrestha S, Kazama F, Nakamura T (2008) Use of principal component analysis, factor analysis and discriminant analysis to evaluate spatial and temporal variations in water quality of the Mekong River. J Hydroinf 10(1):43–56
Areerachakul S, Siripun S (2010) Classification and regression trees and MLP neural network to classify water quality of canals in Bangkok, Thailand. Int J Intell Comput Res (IJICR) 1(1–2):43–50
Spruill TB, Showers WJ, Howe SS (2002) Application of classification-tree methods to identify nitrate sources in ground water. J Environ Qual 31:1538–1549
Stiff HA Jr (1951) The interpretation of chemical water analysis by means of patterns. J Pet Technol 3(10):15–17
U.S. Salinity Laboratory Staff (1954) Diagnosis and improvement of saline and alkali soils: U.S. Dept. Agric. Handbook No.60, 160 p
Walley WJ, Dzeroski S (1996) Biological monitoring: A comparison between Bayesian, neural and machine learning methods of water quality classification, International Symposium on Environmental Software Systems
Wilcox LV (1955) Classification and use of irrigation waters: U.S. Dept. Agric. Circ. 969, 19p
Witten IH, Frank E (2005) Data mining, practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, California, USA
Acknowledgments
This study was supported by the Ahar Branch, Islamic Azad University, Ahar, Iran.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Saghebian, S.M., Sattari, M.T., Mirabbasi, R. et al. Ground water quality classification by decision tree method in Ardebil region, Iran. Arab J Geosci 7, 4767–4777 (2014). https://doi.org/10.1007/s12517-013-1042-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12517-013-1042-y