Probability mapping of groundwater contamination by hydrocarbon from the deep oil reservoirs using GIS-based machine-learning algorithms: a case study of the Dammam aquifer (middle of Iraq)

Huda M. Al-Mayahi, Alaa M. Al-Abadi, Alan E. Fryar

Research output: Contribution to journalArticlepeer-review

3 Scopus citations


The Dammam Formation in the southern and western deserts of Iraq is an important aquifer because it contains a huge groundwater reserve suitable for various uses. In the Karbala-Najaf plateau and the neighboring areas of the middle of Iraq, the drilling of groundwater wells usually fails due to the contamination of this aquifer with hydrocarbon from the deep oil reservoirs. This work suggests a method for the spatial delineation of groundwater contamination in this aquifer. Three machine learning classifiers, backpropagation multi-layer perceptron artificial neural networks (ANN), support vector machine with radial basis function (SVM-radial), and random forest (RF) with GIS, were used to map the probability of contamination in this aquifer. An inventory map of 139 groundwater boreholes (contaminated and non-contaminated) was utilized for building the models with seven factors that are considered to control contamination: fault density, distance to faults in general and the Abu Jir fault in particular, groundwater depth, hydraulic conductivity, aquifer saturated thickness, and land-surface elevation. The Relief-F feature selection method indicated that all factors were relevant. Five statistical measures were used for comparing the model performance: accuracy, sensitivity, specificity, kappa, and the area under the receiver operating characteristics curve (AUC). Applying the models using the R statistical package indicated that all models had excellent goodness-of-fit (accuracy > 90%), but the ANN (accuracy = 97%, sensitivity = 1.00%, specificity = 96%, kappa = 0.93, and AUC = 0.97) and RF (accuracy = 95%, sensitivity = 1.00%, specificity = 93%, kappa = 0.88, and AUC = 0.98) outperformed SVM-radial (accuracy = 92%, sensitivity = 1.00%, specificity = 90%, kappa = 0.82, and AUC = 0.95). The contamination probability values produced by these three models were categorized into different contamination zones range from very low to very high. The finding of this analysis may be used as a guide for drilling uncontaminated wells of groundwater.

Original languageEnglish
Pages (from-to)13736-13751
Number of pages16
JournalEnvironmental Science and Pollution Research
Issue number11
StatePublished - Mar 2021

Bibliographical note

Funding Information:
We thank the General Commission for Groundwater/Iraq for providing the necessary facilities for conducting this study, such as field visits to the region and taking groundwater samples from contaminated and uncontaminated wells.

Publisher Copyright:
© 2020, Springer-Verlag GmbH Germany, part of Springer Nature.

Copyright 2021 Elsevier B.V., All rights reserved.


  • Dammam aquifer
  • Groundwater contamination
  • Iraq
  • Machine learning

ASJC Scopus subject areas

  • Environmental Chemistry
  • Pollution
  • Health, Toxicology and Mutagenesis


Dive into the research topics of 'Probability mapping of groundwater contamination by hydrocarbon from the deep oil reservoirs using GIS-based machine-learning algorithms: a case study of the Dammam aquifer (middle of Iraq)'. Together they form a unique fingerprint.

Cite this