Groundwater data, predictor variables, and rasters used for predicting the probability of high arsenic and high manganese in the Glacial Aquifer System, northern continental United States

This data release contains input data used in model development and TIF raster files used to predict the probability of high arsenic (As) and high manganese (Mn) in groundwater within the glacial aquifer system in the northern United States. Input data include measured As and Mn concentrations at groundwater wells, and associated predictor variable data. The probability of high As and high Mn was predicted using boosted regression tree methods using the gbm package in R version 4.0.0. The response variables for individual models were the occurrence of: (1) As >10 µg/L, and (2) Mn >300 µg/L. Water-quality data were compiled from three sources, as described in Wilson and others (2019): a compilation of data from numerous agencies and organizations at the state, regional, and local level; the U.S. Geological Survey National Water Information System; and the U.S. Environmental Protection Agency Safe Drinking Water Information System. The resultant dataset consisted of 10,001 As and 14,565 Mn measurements across the study area. A total of 108 predictor variables were originally considered for model development which included well characteristics, soil properties, aquifer properties, predicted nitrate, hydrologic position on the landscape, groundwater age, predicted pH, and predicted anoxic conditions. After model refinement, a total of 79 and 55 predictor variables were used for predicting the probability of high As and high Mn, respectively. The probability of high As and high Mn was predicted at two depths representative of public and domestic drinking water supply depths at a resolution of 1 km across the glacial aquifer.

Data e Risorse

Campo Valore
accessLevel public
bureauCode {010:12}
catalog_@context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
catalog_conformsTo https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy https://project-open-data.cio.gov/v1.1/schema/catalog.json
datagov_dedupe_retained 20220722114234
identifier USGS:5f21cf8982cef313ed94004a
metadata_type geospatial
modified 20210406
old-spatial {"type": "Polygon", "coordinates": [[[-124.7542, 35.0921], [-124.7542, 51.5222], [ -65.3793, 51.5222], [ -65.3793, 35.0921], [-124.7542, 35.0921]]]}
publisher U.S. Geological Survey
publisher_hierarchy Department of the Interior > U.S. Geological Survey
resource-type Dataset
source_datajson_identifier true
source_hash c04bc0b5eb83fd80ef4b92212e27889c7abd6121
source_schema_version 1.1
spatial {"type": "Polygon", "coordinates": [[[-124.7542, 35.0921], [-124.7542, 51.5222], [ -65.3793, 51.5222], [ -65.3793, 35.0921], [-124.7542, 35.0921]]]}
theme {geospatial}
Gruppi
  • AmeriGEOSS
  • National Provider
  • North America
Tag
  • amerigeo
  • amerigeoss
  • aquifer-system
  • arsenic
  • ckan
  • connecticut
  • drinking-water-use
  • geo
  • geoss
  • glacial-aquifer-system
  • groundwater
  • hydrogeology
  • idaho
  • illinois
  • indiana
  • iowa
  • kansas
  • maine
  • manganese
  • massachusetts
  • michigan
  • minnesota
  • missouri
  • montana
  • national
  • nawqa
  • nebraska
  • new-hampshire
  • new-jersey
  • new-york
  • north-america
  • north-dakota
  • ohio
  • pennsylvania
  • rhode-island
  • south-dakota
  • united-states
  • usgs-5f21cf8982cef313ed94004a
  • vermont
  • washington
  • water-quality
  • wisconsin
isopen False
license_id notspecified
license_title License not specified
maintainer Sarah M. Elliott
maintainer_email selliott@usgs.gov
metadata_created 2025-11-22T12:10:40.742339
metadata_modified 2025-11-22T12:10:40.742343
notes This data release contains input data used in model development and TIF raster files used to predict the probability of high arsenic (As) and high manganese (Mn) in groundwater within the glacial aquifer system in the northern United States. Input data include measured As and Mn concentrations at groundwater wells, and associated predictor variable data. The probability of high As and high Mn was predicted using boosted regression tree methods using the gbm package in R version 4.0.0. The response variables for individual models were the occurrence of: (1) As >10 µg/L, and (2) Mn >300 µg/L. Water-quality data were compiled from three sources, as described in Wilson and others (2019): a compilation of data from numerous agencies and organizations at the state, regional, and local level; the U.S. Geological Survey National Water Information System; and the U.S. Environmental Protection Agency Safe Drinking Water Information System. The resultant dataset consisted of 10,001 As and 14,565 Mn measurements across the study area. A total of 108 predictor variables were originally considered for model development which included well characteristics, soil properties, aquifer properties, predicted nitrate, hydrologic position on the landscape, groundwater age, predicted pH, and predicted anoxic conditions. After model refinement, a total of 79 and 55 predictor variables were used for predicting the probability of high As and high Mn, respectively. The probability of high As and high Mn was predicted at two depths representative of public and domestic drinking water supply depths at a resolution of 1 km across the glacial aquifer.
num_resources 2
num_tags 42
title Groundwater data, predictor variables, and rasters used for predicting the probability of high arsenic and high manganese in the Glacial Aquifer System, northern continental United States