Data used to model and map arsenic concentration exceedances in private wells throughout the conterminous United States for human health studies

This data release contains data used to develop models and maps that estimate probabilities of exceeding various thresholds of arsenic concentrations in private domestic wells throughout the conterminous United States. Three boosted regression tree (BRT) models were developed separately to estimate the probability of private well arsenic concentrations exceeding 1, 5, and 10 micrograms per liter (µg/L). A random forest (RF) model was developed to estimate the most probable arsenic concentration category (≤5, >5 to ≤10, or >10 µg/L). The models use arsenic concentration data from private domestic wells located throughout the conterminous United States and independent variables that are available as geospatial data. The models were used to produce maps that are included in this data release. The model input data (predictor variables) that were used to make the maps are within a zipped folder (Map_Input_Data.zip) that contains 85 tif-raster files, one for each model predictor variable. The map probability estimates that are outputs from the model are in a zipped folder (Map_Output_Data.zip) that contains 13 tif-raster files, one model estimate map for each of the BRT models and four for the RF model, as well as 2 confidence interval maps for each BRT model.

Data e Risorse

Campo Valore
accessLevel public
bureauCode {010:12}
catalog_@context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
catalog_conformsTo https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy https://project-open-data.cio.gov/v1.1/schema/catalog.json
identifier USGS:5f2d4ce382ceae4cb3c2e1d6
metadata_type geospatial
modified 20210312
old-spatial -126.9141, 23.2413, -65.7422, 49.8380
publisher U.S. Geological Survey
publisher_hierarchy Department of the Interior > U.S. Geological Survey
resource-type Dataset
source_datajson_identifier true
source_hash 24ecf9746827905332c5edbbd8895fe72d458c94
source_schema_version 1.1
spatial {"type": "Polygon", "coordinates": [[[-126.9141, 23.2413], [-126.9141, 49.8380], [ -65.7422, 49.8380], [ -65.7422, 23.2413], [-126.9141, 23.2413]]]}
theme {geospatial}
Gruppi
  • AmeriGEOSS
  • National Provider
  • North America
Tag
  • amerigeo
  • amerigeoss
  • arsenic
  • ckan
  • domestic-well-water-use
  • drinking-water-use
  • environment
  • environmental-health-human
  • geo
  • geoscientificinformation
  • geoss
  • groundwater-quality
  • health
  • machine-learning-models
  • national
  • north-america
  • united-states
  • usgs-5f2d4ce382ceae4cb3c2e1d6
isopen False
license_id notspecified
license_title License not specified
maintainer Melissa A Lombard
maintainer_email mlombard@usgs.gov
metadata_created 2025-11-22T15:24:08.555483
metadata_modified 2025-11-22T15:24:08.555487
notes This data release contains data used to develop models and maps that estimate probabilities of exceeding various thresholds of arsenic concentrations in private domestic wells throughout the conterminous United States. Three boosted regression tree (BRT) models were developed separately to estimate the probability of private well arsenic concentrations exceeding 1, 5, and 10 micrograms per liter (µg/L). A random forest (RF) model was developed to estimate the most probable arsenic concentration category (≤5, >5 to ≤10, or >10 µg/L). The models use arsenic concentration data from private domestic wells located throughout the conterminous United States and independent variables that are available as geospatial data. The models were used to produce maps that are included in this data release. The model input data (predictor variables) that were used to make the maps are within a zipped folder (Map_Input_Data.zip) that contains 85 tif-raster files, one for each model predictor variable. The map probability estimates that are outputs from the model are in a zipped folder (Map_Output_Data.zip) that contains 13 tif-raster files, one model estimate map for each of the BRT models and four for the RF model, as well as 2 confidence interval maps for each BRT model.
num_resources 2
num_tags 18
title Data used to model and map arsenic concentration exceedances in private wells throughout the conterminous United States for human health studies