Coast Train--Labeled imagery for training and evaluation of data-driven models for image segmentation

Coast Train is a library of images of coastal environments, annotations, and corresponding thematic label masks (or ‘label images’) collated for the purposes of training and evaluating machine learning (ML), deep learning, and other models for image segmentation. It includes image sets from both geospatial satellite, aerial, and UAV imagery and orthomosaics, as well as non-geospatial oblique and nadir imagery. Images include a diverse range of coastal environments from the U.S. Pacific, Gulf of Mexico, Atlantic, and Great Lakes coastlines, consisting of time-series of high-resolution (≤1m) orthomosaics and satellite image tiles (10–30m). Each image, image annotation, and labelled image is available as a single NPZ zipped file. NPZ files follow the following naming convention: {datasource}{numberofclasses}{threedigitdatasetversion}.zip, where {datasource} is the source of the original images (for example, NAIP, Landsat 8, Sentinel 2), {numberofclasses} is the number of classes used to annotate the images, and {threedigitdatasetversion} is the three-digit code corresponding to the dataset version (in other words, 001 is version 1). Each zipped folder contains a collection of NPZ format files, each of which corresponds to an individual image. An individual NPZ file is named after the image that it represents and contains (1) a CSV file with detail information for every image in the zip folder and (2) a collection of the following NPY files: orig_image.npy (original input image unedited), image.npy (original input image after color balancing and normalization), classes.npy (list of classes annotated and present in the labelled image), doodles.npy (integer image of all image annotations), color_doodles.npy (color image of doodles.npy), label.npy (labelled image created from the classes present in the annotations), and settings.npy (annotation and machine learning settings used to generate the labelled image from annotations). All NPZ files can be extracted using the utilities available in Doodler (Buscombe, 2022). A merged CSV file containing detail information on the complete imagery collection is available at the top level of this data release, details of which are available in the Entity and Attribute section of this metadata file.

Data e Risorse

Campo Valore
accessLevel public
bureauCode {010:12}
catalog_@context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
catalog_conformsTo https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy https://project-open-data.cio.gov/v1.1/schema/catalog.json
identifier USGS:9cdb71c1-cc5a-4786-9232-93d7e7a340cf
metadata_type geospatial
modified 20220319
old-spatial -180.0, -90.0, 180.0, 90.0
publisher U.S. Geological Survey
publisher_hierarchy Department of the Interior > U.S. Geological Survey
resource-type Dataset
source_datajson_identifier true
source_hash 7dd7295b03176b16c157c3dfa8cab41bf95c121f
source_schema_version 1.1
spatial {"type": "Polygon", "coordinates": [[[-180.0, -90.0], [-180.0, 90.0], [ 180.0, 90.0], [ 180.0, -90.0], [-180.0, -90.0]]]}
theme {geospatial}
Gruppi
  • AmeriGEOSS
  • National Provider
  • North America
Tag
  • aerial-and-satellite-photography
  • aerial-photography
  • agents-of-coastal-change
  • amerigeo
  • amerigeoss
  • anthropogenic-agents-of-coastal-change
  • bay
  • beach
  • beach-zone-communities
  • biological-and-physical-processes
  • biology
  • botany
  • breakwater-shoreline-stabilization-structure
  • bridge
  • canal
  • cape
  • ckan
  • cliff
  • cmhrp
  • coast
  • coastal-and-marine-hazards-and-resources-program
  • coastal-barrier
  • coastal-development
  • coastal-ecosystems
  • coastal-plain
  • coastal-processes
  • coastal-protection-structures
  • computer-science
  • cove
  • datasets
  • distributions
  • dune
  • earth-sciences
  • earth-system
  • ecology
  • effects-of-coastal-change
  • environment
  • environmental-geography
  • erosion
  • floods
  • geo
  • geography
  • geology
  • geoscientificinformation
  • geospatial-datasets
  • geoss
  • habitat
  • hazards
  • hazards-and-disasters
  • human-impacts
  • human-responses-to-coastal-change
  • image-analysis
  • image-collections
  • information-science
  • infrastructure
  • island
  • jetty
  • lagoon
  • lake
  • land-use-and-land-cover
  • land-use-change
  • life-sciences
  • marsh
  • mitigation-of-coastal-hazards
  • mudflat
  • multispectral-imaging
  • national
  • north-america
  • ocean
  • pacific-coastal-and-marine-science-center
  • pcmsc
  • physical-geography
  • physical-habitats-and-geomorphology
  • remote-sensing
  • shore
  • social-sciences
  • spcmsc
  • st-petersburg-coastal-and-marine-science-center
  • structures
  • swamp
  • tidal-flat
  • tidal-inlet
  • u-s-geological-survey
  • united-states
  • usgs
  • usgs-9cdb71c1-cc5a-4786-9232-93d7e7a340cf
  • visible-light-imaging
  • warc
  • wetland-and-aquatic-research-center
  • whcmsc
  • woods-hole-coastal-and-marine-science-center
isopen False
license_id notspecified
license_title License not specified
maintainer PCMSC Science Data Coordinator
maintainer_email pcmsc_data@usgs.gov
metadata_created 2025-11-20T10:17:34.159200
metadata_modified 2025-11-20T10:17:34.159205
notes Coast Train is a library of images of coastal environments, annotations, and corresponding thematic label masks (or ‘label images’) collated for the purposes of training and evaluating machine learning (ML), deep learning, and other models for image segmentation. It includes image sets from both geospatial satellite, aerial, and UAV imagery and orthomosaics, as well as non-geospatial oblique and nadir imagery. Images include a diverse range of coastal environments from the U.S. Pacific, Gulf of Mexico, Atlantic, and Great Lakes coastlines, consisting of time-series of high-resolution (≤1m) orthomosaics and satellite image tiles (10–30m). Each image, image annotation, and labelled image is available as a single NPZ zipped file. NPZ files follow the following naming convention: {datasource}_{numberofclasses}_{threedigitdatasetversion}.zip, where {datasource} is the source of the original images (for example, NAIP, Landsat 8, Sentinel 2), {numberofclasses} is the number of classes used to annotate the images, and {threedigitdatasetversion} is the three-digit code corresponding to the dataset version (in other words, 001 is version 1). Each zipped folder contains a collection of NPZ format files, each of which corresponds to an individual image. An individual NPZ file is named after the image that it represents and contains (1) a CSV file with detail information for every image in the zip folder and (2) a collection of the following NPY files: orig_image.npy (original input image unedited), image.npy (original input image after color balancing and normalization), classes.npy (list of classes annotated and present in the labelled image), doodles.npy (integer image of all image annotations), color_doodles.npy (color image of doodles.npy), label.npy (labelled image created from the classes present in the annotations), and settings.npy (annotation and machine learning settings used to generate the labelled image from annotations). All NPZ files can be extracted using the utilities available in Doodler (Buscombe, 2022). A merged CSV file containing detail information on the complete imagery collection is available at the top level of this data release, details of which are available in the Entity and Attribute section of this metadata file.
num_resources 2
num_tags 91
title Coast Train--Labeled imagery for training and evaluation of data-driven models for image segmentation