SDNist: Benchmark data and evaluation tools for data synthesizers.

SDNist is a set of benchmark data and metrics for the evaluation of synthetic data generators on structured tabular data. These benchmarks are distributed as a simple open-source python package to allow standardized and reproducible comparison of synthetic generator models on real world data and use cases. These data and metrics were developed for and vetted through the NIST PSCR Differential Privacy Temporal Map Challenge, where the evaluation tools, k-marginal and Higher Order Conjunction, proved effective in distinguishing competing models in the competition environment.SDNist is available via pip install: pip install sdnist for Python >=3.6 or on the [USNIST]Github(https://github.com/usnistgov/SDNist/). The sdnist Python module will download data from NIST as necessary, and users are not required to download data manually.

Data and Resources

Field Value
accessLevel public
bureauCode {006:55}
catalog_@context https://project-open-data.cio.gov/v1.1/schema/data.json
catalog_conformsTo https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy https://project-open-data.cio.gov/v1.1/schema/catalog.json
identifier ark:/88434/mds2-2515
issued 2021-12-28
landingPage https://data.nist.gov/od/id/mds2-2515
language {en}
license https://www.nist.gov/open/license
modified 2021-12-06 00:00:00
programCode {006:045}
publisher National Institute of Standards and Technology
resource-type Dataset
source_datajson_identifier true
source_hash d7cdf4595dc7357337a23e4b2d6f4debc32fa998
source_schema_version 1.1
theme {"Information Technology:Artificial Intelligence","Information Technology:Privacy","Public Safety:Public safety communications research"}
Groups
  • AmeriGEOSS
  • National Provider
  • North America
Tags
  • amerigeo
  • amerigeoss
  • benchmarks
  • ckan
  • differential-privacy
  • geo
  • geoss
  • national
  • north-america
  • privacy
  • private-information-sharing
  • synthetic-data
  • united-states
isopen False
license_id other-license-specified
license_title other-license-specified
maintainer Gary Howarth II
maintainer_email gary.howarth@nist.gov
metadata_created 2025-11-22T17:04:07.863256
metadata_modified 2025-11-22T17:04:07.863260
notes SDNist is a set of benchmark data and metrics for the evaluation of synthetic data generators on structured tabular data. These benchmarks are distributed as a simple open-source python package to allow standardized and reproducible comparison of synthetic generator models on real world data and use cases. These data and metrics were developed for and vetted through the NIST PSCR Differential Privacy Temporal Map Challenge, where the evaluation tools, k-marginal and Higher Order Conjunction, proved effective in distinguishing competing models in the competition environment.SDNist is available via `pip` install: `pip install sdnist` for Python >=3.6 or on the [USNIST]Github(https://github.com/usnistgov/SDNist/). The sdnist Python module will download data from NIST as necessary, and users are not required to download data manually.
num_resources 17
num_tags 13
title SDNist: Benchmark data and evaluation tools for data synthesizers.