Active Evaluation Software for Selection of Ground Truth Labels

This software repository contains a python package Aegis (Active Evaluator Germane Interactive Selector) package that allows us to evaluate machine learning systems's performance (according to a metric such as accuracy) by adaptively sampling trials to label from an unlabeled test set to minimize the number of labels needed. This includes sample (public) data as well as a simulation script that tests different label-selecting strategies on already labelled test sets. This software is configured so that users can add their own data and system outputs to test evaluation.

Data and Resources

Field Value
accessLevel public
bureauCode {006:55}
catalog_@context https://project-open-data.cio.gov/v1.1/schema/data.json
catalog_conformsTo https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy https://project-open-data.cio.gov/v1.1/schema/catalog.json
identifier ark:/88434/mds2-2227
issued 2020-07-09
landingPage https://github.com/usnistgov/active-evaluation
language {en}
license https://www.nist.gov/open/license
modified 2020-04-28 00:00:00
programCode {006:045}
publisher National Institute of Standards and Technology
resource-type Dataset
source_datajson_identifier true
source_hash 12e1285c4b4fb916d3e1c426cbfd61768bc581e0
source_schema_version 1.1
theme {"Information Technology:Data and informatics"}
Groups
  • AmeriGEOSS
  • National Provider
  • North America
Tags
  • active-evaluation
  • amerigeo
  • amerigeoss
  • ar
  • ckan
  • geo
  • geoss
  • machine-learning
  • national
  • north-america
  • united-states
isopen False
license_id other-license-specified
license_title other-license-specified
maintainer Peter Fontana
maintainer_email peter.fontana@nist.gov
metadata_created 2025-11-22T12:30:42.607955
metadata_modified 2025-11-22T12:30:42.607959
notes This software repository contains a python package Aegis (Active Evaluator Germane Interactive Selector) package that allows us to evaluate machine learning systems's performance (according to a metric such as accuracy) by adaptively sampling trials to label from an unlabeled test set to minimize the number of labels needed. This includes sample (public) data as well as a simulation script that tests different label-selecting strategies on already labelled test sets. This software is configured so that users can add their own data and system outputs to test evaluation.
num_resources 1
num_tags 11
title Active Evaluation Software for Selection of Ground Truth Labels