Theory aware Machine Learning (TaML)

A code repository and accompanying data for incorporating imperfect theory into machine learning for improved prediction and explainability. Specifically, it focuses on the case study of the dimensions of a polymer chain in different solvent qualities. Jupyter Notebooks for quickly testing concepts and reproducing figures, as well as source code that computes the mean squared error as a function of dataset size for various machine learning models are included.For additional details on the data, please refer to the README.md associated with the data. For additional details on the code, please refer to the README.md provided with the code repository (GitHub Repo for Theory aware Machine Learning). For additional details on the methodology, see a forthcoming manuscript titled "Leveraging theory for enhanced machine learning," by Debra J. Audus, Austin McDannald and Brian DeCost.

Data and Resources

Field Value
accessLevel public
accrualPeriodicity irregular
bureauCode {006:55}
catalog_@context https://project-open-data.cio.gov/v1.1/schema/data.json
catalog_conformsTo https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy https://project-open-data.cio.gov/v1.1/schema/catalog.json
identifier ark:/88434/mds2-2637
landingPage https://data.nist.gov/od/id/mds2-2637
language {en}
license https://www.nist.gov/open/license
modified 2022-05-06 00:00:00
programCode {006:045}
publisher National Institute of Standards and Technology
resource-type Dataset
source_datajson_identifier true
source_hash 498b138270b1882131a62ea892565a540ae794e8
source_schema_version 1.1
theme {"Mathematics and Statistics:Uncertainty quantification","Materials:Modeling and computational material science","Information Technology:Data and informatics",Materials:Polymers}
Groups
  • AmeriGEOSS
  • National Provider
  • North America
Tags
  • amerigeo
  • amerigeoss
  • ckan
  • geo
  • geoss
  • machine-learning
  • national
  • north-america
  • polymers
  • theory
  • transfer-learning
  • united-states
isopen False
license_id other-license-specified
license_title other-license-specified
maintainer Debra Audus
maintainer_email debra.audus@nist.gov
metadata_created 2025-11-21T18:34:38.616247
metadata_modified 2025-11-21T18:34:38.616251
notes A code repository and accompanying data for incorporating imperfect theory into machine learning for improved prediction and explainability. Specifically, it focuses on the case study of the dimensions of a polymer chain in different solvent qualities. Jupyter Notebooks for quickly testing concepts and reproducing figures, as well as source code that computes the mean squared error as a function of dataset size for various machine learning models are included.For additional details on the data, please refer to the README.md associated with the data. For additional details on the code, please refer to the README.md provided with the code repository (GitHub Repo for Theory aware Machine Learning). For additional details on the methodology, see a forthcoming manuscript titled "Leveraging theory for enhanced machine learning," by Debra J. Audus, Austin McDannald and Brian DeCost.
num_resources 87
num_tags 12
title Theory aware Machine Learning (TaML)