Data release for: Evaluating k-nearest neighbor (kNN) imputation models for species-level aboveground forest biomass mapping in Northeast China

Quantifying spatially explicit or pixel-level aboveground forest biomass (AFB) across large regions is critical for measuring forest carbon sequestration capacity, assessing forest carbon balance, and revealing changes in the structure and function of forest ecosystems. When AFB is measured at the species level using widely available remote sensing data, regional changes in forest composition can readily be monitored. In this study, wall-to-wall maps of species-level AFB were generated for forests in Northeast China by integrating forest inventory data with Moderate Resolution Imaging Spectroradiometer (MODIS) images and environmental variables through applying the optimal k-nearest neighbor (kNN) imputation model. By comparing the prediction accuracy of 630 kNN models, we found that the models with random forest (RF) as the distance metric showed the highest accuracy. Compared to the use of single-month MODIS data for September, there was no appreciable improvement for the estimation accuracy of species-level AFB by using multi-month MODIS data. When k > 7, the accuracy improvement of the RF-based kNN models using the single MODIS predictors for September was essentially negligible. Therefore, the kNN model using the RF distance metric, single-month (September) MODIS predictors and k = 7 was the optimal model to impute the species-level AFB for entire Northeast China. Our imputation results showed that average AFB of all species over Northeast China was 101.98 Mg/ha around 2000. Among 17 widespread species, larch was most dominant, with the largest AFB (20.88 Mg/ha), followed by white birch (13.84 Mg/ha). Amur corktree and willow had low AFB (0.91 and 0.96 Mg/ha, respectively). Environmental variables (e.g., climate and topography) had strong relationships with species-level AFB. By integrating forest inventory data and remote sensing data with complete spatial coverage using the optimal kNN model, we successfully mapped the AFB distribution of the 17 tree species over Northeast China. We also evaluated the accuracy of AFB at different spatial scales. The AFB estimation accuracy significantly improved from stand level up to the ecotype level, indicating that the AFB maps generated from this study are more suitable to apply to forest ecosystem models (e.g., LINKAGES) which require species-level attributes at the ecotype scale.

Data e Risorse

Campo Valore
accessLevel public
bureauCode {010:12}
catalog_@context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
catalog_conformsTo https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy https://project-open-data.cio.gov/v1.1/schema/catalog.json
identifier USGS:5d56e4e0e4b01d82ce8ebad2
metadata_type geospatial
modified 20200820
old-spatial 114.296759671, 37.184155145, 136.525997731, 54.239622795
publisher U.S. Geological Survey
publisher_hierarchy Department of the Interior > U.S. Geological Survey
resource-type Dataset
source_datajson_identifier true
source_hash 562a60435508fb44ea07156da33725f4542c0146
source_schema_version 1.1
spatial {"type": "Polygon", "coordinates": [[[114.296759671, 37.184155145], [114.296759671, 54.239622795], [ 136.525997731, 54.239622795], [ 136.525997731, 37.184155145], [114.296759671, 37.184155145]]]}
theme {geospatial}
Gruppi
  • AmeriGEOSS
  • National Provider
  • North America
Tag
  • aboveground-forest-biomass
  • amerigeo
  • amerigeoss
  • biomass-imputation
  • changbai-mountains
  • ckan
  • field-inventory-and-monitoring
  • forest-ecosystems
  • geo
  • geoscientificinformation
  • geoss
  • greater-khingan-mountains
  • heilongjiang
  • hulun-buir-plateau
  • imagerybasemapsearthcover
  • inner-mongolia
  • jilin
  • lesser-khingan-mountains
  • liaohe-plain
  • liaoning
  • modis
  • multispectral-imaging
  • national
  • north-america
  • northeast-china
  • random-forest-model
  • sanjiang-plain
  • songnen-plain
  • united-states
  • usgs-5d56e4e0e4b01d82ce8ebad2
isopen False
license_id notspecified
license_title License not specified
maintainer Paul D Henne
maintainer_email phenne@usgs.gov
metadata_created 2025-11-21T01:17:05.830187
metadata_modified 2025-11-21T01:17:05.830191
notes Quantifying spatially explicit or pixel-level aboveground forest biomass (AFB) across large regions is critical for measuring forest carbon sequestration capacity, assessing forest carbon balance, and revealing changes in the structure and function of forest ecosystems. When AFB is measured at the species level using widely available remote sensing data, regional changes in forest composition can readily be monitored. In this study, wall-to-wall maps of species-level AFB were generated for forests in Northeast China by integrating forest inventory data with Moderate Resolution Imaging Spectroradiometer (MODIS) images and environmental variables through applying the optimal k-nearest neighbor (kNN) imputation model. By comparing the prediction accuracy of 630 kNN models, we found that the models with random forest (RF) as the distance metric showed the highest accuracy. Compared to the use of single-month MODIS data for September, there was no appreciable improvement for the estimation accuracy of species-level AFB by using multi-month MODIS data. When k > 7, the accuracy improvement of the RF-based kNN models using the single MODIS predictors for September was essentially negligible. Therefore, the kNN model using the RF distance metric, single-month (September) MODIS predictors and k = 7 was the optimal model to impute the species-level AFB for entire Northeast China. Our imputation results showed that average AFB of all species over Northeast China was 101.98 Mg/ha around 2000. Among 17 widespread species, larch was most dominant, with the largest AFB (20.88 Mg/ha), followed by white birch (13.84 Mg/ha). Amur corktree and willow had low AFB (0.91 and 0.96 Mg/ha, respectively). Environmental variables (e.g., climate and topography) had strong relationships with species-level AFB. By integrating forest inventory data and remote sensing data with complete spatial coverage using the optimal kNN model, we successfully mapped the AFB distribution of the 17 tree species over Northeast China. We also evaluated the accuracy of AFB at different spatial scales. The AFB estimation accuracy significantly improved from stand level up to the ecotype level, indicating that the AFB maps generated from this study are more suitable to apply to forest ecosystem models (e.g., LINKAGES) which require species-level attributes at the ecotype scale.
num_resources 2
num_tags 30
title Data release for: Evaluating k-nearest neighbor (kNN) imputation models for species-level aboveground forest biomass mapping in Northeast China