POPMAPS: An R package to estimate ancestry probability surfaces

This software code was developed to estimate the probability that individuals found at a geographic location will belong to the same genetic cluster as individuals at the nearest empirical sampling location for which ancestry is known. POPMAPS includes 5 main functions to calculate and visualize these results (see Table 1 for functions and arguments). Population assignment coefficients and a raster surface must be estimated prior to using POPMAPS functions (see Fig. 1a and b). With these data in hand, users can run a jackknife function to choose an optimal parameter combination that reconstructs empirical data best (Figs. 2 and S2). Pertinent parameters include 1) how many empirical sampling localities should be used to estimate ancestry coefficients and 2) what is the influence of empirical sites on ancestry coefficient estimation as distance increases (Fig. 2). After choosing these parameters, a user can estimate the entire ancestry probability surface (Fig. 1c and d, Fig. 3). This package can be used to estimate ancestry coefficients from empirical genetic data across a user-defined geospatial layer. Estimated ancestry coefficients are used to calculate ancestry probabilities, which together with 'hard population boundaries,' compose an ancestry probability surface. Within a hard boundary, the ancestry probability informs a user of the confidence that they can have of genetic identity matching the principal population if they were to find individuals of the focal organism at a location. Confidence can be modified across the ancestry probability surface by changing parameters influencing the contribution of empirical data to the estimation of ancestry coefficients. This information may be valuable to inform decision-making for organisms having management needs.

Data and Resources

Field Value
accessLevel public
bureauCode {010:12}
catalog_@context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
catalog_conformsTo https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy https://project-open-data.cio.gov/v1.1/schema/catalog.json
identifier USGS:627e7b24d34e3bef0c9a2cc2
metadata_type geospatial
modified 20220524
old-spatial -180.000, 90.0000, 180.000, 90.0000
publisher U.S. Geological Survey
publisher_hierarchy Department of the Interior > U.S. Geological Survey
resource-type Dataset
source_datajson_identifier true
source_hash 627ae16774f20acb9d41310260fb4906ae182130
source_schema_version 1.1
spatial {"type": "Polygon", "coordinates": [[[-180.000, 90.0000], [-180.000, 90.0000], [ 180.000, 90.0000], [ 180.000, 90.0000], [-180.000, 90.0000]]]}
theme {geospatial}
Groups
  • AmeriGEOSS
  • National Provider
  • North America
Tags
  • amerigeo
  • amerigeoss
  • biogeography
  • biota
  • ckan
  • demographics
  • environmental-gradients
  • evolution
  • genetic-diversity
  • geo
  • geoss
  • national
  • native-plant-materials-development
  • native-species
  • north-america
  • phylogeny
  • phylogeography
  • restoration
  • seed-transfer-guidelines
  • united-states
  • usgs-627e7b24d34e3bef0c9a2cc2
isopen False
license_id notspecified
license_title License not specified
maintainer Robert T Massatti
maintainer_email rmassatti@usgs.gov
metadata_created 2025-11-21T19:37:14.285020
metadata_modified 2025-11-21T19:37:14.285025
notes This software code was developed to estimate the probability that individuals found at a geographic location will belong to the same genetic cluster as individuals at the nearest empirical sampling location for which ancestry is known. POPMAPS includes 5 main functions to calculate and visualize these results (see Table 1 for functions and arguments). Population assignment coefficients and a raster surface must be estimated prior to using POPMAPS functions (see Fig. 1a and b). With these data in hand, users can run a jackknife function to choose an optimal parameter combination that reconstructs empirical data best (Figs. 2 and S2). Pertinent parameters include 1) how many empirical sampling localities should be used to estimate ancestry coefficients and 2) what is the influence of empirical sites on ancestry coefficient estimation as distance increases (Fig. 2). After choosing these parameters, a user can estimate the entire ancestry probability surface (Fig. 1c and d, Fig. 3). This package can be used to estimate ancestry coefficients from empirical genetic data across a user-defined geospatial layer. Estimated ancestry coefficients are used to calculate ancestry probabilities, which together with 'hard population boundaries,' compose an ancestry probability surface. Within a hard boundary, the ancestry probability informs a user of the confidence that they can have of genetic identity matching the principal population if they were to find individuals of the focal organism at a location. Confidence can be modified across the ancestry probability surface by changing parameters influencing the contribution of empirical data to the estimation of ancestry coefficients. This information may be valuable to inform decision-making for organisms having management needs.
num_resources 2
num_tags 21
title POPMAPS: An R package to estimate ancestry probability surfaces