Data from: Genetic variation among 481 diverse soybean accessions

This data is from the manuscript titled: "Genetic variation among 481 diverse soybean accessions, inferred from genomic re-sequencing". SNP calls were obtained from resequencing 481 diverse soybean lines comprising 52 wild (Glycine soja) and 429 cultivated (Glycine max). This dataset contains 6 gzipped VCF (Variant Call Format) files with variant calls for all 481 USB accessions, all G. max accessions, G. soja accessions, accessions sequenced at 15x coverage, accessions sequenced at 40x coverage, and 106 accessions re-sequenced from a previous study (Valliyodan et al. 2016). SNPs were called using the Haplotype caller algorithm from the Genome Analysis Toolkit (GATK) version gatk-2.5-2-gf57256b. A total of 7.8 million SNPs were identified between the 481 re-sequenced accessions. SNPs were assigned IDs using the script "assign_name.awk" available at https://github.com/soybase/SoySNP-Names. SNP effects were predicted using SnpEff 3.0. Dataset also available at https://soybase.org/data/v2/Glycine/max/diversity/Wm82.gnm2.div.Valliyod... Funding support provided by the United Soybean Board for the large-scale sequencing of soybean genomes (project #1320-532-5615), Bayer (previously Monsanto and Bayer), and Corteva (previously Dow AgroSciences), with in-kind support for analysis from USDA Agricultural Research Service project 5030-21000-069-00-D.

Data and Resources

Field Value
accessLevel public
bureauCode {005:18}
catalog_@context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
catalog_conformsTo https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy https://project-open-data.cio.gov/v1.1/schema/catalog.json
describedBy https://data.nal.usda.gov/dataset/data-genetic-variation-among-481-diverse-soybean-accessions/resource/dcd60c82-ae7d-4514-9d79-66fdaa7e5a57
identifier c816b299-60fd-47a4-8a6a-4ad1e7285daf
license https://creativecommons.org/publicdomain/zero/1.0/
modified 2021-10-27
programCode {005:040}
publisher Agricultural Research Service
resource-type Dataset
source_datajson_identifier true
source_hash 67e9397de2ac95e74466855df877b04c6711fa4d
source_schema_version 1.1
Groups
  • AmeriGEOSS
  • National Provider
  • North America
Tags
  • amerigeo
  • amerigeoss
  • ckan
  • genetic-variation
  • geo
  • geoss
  • national
  • north-america
  • np301
  • resequencing
  • snps
  • soybase
  • soybean
  • united-states
isopen True
license_id cc-zero
license_title Creative Commons CCZero
license_url http://www.opendefinition.org/licenses/cc-zero
maintainer Brown, Anne V.
maintainer_email anne.brown@usda.gov
metadata_created 2025-11-20T23:36:48.045099
metadata_modified 2025-11-20T23:36:48.045102
notes <p>This data is from the manuscript titled: "Genetic variation among 481 diverse soybean accessions, inferred from genomic re-sequencing". SNP calls were obtained from resequencing 481 diverse soybean lines comprising 52 wild (<em>Glycine soja</em>) and 429 cultivated (<em>Glycine max</em>). This dataset contains 6 gzipped VCF (Variant Call Format) files with variant calls for all 481 USB accessions, all <em>G. max</em> accessions, <em>G. soja</em> accessions, accessions sequenced at 15x coverage, accessions sequenced at 40x coverage, and 106 accessions re-sequenced from a previous study (Valliyodan et al. 2016). SNPs were called using the Haplotype caller algorithm from the Genome Analysis Toolkit (GATK) version gatk-2.5-2-gf57256b. A total of 7.8 million SNPs were identified between the 481 re-sequenced accessions. SNPs were assigned IDs using the script "assign_name.awk" available at <a href="https://github.com/soybase/SoySNP-Names">https://github.com/soybase/SoySNP-Names</a>. SNP effects were predicted using SnpEff 3.0.</p> <p>Dataset also available at <a href="https://soybase.org/data/v2/Glycine/max/diversity/Wm82.gnm2.div.Valliyodan_Brown_2021/">https://soybase.org/data/v2/Glycine/max/diversity/Wm82.gnm2.div.Valliyod...</a></p> <p>Funding support provided by the United Soybean Board for the large-scale sequencing of soybean genomes (project #1320-532-5615), Bayer (previously Monsanto and Bayer), and Corteva (previously Dow AgroSciences), with in-kind support for analysis from USDA Agricultural Research Service project 5030-21000-069-00-D.</p>
num_resources 15
num_tags 14
title Data from: Genetic variation among 481 diverse soybean accessions