Data and code from: The Impacts of Parental Choice and Intrapopulation Selection for Seed Size on the Uprightness of Progeny Derived from Interspecific Hybridization between Glycine max and Glycine soja

This dataset contains all data and code necessary to reproduce the analysis described under the heading "Experiment 3" in the manuscript: Taliercio, E., Eickholt, D., Read, Q. D., Carter, T., Waldeck, N., & Fallen, B. (2023). Parental choice and seed size impact the uprightness of progeny from interspecific Glycine hybridizations. Crop Science. https://doi.org/10.1002/csc2.21015 The attached files are:

G_max_G_soja_seedweight_seedcolor_analysis.Rmd: RMarkdown notebook containing all analysis code. The CSV data files should be placed in a subdirectory called data within the working directory from which the notebook is rendered.

G_max_G_soja_seedweight_seedcolor_analysis.html: Rendered HTML output from RMarkdown notebook, including figures, tables, and explanatory text.

counts_seedwt.csv: CSV file containing the number of progeny selected and average 100-seed weight data for each combination of cross, size class, and replicate. Columns are:

F3_location: text identifier of F3 nursery location, either "CLA" or "FF" plot: numeric ID of plot pop: numeric ID of population max: name of G. max parent soja: name of G. soja parent F2_location: text identifier of F2 nursery location, either "Caswell" or "Hugo" n_planted: number of seeds planted (raw) n_selected: number of progeny selected size_ordered: seed size class, to be converted to an ordered factor size_combined: seed size class aggregated to fewer unique levels ave_100sw: average 100-seed weight for the given size class n_planted_trials: number of seeds planted rounded to nearest integer

seedcolor.csv: CSV file with additional data on number of seeds of each color by population. Columns are:

cross: text identifier of cross line: text identifier of line light: number of light seeds mid: number of mid-green seeds brown: number of brown seeds dark: number of dark or black seeds population: identifier of population type (F2 derived or selected) max: name of G. max parent n_total: sum of the light, mid, brown, and dark columns soja: name of G. soja parent

The data processing and analysis pipeline in the RMarkdown notebook includes:

Importing the data (slightly cleaned version is provided) Creating boxplots of proportion selected by cross, nursery location, and size class Fitting logistic GLMM to estimate the probability of selection as a function of parent, 100-seed weight, and their interactions Extracting and plotting random effect estimates from model Calculating and plotting estimated marginal means from model Taking contrasts between pairs of estimated marginal means and trends Calculating Bayes Factors associated with the contrasts Generating figures and tables for all above results Additional seed color analysis: importing data (slightly cleaned version is provided) Additional seed color analysis: drawing exploratory bar plot Additional seed color analysis: fitting multinomial GLM modeling the proportion of seeds with each color as a function of population Additional seed color analysis: generating expected value predictions from GLM and taking contrasts Additional seed color analysis: creating figures and tables for model results

This research was funded by CRIS 6070-21220-069-00D, United Soybean Board Project # 2333-203-0101, and falls under National Program NP301. Resources in this dataset:Resource Title: RMarkdown document with all analysis code. File Name: G_max_G_soja_seedweight_seedcolor_analysis.RmdResource Title: Rendered HTML version of notebook. File Name: G_max_G_soja_seedweight_seedcolor_analysis.htmlResource Title: Progeny counts and seed weight data. File Name: counts_seedwt.csvResource Title: Seed color counts data. File Name: seedcolor.csv

Data and Resources

Field Value
accessLevel public
bureauCode {005:18}
catalog_@context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
catalog_conformsTo https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy https://project-open-data.cio.gov/v1.1/schema/catalog.json
identifier 10.15482/USDA.ADC/1528604
license https://www.usa.gov/publicdomain/label/1.0/
modified 2024-02-21
old-spatial {"type": "MultiPoint", "coordinates": [[-77.58, 35.26], [-77.79, 35.95], [-78.46, 35.65], [-67, 18.45]]}
programCode {005:040}
publisher Agricultural Research Service
resource-type Dataset
source_datajson_identifier true
source_hash 0dd5061bc9aab8cc8435e03745eef0fc08357a9dad944d56d5649ef325e7b748
source_schema_version 1.1
spatial {"type": "MultiPoint", "coordinates": [[-77.58, 35.26], [-77.79, 35.95], [-78.46, 35.65], [-67, 18.45]]}
temporal 2013-01-01/2021-12-31
Groups
  • AmeriGEOSS
  • National Provider
  • North America
Tags
  • AmeriGEO
  • AmeriGEOSS
  • CKAN
  • GEO
  • GEOSS
  • National
  • North America
  • United States
  • ars
  • data-gov
  • glycine-max
  • glycine-soja
  • hybrids
  • np301
  • plant-breeding
  • response-to-selection
  • seed-size
  • soybean
  • uprightness
isopen False
license_id us-pd
license_title us-pd
maintainer Read, Quentin
maintainer_email quentin.read@usda.gov
metadata_created 2025-09-24T06:23:03.018377
metadata_modified 2025-09-24T06:23:03.018387
notes <p>This dataset contains all data and code necessary to reproduce the analysis described under the heading "Experiment 3" in the manuscript:</p> <p>Taliercio, E., Eickholt, D., Read, Q. D., Carter, T., Waldeck, N., & Fallen, B. (2023). Parental choice and seed size impact the uprightness of progeny from interspecific <em>Glycine</em> hybridizations. <em>Crop Science</em>. <a href="https://doi.org/10.1002/csc2.21015">https://doi.org/10.1002/csc2.21015</a></p> <p>The attached files are:</p> <ul> <li> <p><code>G_max_G_soja_seedweight_seedcolor_analysis.Rmd</code>: RMarkdown notebook containing all analysis code. The CSV data files should be placed in a subdirectory called data within the working directory from which the notebook is rendered.</p> </li> <li> <p><code>G_max_G_soja_seedweight_seedcolor_analysis.html</code>: Rendered HTML output from RMarkdown notebook, including figures, tables, and explanatory text.</p> </li> <li> <p><code>counts_seedwt.csv</code>: CSV file containing the number of progeny selected and average 100-seed weight data for each combination of cross, size class, and replicate. Columns are:</p> <ul> <li><strong>F3_location:</strong> text identifier of F3 nursery location, either <code>"CLA"</code> or <code>"FF"</code></li> <li><strong>plot:</strong> numeric ID of plot</li> <li><strong>pop:</strong> numeric ID of population</li> <li><strong>max:</strong> name of G. max parent</li> <li><strong>soja:</strong> name of G. soja parent</li> <li><strong>F2_location:</strong> text identifier of F2 nursery location, either <code>"Caswell"</code> or <code>"Hugo"</code></li> <li><strong>n_planted:</strong> number of seeds planted (raw)</li> <li><strong>n_selected:</strong> number of progeny selected</li> <li><strong>size_ordered:</strong> seed size class, to be converted to an ordered factor</li> <li><strong>size_combined:</strong> seed size class aggregated to fewer unique levels</li> <li><strong>ave_100sw:</strong> average 100-seed weight for the given size class</li> <li><strong>n_planted_trials:</strong> number of seeds planted rounded to nearest integer</li> </ul> </li> <li> <p><code>seedcolor.csv</code>: CSV file with additional data on number of seeds of each color by population. Columns are:</p> <ul> <li><strong>cross:</strong> text identifier of cross</li> <li><strong>line:</strong> text identifier of line</li> <li><strong>light:</strong> number of light seeds</li> <li><strong>mid:</strong> number of mid-green seeds</li> <li><strong>brown:</strong> number of brown seeds</li> <li><strong>dark:</strong> number of dark or black seeds</li> <li><strong>population:</strong> identifier of population type (F2 derived or selected)</li> <li><strong>max:</strong> name of <em>G. max</em> parent</li> <li><strong>n_total:</strong> sum of the light, mid, brown, and dark columns</li> <li><strong>soja:</strong> name of <em>G. soja</em> parent</li> </ul> </li> </ul> <p>The data processing and analysis pipeline in the RMarkdown notebook includes:</p> <ul> <li>Importing the data (slightly cleaned version is provided)</li> <li>Creating boxplots of proportion selected by cross, nursery location, and size class</li> <li>Fitting logistic GLMM to estimate the probability of selection as a function of parent, 100-seed weight, and their interactions</li> <li>Extracting and plotting random effect estimates from model</li> <li>Calculating and plotting estimated marginal means from model</li> <li>Taking contrasts between pairs of estimated marginal means and trends</li> <li>Calculating Bayes Factors associated with the contrasts</li> <li>Generating figures and tables for all above results</li> <li>Additional seed color analysis: importing data (slightly cleaned version is provided)</li> <li>Additional seed color analysis: drawing exploratory bar plot</li> <li>Additional seed color analysis: fitting multinomial GLM modeling the proportion of seeds with each color as a function of population</li> <li>Additional seed color analysis: generating expected value predictions from GLM and taking contrasts</li> <li>Additional seed color analysis: creating figures and tables for model results</li> </ul> <p>This research was funded by CRIS 6070-21220-069-00D, United Soybean Board Project # 2333-203-0101, and falls under National Program NP301.</p> <div><br>Resources in this dataset:</div><br><ul><li><p>Resource Title: RMarkdown document with all analysis code.</p> <p>File Name: G_max_G_soja_seedweight_seedcolor_analysis.Rmd</p></li><br><li><p>Resource Title: Rendered HTML version of notebook.</p> <p>File Name: G_max_G_soja_seedweight_seedcolor_analysis.html</p></li><br><li><p>Resource Title: Progeny counts and seed weight data.</p> <p>File Name: counts_seedwt.csv</p></li><br><li><p>Resource Title: Seed color counts data.</p> <p>File Name: seedcolor.csv</p></li></ul>
num_resources 4
num_tags 19
title Data and code from: The Impacts of Parental Choice and Intrapopulation Selection for Seed Size on the Uprightness of Progeny Derived from Interspecific Hybridization between Glycine max and Glycine soja