Two-stage models improve machine learning classifiers in wildlife research: A case study in identifying false positive detections of Ruffed Grouse

Autonomous recording units are increasingly being used to monitor wildlife on large geographic and temporal scales, paired with machine learning (ML) to automate detection of wildlife. However, false positive detections from ML classifiers can result in erroneous ecological models that can lead to misguided management and conservation actions. We used a two-stage general approach to understand and reduce false positive detections, a technique in which outputs of the primary classification model are passed to a secondary classification model to yield the probability that a detection from the primary model is a true positive detection. This approach is demonstrated on two open-source models, BirdNET and the Drumming Model, that detect Ruffed Grouse (Bonasa umbellus). We analyzed over 9500 hours of acoustic data collected in 2022-2023 from the Green Mountain National Forest in Vermont, USA, and found the two models detected different types of acoustic signals associated with differing life history traits. The Drumming Model yielded 4106 detections (71.5% true positives); BirdNET yielded 524 detections (17.0% true positives). Secondary logistic regression models separated true positives and false positives with high accuracy (BirdNET = 84.5%; Drumming Model = 89.8%). Our findings go beyond improving Ruffed Grouse monitoring and conservation efforts to, more broadly, illustrate how two-stage ML approaches can improve the use of model-derived detections in wildlife research.

Data and Resources

Field Value
accessLevel public
bureauCode {010:12}
catalog_@context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
catalog_@id https://ddi.doi.gov/usgs-data.json
catalog_conformsTo https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy https://project-open-data.cio.gov/v1.1/schema/catalog.json
identifier http://datainventory.doi.gov/id/dataset/usgs-679392d5d34e88f5864c50b5
metadata_type geospatial
modified 2025-04-30T00:00:00Z
old-spatial -73.2300, 42.7300, -72.7400, 44.1600
publisher U.S. Geological Survey
resource-type Dataset
source_datajson_identifier true
source_hash e4a3319955c2791805943f6794510c8e567ad282b3eec54e785789f062039355
source_schema_version 1.1
spatial {"type": "Polygon", "coordinates": [[[-73.2300, 42.7300], [-73.2300, 44.1600], [ -72.7400, 44.1600], [ -72.7400, 42.7300], [-73.2300, 42.7300]]]}
theme {geospatial}
Groups
  • AmeriGEOSS
  • National Provider
  • North America
Tags
  • AmeriGEO
  • AmeriGEOSS
  • CKAN
  • GEO
  • GEOSS
  • National
  • North America
  • United States
  • acoustic-monitoring
  • artificial-intelligence
  • bioacoustics
  • biota
  • green-mountain-national-forest
  • machine-learning
  • usgs-679392d5d34e88f5864c50b5
  • vermont
  • wildlife-monitoring
isopen False
license_id notspecified
license_title License not specified
maintainer Laurence Clarfeld
maintainer_email lclarfel@uvm.edu
metadata_created 2025-09-24T23:17:31.906103
metadata_modified 2025-09-24T23:17:31.906112
notes Autonomous recording units are increasingly being used to monitor wildlife on large geographic and temporal scales, paired with machine learning (ML) to automate detection of wildlife. However, false positive detections from ML classifiers can result in erroneous ecological models that can lead to misguided management and conservation actions. We used a two-stage general approach to understand and reduce false positive detections, a technique in which outputs of the primary classification model are passed to a secondary classification model to yield the probability that a detection from the primary model is a true positive detection. This approach is demonstrated on two open-source models, BirdNET and the Drumming Model, that detect Ruffed Grouse (Bonasa umbellus). We analyzed over 9500 hours of acoustic data collected in 2022-2023 from the Green Mountain National Forest in Vermont, USA, and found the two models detected different types of acoustic signals associated with differing life history traits. The Drumming Model yielded 4106 detections (71.5% true positives); BirdNET yielded 524 detections (17.0% true positives). Secondary logistic regression models separated true positives and false positives with high accuracy (BirdNET = 84.5%; Drumming Model = 89.8%). Our findings go beyond improving Ruffed Grouse monitoring and conservation efforts to, more broadly, illustrate how two-stage ML approaches can improve the use of model-derived detections in wildlife research.
num_resources 1
num_tags 17
title Two-stage models improve machine learning classifiers in wildlife research: A case study in identifying false positive detections of Ruffed Grouse