Two-stage models improve machine learning classifiers in wildlife research: A case study in identifying false positive detections of Ruffed Grouse

Autonomous recording units are increasingly being used to monitor wildlife on large geographic and temporal scales, paired with machine learning (ML) to automate detection of wildlife. However, false positive detections from ML classifiers can result in erroneous ecological models that can lead to misguided management and conservation actions. We used a two-stage general approach to understand and reduce false positive detections, a technique in which outputs of the primary classification model are passed to a secondary classification model to yield the probability that a detection from the primary model is a true positive detection. This approach is demonstrated on two open-source models, BirdNET and the Drumming Model, that detect Ruffed Grouse (Bonasa umbellus). We analyzed over 9500 hours of acoustic data collected in 2022-2023 from the Green Mountain National Forest in Vermont, USA, and found the two models detected different types of acoustic signals associated with differing life history traits. The Drumming Model yielded 4106 detections (71.5% true positives); BirdNET yielded 524 detections (17.0% true positives). Secondary logistic regression models separated true positives and false positives with high accuracy (BirdNET = 84.5%; Drumming Model = 89.8%). Our findings go beyond improving Ruffed Grouse monitoring and conservation efforts to, more broadly, illustrate how two-stage ML approaches can improve the use of model-derived detections in wildlife research.

Data and Resources

Digital DataXML
Landing page for access to the data
Explore
- Preview
- Download

Field	Value
accessLevel	public
bureauCode	{010:12}
catalog_@context	https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
catalog_@id	https://ddi.doi.gov/usgs-data.json
catalog_conformsTo	https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy	https://project-open-data.cio.gov/v1.1/schema/catalog.json
identifier	http://datainventory.doi.gov/id/dataset/usgs-679392d5d34e88f5864c50b5
metadata_type	geospatial
modified	2025-04-30T00:00:00Z
old-spatial	-73.2300, 42.7300, -72.7400, 44.1600
publisher	U.S. Geological Survey
resource-type	Dataset
source_datajson_identifier	true
source_hash	e4a3319955c2791805943f6794510c8e567ad282b3eec54e785789f062039355
source_schema_version	1.1
spatial	{"type": "Polygon", "coordinates": [[[-73.2300, 42.7300], [-73.2300, 44.1600], [ -72.7400, 44.1600], [ -72.7400, 42.7300], [-73.2300, 42.7300]]]}
theme	{geospatial}
Groups	AmeriGEOSS National Provider North America
Tags	AmeriGEO AmeriGEOSS CKAN GEO GEOSS National North America United States acoustic-monitoring artificial-intelligence bioacoustics biota green-mountain-national-forest machine-learning usgs-679392d5d34e88f5864c50b5 vermont wildlife-monitoring
isopen	False
license_id	notspecified
license_title	License not specified
maintainer	Laurence Clarfeld
maintainer_email	lclarfel@uvm.edu
metadata_created	2025-09-24T23:17:31.906103
metadata_modified	2025-09-24T23:17:31.906112
notes	Autonomous recording units are increasingly being used to monitor wildlife on large geographic and temporal scales, paired with machine learning (ML) to automate detection of wildlife. However, false positive detections from ML classifiers can result in erroneous ecological models that can lead to misguided management and conservation actions. We used a two-stage general approach to understand and reduce false positive detections, a technique in which outputs of the primary classification model are passed to a secondary classification model to yield the probability that a detection from the primary model is a true positive detection. This approach is demonstrated on two open-source models, BirdNET and the Drumming Model, that detect Ruffed Grouse (Bonasa umbellus). We analyzed over 9500 hours of acoustic data collected in 2022-2023 from the Green Mountain National Forest in Vermont, USA, and found the two models detected different types of acoustic signals associated with differing life history traits. The Drumming Model yielded 4106 detections (71.5% true positives); BirdNET yielded 524 detections (17.0% true positives). Secondary logistic regression models separated true positives and false positives with high accuracy (BirdNET = 84.5%; Drumming Model = 89.8%). Our findings go beyond improving Ruffed Grouse monitoring and conservation efforts to, more broadly, illustrate how two-stage ML approaches can improve the use of model-derived detections in wildlife research.
num_resources	1
num_tags	17
title	Two-stage models improve machine learning classifiers in wildlife research: A case study in identifying false positive detections of Ruffed Grouse