ViTexOCR; a script to extract text overlays from digital video

The ViTexOCR script presents a new method for extracting navigation data from videos with text overlays using optical character recognition (OCR) software. Over the past few decades, it was common for videos recorded during surveys to be overlaid with real-time geographic positioning satellite chyrons including latitude, longitude, date and time, as well as other ancillary data (such as speed, heading, or user input identifying fields). Embedding these data into videos provides them with utility and accuracy, but using the location data for other purposes, such as analysis in a geographic information system, is not possible when only available on the video display. Extracting the text data from imagery using software allows these videos to be located and analyzed in a geospatial context. The script allows a user to select a video, specify the text data types (e.g. latitude, longitude, date, time, or other), text color, and the pixel locations of overlay text data on a sample video frame. The script’s output is a data file containing the retrieved geospatial and temporal data. All functionality is bundled in a Python script that incorporates a graphical user interface and several other software dependencies.

Data and Resources

Original MetadataXML
The metadata original format
Explore
- Preview
- Download
Digital DataXML
Landing page for access to the data
Explore
- Preview
- Download

Field	Value
accessLevel	public
bureauCode	{010:12}
catalog_@context	https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
catalog_conformsTo	https://project-open-data.cio.gov/v1.1/schema
catalog_describedBy	https://project-open-data.cio.gov/v1.1/schema/catalog.json
identifier	USGS:58dd56ace4b02ff32c685954
metadata_type	geospatial
modified	20201019
old-spatial	-180.0, -90.0, 180.0, 90.0
publisher	U.S. Geological Survey
publisher_hierarchy	Department of the Interior > U.S. Geological Survey
resource-type	Dataset
source_datajson_identifier	true
source_hash	9ed79a3f25edd58637c4ecdad0cfa5fa5717d2d0
source_schema_version	1.1
spatial	{"type": "Polygon", "coordinates": [[[-180.0, -90.0], [-180.0, 90.0], [ 180.0, 90.0], [ 180.0, -90.0], [-180.0, -90.0]]]}
theme	{geospatial}
Groups	AmeriGEOSS National Provider North America
Tags	amerigeo amerigeoss ckan cmgp coastal-and-marine-geology-program computer-science geo geoss national north-america pacific-coastal-and-marine-science-center pcmsc scientific-software software-development u-s-geological-survey united-states usgs usgs-58dd56ace4b02ff32c685954
isopen	False
license_id	notspecified
license_title	License not specified
maintainer	Evan T. Dailey
maintainer_email	edailey@usgs.gov
metadata_created	2025-11-21T07:32:42.216614
metadata_modified	2025-11-21T07:32:42.216618
notes	The ViTexOCR script presents a new method for extracting navigation data from videos with text overlays using optical character recognition (OCR) software. Over the past few decades, it was common for videos recorded during surveys to be overlaid with real-time geographic positioning satellite chyrons including latitude, longitude, date and time, as well as other ancillary data (such as speed, heading, or user input identifying fields). Embedding these data into videos provides them with utility and accuracy, but using the location data for other purposes, such as analysis in a geographic information system, is not possible when only available on the video display. Extracting the text data from imagery using software allows these videos to be located and analyzed in a geospatial context. The script allows a user to select a video, specify the text data types (e.g. latitude, longitude, date, time, or other), text color, and the pixel locations of overlay text data on a sample video frame. The script’s output is a data file containing the retrieved geospatial and temporal data. All functionality is bundled in a Python script that incorporates a graphical user interface and several other software dependencies.
num_resources	2
num_tags	18
title	ViTexOCR; a script to extract text overlays from digital video