The Queryable Earth: How the SpatioTemporal Asset Catalog Liberated Public Imagery from Legacy Silos

The rapid proliferation of Earth observation satellites, high-resolution drones, and aerial sensors has generated an unprecedented volume of geospatial data, capturing the dynamics of a changing planet in near-real-time. Yet, for decades, accessing this wealth of information remained a deeply fractured, frustrating, and inefficient endeavor. Scientists, analysts, and developers faced a highly fragmented landscape of custom application programming interfaces (APIs), proprietary platforms, and inconsistent data storage methods. Finding, filtering, and downloading imagery often required reverse-engineering bespoke interfaces for every new dataset, leaving researchers to wrestle with data management rather than focusing on analysis.

The emergence of the SpatioTemporal Asset Catalog (STAC) specification has fundamentally redefined the relationship between raw pixels and geographic exploration. Arriving precisely as cloud computing matured and remote sensing archives expanded exponentially, STAC replaced administrative obstacles with a developer-centric, open-source standard. By shifting the industry paradigm from heavy, on-premises database queries to cloud-native streaming, STAC has democratized access to the world’s public Earth observation archives. This technological leap is examined through the lens of publishing STAC Catalogs for Public Data, a process that has turned locked archives into a living, streamable public commons and brought the vision of a 'Queryable Earth' into reality.

Serving as a conceptual follow-on to the real-world tracking workflows discussed in My Journey from STAC APIs to Streaming SAR Imagery, this piece expands on how those core API mechanics are structured globally.

The Friction of the Past: Legacy Spatial Data Infrastructures and the XML Desert

To appreciate the impact of the STAC framework, the structural and human limitations of legacy cataloging methods must first be examined. For decades, spatial data infrastructures (SDIs) relied heavily on standards established by the Open Geospatial Consortium (OGC) and the International Organization for Standardization (ISO). Principal among these were the Catalog Service for the Web (CSW), the Web Feature Service (WFS), and metadata structured around the ISO 19115 and ISO 19139 XML schemas.

While these frameworks successfully organized regional geoportals, they struggled when scaled to modern cloud environments and petabyte-scale archives. Legacy catalogs were heavily database-bound, relying on complex application servers to parse nested XML payloads. In a WFS environment, users could retrieve discrete vector highlights, but the actual raster coverage remained detached. Web Coverage Services (WCS) attempted to return multi-dimensional physical properties, but legacy catalogue specifications lacked formal, standardized procedures for modifying or updating the underlying metadata.

This technical design introduced several systemic challenges for developers and researchers:

  • XML Schema Complexity: ISO 19115 documents were notoriously long, rigid, and complex, making machine-to-machine scripting and rapid client-side parsing highly difficult.

  • The Download Bottleneck: Imagery search was structurally decoupled from data delivery. To analyze or inspect a satellite scene, users were forced to download multi-gigabyte zip files over File Transfer Protocol (FTP) or HTTP, only to find that the target area was obscured by clouds.

  • API Incompatibility: Every data provider implemented search parameters differently, requiring developers to write custom API wrappers for every satellite constellation or aerial campaign.

  • Maintenance Complexity: High-performance database clusters (such as Elasticsearch or PostGIS) required continuous maintenance, backups, and structural indices that were prone to corruption and high operational costs.

Dimension Legacy Geospatial Cataloging (CSW / ISO 19115) Cloud-Native STAC Paradigm
Data Format Complex, deeply nested XML schemas Lightweight, developer-friendly JSON & GeoJSON
Search Interface SOAP/XML-based CSW endpoints RESTful STAC APIs aligned with OGC API - Features
Access Model Bulk zip file downloads via FTP/HTTP Direct programmatic streaming of cloud-optimized formats
Operational State Rigid, database-backed infrastructure Dual flexibility: active API or static, serverless flat files
Integration Path Bespoke client integration and custom tooling Standardized, highly reusable open-source libraries

These operational bottlenecks created deep frustration, setting the stage for a radical rethinking of metadata. This frustration boiled over in 2015, sparking a collaborative movement that would redefine the industry.

A Spark in the Mountains: The Boulder Sprints and the Guerrilla Playbook

The origins of the SpatioTemporal Asset Catalog specification trace back to a grassroots movement within the open-source geospatial community. The foundational concepts were drafted at an ad hoc meeting focused on OpenAerialMap during the 2015 FOSS4G-NA conference. Out of this gathering, the Open Imagery Network (OIN) was conceived. The driving idea behind OIN was to standardize a metadata layer below active web maps, utilizing simple JSON sidecar files to detail the basic attributes of hosted imagery. This open commons approach guaranteed an open license that any organization could use.

This framework was subsequently expanded in October 2017 during an intensive, multi-organizational sprint in Boulder, Colorado. Organized under the auspices of Radiant Earth, which served as a neutral convening body, approximately 20 developers representing 13 distinct geospatial organizations gathered to address the interoperability of satellite data APIs. The primary goal was to establish a core set of search fields capable of handling not only traditional satellite imagery, but also drone data, balloon imagery, LiDAR point clouds, synthetic aperture radar (SAR), hyperspectral files, and derived datasets like normalized difference vegetation indices (NDVI).

This effort bypassed traditional, bureaucratic standards-setting processes in favor of a community-driven methodology often called the "Guerrilla Standards Playbook". Coined by Howard Butler, this playbook prioritized working implementations over theoretical compliance, drawing inspiration from earlier grassroots standards like GeoJSON and Cloud Optimized GeoTIFFs (COGs). The community focused on meeting the most essential 80% of developer needs with a simple, lightweight core specification, leaving highly specialized requirements to an extensible system of community-developed plugins.

A core reason this playbook worked is that it actively avoided a common trap in open standards work: the “Field of Dreams” fallacy. As Matt Hanson from Element 84 noted, "You can have the most elegant standard, but if nobody uses it, it's not a success". The project’s direction was driven organically by the community, with influence earned through pull requests and debates on GitHub. Chris Holmes acted as the "benevolent dictator for life," guiding the overarching vision of a unified metadata standard, while Matthias Mohr of the openEO project joined after the first couple of sprints to provide meticulous technical oversight and structural rigor. By the end of the very first sprint, the community had built a working server. While simple, this rapid prototype validated their collaborative approach, ensuring that every subsequent change to the specification was tested against working code.

With the spirit of guerrilla standards driving the community, the sprint participants set out to translate their philosophy into code. The result of this practical work was a simple, elegant architecture built on four distinct pillars.

Defining the STAC Core: Simple Structures, Powerful Extensions

The STAC architecture is built on a simple foundation, organizing geospatial data using JSON and GeoJSON formats to maximize developer accessibility. Rather than relying on a monolithic system, STAC divides cataloging into four complementary specifications:

Specification Component Underlying Structure Primary Operational Purpose
STAC Catalog Simple JSON document Acts as a structural directory linking to other Catalogs, Collections, or Items.
STAC Collection JSON document extending the Catalog spec Groups similar Items, providing dataset-level metadata such as licensing, spatial-temporal extents, and descriptions.
STAC Item GeoJSON Feature containing geometry, properties, and assets Represents a single scene or dataset at a specific location and time, linking directly to the underlying raw files.
STAC API Dynamic RESTful web service Enables programmatic, high-speed querying of Items based on spatial, temporal, and property-based filters.

The STAC Item serves as the fundamental building block of the entire specification. Because it is designed as a GeoJSON Feature, it natively integrates with modern web mapping libraries. Crucially, a STAC Item contains an "assets" dictionary, a map of unique keys pointing directly to the actual files, such as multi-spectral bands, metadata files, or browse thumbnails.

To maintain stability and prevent version churn, the STAC community engaged in detailed debates to keep the core specification as small as possible, capturing only the most essential spatial and temporal fields. Any specialized fields, such as satellite view geometry, instrument-specific parameters, or cloud-cover thresholds, were pushed into an extensible system of community-governed extensions.

The Dichotomy of Static Catalogs and Dynamic APIs

A key architectural strength of the STAC specification is its support for both static catalogs and dynamic APIs. A Static STAC consists of flat JSON files hosted on a simple web server or a cloud object store like Amazon S3 or Google Cloud Storage. Because it is simply a set of interconnected links, it cannot be queried directly; instead, clients discover data by crawling the catalog.

This static approach provides high reliability. Active databases and search clusters (like Elasticsearch) can be difficult to maintain at scale, prone to corruption, and require extensive operational work. Because the core of a static catalog is composed of flat files, it serves as an extremely robust, corruption-resistant canonical point of truth. It allows smaller data providers, who may lack the budget or technical resources to maintain complex active databases, to expose their datasets simply by uploading JSON files, making them easily indexable by external crawlers.

Conversely, the STAC API represents the active, queryable implementation of the specification. Built on top of the OGC API - Features standard, the STAC API allows users to search massive datasets programmatically using spatial bounding boxes, precise date ranges, and custom properties like cloud-cover thresholds.

Importantly, the shared structural design of STAC Items and links allows the API to serve as a bridge. Simple web clients can treat an active API exactly like a static catalog, crawling its endpoints to build external indices. This architectural alignment allows organizations to expose complex active databases to a wider range of tools without altering their underlying storage systems. Furthermore, if an active API's database is ever corrupted, a static catalog crawled from the API can serve as a reliable, complete backup to reconstruct the database. To bring these abstract specifications to life, developers began crafting specialized tools that turned JSON schemas into functional code.

The Tooling Explosion: From PySTAC to Serverless GeoParquet

As the STAC specification matured, the development of common, reusable tooling became essential to accelerate its adoption. In November 2019, Azavea released PySTAC 0.3, a core Python library that allowed users to read, write, and manipulate STAC data. PySTAC was initially developed as a contribution to sat-stac, a tool built by Matt Hanson as part of the sat-util suite. By providing a dependency-free implementation of the core concepts, PySTAC lowered the barrier to entry for Python developers, guiding them through familiar object-oriented concepts rather than raw JSON schemas.

The STAC software ecosystem has since expanded to support every stage of the data lifecycle. Open-source frameworks like stac-server and stac-fastapi provide out-of-the-box, scalable STAC API implementations backed by Elasticsearch and PostgreSQL, respectively. For visualization, tools like stac-browser allow organizations to deploy highly performant, lightweight web portals directly on top of their STAC endpoints, making petabytes of data easily discoverable by non-technical users.

Cloud Storage

S3 / Azure Blob

1. Query API
2. HTTP Range

STAC Client

pystac-client / eoAPI

Direct Stream

QGIS / Python

Direct Stream Processing

The maturation of cloud-native geospatial data formats has further expanded these capabilities. Cloud Optimized GeoTIFFs (COGs) organize pixel data internally, allowing users to retrieve only the portions of an image they need using HTTP range requests. While COGs enable efficient pixel-level streaming, STAC provides the standardized metadata framework required to find those pixels. Together, they support a modern, cloud-native workflow that reduces data transfer costs, eliminates redundant file copies, and enables complex distributed processing directly on cloud-hosted datasets.

This landscape continues to evolve with the integration of GeoParquet. Developed by engineers at Development Seed, including Pete Gadomski and Anthony Lukach, stac-geoparquet converts STAC Items into highly compressed, columnar Parquet files. This approach addresses "STAC at small scale". As the development team noted:

"For smaller teams or lightweight workflows, setting up and maintaining a full backend can feel like using a sledgehammer to crack a walnut." 

By serving metadata through stac-fastapi-geoparquet, teams can run lightweight, serverless queries directly on cloud-hosted metadata, bypassing the need to manage active database backends for small-to-medium catalogs (under 100,000 items).

As these programmatic libraries lowered the barriers to entry, the stage was set for a massive validation of the standard. The ultimate test of the STAC specification would come from the world's most extensive, historical Earth observation archives.

Planetary Scale Legitimacy: Landsat Collection 2 and the Microsoft Planetary Computer

The transition of the STAC specification from a grassroots project to an industry standard was accelerated by two major cloud-native initiatives.

USGS Landsat Collection 2

On December 1, 2020, the U.S. Geological Survey (USGS) Earth Resources Observation and Science (EROS) Center completed Landsat Collection 2, a significant reprocessing of the 50-year Landsat archive. By leveraging commercial cloud architectures on AWS, the USGS reprocessed its entire global archive, spanning Landsats 1 through 9, at an unprecedented speed of 450,000 scenes per day. All 40 million generated records were formatted as Cloud Optimized GeoTIFFs and published with STAC metadata.

The impact on global scientific research was immediate, providing geometric alignment between Landsat and Europe’s Sentinel-2 constellations, and enabling global surface temperature and reflectance modeling. EROS Director Christopher "C.J." Loria highlighted this operational leap, noting that the move to cloud processing and cloud data access represented "a significant paradigm shift for the USGS". Calli Jenkerson, who managed the Land Science Research and Development team, summarized the scientific value simply: "It makes good data better. That's what it does". Furthermore, Crawford, a Landsat Science Team member, noted that bringing the Landsat and Sentinel-2 missions into geometric alignment "is a significant leap forward".

Microsoft Planetary Computer

The Microsoft Planetary Computer utilizes the STAC specification to organize and index petabytes of open environmental data. All spatial search, visualization, and cloud processing on the platform are driven by a high-performance STAC API. To secure access to datasets hosted on Azure Blob Storage while keeping them free for public use, Microsoft developed an active authentication model. Users access data by requesting a short-lived Shared Access Signature (SAS) token from a dedicated API endpoint. This token is then appended to the resource URL, allowing client tools to stream the underlying pixels directly. This credential management model is built directly into client libraries like the planetary-computer Python package, allowing developers to query STAC metadata and sign file paths automatically in a single step.

In September 2025, Microsoft expanded this ecosystem with a public preview of Planetary Computer Pro, introducing native integration with Microsoft Entra ID. This enterprise feature allows private organizations to ingest and publish their own spatial datasets using the STAC standard. By securing internal catalogs with Entra ID, corporate and government users can manage and search private geospatial repositories using the same open-source tools developed for the public platform.

While these cloud-hosted public archives proved that STAC could manage data at a planetary scale, the benefits of this infrastructure had to reach beyond specialized cloud engineers. To truly democratize Earth observation, the technology needed to find a home on the everyday desktops of GIS professionals.

Empowering the Desktop: QGIS and the Lutra Consulting Revolution

Historically, desktop GIS analysts had to rely on custom, provider-specific plugins to connect to satellite archives, creating a fragmented user experience. In QGIS 3.40, Saber Razmjooei and the development team at Lutra Consulting addressed this fragmentation by implementing STAC as a core, native data provider within the software. This desktop integration was expanded in QGIS 3.42 to include full dynamic searching, filtering, and footprint previews.

Through the native Data Source Manager, an analyst can configure a connection to any public STAC API, such as SNAPPlanet or the Microsoft Planetary Computer. Using an intuitive interface, users can apply spatial bounding boxes and temporal filters to search millions of assets.

Step 01

Query STAC API

Step 02

Display Footprints on Map Canvas

Step 03

Stream Pixels Directly

Direct COG Streaming

QGIS displays the exact geographic footprint of matching results directly on the map canvas as a vector layer. Once a target item is selected, the analyst can drag it into the layers panel. If the underlying asset is in a cloud-optimized format like COG or COPC, QGIS streams the pixels directly into the display in real-time, bypassing the need to download the file locally. Funded in part by Microsoft, this integration also supports Entra ID and SAS token authentication natively, allowing enterprise users to access both public and secure private catalogs within a unified desktop environment.

By bringing cloud-hosted data directly into the desktop environment, the integration of STAC in QGIS bypassed hours of traditional downloading. This rapid accessibility is not merely a convenience; in the face of sudden-onset natural disasters, it becomes a critical asset that directly impacts human survival.

Humanitarian Frontlines: STAC on the Edge of Disaster

When sudden-onset disasters occur, the speed of imagery discovery, processing, and delivery is vital for emergency responders. In these high-stakes scenarios, STAC has helped eliminate traditional administrative and data-ingestion delays.

During the 2020 California North Complex Fire, which devastated the Berry Creek region, Element 84 demonstrated the speed of cloud-native workflows by combining STAC metadata with Sentinel-2 COG datasets on AWS. Automated processing pipelines on AWS Amazon EKS clusters queried, retrieved, and processed pre- and post-fire imagery immediately. Rather than wasting valuable time preparing datasets, analysis-ready imagery was delivered to response teams on the ground within minutes, helping them focus on saving lives rather than wrangling data.

This capability is further illustrated by the integration of near-real-time geostationary weather data. During Hurricane Hilary, Element 84 deployed a specialized, low-latency processing pipeline using GOES geostationary mesoscale data. The GOES Advanced Baseline Imager captures detailed regional data every sixty seconds. Using their FilmDrop system, Element 84 processed raw GOES streams in real-time, generated web-ready products, and immediately published the metadata as a STAC Item collection. This standardized output allowed first responders to integrate live hurricane tracking directly into active web maps, illustrating the power of STAC to make complex satellite data immediately queryable.

Data Source

GOES-16/17 Geostationary Sensors

Data Acquisition

Rapid Imagery Capture

Gathers Mesoscale Imagery Every 60 Seconds

Processing Pipeline

Composite Conversion

Converts Raw Inputs to Corrected RGB & IR Composites

Ingestion Specs

Metadata Standardization

Standardizes Spatial, Temporal, & Processing Metadata

End User Action

Real-Time Map Visualization

First Responders Visualize Real-Time Hurricane Path

This rapid discovery is also used across commercial satellite constellations during humanitarian crises. Major commercial satellite operators have integrated the STAC specification into their humanitarian disaster response programs:

  • Vantor (Maxar) Open Data Program: Vantor releases before-and-after imagery of sudden-onset disasters under permissive Creative Commons licenses, designed for direct integration with humanitarian mapping tools.

  • Planet Disaster Data: Planet hosts a dedicated, open STAC catalog on Google Cloud Platform, exposing PlanetScope and SkySat imagery for global disaster events.

  • Federated STAC Map Browser: Platforms like "Mapping Disasters" federate public STAC catalogs from Maxar, Capella, Umbra, and Wyvern into a single, unified map interface.

By standardizing imagery search across these diverse constellations, response organizations can monitor affected areas using a single search query, eliminating the need to coordinate with multiple individual providers.

Whether fighting a raging wildfire in California or coordinating aid after a sudden hurricane, the ability to rapidly query and stream pixels represents a profound shift in how humanity visualizes its home. This collaborative framework, born out of community sprints, now underpins a new era of open spatial intelligence.

The Democratic Future of Open Spatial Intelligence

The evolution of the SpatioTemporal Asset Catalog specification is a testament to the power of community-driven, open-source collaboration. By prioritizing developer needs, maintaining a simple core, and aligning with cloud-optimized formats, STAC has established an interoperable ecosystem used by commercial providers, independent creators, and open-source developers alike. Rather than perpetuating the rigid, non-standardized data silos that historically restricted imagery to specialized institutional gatekeepers, this architecture functions as a universal fabric connecting the entire geospatial landscape. Powered by a relentless open-source development community, STAC ensures that completely diverse datasets, whether optical, radar, or hyperspectral, can immediately plug into the exact same client libraries and desktop tools without requiring custom engineering or proprietary platforms.

This transformation has shifted the focus of the geospatial industry away from the administrative and technical bottlenecks of legacy data management, driving a massive shift toward widespread accessibility for a much wider audience. The act of publishing STAC Catalogs for Public Data has broken down the closed sandboxes of the past, converting raw pixels into an open, democratic infrastructure that empowers individual analysts, independent researchers, and everyday users. By standardizing metadata definitions ecosystem-wide, the framework allows a single user to federate diverse data streams seamlessly, enabling a basic search query to cross-examine and overlay catalogs from entirely independent global providers. Whether an individual is building a localized mapping application, an analyst is routing emergency resources on the ground during a wildfire , or a GIS professional is streaming satellite imagery directly to their desktop display, the open-source tooling driving STAC ensures that Earth observation data is no longer trapped in siloed formats, but easily found, harmonized across platforms, and immediately put to work by anyone.

Adam Simmons

Geospatial Industry Consultant | Founder, Project Geospatial

Adam Simmons is a geospatial technology liaison and strategic advisor with over 20 years of experience across the defense and commercial sectors. A veteran of the U.S. Air Force, he specialized in imagery analysis and order of battle before transitioning to executive leadership as the CEO of Midgard Raven, LLC and the founder of Project Geospatial, a 501(c)(3) dedicated to highlighting innovation within the geospatial ecosystem. Adam bridges the gap between technical development and market storytelling, leveraging his extensive background as a journalist and industry consultant to help companies navigate complex technology landscapes.

https://www.linkedin.com/in/adamsimmonsgeo
Next
Next

My Journey from STAC APIs to Streaming SAR Imagery