The Case for the Resilient Foundation: Why Innovation Needs a Floor
In the current geospatial zeitgeist, a seductive narrative has taken hold. If you follow the industry’s most prominent voices, you’ve likely seen the posts: "The Shapefile is holding your career back." "The Traditional GIS Stack is broken." "Stop hoarding files and start leveraging the cloud."
The argument, championed by innovators and community voices like Matt Forrest and others, is intellectually compelling. On paper, the logic is flawless: cloud-native formats like GeoParquet, Cloud-Optimized GeoTIFFs (COGs), and PMTiles allow for streaming, distributed compute, and massive scalability. In a high-bandwidth, venture-backed office in San Francisco or London, this is the future. It promises a world where we no longer wait for downloads and where the petabyte is as accessible as the kilobyte.
But as someone who has spent decades in the "dirt", navigating the rigid requirements of Department of Defense (DoD) operations and the chaotic friction of inter-agency data sharing, I find this "modern-only" advocacy to be dangerously short-sighted. It conflates Efficiency with Resilience, and in doing so, it creates a massive operational blind spot that threatens to alienate the very people who do the most critical work on the ground.
The Lesson of the "Modern" Trap
I wasn’t always a skeptic of the "kill the legacy" movement. Twenty years years ago, I was one of its loudest advocates. One example, I was heavily involved in pushing KML as a standard within the military and the broader community. I believed then, as many do now, that the "old ways" were simply hurdles to be cleared. I thought that by moving to an XML-based, human-readable, and web-friendly format, we would revolutionize the theater of operations. I envisioned a world where tactical data moved as seamlessly as a web page, rendered instantly in any viewer.
I was wrong.
What I learned, often the hard way, in high-pressure environments where connectivity was a luxury and the "cloud" was a literal weather pattern rather than a server farm, is that complexity is the enemy of the mission. We found that KML was "modern," and it was efficient, but it was also brittle in the long term. Large files choked low-bandwidth systems, and the flexibility of the schema meant that getting data out of KML was not as simple as we thought for more complex interoperability for cross community data sharing and future systems. It was eventually standardized but there was a long term implications where many fell back to shapefiles and CSVs. Anyone remember GeoRSS? This was actually a primitive innovation that worked before GeoJSON was popular. I digress.
The geospatial ecosystem does not exist in a vacuum. We aren't just sharing data with other "Modern GIS" experts who have the latest Python libraries installed. We are sharing data with logistics officers, civil engineers, emergency responders, and field agents. These are people using software they inherited three budget cycles ago, or custom-built tactical tools that don't know what a "cloud bucket" is. When a mission is live, the last thing you want to hear is: "I can't open this; I don't have the right drivers for the parquet reader."
The "Modern Stack" assumes a level of technical homogeneity that simply does not exist in the real world. Despite popular rhetoric and heavy push in the last 15 years, not every GIS analyst knows python. In those moments, a "Cloud-Native" stream isn't an asset; it’s a barrier. If you need a stable, universal hand-off to a piece of software outside the GIS bubble, like a legacy CAD system, a custom tracking tool, or an air-gapped analytics engine—the "Modern Stack" hits a wall. But the Shapefile, the GeoTIFF, and the CSV? They are the universal translators of our industry. They work because they are the lowest common denominator of truth. They are the formats the other person already has the ability to read, regardless of their internet speed, software version, or IT budget.
The Standardization Treadmill: New Formats, Old Realities
Over the last 10 to 15 years, we have witnessed a dizzying parade of "next big things" in geospatial data. We’ve seen the rise and pivot of GML, the hype and plateau of KML, the proliferation of GeoJSON, and now the ascendancy of STAC (SpatioTemporal Asset Catalogs), GeoParquet, and Iceberg. Each of these formats promises to solve the sins of the previous generation, offering better compression, faster indexing, or more robust metadata.
STAC, for instance, is a brilliant solution for the "where is the data?" problem in the cloud. It provides a standard way to index and search millions of assets across disparate buckets. But we must be careful not to mistake the Index for the Asset. A STAC catalog is a map to the treasure, but if the treasure is locked behind a format that requires a specialized cloud-native environment or a high-speed persistent connection to open, the map is useless to a responder on the ground.
The frequent emergence and change of these formats creates a "Standardization Treadmill." While the tech elite are busy debating the merits of the latest schema version or the most efficient way to partition an Iceberg table, the mission on the ground is still being run on what works. Every time we chase a new format, we introduce a new layer of potential failure and a new requirement for retraining. The "Old" remains for a very simple reason: it is the only thing that has proven it can survive the treadmill without breaking the end-user’s workflow. Stability is its own form of innovation.
The Gravity of Legacy Data
There is a pragmatic reality that the Modernists often ignore: Legacy Gravity. Millions of datasets across the globe are currently stored in Shapefiles, GeoTIFFs, and file-based databases. These aren't just "forgotten" files; they are the record of our infrastructure, our history, and our environment. The Modern Stack narrative suggests we should constantly convert and upgrade this data to the newest, most efficient medium to "stay ahead."
But is everything worth the cost of conversion? In the DoD, in municipal government, and in the NGO sector, the answer is often a resounding "No."
The labor, the compute power, and—more importantly—the risk of data loss or schema misalignment during conversion often outweigh the benefits of a marginally more efficient streaming format. If a dataset is accessed once a month by a human analyst to verify a property line or a tactical boundary, why move it to a complex, multi-layered cloud-native pipeline? There is a massive amount of "Good Enough" data that provides 90% of the value for 0% of the ongoing maintenance cost. By keeping this data in its foundational format, we ensure it remains accessible for the next thirty years, not just the next three months of the current tech cycle. We are protecting the integrity of the record from the volatility of the tools.
The Economics of Data Democracy and the "Complexity Tax"
The advocacy for the "Modern Stack" often ignores what I call the Complexity Tax. Innovation today almost always comes with a subscription fee, an egress cost, or the requirement for a DevOps engineer to manage API permissions. We are trading the simplicity of a file on a disk for the complexity of a managed service that requires constant upkeep.
When we talk about the democratization of data, we have to talk about accessibility at the lowest levels of the resource curve. We have to talk about "Free and Easy" as a core requirement, not a legacy byproduct.
Foundational formats are Free: They don't require a cloud credit card to open. They don't charge you to move your own data from point A to point B. In the modern stack, "egress fees" are a silent tax on information sharing. If it costs money to move data from a cloud bucket to a field unit, that data is no longer democratic; it is a toll-road.
They offer Data Sovereignty: A GeoPackage or a Shapefile on a thumb drive cannot be "deprovisioned" by an API change, a billing error, or a server outage. It is a sovereign asset that belongs to the user, not the provider. In the DoD, we understand the importance of "disconnected operations." If your data requires a heartbeat to a global cloud provider, you have effectively outsourced your mission success to a third-party vendor’s uptime.
They are Inclusive: They allow a student in a rural village with a ten-year-old laptop or a first responder in a disaster zone with zero bars of signal to participate in the data ecosystem. If a format requires a high-end GPU and a 5G link to visualize, it isn't democratic—it’s exclusive.
To call a Shapefile a "bottleneck" is a form of technological elitism. It assumes a world of perfect connectivity and infinite budgets. In the real world—the one I’ve lived in—those foundational tools aren't bottlenecks; they are the lifeboats that keep the mission afloat when the high-speed links are severed.
The AI Shift: Why the Foundation Becomes the Anchor
We are entering the age of AI, where feature extraction and spatial analysis are happening at speeds we couldn't imagine five years ago. I believe the "Modern Stack" will eventually become "invisible" infrastructure—working in the background of LLMs and automated pipelines. The machines will talk to machines in GeoParquet because it is efficient for their "brains."
But as the process becomes more automated and opaque, the Foundational Formats will become our anchor to reality. When an AI generates a million building footprints in seconds, we will need human-verifiable, "air-gapped" standards like GeoTIFF and Shapefile to audit what the AI is producing. If we cannot pull the data out of the machine and into a simple, static format that a human can inspect without a specialized environment, we lose the ability to trust the output.
The foundation formats will be the way we ground the AI. They provide a "Snapshot of Truth" that doesn't change when a model updates or a cloud service is patched. The newer formats will be used by the machines for velocity; the foundation formats will be used by the people for accountability and verification.
The Power of the Balanced Stack
I don't believe in abandoning the old for the sake of the new. I believe in Balance. A tool is only "advanced" if it solves the problem at hand without creating three new ones.
The Modern "Ceiling"
STAC, GeoParquet, COG, PMTiles
High-velocity research, AI training, global-scale analytics, and cloud-native visualization.
Connectivity dependency, "Complexity Tax," and vendor lock-in.
The Foundational "Floor"
Shapefile, GeoTIFF, CSV, GeoPackage
Mission delivery, Interoperability, Field operations, and long-term data archiving.
Legacy Gravity, manual storage management, and lack of "streaming" efficiency at scale.
Respect the Floor
To the evangelists pushing to "kill the shapefile": stop being so short-sighted. Your Modern Stack is a brilliant ceiling, but it makes for a terrible floor. If you spend all your time building the ceiling without ensuring the floor is solid, you’ll find yourself with nowhere to stand when the storms of reality hit.
My career hasn't been held back by legacy formats; it has been sustained by them. I will continue to advocate for the latest methods and push for more efficient code, but I will never apologize for maintaining a foundation of resilient, free, and accessible tools. The older formats remain not because we are lazy, but because the data they hold is too valuable to be gambled on the next 15-year tech cycle.
A Call to Action
I want to pose a challenge to those pushing exclusively for the Modern Stack: How well does your data work for the partners who don't share your tools or your infrastructure? In our professional bubble, it's easy to forget that we are rarely the end of the data pipeline. Our work often passes through hands that don't have high-speed cloud access, specialized developers, or the latest hardware.
I'll be the first to admit: I’m not necessarily a fan of the Shapefile’s technical limitations. I know its flaws as well as anyone. But I also know its place. I know better than to advocate for its deprecation because I understand that its value isn’t in its elegance, it’s in its reach.
If your "efficient" format requires your client to rebuild their entire workflow just to read it, have you really made progress? Let’s stop viewing foundational formats as hurdles and start seeing them as the essential bridges that allow our work to thrive in the messy, disconnected reality of the real world.
What is your "Lifeboat Format"? I want to hear your stories of when "boring and free" beat "new and expensive" in the comments below. Let's start talking about resilience again.
Final thought: The data format that absolutely perplexes me in my career is GeoPDFs. These days I can admit even that format has its place in our community but for me that one still falls under one of the most niche formats that I would never advocate for.