Zero-Downtime Deployments for Ponder Indexer
Using a schema-based blue-green approach for ponder deployments
Blockchain indexers present a unique challenge for deployments. Unlike typical stateless web services, indexers maintain significant state. They need to process and store historical blockchain data, which can take hours or even days to synchronize. How do you deploy a new version without interrupting API consumers or losing indexed data?
This article explains how we implemented zero-downtime deployments for our Ponder indexer on Kubernetes. Ponder was already designed with this problem in mind. It ships with schema isolation, a standalone serve mode for horizontal scaling, and built-in health endpoints for orchestration. Our job was to wire those primitives into Kubernetes properly.
The Problem
Traditional deployment strategies fall short for blockchain indexers.
Rolling deployments assume new pods can start serving traffic quickly. That's not the case here. A fresh indexer might need 12+ hours to process historical blockchain data before it can respond to queries. During that time, the deployment appears stuck. Worse, if the new version has bugs, you've already terminated working pods.
Traditional blue-green deployments would require duplicating the entire database. Syncing a new database from scratch defeats the purpose of quick switchovers, and resource costs double during deployment windows.
Canary deployments split traffic between versions, but an unsynced indexer returns incomplete data. Users would get inconsistent responses depending on which pod handles their request.
We needed something different: a strategy that keeps serving traffic from the old version until the new one is fully ready, shares database resources between versions, allows the new indexer to take as long as needed to sync, and switches traffic atomically once ready.
The Solution: Schema-Based Blue-Green
Our approach uses PostgreSQL schema isolation combined with Kubernetes readiness probes to achieve zero-downtime deployments. The key insight:
Instead of duplicating databases, we duplicate schemas within the same database. Each deployment version writes to its own schema, and Kubernetes routes traffic only to pods that are ready to serve.
Architecture Overview
Two-Tier Deployment Pattern
We separate the indexer into two distinct Kubernetes deployments.
The Indexer Deployment (bleu-indexer-indexer) runs pnpm ponder start, the process that indexes blockchain data. It runs as a single replica since indexing is a sequential operation. This is where the heavy compute happens: it creates and populates its own database schema, consuming significant CPU and memory.
The API Deployment (bleu-indexer-api) runs pnpm ponder serve, a read-only HTTP server that Ponder provides specifically for this pattern. As described in their self-hosting guide, ponder serve runs the API layer without the indexing engine, so multiple replicas can operate behind a load balancer reading from the same database schema. We run 3 replicas in production.
This separation is crucial: the API tier can have multiple replicas for availability and load distribution, while the indexer runs as a single instance doing the heavy lifting. Ponder was designed with this decoupling in mind.
Implementation Details
Dynamic Schema Naming
Ponder uses PostgreSQL schemas to isolate each deployment](https://ponder.sh/docs/api-reference/ponder/database). The target schema is controlled via the DATABASE_SCHEMA environment variable (or the --schema CLI flag). Ponder's docs suggest using Kubernetes pod names, git commit hashes, or deployment IDs. We compute it at pod startup:
# From the deployment entrypoint export DATABASE_SCHEMA="bleu-indexer-v1.2.3-$(env | grep -E 'RPC_URL(_[0-9]+)?=' | sort | sha256sum | cut -c1-8)" exec pnpm ponder start
The schema name combines the release name prefix (bleu-indexer), the image tag (e.g., v1.2.3), and a hash of all RPC URL environment variables. This ensures different versions never collide when we push a new release, and changing RPC providers triggers a fresh re-index (intentionally). Multiple schemas coexist in the same database; we clean up old ones periodically.
Ponder enforces a safety rule here: once an instance claims a schema via ponder start, no other instance can use it, even after the original stops. This prevents data corruption during concurrent deployments. It also means crash recovery works automatically: restarting a pod with the same schema resumes indexing from the last checkpoint instead of starting over.
The Magic: Readiness Probes as Traffic Gates
The key to zero-downtime is in the readiness probe configuration:
readinessProbe: httpGet: path: /ready port: 3000 initialDelaySeconds: 30 periodSeconds: 10 failureThreshold: 129600 # ~36 hours
That failureThreshold: 129600 is not a typo.
A new indexer might need 12+ hours to process historical blockchain data. During that time, pods stay "not ready" and Kubernetes won't route traffic to them. The previous version continues handling all requests. Once /ready returns 200, Kubernetes adds the pod to the Service endpoints and traffic shifts automatically. No manual intervention, no deployment scripts.
Ponder ships with two health endpoints designed for exactly this orchestration: /health returns 200 immediately on process startup, while /ready returns 200 only when indexing has reached realtime across all chains, returning 503 during the backfill. We didn't build custom health checks; we just pointed Kubernetes at what Ponder already provides.
Liveness vs Readiness
We use both probes with different purposes:
livenessProbe: httpGet: path: /health port: 3000 initialDelaySeconds: 30 periodSeconds: 10 failureThreshold: 3 readinessProbe: httpGet: path: /ready port: 3000 initialDelaySeconds: 30 periodSeconds: 10 failureThreshold: 129600
The liveness probe (/health) answers "Is the process alive?" If it fails three times, Kubernetes restarts the pod. The readiness probe (/ready) answers "Can this pod serve traffic?" If it fails, Kubernetes removes the pod from the Service endpoints.
A pod can be alive but not ready. That's the syncing state, and it's perfectly fine.
Deployment Flow
Here's what happens when we deploy a new indexer version:
Database Architecture
Ponder requires PostgreSQL and recommends keeping database roundtrip latency under 50ms. We use CloudNativePG to run PostgreSQL directly inside the Kubernetes cluster with a 3-replica cluster:
# Simplified from our Helm template apiVersion: postgresql.cnpg.io/v1 kind: Cluster metadata: name: bleu-indexer-postgres spec: instances: 3 postgresql: parameters: shared_buffers: "256MB" max_connections: "200" storage: size: 10Gi
All schemas live in the same database. Connection pools, memory, and storage are shared across versions. Ponder also maintains a shared ponder_sync schema that caches RPC requests across instances (it's lock-free, so multiple deployments can safely share it). Old indexing schemas stick around for easy rollback if needed. We just point the API back to a previous schema. Cleanup happens as a maintenance task: periodically drop schemas that are no longer in use.
Configuration
We use Helm with environment-specific value files. The base values.yaml defines defaults: 1 indexer replica, 2 API replicas, modest resource requests. Production overrides in instances/vultr1.prod.yaml bump resources and enable ingress:
# values.yaml indexer: replicas: 1 resources: requests: cpu: 500m memory: 512Mi api: replicas: 2 resources: requests: cpu: 500m memory: 512Mi postgres: instances: 3 storage: size: 10Gi
# -- production overrides `instances/vultr1.prod.yaml` indexer: resources: requests: cpu: 1000m memory: 2Gi limits: memory: 4Gi ingress: enabled: true hosts: - host: api-v3.bleu.builders paths: - path: / pathType: Prefix tls: - secretName: bleu-indexer-tls hosts: - api-v3.bleu.builders
Sensitive configuration like RPC URLs and API keys live in Kubernetes Secrets, referenced via envFrom. We use Stakater Reloader to automatically restart pods when secrets change. Just annotate the deployment with reloader.stakater.com/auto: "true".
Why This Works
Aspect | Traditional Blue-Green | Schema-Based Blue-Green |
|---|---|---|
Database duplication | Full database copy required | Single database, multiple schemas |
Resource cost | 2x during deployment | Minimal overhead |
Sync time | Must pre-sync before cutover | Sync happens in-place |
Rollback | Switch DNS/LB back | Old schema still available |
Complexity | Separate infrastructure | Single cluster, schema isolation |
The old version serves traffic until the new one is fully ready (true zero-downtime). No duplicate infrastructure needed, so it's cost efficient. Kubernetes handles all the orchestration; we don't maintain custom deployment scripts. Everything lives in version control (GitOps), and we get standard Kubernetes metrics and logs for observability.
Monitoring
We track deployments through Kubernetes events for pod lifecycle, Prometheus metrics for indexer sync progress and API latency, and BetterStack for external uptime monitoring:
resource "betteruptime_monitor" "bleu_indexer" { url = "https://api-v3.bleu.builders/ready" monitor_type = "status" check_frequency = 60 }
What We Learned
Readiness probes are your traffic switch—configure them carefully. That 36-hour failure threshold seemed aggressive at first, but it's exactly what we needed.
Separating indexing from serving was the right call. They have fundamentally different scaling needs: one is a single stateful process, the other is stateless and horizontally scalable.
Schema isolation beats database duplication. It's simpler, cheaper, and gives us easy rollback for free.
Let Kubernetes do the orchestration. We tried building custom deployment scripts early on. They were fragile and hard to maintain. The readiness probe approach is declarative and self-healing.
Plan for long syncs. Blockchain history only grows. What takes 4 hours today will take 8 hours next year.
Conclusion
Zero-downtime deployments for blockchain indexers don't require complex custom tooling. By combining schema-based isolation, Kubernetes readiness probes, and a two-tier architecture, we get reliable automated deployments where users never see downtime, even when the new indexer takes hours to synchronize.
The key insight is treating the indexer and API as separate concerns with different lifecycles. The indexer is a stateful, slow-starting process. The API is stateless and can scale horizontally. By separating them and using schemas for isolation, we get the benefits of blue-green deployments without the infrastructure overhead.
References
- Ponder — Self-Hosting in Production — schema isolation,
ponder serve, health endpoints, and the views pattern - Ponder — Database Configuration — schema naming,
ponder_synccache, build ID recovery, and lifecycle rules - Kubernetes — Configure Liveness, Readiness and Startup Probes — how readiness probes control traffic routing
- Kubernetes — Pod Lifecycle — probe types and their effect on Service endpoints
- CloudNativePG — PostgreSQL operator for Kubernetes
- Stakater Reloader — auto-restart pods on Secret/ConfigMap changes
- ArgoCD — GitOps continuous delivery for Kubernetes