Zero-Downtime Deployments for Ponder Indexer

Blockchain indexers present a unique challenge for deployments. Unlike typical stateless web services, indexers maintain significant state. They need to process and store historical blockchain data, which can take hours or even days to synchronize. How do you deploy a new version without interrupting API consumers or losing indexed data?

This article explains how we implemented zero-downtime deployments for our Ponder indexer on Kubernetes. Ponder was already designed with this problem in mind. It ships with schema isolation, a standalone serve mode for horizontal scaling, and built-in health endpoints for orchestration. Our job was to wire those primitives into Kubernetes properly.

The Problem

Traditional deployment strategies fall short for blockchain indexers.

Rolling deployments assume new pods can start serving traffic quickly. That's not the case here. A fresh indexer might need 12+ hours to process historical blockchain data before it can respond to queries. During that time, the deployment appears stuck. Worse, if the new version has bugs, you've already terminated working pods.

Traditional blue-green deployments would require duplicating the entire database. Syncing a new database from scratch defeats the purpose of quick switchovers, and resource costs double during deployment windows.

Canary deployments split traffic between versions, but an unsynced indexer returns incomplete data. Users would get inconsistent responses depending on which pod handles their request.

We needed something different: a strategy that keeps serving traffic from the old version until the new one is fully ready, shares database resources between versions, allows the new indexer to take as long as needed to sync, and switches traffic atomically once ready.

The Solution: Schema-Based Blue-Green

Our approach uses PostgreSQL schema isolation combined with Kubernetes readiness probes to achieve zero-downtime deployments. The key insight:

Instead of duplicating databases, we duplicate schemas within the same database. Each deployment version writes to its own schema, and Kubernetes routes traffic only to pods that are ready to serve.

Architecture Overview

Zero-Downtime Ponder Deployment Architecture

Two-Tier Deployment Pattern

We separate the indexer into two distinct Kubernetes deployments.

The Indexer Deployment (bleu-indexer-indexer) runs pnpm ponder start, the process that indexes blockchain data. It runs as a single replica since indexing is a sequential operation. This is where the heavy compute happens: it creates and populates its own database schema, consuming significant CPU and memory.

The API Deployment (bleu-indexer-api) runs pnpm ponder serve, a read-only HTTP server that Ponder provides specifically for this pattern. As described in their self-hosting guide, ponder serve runs the API layer without the indexing engine, so multiple replicas can operate behind a load balancer reading from the same database schema. We run 3 replicas in production.

This separation is crucial: the API tier can have multiple replicas for availability and load distribution, while the indexer runs as a single instance doing the heavy lifting. Ponder was designed with this decoupling in mind.

Implementation Details

Dynamic Schema Naming

Ponder uses PostgreSQL schemas to isolate each deployment](https://ponder.sh/docs/api-reference/ponder/database). The target schema is controlled via the DATABASE_SCHEMA environment variable (or the --schema CLI flag). Ponder's docs suggest using Kubernetes pod names, git commit hashes, or deployment IDs. We compute it at pod startup:

# From the deployment entrypoint
export DATABASE_SCHEMA="bleu-indexer-v1.2.3-$(env | grep -E 'RPC_URL(_[0-9]+)?=' | sort | sha256sum | cut -c1-8)"
exec pnpm ponder start

The schema name combines the release name prefix (bleu-indexer), the image tag (e.g., v1.2.3), and a hash of all RPC URL environment variables. This ensures different versions never collide when we push a new release, and changing RPC providers triggers a fresh re-index (intentionally). Multiple schemas coexist in the same database; we clean up old ones periodically.

Ponder enforces a safety rule here: once an instance claims a schema via ponder start, no other instance can use it, even after the original stops. This prevents data corruption during concurrent deployments. It also means crash recovery works automatically: restarting a pod with the same schema resumes indexing from the last checkpoint instead of starting over.

The Magic: Readiness Probes as Traffic Gates

The key to zero-downtime is in the readiness probe configuration:

readinessProbe:
  httpGet:
    path: /ready
    port: 3000
  initialDelaySeconds: 30
  periodSeconds: 10
  failureThreshold: 129600 # ~36 hours

That failureThreshold: 129600 is not a typo.

A new indexer might need 12+ hours to process historical blockchain data. During that time, pods stay "not ready" and Kubernetes won't route traffic to them. The previous version continues handling all requests. Once /ready returns 200, Kubernetes adds the pod to the Service endpoints and traffic shifts automatically. No manual intervention, no deployment scripts.

Ponder ships with two health endpoints designed for exactly this orchestration: /health returns 200 immediately on process startup, while /ready returns 200 only when indexing has reached realtime across all chains, returning 503 during the backfill. We didn't build custom health checks; we just pointed Kubernetes at what Ponder already provides.

Liveness vs Readiness

We use both probes with different purposes:

livenessProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 30
  periodSeconds: 10
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /ready
    port: 3000
  initialDelaySeconds: 30
  periodSeconds: 10
  failureThreshold: 129600

The liveness probe (/health) answers "Is the process alive?" If it fails three times, Kubernetes restarts the pod. The readiness probe (/ready) answers "Can this pod serve traffic?" If it fails, Kubernetes removes the pod from the Service endpoints.

A pod can be alive but not ready. That's the syncing state, and it's perfectly fine.

Deployment Flow

Here's what happens when we deploy a new indexer version:

Zero-Downtime Ponder Deployment Detect Change

Database Architecture

Ponder requires PostgreSQL and recommends keeping database roundtrip latency under 50ms. We use CloudNativePG to run PostgreSQL directly inside the Kubernetes cluster with a 3-replica cluster:

# Simplified from our Helm template
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: bleu-indexer-postgres
spec:
  instances: 3
  postgresql:
    parameters:
      shared_buffers: "256MB"
      max_connections: "200"
  storage:
    size: 10Gi

All schemas live in the same database. Connection pools, memory, and storage are shared across versions. Ponder also maintains a shared ponder_sync schema that caches RPC requests across instances (it's lock-free, so multiple deployments can safely share it). Old indexing schemas stick around for easy rollback if needed. We just point the API back to a previous schema. Cleanup happens as a maintenance task: periodically drop schemas that are no longer in use.

Configuration

We use Helm with environment-specific value files. The base values.yaml defines defaults: 1 indexer replica, 2 API replicas, modest resource requests. Production overrides in instances/vultr1.prod.yaml bump resources and enable ingress:

# values.yaml
indexer:
  replicas: 1
  resources:
    requests:
      cpu: 500m
      memory: 512Mi

api:
  replicas: 2
  resources:
    requests:
      cpu: 500m
      memory: 512Mi

postgres:
  instances: 3
  storage:
    size: 10Gi

# -- production overrides `instances/vultr1.prod.yaml`
indexer:
  resources:
    requests:
      cpu: 1000m
      memory: 2Gi
    limits:
      memory: 4Gi

ingress:
  enabled: true
  hosts:
    - host: api-v3.bleu.builders
      paths:
        - path: /
          pathType: Prefix
  tls:
    - secretName: bleu-indexer-tls
      hosts:
        - api-v3.bleu.builders

Sensitive configuration like RPC URLs and API keys live in Kubernetes Secrets, referenced via envFrom. We use Stakater Reloader to automatically restart pods when secrets change. Just annotate the deployment with reloader.stakater.com/auto: "true".

Why This Works

Aspect	Traditional Blue-Green	Schema-Based Blue-Green
Database duplication	Full database copy required	Single database, multiple schemas
Resource cost	2x during deployment	Minimal overhead
Sync time	Must pre-sync before cutover	Sync happens in-place
Rollback	Switch DNS/LB back	Old schema still available
Complexity	Separate infrastructure	Single cluster, schema isolation

The old version serves traffic until the new one is fully ready (true zero-downtime). No duplicate infrastructure needed, so it's cost efficient. Kubernetes handles all the orchestration; we don't maintain custom deployment scripts. Everything lives in version control (GitOps), and we get standard Kubernetes metrics and logs for observability.

Monitoring

We track deployments through Kubernetes events for pod lifecycle, Prometheus metrics for indexer sync progress and API latency, and BetterStack for external uptime monitoring:

resource "betteruptime_monitor" "bleu_indexer" {
  url             = "https://api-v3.bleu.builders/ready"
  monitor_type    = "status"
  check_frequency = 60
}

What We Learned

Readiness probes are your traffic switch—configure them carefully. That 36-hour failure threshold seemed aggressive at first, but it's exactly what we needed.

Separating indexing from serving was the right call. They have fundamentally different scaling needs: one is a single stateful process, the other is stateless and horizontally scalable.

Schema isolation beats database duplication. It's simpler, cheaper, and gives us easy rollback for free.

Let Kubernetes do the orchestration. We tried building custom deployment scripts early on. They were fragile and hard to maintain. The readiness probe approach is declarative and self-healing.

Plan for long syncs. Blockchain history only grows. What takes 4 hours today will take 8 hours next year.

Conclusion

Zero-downtime deployments for blockchain indexers don't require complex custom tooling. By combining schema-based isolation, Kubernetes readiness probes, and a two-tier architecture, we get reliable automated deployments where users never see downtime, even when the new indexer takes hours to synchronize.

The key insight is treating the indexer and API as separate concerns with different lifecycles. The indexer is a stateful, slow-starting process. The API is stateless and can scale horizontally. By separating them and using schemas for isolation, we get the benefits of blue-green deployments without the infrastructure overhead.

References

Ponder — Self-Hosting in Production — schema isolation, ponder serve, health endpoints, and the views pattern
Ponder — Database Configuration — schema naming, ponder_sync cache, build ID recovery, and lifecycle rules
Kubernetes — Configure Liveness, Readiness and Startup Probes — how readiness probes control traffic routing
Kubernetes — Pod Lifecycle — probe types and their effect on Service endpoints
CloudNativePG — PostgreSQL operator for Kubernetes
Stakater Reloader — auto-restart pods on Secret/ConfigMap changes
ArgoCD — GitOps continuous delivery for Kubernetes