githublinkedinemail
← /articles

Idempotency in real systems

At-least-once is reality. A unique constraint is the only real guarantee. Idempotency-Key, dedupe tables — survival in distributed systems.

4 mindistributed

Idempotency in real systems

Every queue delivers at least once.

Every HTTP call can be retried.

Every webhook can arrive twice.

If your application wasn't designed for this, it's already broken — you just haven't seen the damage yet.


At-least-once is the reality, not the exception

The literature makes it sound like "exactly-once delivery" is a thing.

It isn't.

What exists is:

  • at-most-once: may drop messages (fast, unreliable)
  • at-least-once: may duplicate (reliable, you handle duplicates)
  • exactly-once: marketing

Every serious queue (SQS, RabbitMQ, Sidekiq, Kafka) delivers at-least-once by default.

Translation: sooner or later, your job will run twice for the same message.

If it charges the card twice, that's your problem, not the queue's.


Idempotency in one sentence

An idempotent operation is one that, executed N times with the same inputs, has the same effect as executing it once.

Note: effect, not result.

GET /users/42 is trivially idempotent — it changes nothing.

POST /charges has to be made idempotent on purpose.


Idempotent != commutative

This trips up smart people.

  • Idempotent: f(f(x)) = f(x) — calling again does nothing extra.
  • Commutative: f(g(x)) = g(f(x)) — order doesn't matter.
SET status = 'paid'        ← idempotent (call twice, same result)
balance += 100             ← NOT idempotent, but commutative with another +N
balance = 500              ← idempotent

When you design an operation, you choose which property you want.

In a payment system? Idempotency, always. Commutativity is a bonus.


The "check then write" trap

This is the most common sin:

def create_charge(order_id:, amount:)
  return if Charge.exists?(order_id: order_id)
  Charge.create!(order_id: order_id, amount: amount)
end

Looks idempotent. Isn't.

Between the exists? and the create!, another thread / process / worker replica can run the exact same check.

Result: two charges for the same order.

The window is small. With real production traffic, it happens.

The only real guarantee is at the database:

CREATE UNIQUE INDEX charges_order_id_idx ON charges (order_id);
def create_charge(order_id:, amount:)
  Charge.create!(order_id: order_id, amount: amount)
rescue ActiveRecord::RecordNotUnique
  Charge.find_by!(order_id: order_id)
end

Now we're talking. The database is the only source of truth that matters.


Unique constraint is the only real contract

Repeat after me:

Idempotency in application code, without a unique constraint in the database, doesn't exist.

Mutex in Ruby? Works inside ONE process.

Lock in Redis? Has TTL, clock skew, network issues.

if not exists in code? Race condition guaranteed.

UNIQUE INDEX? Atomic, enforced by Postgres, no chance of failure.

Difference between senior and junior:

  • junior protects idempotency with Mutex.synchronize
  • senior puts on a UNIQUE INDEX and sleeps soundly

Idempotency at the HTTP layer — the Stripe model

Stripe popularized the pattern:

POST /v1/charges
Idempotency-Key: 9f3a7e2b-1c4d-4e5f-8a9b-1234567890ab

{ "amount": 5000, "currency": "usd" }

The client generates a UUID. If it retries (timeout, retry), it sends the same key.

The server:

  1. receives the request
  2. looks up the idempotency_keys table for that key
  3. if it exists → returns the stored response
  4. if not → processes, stores response, returns
┌──────────────────────────────────────────────┐
│  idempotency_keys                            │
├──────────────────────────────────────────────┤
│ key            uuid           PRIMARY KEY    │
│ user_id        bigint                        │
│ request_hash   text                          │
│ response_body  jsonb                         │
│ response_code  integer                       │
│ created_at     timestamptz                   │
└──────────────────────────────────────────────┘

Important detail: you store the hash of the request body. If the same key arrives with a different body, return an error. Otherwise a clever client could "recycle" keys.

One more: created_at exists so you can expire old keys (Stripe uses 24h).


Idempotency in workers

Classic case: charge job.

class ChargeOrderJob
  def perform(order_id)
    order = Order.find(order_id)
    PaymentGateway.charge(order)
    order.update!(paid: true)
  end
end

Sidekiq may run this twice. SQS almost always will. Result: double charge.

The fix is not "try not to duplicate", it's assume it will:

class ChargeOrderJob
  def perform(order_id)
    order = Order.find(order_id)

    Charge.create!(
      order_id: order_id,
      idempotency_key: "charge_order_#{order_id}"
    )

    PaymentGateway.charge(order, idempotency_key: "charge_order_#{order_id}")
    order.update!(paid: true)
  rescue ActiveRecord::RecordNotUnique
    # job ran again, charge already exists, move on
  end
end

Two layers of idempotency:

  • yours, via UNIQUE INDEX on charges.idempotency_key
  • the provider's, via the Idempotency-Key header

And here's the detail that separates people who sleep from people who get woken up at 3am: the idempotency_key has to exist before the money leaves. If you generate the key inside the job, each retry creates a new one, and your protection is worthless.


The dedupe table — the generic pattern

Works for any event that needs to be processed once:

CREATE TABLE processed_events (
  event_id   text PRIMARY KEY,
  processed_at timestamptz NOT NULL DEFAULT now()
);
def handle_event(event)
  ProcessedEvent.create!(event_id: event.id)
  do_the_thing(event)
rescue ActiveRecord::RecordNotUnique
  # already processed
end

Order matters:

WRONG                         RIGHT
─────────                     ─────────
1. do_the_thing               1. INSERT processed_events
2. INSERT processed_events    2. do_the_thing

Why? Because if do_the_thing fails on the wrong path, you did the action but didn't mark it, and on retry you do it again.

On the right path: if the insert succeeded and the action failed, the retry sees the record and skips — meaning you need another strategy (e.g. store the result alongside). It's a trade-off, but an explicit one.

For most cases, the best move is to wrap everything in a transaction:

ActiveRecord::Base.transaction do
  ProcessedEvent.create!(event_id: event.id)
  do_the_thing(event)
end

All or nothing. Postgres handles it.


The most common mistakes

1. Idempotency only in application code.

Without a unique constraint, it doesn't count. Already said.

2. Idempotency key generated inside the worker.

Each retry generates a new key. Equivalent to no key.

3. Trusting auto-increment ids as dedupe.

The id is generated by the database. The client doesn't know what it'll be. Can't dedupe on it.

4. TTL too short on the keys table.

The client may take 30 minutes to retry (delayed push notification, backgrounded app). Key expired? Double charge.

5. Not testing the double path.

Write a test that hits the endpoint twice with the same key. Run the job twice in a row. Make sure the effect is the same.

6. Webhook handler without dedupe.

Stripe, GitHub, any serious provider resends webhooks when your endpoint returns non-2xx or takes too long. It will arrive again. Handle it.


Quick mental map

Got a mutating operation?
   ↓
Could it be called twice?  (answer: always yes)
   ↓
Define the idempotency KEY (resource id, client uuid, event hash)
   ↓
Create UNIQUE INDEX on that key
   ↓
Treat RecordNotUnique as "already done, ok"
   ↓
If there's an external side-effect, propagate the key (Idempotency-Key)

Five steps. In any language, any database, any queue.


Conclusion

A distributed system without idempotency is a time bomb.

It's not "if" it duplicates. It's "when".

The difference between those who survive and those who get woken at 3am:

  • assumes at-least-once as default
  • puts a unique constraint in the database
  • propagates the idempotency key across layers
  • tests the duplicate path

Idempotency isn't a pattern. It's a survival skill in distributed systems.

The day your queue duplicates — and it will — you'll be glad you took it seriously.

Or you'll learn, the expensive way.

Idempotency in real systems
Idempotency in real systems