Idempotency in real systems
At-least-once is reality. A unique constraint is the only real guarantee. Idempotency-Key, dedupe tables — survival in distributed systems.
Idempotency in real systems
Every queue delivers at least once.
Every HTTP call can be retried.
Every webhook can arrive twice.
If your application wasn't designed for this, it's already broken — you just haven't seen the damage yet.
At-least-once is the reality, not the exception
The literature makes it sound like "exactly-once delivery" is a thing.
It isn't.
What exists is:
- at-most-once: may drop messages (fast, unreliable)
- at-least-once: may duplicate (reliable, you handle duplicates)
- exactly-once: marketing
Every serious queue (SQS, RabbitMQ, Sidekiq, Kafka) delivers at-least-once by default.
Translation: sooner or later, your job will run twice for the same message.
If it charges the card twice, that's your problem, not the queue's.
Idempotency in one sentence
An idempotent operation is one that, executed N times with the same inputs, has the same effect as executing it once.
Note: effect, not result.
GET /users/42 is trivially idempotent — it changes nothing.
POST /charges has to be made idempotent on purpose.
Idempotent != commutative
This trips up smart people.
- Idempotent:
f(f(x)) = f(x)— calling again does nothing extra. - Commutative:
f(g(x)) = g(f(x))— order doesn't matter.
SET status = 'paid' ← idempotent (call twice, same result)
balance += 100 ← NOT idempotent, but commutative with another +N
balance = 500 ← idempotent
When you design an operation, you choose which property you want.
In a payment system? Idempotency, always. Commutativity is a bonus.
The "check then write" trap
This is the most common sin:
def create_charge(order_id:, amount:)
return if Charge.exists?(order_id: order_id)
Charge.create!(order_id: order_id, amount: amount)
end
Looks idempotent. Isn't.
Between the exists? and the create!, another thread / process / worker replica can run the exact same check.
Result: two charges for the same order.
The window is small. With real production traffic, it happens.
The only real guarantee is at the database:
CREATE UNIQUE INDEX charges_order_id_idx ON charges (order_id);
def create_charge(order_id:, amount:)
Charge.create!(order_id: order_id, amount: amount)
rescue ActiveRecord::RecordNotUnique
Charge.find_by!(order_id: order_id)
end
Now we're talking. The database is the only source of truth that matters.
Unique constraint is the only real contract
Repeat after me:
Idempotency in application code, without a unique constraint in the database, doesn't exist.
Mutex in Ruby? Works inside ONE process.
Lock in Redis? Has TTL, clock skew, network issues.
if not exists in code? Race condition guaranteed.
UNIQUE INDEX? Atomic, enforced by Postgres, no chance of failure.
Difference between senior and junior:
- junior protects idempotency with
Mutex.synchronize - senior puts on a
UNIQUE INDEXand sleeps soundly
Idempotency at the HTTP layer — the Stripe model
Stripe popularized the pattern:
POST /v1/charges
Idempotency-Key: 9f3a7e2b-1c4d-4e5f-8a9b-1234567890ab
{ "amount": 5000, "currency": "usd" }
The client generates a UUID. If it retries (timeout, retry), it sends the same key.
The server:
- receives the request
- looks up the
idempotency_keystable for that key - if it exists → returns the stored response
- if not → processes, stores response, returns
┌──────────────────────────────────────────────┐
│ idempotency_keys │
├──────────────────────────────────────────────┤
│ key uuid PRIMARY KEY │
│ user_id bigint │
│ request_hash text │
│ response_body jsonb │
│ response_code integer │
│ created_at timestamptz │
└──────────────────────────────────────────────┘
Important detail: you store the hash of the request body. If the same key arrives with a different body, return an error. Otherwise a clever client could "recycle" keys.
One more: created_at exists so you can expire old keys (Stripe uses 24h).
Idempotency in workers
Classic case: charge job.
class ChargeOrderJob
def perform(order_id)
order = Order.find(order_id)
PaymentGateway.charge(order)
order.update!(paid: true)
end
end
Sidekiq may run this twice. SQS almost always will. Result: double charge.
The fix is not "try not to duplicate", it's assume it will:
class ChargeOrderJob
def perform(order_id)
order = Order.find(order_id)
Charge.create!(
order_id: order_id,
idempotency_key: "charge_order_#{order_id}"
)
PaymentGateway.charge(order, idempotency_key: "charge_order_#{order_id}")
order.update!(paid: true)
rescue ActiveRecord::RecordNotUnique
# job ran again, charge already exists, move on
end
end
Two layers of idempotency:
- yours, via
UNIQUE INDEXoncharges.idempotency_key - the provider's, via the
Idempotency-Keyheader
And here's the detail that separates people who sleep from people who get woken up at 3am: the idempotency_key has to exist before the money leaves. If you generate the key inside the job, each retry creates a new one, and your protection is worthless.
The dedupe table — the generic pattern
Works for any event that needs to be processed once:
CREATE TABLE processed_events (
event_id text PRIMARY KEY,
processed_at timestamptz NOT NULL DEFAULT now()
);
def handle_event(event)
ProcessedEvent.create!(event_id: event.id)
do_the_thing(event)
rescue ActiveRecord::RecordNotUnique
# already processed
end
Order matters:
WRONG RIGHT
───────── ─────────
1. do_the_thing 1. INSERT processed_events
2. INSERT processed_events 2. do_the_thing
Why? Because if do_the_thing fails on the wrong path, you did the action but didn't mark it, and on retry you do it again.
On the right path: if the insert succeeded and the action failed, the retry sees the record and skips — meaning you need another strategy (e.g. store the result alongside). It's a trade-off, but an explicit one.
For most cases, the best move is to wrap everything in a transaction:
ActiveRecord::Base.transaction do
ProcessedEvent.create!(event_id: event.id)
do_the_thing(event)
end
All or nothing. Postgres handles it.
The most common mistakes
1. Idempotency only in application code.
Without a unique constraint, it doesn't count. Already said.
2. Idempotency key generated inside the worker.
Each retry generates a new key. Equivalent to no key.
3. Trusting auto-increment ids as dedupe.
The id is generated by the database. The client doesn't know what it'll be. Can't dedupe on it.
4. TTL too short on the keys table.
The client may take 30 minutes to retry (delayed push notification, backgrounded app). Key expired? Double charge.
5. Not testing the double path.
Write a test that hits the endpoint twice with the same key. Run the job twice in a row. Make sure the effect is the same.
6. Webhook handler without dedupe.
Stripe, GitHub, any serious provider resends webhooks when your endpoint returns non-2xx or takes too long. It will arrive again. Handle it.
Quick mental map
Got a mutating operation?
↓
Could it be called twice? (answer: always yes)
↓
Define the idempotency KEY (resource id, client uuid, event hash)
↓
Create UNIQUE INDEX on that key
↓
Treat RecordNotUnique as "already done, ok"
↓
If there's an external side-effect, propagate the key (Idempotency-Key)
Five steps. In any language, any database, any queue.
Conclusion
A distributed system without idempotency is a time bomb.
It's not "if" it duplicates. It's "when".
The difference between those who survive and those who get woken at 3am:
- assumes at-least-once as default
- puts a unique constraint in the database
- propagates the idempotency key across layers
- tests the duplicate path
Idempotency isn't a pattern. It's a survival skill in distributed systems.
The day your queue duplicates — and it will — you'll be glad you took it seriously.
Or you'll learn, the expensive way.
