mmillin

This is an excellent article, I’ve seen almost all of the issues it calls out in production for various APIs. I’ll be saving this to share with my team.

I’ve seen two separate engineers implement a “generic idempotent operation” library which used separate transactions to store the idempotency details without realizing the issues it had. That was in an organization of less than 100 engineers less than 5 years apart.

One other thing I would augment this with is Antithesis’ Definite vs Indefinite error definition (https://antithesis.com/docs/resources/reliability_glossary/#...). It helps to classify your failures in this way when considering replay behavior.

shiandow

This seems to assume retrying a command should result in the same response, but I am not sure I agree.

Idempotency is about state, not communication. Send the same payment twice and one of them should respond "payment already exists".

show comments
villgax

skill issue lol, it's not idempotent anymore, same key for different requests? Heard of a nonce?

zinkem

Idempotency is easy if you don't use mutable state in your middleware.

Auth, logging, and atomicity are all isolated concerns that should not affect the domain specific user contract with your API.

How you handle unique keys is going to vary by domain and tolerance-- and its probably not going to be the same in every table.

It's important to design a database schema that can work independently of your middleware layer.

chaz6

You keep the hash of the request so that you can reject a subsequent request with a different body. This has helped me surface bugs and data issues in other systems.

stavros

Half of the mentioned issues are issues of atomicity, not idempotency. If I make a request, and the server crashes midway and doesn't send some crucial events, that's an issue whether or not I send a second request.

From a cursory read, only the part up to "what if the second request comes while the first is running" is an idempotency problem, in which case all subsequent responses need to wait until the first one is generated.

Everything else is an atomicity issue, which is fine, let's just call it what it is.