Status columns lie. Events don't.
Why we derive customer-contact state from event tables instead of storing it as a column on the customer row.
The temptation, when a project asks “what state is this customer in?”, is to add a status column to the customer table. It’s the path of least resistance. It is also, with depressing reliability, the column that lies to you six months in.
Two events arrive out of order. Two services both update the row. A migration runs, and the row is rebuilt without one of the side effects. A retry fires, and the status flips back to a state it should not be in. By the time anyone notices, the column says one thing and the underlying events say another, and the only honest answer to “what state is the customer in?” is “let me check the logs”.
There is a better answer, and it has been the right answer in regulated data work for a long time: don’t store the state. Derive it.
The shape of the rule
Keep your event tables append-only. Each row is something that happened: an email sent, a delivery event received, a postal piece dispatched, a return logged. None of these are status; all of them are observations. They are immutable by intent. If something didn’t happen, you don’t have a row.
Then write a view — call it v_customer_contact_status — that derives the current state from those events. The view is the source of truth for “what state is the customer in”. You can drop it and rebuild it. You can change the rules. You can audit the answer by reading the underlying events. The view is cheap; the events are honest.
What this looks like in practice
On a recent regulatory customer-contact build, every state the system cared about (which wave a customer had received, whether a delivery had bounced, whether a postal piece had returned “gone away”) was derived from event tables. There was no customer.status column. There was nothing that could disagree with itself.
When the regulator asked, late in the engagement, exactly what had been sent to a particular customer on a particular day, the answer was a single query against the events. The view was rebuilt twice during the project as the cohort definition evolved. Neither rebuild required a backfill — the events were already correct.
Why this is hard to give up
The cost is more upfront thinking. You have to design your event tables before you can derive state. You have to write the view, which is more typing than UPDATE customer SET status = ?. You have to convince stakeholders who like the convenience of “just look at the status column” that the answer is going to be a query, not a join.
The reward is that nothing ever drifts. Every state can be re-derived. Every audit question has a SQL answer. And the line you have to defend when a regulator asks “how do you know?” is a line of SQL, not a hope.