More control with idempotency

Published on 21 March 2022, in #coding, #system-design

Idempotency has become one of my favorite principles for designing systems. When programs are designed to be idempotent, I feel much more in control and know better what is going on. So let's take a closer look into the idempotency by trying to design a simplified real-life idempotent program.

Our use-case here will be about the health monitoring of charge-points for electric vehicles. Let's assume that our system is already able to detect alerts on such charge-points, and our task is to send emails to the charge-point owners notifying them about the detected alerts. (To make things simpler for now, let's assume that there can be only one alert on a given charge-point during its whole lifetime. Later, we will also discuss how to approach multiple alerts). Now, let's try to design such a program in an idempotent way.

Our visual grammar for this example #

In the drawings below, we will operate with two things. First, we will have an alert for a given charge-point:

Screenshot 2022-03-09T17.18.png

And second, we will have emails for charge-point owners:

Screenshot 2022-03-09T17.19.png

Back to idempotency: What is it? #

A program is idempotent if when we run the program multiple times (on the same input), it will have the same effect as if we run it only once.

How I like to think about idempotence is that we have some desired state (the input), and then we have an actual state and the purpose of the program is to bring the actual state to the desired state. Such a program is idempotent because:

the first run will bring the actual state into the desired state,
the second run will do nothing because the actual state already is the desired state,
the third, fourth, etc runs will also do nothing.

Let's try now to apply this to the health-monitoring use-case. The two key questions that I tend to always ask myself are: What is the desired state and what is the actual state? I think that in our example:

the desired state could be the list of emails that should be sent;
and the actual state could be the list of emails there were already sent.

We don't need to store the whole emails, I think we can get away with the list of charge-points IDs only.

Screenshot 2022-03-21T14.21.png

The goal of the program now is to bring the actual state—that is, emails that were already sent—into the desired state—that is, emails that should be sent. This can be accomplished by taking the difference between the two states and sending emails for that difference.

Let's see how that will work:

Run #1: With three new alerts, the program sends three emails #

Let's say three alerts were detected. That means that three emails should be sent. However, as of yet, no emails were sent. The difference is the three emails that the program will send.

Screenshot 2022-03-21T14.22.png

Run #2: No new alert, the program sends no emails #

We run the program again and no new alert has been detected. That means that three emails should be sent, and three emails were already sent. The difference is empty, so the program is not going to send any emails.

Screenshot 2022-03-21T14.23.png

Run #3: Let's change the input: a new alert is detected #

Now let's say we run the program again and a new alert has been detected. At this point, four emails should be sent, and three emails were already sent. The difference is the one new email that the program will send.

Screenshot 2022-03-21T14.23.png

Support for multiple alerts #

Now, how can we add support for multiple alerts per one charge-point? One option would be to start considering when did the alert start, and when did the alert stop. Let's call such an interval the alert interval.

To define the desired state, we need to first clarify the required business logic: should the charge-point owner be notified of all the alerts or only of the latest alert? Let's say we'd like to notify the owner of all the alerts. To support multiple alerts, we might need a different desired state: the desired state could be now a list of alerts with their time intervals that the owner should be emailed about. The actual state could be a list of alerts with their time intervals that the owner was already emailed about. The difference between the two states is the list of alerts that the program can transform into emails and send to the charge-point owner.

Now, imagine that we have such logic implemented and the program has been running in production. After months, however, we realize that it would indeed be better—from the business perspective—to send only the latest alert, not all of them. With such an idempotent design, it could be straightforward to update the system for that requirement: we'd quite possibly need to update how the desired state is calculated, and probably slightly touch the actual state and the difference calculation to accommodate for the alert intervals. However, all in all, supporting multiple alerts would seem like a straightforward incremental improvement, as opposed to a BIG SYSTEM CHANGE.

Pros & Cons #

Let's look into pros & cons of such an idempotent design. First, the pros:

full control over the frequency of sending emails,
resilient towards email service disruptions,
easier auditing and bug-fixing when something is not going according to plan,
flexibility to change the logic for the desired state later and re-run the whole program from the very beginning,
the dev experience might feel more ergonomic,
idempotent concepts might feel easier to think and reason with (idempotency might be considered to fall into the same bucket of mental models as immutability, and declarative and functional code style )

For the cons:

The key downside that I'm aware of is that calculating the difference between the desired state and the actual state can take long. I've encountered this issue when calculating the difference using Postgres cross-joins over two big tables. We had to optimize the calculation by storing the timestamp of the last execution and then computing the difference using that timestamp.
Another possible downside is that sending an email and recording that the email was sent would ideally need to happen atomically. That is, if we send an email and fail to record that fact, we might end up with an inconsistent actual state that could lead to, for example, emails being sent twice. Pragmatically speaking, this hasn't been an issue for non-critical low-load business-line software that I've worked on. However, fo high-load or critical systems, this could be a potential downside that one needs to be aware of.

In general, I think that the pros outweigh the cons by a large margin for idemoptency, at least for use-cases that I have encountered. For those reasons, I've come to treat idempotency as a fantastic tool that I try to apply by default whenever I can.