Adding domain knowledge to code comments

Published on 16 December 2020, in #coding

I started experimenting with adding domain knowledge to code comments. It turned out that such comments were super helpful to others. They even inspired others to write such comments themselves. So if they are so useful, why is it that we don't write them more often?

A bit of context: Say hello to hybrid inverters #

To give a bit of context, I've been working on an application that calculates investment costs for photovoltaic (PV) systems. Specifically, I've been adding new investment logic for hybrid inverters. I knew what an inverter was: a device changing DC current to AC. However, I didn't know a thing about hybrid inverters & how they relate to investments.

So first, I needed to figure out what a hybrid inverter even was. I learned that if you install a PV system with a battery, you need a hybrid inverter. It turns out a hybrid inverter is a device that combines a solar power inverter & a battery power inverter into one. For the investment costs, a hybrid inverter relates to both the PV system & the battery system.

hybrid-inverter

Updating the code & adding the comment #

Once I had hybrid inverters figured out, I noticed it was easier to understand the functional requirement at hand. Also, I could better understand the existing code & I was quick to update it. I ended up with this Typescript code:

const GetInvestmentCost = {
  forHybridInverter(/*...args...*/) {
    /*...body...*/
  },
  forBatterySystem(/*...args...*/) {
    /*...body...*/
  },
  forPVSystem(/*...args...*/) {
    /*...body...*/
  },
}

Out of the total time I spent on this update, most of the time went to understanding the domain background: the hybrid inverters & their relation to investment. This knowledge, specific for this particular feature, hadn't been captured anywhere. I also noticed that the domain knowledge was loaded in my mind, readily available for usage. So I added what I knew to the code as a comment. As brief & to the point as possible. It took a few minutes.

/*
 * If you install a PV system with a battery,
 * you need a hybrid inverter. A hybrid inverter
 * combines a solar inverter & battery inverter
 * into one device.
 *
 * Now, for investment cost, we put the hybrid
 * inverter separately, as it doesn't exclusively
 * belong to neither the PV system costs nor
 * the battery system cost. The investment cost
 * for the battery & PV system excludes
 * the cost of the hybrid inverter.
 * */
const GetInvestmentCost = {
  forHybridInverter(/*...args...*/) {
    /*...body...*/
  },
  forBatterySystem(/*...args...*/) {
    /*...body...*/
  },
  forPVSystem(/*...args...*/) {
    /*...body...*/
  },
}

It's not perfect, but it'll do. I aimed to bridge the gap between the code & the domain. I assumed the next person looking at the code will understand the logic faster & thus make their change faster.

There's another thing I aimed for: the explanation & the code now sit right next to each other. As Rich Hickey explains, when we put related things together, we understand the system more easily. And so we add features and fix bugs quicker. But if we put related things apart, such as the code and the domain knowledge driving the code, the system is harder for us to understand. This scatteredness of things, these inconsistent locations, is a form of complexity. It slows down our understanding of systems & speed of change.

More examples #

Encouraged, I added a bunch of other domain comments to the code I worked with:

/*
 * With net metering, also called reverse-count metering,
 * the meter rolls back when you inject energy back
 * to the grid. The grid serves as a free & fully
 * efficient virtual battery.
 */
function NetMeteringCashFlow(/*...args...*/) {
  /*...body...*/
}

/*
 * With market valorization policy, you are reimbursed
 * for the energy injected into the grid, according
 * to the injection tariff. For the energy consumed
 * off the grid, you pay the regular off-take tariff.
 * Both the injection tariff & the off-take are specific
 * to your grid operator.
 */
function MarketValorizationCashFlow(/*...args...*/) {
  /*...body...*/
}

Feedback from fellow developers #

I was curious what other devs thought about this, so I sent them a few snippets and asked about their opinion. Their answers were so interesting that I asked them if it's ok to include them here. They agreed, so here they are:

There are so many concepts and business rules to be changed relatively often, adding more context to the right place is incredibly helpful. However, I also think doing this is difficult and I'm very interested to see how well will the comments age with coming time.

When asked more about the difficulty, he replied:

My personal hypothesis for why I haven't been able to write such comments more is that for me (and I suspect many other developers), writing is a different skill from coding. I'm much less experienced at it and it's difficult for me to switch back and forth between. Because of the "free form". It also feels much less forgiving and more permanent than code.

The remote work environment, like we have, with more asynchronous and written communication, may be very helpful in promoting this. In my previous jobs we preferred talking over writing, attempting to make all knowledge shared by implicit osmosis (extreme being pair/mob programming) rather than explicit knowledge base (extreme being almost complete asynchronicity like Gitlab culture). Like with anything interesting, there are constraints and trade-offs for both approaches, but the more I think about it, the more comments like these seem to worth pursuing and I'm excited to try it as well.

Another developer answered:

I am all for it. The downsides are hardly there. There is a maintenance burden for the comments, but that is usually removing/editing here or there. It also has the upside of assisting code review.

And another one:

I'm extremely in favor. I was looking for a value yesterday and was unsure of what it was from the ticket, and if it had a comment like the above, it may have been easier to decipher what that value actually did.

So it seems others find such comments also super helpful, but they see two possible issues: writing & maintenance.

The difficulty of writing #

The difficulty of writing resonates. I feel it too, especially since English is not my native language. What has helped me is redirecting attention to learning. It turns out that non-fiction writing is a skill I can try to improve. The best resource on non-fiction writing I have found so far has been Steven Pinker's The Sense of Style (thanks to Gergely for this tip). The book is an eye-opener. It's an entertaining, practical & easy read that introduces basic concepts behind nonfictional writing. It puts logic & process & rules into the writing process. I like that.

Another strategy, thanks Alexey for the tip, is to use bullet-points as they are easier to write.

Comments maintenance #

Now for the maintenance. Imagine I need to make a change to a complex code. So first, I need to understand the code. I try to get all the help I can get, such as looking into the code, the comments, reading docs & discussing with the team. Once I understand the code & the domain context, I can proceed and make the change. As I finish, I am about to move on to another problem. Now shall I invest a few minutes in writing (or updating) the comment or not?

No matter how I decide now, the cost of gathering knowledge has already been paid. To assess if to comment or not, I need to consider only the cost of writing the comment, not the cost of obtaining the knowledge for the comment.

So let's say I decide to invest my time and write the comment as part of the change A. With the comment, the next person updating the code with the change B will have most of the explanation they need already at hand. They wouldn't need to pay as much time as I did to understand the code. So for the two of us, the total return on my investment would be positive & high.

time-saved-investment-yes

What if I decide to save time & not write the comment? The upside is that I will deliver the change A faster since I am not writing the comment. The downside is that the next person would need to start from scratch for their change B. They would need to figure out all the heavy stuff again. So for the two of us, the total return on my investment would be negative.

time-saved-investment-no

Now you may wonder: it's one change, does it matter? Perhaps not much for one change. But imagine more changes, more functions & more developers. The savings start to compound.

investment-change-more

A friend remarked:

Taking the time to explain something for the next dev is usually even more of a good idea because the next dev to read the code will probably be yourself, so even when you feel selfish, you can think of it as a gift to your future self.

Conclusion to this experiment #

All in all, it seems to me that if we adopt domain comments throughout the codebase, we accelerate the engineering pace. Perhaps even significantly. So for now, I'll try to write more of them and see how it goes.

Won't comments get out of sync? #

It seems that domain comments don't change that often. A hybrid inverter's relation to a PV/battery system stays the same if we refactor imperative code into functional code or extract some code into a reusable component. Also, we can always try to encourage consistency, for example, via code reviews.

Shouldn't good code document itself? #

When discussing the domain comments with a friend of mine, he made this remark. I find it spot on:

Yeah, if code is so clear that it needs no further comments, that is great. But honestly, if you look at code, how often does that encapsulate everything you've learned about the domain. If it doesn't, why not add your knowledge in the comments?