Why comment codeWriting code is not simply about putting language constructs together. It’s about architecture, complexity analysis, tradeoffs, testing, measuring performances, etc. And it’s about making sure that developers (other people as well as the original author) can later read and understand that code, for bug fixing or enhancement. This is where comments come into play (not to be confused with documentation!).

Developers put comments into their code to explain what it does, how it does it, to spell out assumptions, to warn about exceptions, etc. Commenting code is considered as a serious commodity in the world of code quality. People have been measuring the ratio comments/code, and came out with empirical rules. It is widely recommended that about 30% of the source code be comments.

So what is a good ratio? Is 30% of comment sufficient? Should it be more, or is 15% OK? And why?

I label comments as follows:

  1. Redundant: the comment states the obvious and does not bring any information. It is a waste of space.
  2. Obsolete: the code changed, but the comment was not updated. It is irrelevant at best, misleading and confusing at worst.
  3. Incorrect: the comment only reflects the author’s confusion. It is as misleading as (2).
  4. Informative: none of the above.

In my experience most of the comments fall into category 1: a pollution that does not bring any information that the actual source code does not already provide. Category 2 is quite common too: as the code evolves, the comments do not always keep up. How often do I see useful comments?

  • If a comment is necessary to explain what a function does or what a variable represents, I argue that the name of the entity (function, class, type, data member, variable, etc) should carry that meaning.
  • If the comment is about an assumption, an invariants, or an exception, then some extra code guarded with an assert() is a non-ambiguous statement, better than any explanation.

Code should be as self-explanatory as possible. The API, the name of the constructs, the architecture, the assertions, the exception handling: all those aspects bring so much more information and intend than a few sentences in English (replace with your favorite dialect used to write comments). Natural languages are inherently ambiguous; they cannot be trusted to convey formal or logical assessments.

I find a comment to be valuable only in the following cases:

  • To refer to a specific, non-trivial algorithm.
  • To leave a note for future enhancement.
  • To make a joke.

So what is the adequate percentage of comment? If your code reads like a perfect prose, then you should have 0% comment. Practically, it is a few percent (5% is more than enough), leaving notes, references, or clarifications on hardcore algorithms. Anything more is pollution, or the symptom of a code that can be written better.

“Who truly writes good code hardly needs any comment”

 

Tags:

Leave a Reply