Don’t comment code

Posted on October 27, 2009 2:30 AM by Andrew

I’d heard this before, but good advice is typically worth repeating. For those of you who program in R, I’d also recommend that scripts be written so it can run from scratch in an empty R environment. Many many times I’ve found my R scripts and environments to be palimpsests whose meanings are difficult to unravel. (The official recommendation, I guess, is to put everything in R packages, but I’ve never actually learned how to do this.)

15 thoughts on “Don’t comment code”

Andrei on October 27, 2009 1:33 AM at 1:33 am said:

One type of comment which is extremely useful is the type signature of a function. This is not obvious, especially in R when vectors, matrices can be easily confused with each other. And functions can be passed as arguments.

In general, R can be used for very much functional-like programming which many consider to be a good thing.

A.
andrewt on October 27, 2009 1:45 AM at 1:45 am said:

You'd really advocate not supplying comments like those in this famous piece of code??

http://www.bsdlover.cn/study/UnixTree/V6/usr/sys/…
Richard D. Morey on October 27, 2009 2:17 AM at 2:17 am said:

Surely readable code and comments are not mutually exclusive. Often, when I'm looking at source code, I'm looking for a particular snippet of code to do what I want. Comments help me scan tremendously faster than I could without them. I find comments help make the structure of code more transparent.

Comments aren't a replacement for clear code; rather, the complement one another.
harlan.myopenid.com on October 27, 2009 3:50 AM at 3:50 am said:

FWIW, a lot of people strongly disagree with the title of this blog post. This is not to say that you should comment poorly, as the original posting points out nicely, but that you should use comments to explain *why* the code is the way it is, or to give summary explanations of a block of code (describe the forest, not the trees).

I agree with Prof. Gelman, however, that your scripts should load all required packages. I don't think everything should go in a package, but any code that you routinely use for your workflow (say, a routine to nicely format and display the anova() comparisons among a set of nested models), would best be packaged so you can use it wherever you are on your system. (Nothing worse than running into old copies of your frequently-used code in a dusty subdirectory, and being confused why it doesn't work!)
Carlo Hamalainen on October 27, 2009 5:15 AM at 5:15 am said:

It's good to document your code with examples that provide test data. Doing this has really improved the readability and quality of my code lately. (By docstring I mean a Python/Sage function comment with an input/output section that can automatically be tested.)
Andy Fugard on October 27, 2009 5:27 AM at 5:27 am said:

Hmmm. No. Some comments are good. For instance what kinds of parameters functions take, and assumptions about those parameters (pre-conditions); summary of what a function computes (post-conditions). Loop invariants, and other mid-code assertions (whether checked or not) can also be useful for conveying the intention behind code and suggesting test cases.
Kevin Wright on October 27, 2009 6:17 AM at 6:17 am said:

Please, don't propagate this mentality.

Useless comments are easy to ignore.

Comment-less code can take hours to understand.

Enough said.

Kevin Wright
dWj on October 27, 2009 6:25 AM at 6:25 am said:

I very much agree with commenting forest rather than trees. The first assertion made by lingpipe, that comments lie and code doesn't, is correct, but as with NP problems, it's easier to verify information than to discover it. (That's why I still read the newspaper.) A good "this way up" gives you a framework for thinking about what you're about to read, but, barring very careful and deliberate subterfuge, will not make the code more confusing if it's a lie than if it weren't there.
wcw on October 27, 2009 7:50 AM at 7:50 am said:

The point here about loading all libraries carefully is good. It's nice to feed scripts to a vanilla instance sometimes, especially for scheduled jobs.

As for commenting, yeah, this one is wrong, sorry. Comments are not for other people: they are for you. Write an ugly hack? Remind yourself why. Ignore error conditions that won't happen in the particular data on which you're working? Remind yourself. When you come back to reuse your stuff eighteen months later, it'll help.
Anonymous Coward on October 27, 2009 8:01 AM at 8:01 am said:

Applying the logic of that blog post, we should not only stop writing comments, we should just stop writing code altogether. Your code can be wrong, and it can do useless things.

I see bad advice all the time, and I have to say, this is the worst advice I've seen in at least two weeks.

Write good comments, just like you should write good code, and do good statistical analysis. Good comments are necessary to have good code that can be reused in the future, by you, or by anyone you share it with.
Bill Mill on October 27, 2009 8:59 AM at 8:59 am said:

Here's a game:

follow this link: http://github.com/DmitryBaranovskiy/g.raphael/blo…

and tell me what, exactly, the shrink() function does in this javascript code.

Whether or not you can read javascript code, my claim is that it takes at least a few minutes to figure out what even this very short snippet of code does, minutes that could easily be saved with a one-line description of the purpose of the function.

All I did was pick the very most recent example I've seen of this; they abound when trying to read others' code. Please leave your readers some simple, accurate descriptions of any code that's even moderately complex in the comments!
Bob Carpenter on October 27, 2009 9:45 AM at 9:45 am said:

I never said that public methods or functions shouldn't be documented. The most important part of the doc is the user-facing documentation (aka the API doc), which should include all well-formedness conditions for arguments, not just types.

The API doc is your contract to the outside world about what the code does. Write it clearly. Ideally provide executable use examples. And write it so that it can be tested as easily as possible to verify it does what it says it does. Put as much reusable functionality as possible into such publicly documented libraries.

I was talking about comments in the code itself, which presumably your users won't see if you've done your job on the API doc.

Presumably you've opened the lid on a package because something's broken, you want to improve performance, add functionality, or just borrow a coding pattern or idiom.

The real danger is blindly believing comments. Code's written by people, and what they think they're doing versus what the code's actually doing often diverges. My point is that the code's the only thing you can trust.

While comments that are accurate can help in some places, if you're going to be working with a particular piece of code, you really have to understand it. If you don't understand the code, you probably shouldn't be monkeying with it.

One place I noted where code comments help is when you're coding an obscure algorithm for efficiency that can't be folded into reasonably modular function calls. But the comment should be lightweight about what it does.

The second place I find comments helpful is when you code against the conventions of the language (hopefully for some well motivated reason), or you do something unexpected (e.g. for backward compatibility reasons or because a third-party library uses odd calling conventions). The reason to comment here is that someone literate in coding in the language in question will be thrown by non-idiomatic usages or unexpected patterns.
Richard D. Morey on October 27, 2009 10:37 AM at 10:37 am said:

Bob, of course you shouldn't be monkeying around with code you don't understand. But comments can help you understand code more quickly, in exactly the same way that good variable names can. They make the intent of the author more transparent.

I agree with you with respect to really basic comments (like "return(blah + n); //add n to blah and return") that generally decrease the signal to noise ratio, but I think your position against comments is too strong.
Greg Lee on October 30, 2009 11:52 AM at 11:52 am said:

After many years of professional programming in many languages, I follow three rules:

1. Don't allow incorrect comments. This is the universal rule. Don't write incorrect comments; fix or eliminate broken comments.

2. Follow community standards. This recognizes that different languages require different amounts of comments and that different communities benefit from different things. Style guides from places like Google are good starting points (http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html). The "masters" in each community often set good examples (e.g. Ken Thompson for the Unix kernel example posted above).

3. Let some time pass, then re-read your code. If you can't figure out what it does, you need something: better code, a comment, or a different job.
jseabold on October 31, 2009 11:06 AM at 11:06 am said:

This really is not good advice. You should strive for good, readable code. And good, *relevant* comments. Update both appropriately. When working in development teams, comments in the initial stages are what keep us all on the same page.

Comments are closed.