My Secret Life as a Spaghetti Coder
home | about | contact | privacy statement | getting started with cfrails
Several days ago I wrote about common excuses people use for commenting code and things you can do to get rid of them. I took the stance that rarely is there something so complex that it really requires comments.

In other words, much of the time, comments are crutches for understanding bad code.

To show a counter-example where comments were needed, Peter Farrell posted a snippet of code that was hard to understand:

<cfloop condition="patArr[patIdxEnd] NEQ '*' AND strIdxStart LTE strIdxEnd">
  <cfset ch = patArr[patIdxEnd] />
  <cfif ch NEQ '?'>
    <cfif ch NEQ strArr[strIdxEnd]>
      <cfreturn false />
    </cfif>
  </cfif>
  <cfset patIdxEnd = patIdxEnd - 1 />
  <cfset strIdxEnd = strIdxEnd - 1 />
</cfloop>

Indeed, that code is hard to understand, and comments would clear it up. And I'm not trying to pick on Peter (the code is certainly not something I'd be unlikely to write), but there are other ways to clear up the intent, which the clues of str, pat, * and ? indicate may have something to do with regular expressions. (I'll ignore the question of re-implementing the wheel for now.)

For example, even though pat, str, idx, ch, and arr are often programmer shorthand for pattern, string, index, character, and array respectively, I'd probably spell them out. In particular, str and array are often used to indicate data types, and for this example, the data type is not of secondary importance. Instead, because of the primary importance of the data type, I'd opt to spell them out.

Another way to increase the clarity of the code is to wrap this code in an appropriately named function. It appears as if it was extracted from one as there is a return statement, so including a descriptive function name is not unreasonable, and would do wonders for understandability.

But the most important ways in which the code could be improved have to do with the magic strings and boolean expressions. We might ask several questions (and I did, in a follow-up comment to Peter's):
  1. Why are we stopping when patArr[patIdxEnd] EQ '*' OR strIdxStart GT strIdxEnd?
  2. Why are we returning false when ch=="?" and ch!=strArr[strIdxEnd]?
  3. What is the significance of * and ?
In regular expression syntax, a subexpression followed by * tells the engine to find zero or more occurrences of the subexpression. So, we might put it in a variable named zeroOrMore, and set currentPatternToken = patArr[patIdxEnd]. We might also set outOfBounds = strIdxStart GT strIdxEnd, which would mean we continue looping when currentPatternToken NEQ zeroOrMore AND NOT outOfBounds.

Similarly, you could name '?' by putting it in a variable that explains its significance.

And finally, it would be helpful to further condense the continue/stop conditions into variable names descriptive of their purpose.

In the end, regular expression engines may indeed be one of those few applications that are complex enough to warrant using comments to explain what's going on. But if I was already aware of what this piece of code's intent was, I could also have easily cleared it up using the code itself. Of course it is near impossible to do after the fact, but I think I've shown how it might be done if one had that knowledge before-hand.

What do you think?

Hey! Why don't you make your life easier and subscribe to the full post or short blurb RSS feed? I'm so confident you'll love my smelly pasta plate wisdom that I'm offering a no-strings-attached, lifetime money back guarantee!


Comments
Leave a comment

I'm big on spelling out all but the most obvious of variable names. For me, I might leave "str" abbreviated, but the others I'd spell. I loathe ambiguity in code almost as much as I hate cryptic code (the latter often begets the former).

The rest I think I'd leave as is (I also hate "unnecessary" variable assignments), but add a short comment above the whole block to state the the block exists to parse a regular expression.

I'd probably consider myself a minimalist with respect to comments, but not an exclusionist.

Posted by Rob Wilkerson on Jul 07, 2008 at 10:27 AM UTC - 5 hrs

Definitely with you on choosing better variable names to help clarify the code. If you can make code read like English, comments do generally become unnecessary.

Posted by Sean Corfield on Jul 07, 2008 at 04:30 PM UTC - 5 hrs

Here here Sean :)

the same goes for database tables, the classic problem of people using ID as primary keys make things very confusing very quickly

Posted by zac spitzer on Jul 07, 2008 at 07:53 PM UTC - 5 hrs

Dont forget, sometimes you cannot see the internal code, and can only see the method signature or class name.

This is a place where comments attached are extremely helpful.

Posted by Zynasis on Jul 08, 2008 at 01:13 AM UTC - 5 hrs

@zac - I don't see ID as a problem with database tables - but I think you should prefix the table name if there is any ambiguity. It is as redundant to address a column as post.postID as it is to execute a method post.createPost().

@Zynasis: I agree, and in the original post mentioned that I feel API documentation is a valid use for comments.

Posted by Sammy Larbi on Jul 08, 2008 at 07:40 AM UTC - 5 hrs

It's nice that in the previous article there was a short list of criteria indicating when comments might still be necessary. I think that it cuts closest to the point of things in the flurry of back and forth I've seen in articles on this topic lately.
Zealous arguments on either side seem to miss the point. Working hard to eliminate comments - for the sake of eliminating comments - is just as crazy as mandating that every line of code will have an accompanying line of comment.
As a developer it’s important to make the intent of the code as obvious as possible to the _next_ guy (whether its someone sitting next to you, someone strolling through a year from now or even yourself a year from now); Hungarian notation, comments, common idioms, and so forth.
But as a long-time consultant (as I know some of the responders here are), I have inherited code whose comments and well-notated and smartly-named variable were no longer valid for what they are doing. Spend a half an hour (or more!) reading a company's coding standards - complete with copious examples of why it’s better to use positive vs. negative logic and datatype to notation tables - and then open the files to find that few, if any, "standards" are followed. A quick bug fix can result in "drift". Enough time goes by with additional changes and it’s tough to simply call it "drift" anymore.

And this speaks to the heart of what you're saying: make the effort to make the _code_ understandable.

Copious comments are red flag for me - it means that the content was perceived by the previous developer as so complicated that a volume was required to understand it. And that might indeed be the case - and so I would expect to refer to an external document on a particular process, perhaps the business document.

The real argument about comments is that the compiler/interpreter doesn't read the comments - they don't mean anything except to the reader and _worse_ comments are often grudgingly written. Whatever the comments say, you can only unwaveringly rely upon the code to tell you what it's doing.
But if you're trying to read the code, chances are what it is doing is only half the picture - the other half being: what's it supposed to be doing?

Attention to detail and a moderate approach seems a more considerate approach than the zealotry on either of the spectrum.

Posted by rish on Jul 08, 2008 at 12:36 PM UTC - 5 hrs

@rish - +1 to what you said. I'm of the same opinion, with that being the one VERY strongly held belief I have about software development, which I detailed in this post: Bringing Sacred Cows to the Slaughter @ http://www.codeodor.com/index.cfm/2007/6/25/Bringi...

Thanks for the great comment.

Posted by Sammy Larbi on Jul 13, 2008 at 03:30 PM UTC - 5 hrs

Leave a comment

Leave this field empty
Your Name
Email (not displayed, more info?)
Website

Comment:

Subcribe to this comment thread
Remember my details
Google
Web CodeOdor.com

Me
Picture of me

Topics
.NET (29)
AI/Machine Learning (15)
Answers To 100 Interview Questions (9)
Bioinformatics (3)
C and C++ (10)
cfrails (22)
ColdFusion (85)
Customer Relations (21)
Databases (2)
DRY (20)
DSLs (13)
Electronics (2)
Future Tech (6)
Games (8)
Groovy/Grails (9)
Hardware (2)
IDEs (10)
Java (45)
JavaScript (6)
Lisp (3)
Mac OS (3)
Management (9)
Miscellany (67)
OOAD (42)
Programming (164)
Programming Quotables (12)
Rails (25)
Ruby (66)
Save Your Job (66)
scriptaGulous (4)
Software Development Process (31)
TDD (46)
TDDing xorblog (6)
Tools (6)
Web Development (10)
With (1)
YAGNI (12)

Resources
Agile Manifesto & Principles
Principles Of OOD
ColdFusion
CFUnit
Ruby
Ruby on Rails
JUnit



RSS 2.0: Full Post | Short Blurb
Subscribe by email:

Delivered by FeedBurner