Archive for the ‘ruby’ Category

Die Semicolon Die!

Wednesday, May 25th, 2011

I’ve always put javascript’s automatic semicolon insertion (ASI for short) under the bad parts of the language. That is based on Douglas Crockford’s explanation of how the feature is tricky and easily leads to mistakes, with the canonical example being:

// good, returns the object
return { ... }

// wrong! returns undefined
return
    {
        ...
    }

Fair enough. Lately i’ve been doing more and more ruby. Ruby is a language universally praised for its elegant, easy to read syntax. One of the strong points of the syntax is its terseness, that is, you can omit a lot of punctuation. Semicolons as well. Wait a moment…

def test
    return
        {
            ...
        }
end

test # returns nil !!

Same thing! Having the meaning of a program change due to an end-of-line is not a good thing in ruby as well, but it’s widely accepted because of the benefits. This must be true for javascript as well, so first point:

“Removing semicolons and other punctuation clutter is not just a liability. It actually makes your code look better.”

So both the languages have to decide when a statement implicitly terminates. But is ruby implementation really the same as javascript? It turns out it’s not, ruby takes a quite safer approach. A statement in ruby is finished on an end-of-line if it’s syntactically valid by itself, it spans multiple lines if it’s not:

# this works, the trailing dot means the statement is not finished
object.
    method1.
    method2.
    method3

# syntax error, first line is a valid statement by itself, second line calls method1 on nothing
object
    .method1
    .method2
    .method3

It’s safe because how a line is parsed depends on the line itself, not by other lines that could be written “by others”. The bad part is how it makes method chaining on multiple lines look ugly. This is why ruby 1.9 introduced the exception “the statement continues if the first character of next line is a dot”.

Javascript takes a step further to solve this bad part. A controversial step. A statement is finished on an end-of-line if the first character of the next line cannot be correctly parsed as if it was part of the line. Otherwise, the statement goes on. This removes the clutter and gives nice chaining:

// just works
object
    .method1()
    .method2()
    .method3()

Unfortunately, you now have a nasty problem. 2 lines which are supposed to be 2 different statements, but with the first character of the second line being a valid continuation of the first, will be treated as one statement with unpredictable results. This practically happens only when a line starts with either ( [ + - /

// function call instead of grouping
var a = b + c
(d + e).print()
// is really
var a = b + c(d + e).print()

// array index instead of array literal
var a = ["a", "b", "c"]
[0, 1].forEach( … )
// is really
var a = ["a", "b", "c"][0, 1].forEach( … )

// binary math operator instead of unary
var a = b + c
-1 == string.indexOf(query) || die()
// is really
var a = b + c – 1 == string.indexOf(query) || die()

// division instead of regular expression
var i=0
/[a-z]/g.exec(s)
// is really
var i=0 /[a-z]/g.exec(s)

Well, this sucks, so what should you do? I could say that i remember being caught by this problem just once in many years of javascript. The return problem or starting a line the nasty way is something extremely rare. But even if you don't want to afford the risk, why avoid ASI without even knowing about it? Without even thinking about a reasonable fix, given the nicer syntax? And this leads me to the second point:

"To write semicolon-free code and avoid getting bitten, you just need to remember 2 rules

1) Don't put an end-of-line between return, break, continue, throw, postfix ++, postfix -- and their operand
2) Avoid starting a line with ( [ +  - / but if you have to, prepend it with a semicolon"

// everything's fine
return { ... }
continue label
break label
throw error
counter++
counter--

var a = b + c
;(d + e).print()

var a = ["a", "b", "c"]
;[0, 1].forEach( ... )

var a = b + c
;-1 == string.indexOf(query) || die()

var i=0
;/[a-z]/g.exec(s)

Is it that taxing to remember? Automatic semicolon insertion is of course controversial, but using it is not a complete failure. It's a matter of taste, a trade-off between cleaner nicer code and some tough albeit avoidable pitfall.

While i'm at it, let's debunk some well known myths that always show up

  • "I could know ASI but others don't and they will mess things out"
    Well this may be true. It depends on where you work, the skill of your peers, etc.. To me, a javascript programmer is just supposed to know this stuff as he knows of prototype and first class functions. If they don't, supposing they got the opposable thumbs, as they can be told to put semicolons everywhere, they can be told to remember the above 2 simple rules.
  • "It's not gonna work the same way on every browser"
    It's in the specs since more than a decade. I think browser bugs are a thing of the past and even proponents of this theory look unable to find something newer than 5 years ago, so.
  • "It breaks the tools. You cannot minify code anymore, etc..."
    Let's be clear about this. It's officially part of the language. A tool unable to cope with ASI is a broken tool, period. Anyway, i have never had a problem with google closure compiler.
  • "Jslint doesn't work with it"
    Jslint enforces the vision of Douglas and it's pretty strict about it. This is fair, yet for those having another vision nothing is wrong with using Jshint which has an option to accept ASI. 

Let's close with two very nice articles that explain the details and of course you can always read the ecmascript specs:

The most well-written comprehensive article

Very good explanation of the pitfalls

The plain specs

Code Katas: Programmer’s Deep Practice

Wednesday, September 2nd, 2009

karate_champI’ve recently blogged about talent and how it’s grown through disciplined, committed, error focused practice at the edge of your ability, known as deep practice. I guess now it makes sense to approach it from the perspective of programming: What’s programmer’s deep practice? Unsurprisingly, inspiration can be found in the great japanese culture, people who highly value discipline and self improvement. Specifically i’m talking of martial arts. If you were to learn, say, karate you would go to a dojo and perform katas. If you happen to be a programmer, you can go to a coding dojo and practice code katas.

Code katas, a term first coined by Pragmatic Programmer Dave Thomas, are small programming exercises geared to hone a specific programming skill. Traditionally, they tend to be algorithmic like parsing or visiting graphs but could as well aim to improve understanding of particular programming paradigms, like functional or object oriented, or a specific language. Also, as remarkably pointed out by Matteo, katas can be crafted to master a certain technology like web or database. While, as you may guess, Coding Dojos are sites, groups or communities which propose and maintain collections of katas hopefully with solutions and reviews.

So, how do you practice? I suggest you solve a kata, review your work, compare it to other solutions, share your code with others and discuss it. Then solve it again trying to take a different path, balance pros and cons, then solve it again and again, until you feel you internalized the essence of the problem. Finally, you can move to another kata. If it feels like a lot of work, then you got it right. No question mastership requires time and effort but, then again, masters are those destined for greatness.

Resources