January 27, 2010

Over-specification is anti-competitive

If there are 300 implementations of a specification, all different, but you take the 4 "important implementations" and write a specification that is precise enough to cover what those 4 "important implementations" do, exactly, precisely, and normatively require
("MUST") that behavior, then you inevitably wind up of making many of the remaining 296 implementations non-conforming, because the MUST requirements are too stringent.
The process then favors the 4 "important implementations" over the 296 other ones, and makes it harder for any of them to be offered as compliant implementations.
This is an example of "structural bias", as I wrote about earlier.

This problem is widespread in the HTML specification, and unfortunately really difficult to eliminate.
The example where I explored this in depth was in the calculation of "image.width" and "image.height", where a precise algorithm that required a state transition from:
[image not available, not loaded] to [image available, being loaded]
and then to EITHER [ image available, completely loaded] OR [ image not available, load failed].
HTML5 requires if the image is "available" (whether "being loaded"
or "completely loaded") that both image.width and image.height were both non-zero.
This behavior, I was assured, was necessary because there was some (how much? how often? still deployed?) javascript code that relied on exactly this state transition behavior.
Another implementation, which did not cache image width and height, or which let the cached image.width and height expire, and thus would allow [image available, being loaded] to transition to [image not available, not loaded], but that would be non-compliant with the HTML spec.
This non-compliance is not justified by significant interoperability considerations. It's hard to imagine any reasonable programmer making any such assumptions, and much more likely that the requirement is imaginary. By putting "compatibility" with a few, rare occurrences of badly written software which only works with a few browsers as the primary objective of HTML5, the result is an impenetrable mess.
The same can be said for most of the current HTML spec. It is overly precise, in a way that is anti-competitive, due to the process by which it was written; however, it is not in the business interests of the sponsors of the self-selected "WHATWG" steering committee to change the priorities.
Much was written about the cost of reverse engineering and how somehow this precise definition increased competition by giving other implementors precise guidelines for what to implement, but those arguments don't hold water. The cause of "reverse engineering" is and always has been the willingness of implementors to ignore specifications and add new, proprietary and novel extensions, or to modify behavior in a way that is "embrace and pollute" rather than "embrace and extend". This was the behavior during Browser Wars 1.0 and the behavior continues today.
None of the current implementations of HTML technology were written by first consulting the current specification (because the spec was written following the implementations rather than vice versa) so we have no assurance whatsoever that the current specification is useful for any implementation purpose other than proving that a competitive technology is "non-compliant."

Medley Interlisp Project, by Larry Masinter et al.

I haven't been blogging -- most of my focus has been on Medley Interlisp. Tell me what you think!