March 25, 2013

Standardizing JSON

Update 4/2/2013: in an email to the IETF JSON mailing list, Barry Leiba (Applications Area director in IETF) noted that discussions had started with ECMA and ECMA TC 39 to reach agreement on where JSON will be standardized, before continuing with the chartering of an IETF working group.

JSON (JavaScript Object Notation) is a text representation for data interchange. It is derived from the JavaScript scripting language for representing data structures and arrays. Although derived from JavaScript, it is language-independent, with parsers available for many programming languages.

JSON is often used for serializing and transmitting structured data over a network connection. It is commonly used to transmit data between a server and web application, serving as an alternative to XML.

JSON was originally specified by Doug Crockford in RFC 4627, an "Informational" RFC.  IETF specifications known as RFCs come in lots of flavors: an "Informational" RFC isn't a standard that has gone through careful review, while a "standards track" RFC is.

An increasing number of other IETF documents want to specify a reference to JSON, and the IETF rules generally require references to other documents that are the same or higher levels of stability. For this reason and a few others, the IETF is starting a JSON working group (mailing list) to update RFC 4627.

The JavaScript language itself is standardized by a different committee (TC-39) in a different standards organization (ECMA).  For various reasons, the standard is called "ECMAScript" rather than JavaScript.  TC 39 published ECMAScript 5.1, and are working on ECMAScript 6, with a plan to be done in the same time frame as the IETF work.

The W3C  also is developing standards that use JSON and need a stable specification.

Risk of divergence

Unfortunately, there is a possibility of (minor) divergence between the two specifications without coordination, either formally (organizational liaison) or informally, e.g., by making sure there are participants who work in both committees.

There is a formal liaison between IETF and W3C. There is currently no also a formal liaison between W3C and ECMA (and a mailing list, public-script-coord@w3.org ). There is no formal liaison between TC39/ECMA and IETF.

Having multiple conflicting specifications for JSON would be bad. While some want to avoid the overhead of a formal liaison, there needs to be explicit assignment of responsibility. I'm in favor of a formal liaison as well as informal coordination. I think it makes sense for IETF to specify the "normative" definition of JSON, while ECMA TC-39's ECMAScript 6.0 and W3C specs should all point to the new IETF spec.

JSON vs. XML

JSON is often considered as an alternative to XML as a way of passing language-independent data structures as part of network protocols.

In the IETF, BCP 70 (also known as RFC 3470"Guidelines for the Use of Extensible Markup Language (XML) within IETF Protocols" gives guidelines for use of XML in network protocols. However, this published in 2003. (I was a co-author with Marshall Rose and Scott Hollenbeck.)

But of course these guidelines don't answer the question many have: When people want to pass data structures between applications in network protocols, do they use XML or JSON and when? What is the rough consensus of the community? Is it a choice? What are the alternatives and considerations? (Fashion? deployment? expressiveness? extensibility?) 

This is a critical bit of web architecture that needs attention. The community needs guidelines for understanding the competing benefits and costs of XML vs. JSON.  If there's interest, I'd like to see an update to BCP 70 which covers JSON as well as XML.

9 comments:

  1. According to http://www.w3.org/2001/11/StdLiaison there are formal W3C/ECMA liaisons, two people on each side. 

    ReplyDelete
    Replies
    1. updated blog; not sure why they weren't part of the TAG WebIDL/ECMAScript 6 discussion.

      Delete
  2. Although the concept of validation may be antithetical to many JSON adherents, it is critical for interoperable processes.

    Looking back now at the XML Schema wars, I feel confident I can say that
    1. everybody can live with W3C XML Schema Data Types,
    2. Relax NG is easier to write and better at validating loosely-coupled systems,
    3. W3C XML Schema Structures is better at describing XML structure that is designed to be compiled into code accessors (C#, Java) for more strongly coupled RPC-type systems.

    I feel that the various JSON schema proposals are all ignoring these and other lessons of the XML Schema history. I believe that a clear differentiation among the schema roles is necessary to come up with answers that satisfy the various use cases.

    The current approaches either ignore the problem and let JSON be defined by the code that generates it, or err too much on the side of using JSON syntax where it's awkward, that is, in describing other JSON syntax.

    Initiatives such as the binary frmat Avro that are schema-first may take over where Binary XML and JSON+Schema fail.

    But even Avro fails in the loosely-coupled front, with the only tool for describing polymorphism being the blunt UNION.

    A schema language with compact, readable syntax and based on Horn clauses (much like Relax NG) but using JSON or AVRO types would be a good start for describing an interoperable validation language that could apply equally well to many different data serialization formats.

    Please consider the needs of loosely coupled systems, tightly coupled systems, and validation, when looking at the umbrella of standards around JSON.

    ReplyDelete
  3. For those wondering what needs to be done to standardize JSON, see http://www.ietf.org/mail-archive/web/json/current/msg00193.html

    ReplyDelete
  4. +1 for a single specification for JSON.

    ReplyDelete
  5. Leigh - you _might_ talk the IETF into thinking that schemas for everything (or at least for some things) is a good idea, but I doubt you'll get much traction from the JSON community.

    Remember, the JSON folks found very very different lessons in XML and XML Schema history. They didn't just stumble into different or bad ideas, they made those choices with substantial awareness of the options. Not only that, but there are few signs that they are repenting of those choices.

    I've spent too much time on the xml-dev list lately explaining why schema-first is a bad idea, and seeing you propose inflicting that model on a field that has largely rejected it is, well, strange.

    ReplyDelete
  6. While IETF and ECMA management continue to figure out how to coordinate, what is necessary (and sufficient) is to produce an IETF document that ECMAScript 6 can reference, and then removing the JSON grammar from http://wiki.ecmascript.org/doku.php?id=harmony:specification_drafts section 15.2.

    ReplyDelete
  7. (At my instigation) Barry Leiba noted the agreement in the IETF JSON working group charter:

    http://www.ietf.org/mail-archive/web/json/current/msg00267.html

    ReplyDelete