September 9, 2014

The multipart/form-data mess

OK, this is only a tiny mess, in comparison with the URL mess,  and I have more hope for this one.

Way back when (1995), I spec'ed a way of doing "file upload" in RFC1867. I got into this because some Xerox printing product in the 90s wanted it, and enough other folks in the web community seemed to want it too. I was happy to find something that a Xerox product actually wanted from Xerox research.

It seemed natural, if you were sending files, to use MIME's methods for doing so, in the hopes that the design constraints were similar and that implementors would already be familiar with email MIME implementations.  The original file upload spec was done in IETF because at the time, all of the web, including HTML, was being standardized in the IETF.   RFC 1867 was "experimental," which in IETF used to be one way of floating a proposal for new stuff without having to declare it ready.

After some experimentation we wanted to move the spec toward standardization. Part of the process of making the proposal standard was to modularize the specification, so that it wasn't just about uploading files in web pages.   Rather, all the stuff about extending forms and names of form fields and so forth went with HTML. And the container, the holder of "form data"-- independent of what kind of form you had or whether it had any files at all -- went into the definition of multipart/form-data (in RFC2388).   Now, I don't know if it was "theoretical purity" or just some sense of building things that are general purpose to allow unintended mash-ups, but RFC2388 was pretty general, and HTML 3.2 and HTML 4.0 were being developed by people who were more interested in spec-ing a markup language than a form processing application, so there was a specification gap between RFC 2388 and HTML 4.0 about when and how and what browsers were supposed to do to process a form and produce multipart/form-data.

February of last year (2013) I got a request to find someone to update RFC 2388. After many months of trying to find another volunteer (most declined because of lack of time to deal with the politics) I went ahead and started work: update the spec, investigate what browsers did, make some known changes.  See GitHub repo for multipart/form-data and the latest Internet Draft spec.

Now, I admit I got distracted trying to build a test framework for a "test the web forward" kind of automated test, and spent way too much time building what wound up to be a fairly arcane system. But I've updated the document, and recommended its "working group last call". The only problem is that I just made stuff up based on some unvalidated guesswork reported second hand ... there is no working group of people willing to do work. No browser implementor has reviewed the latest drafts that I can tell.

I'm not sure what it takes to actually get technical reviewers who will actually read the document and compare it to one or more implementations to justify the changes in the draft.

Go to it! Review the spec! Make concrete suggestions for change, comments or even better, send GitHub pull requests!



No comments:

Post a Comment

Medley Interlisp Project, by Larry Masinter et al.

I haven't been blogging -- most of my focus has been on Medley Interlisp. Tell me what you think!