notated.org notes on learning, design, tools, & life

RFC 2397 - The “data” URL scheme

I’ve still been having way too much fun playing around with my self-editing data URL “page”, and wound up finding the original proposal for data URLs from the IETF dated August 1998:

Some applications that use URLs also have a need to embed (small) media type data directly inline. This document defines a new URL scheme that would work like ‘immediate addressing’. The URLs are of the form:

data:[<mediatype>][;base64],<data>

The <mediatype> is an Internet media type specification (with optional parameters.) The appearance of “;base64” means that the data is encoded as base64. Without “;base64”, the data (as a sequence of octets) is represented using ASCII encoding for octets inside the range of safe URL characters and using the standard %xx hex encoding of URLs for octets outside that range. If <mediatype> is omitted, it defaults to text/plain;charset=US-ASCII. As a shorthand, “text/plain” can be omitted but the charset parameter supplied.

And apparently the idea goes back even further:

This idea was originally proposed August 1995. Some versions of the data URL scheme have been used in the definition of VRML, and a version has appeared as part of a proposal for embedded data in HTML. Various changes have been made, based on requests, to elide the media type, pack the indication of the base64 encoding more tightly, and eliminate “quoted printable” as an encoding since it would not easily yield valid URLs without additional %xx encoding, which itself is sufficient. The “data” URL scheme is in use in VRML, new applications of HTML, and various commercial products. It is being used for object parameters in Java and ActiveX applications.

I kind of love looking at older documents like this about the inner workings of the web. I primarily see them used for inlining images, as outlined by Chris Coyier, but the idea of encoding an entire page is just somehow still tripping me out.

Maybe it’ll pass.