HTML Serialization

Serialization of HTML 5 Documents

(Whether a web page is being processed as HTML, xHTML or pure XML is a technical issue for advanced users. You may want to skip directly to the Basic HTML Code section.)

The type of serialization of an HTML document refers to the syntax used when converting the HTML from an internal document model to a stream of bytes to be stored or transmitted or vice-versa. The HTML 5 specifications allow coding HTML documents in either the HTML style, based on 1997 HTML 4 and earlier specifications, or the xHTML style, based on the 1998 XML, 1999 Namespaces and 2000-2001 XHTML 1.x W3C recommendations. The xHTML style of code has a number of advantages, including:

HTML 5 has been designed to be backward compatible with both the 1997-1999 HTML 4 standards and the 2000-2001 XHTML 1.x W3C recommendations. The XML serialization of HTML 5 merges these two standards, and is already understood by virtually all web browsers including xHTML-based mobile browsers.

Processing HTML Code

HTML code can be processed in at least three different ways:

  1. as an HTML serialization of HTML, by web browsers and other software that process HTML documents from that serialization format
  2. as an XHTML-compatible XML-based serialization of HTML (xHTML), by web browsers and other software that process HTML documents from that serialization format
  3. as pure XML, by XSLT and other software that process documents as XML

Polyglot HTML documents are HTML documents that are coded in a manner that allows them to be read in any of those three ways. This avoids having to limit the documents to only the parsers that process one serialization or the other or having to code the same content in two or three different ways.

Polyglot documents can be delivered as:

  1. text/html to traditional web browsers,
  2. application/xml to XML parsers or
  3. application/xhtml+xml, the combination of both, which works with web browsers on cell phones and other handheld devices as well as on desktop computers

If you start creating polyglot documents now your web pages will be well positioned for both current and future HTML browsers and mobile devices.