Other differences between HTML 4 / xHTML and HTML 5
- Coding special characters in HTML without a DTD
- Because there is no HTML 5 DTD, decimal or hexadecimal numerical values should be used coding HTML character entities. For a handy reference of special characters in HTML, see HTML Character Codes.
- alternative HTML syntax standards
- For backward compatibility with most existing web pages, HTML 5 allows documents to be coded using either the syntax traditionally used for the 1997-1999 versions of HTML (version 4.x), or the 2000-2001 W3C extended HTML syntax, which is more mobile-friendly and allows using templates. (For example, compare how a site such as this one looks on a mobile device with how other sites such as WhiteHouse.gov look.)
- An HTML page can be coded such that it adheres to a common subset of the syntactic requirements that are shared by both syntaxes and avoids anything that is unique to one syntax or the other. HTML documents that can be properly delivered and processed as either syntactic flavor of HTML are known as polyglot HTML documents.
- attribute coding for polyglot HTML documents
There are some specific requirements for coding attributes in HTML documents that can be parsed using either set of syntax rules. Coding of attributes, even boolean attributes, must include an equal sign and value, and the value must always be enclosed in quotes. The value of a boolean attribute must be the name of the attribute itself or else the attribute must be omitted - the value of a boolean attribute should not be
true
,false
, omitted (a minimized attribute) or an empty string.<option selected="selected"/>
The following ways of coding HTML attributes should be avoided:
<option selected/> <option selected=""/> <option selected="true"/> <option selected="yes"/>
- HTML 5 Specifications focus on the DOM
- The specifications for HTML 5 define elements of the language in terms of the operation and effects of the document's internal object model, making HTML 5 more of an abstract language than earlier versions. As a result, the language can be encoded in more than one syntax, as determined by the media type (
text/html
for the HTML syntax orapplication/xhtml+xml
for the XML syntax, for example). Documents can even exist without an external representation through using the DOM APIs internally. - Separate rules for creating HTML code vs. parsing HTML
- HTML pages on the web have been somewhat haphazardly created under various different standards and proprietary formats and in many cases with no particular standard or format in mind. As a result, different browser vendors have developed a variety of incompatible methods of handling a lot of non-standard and just plain bad HTML coding.
- There are now different rules for HTML authors and parsing software. The rules for writing HTML code have been simplified, but are more strict in that errors may be generated for some types of invalid coding, such as mismatched tags. The rules for parsing HTML are also more precise, especially for how to handle non-standard HTML documents that do not follow the authoring rules.
- For backward compatibility with the majority of existing documents, the HTML 5 specification requires document parsers and viewers (browsers) to support older, deprecated elements and attributes and other non-standard HTML coding in as consistent manner as possible. This means that HTML 5 compliant user agents will gracefully handle some of the more common errors in HTML coding. It's sort of a "Do What I Mean, Not What I Say" feature of web browsers. It also means that different browsers should start handling certain types of errors in the coding of HTML documents in a more consistent and predictable way.
- In layman's terms, there are two different branches of the HTML standard - one for HTML authors and another for developers of browsers and other software that needs to parse HTML code. While browsers may recognize deprecated and non-standard HTML coding, developers creating new HTML pages should avoid deprecated HTML elements and attributes and try to create documents that conform to the authoring requirements of the HTML 5 specification. The benefits of adhering to the HTML authoring rules include:
- more consistent presentation in traditional web browsers from different vendors,
- better support in handheld and mobile devices with smaller screens and
- greater longevity of the HTML pages being created.
- Separation of main content and navigation
- Earlier versions of HTML were not designed with accessibility in mind. Web page designers created some workarounds such as "Skip Navigation" links (similar to "Skip Intro" buttons), which would be placed near the beginning of web pages to allow users "viewing" a site with an HTML screen reader to keep the software from having to vocalize the navigation links on every page.
- HTML 5 continues to support XSL style sheets, which allow the navigation and other common elements to be completely removed from the documents with the main content and placed in one or more style sheet documents. The advantages of separating the navigation from the main content include:
- making the web pages more accessible, by not forcing screen readers to vocalize all of the navigation links before getting to the main content
- centralizing the "look and feel" of the web site using the templates in the style sheet documents
- improving page load times, since the style sheet documents can be cached by the browser and downloaded just once as opposed to when the common navigation elements are wrapped around the content and downloaded again and again with every web page
- making the web pages more mobile-friendly