[Skip to the Main Content]

Palm trees on a grassy field in Hawai’i

They aren't HTML5 docs in the first place

How things change. These days, I have ceased considering what makes a document conforming or nonconforming by its Document Type Declaration. Documents are conforming or documents are nonconforming. Occasionally, they’re each. A conforming HTML 4.01 document may be a conforming XHTML 1.0 document. Conversely, a nonconforming XHTML 1.0 document may be a conforming HTML 5 document.

Ian Hickson offered the following in reply [June 17, 2007] to an W3C HTML WG message, Allow other doctypes.

Conforming HTML4 and XHTML1 docs will not become non-conforming HTML4 and XHTML1 docs. They'll remain conforming HTML4 and XHTML1 docs. They won't be conforming HTML5 docs because they aren't HTML5 docs in the first place. I don't see this as a problem.

HTML 4.01 or XHTML 1.0 document content which is found nonconforming with the (X)HTML 5 Validator may be made HTML 5 conforming and—after DocType replacement—pass. Presently, conforming HTML 5 documents that have not included undefined W3C elements, e.g., <header> will pass W3C Markup Validation (except for the various Document Type Declaration and Charater Set requirements).

I thought about what constitutes failure and acceptance between HTML 4.01, XHTML 1.0 and HTML 5. It’s content.

I did some simple test cases.

HTML 4.01/Strict Passed W3C Markup Validation

The specific HTML markup [with suspicious elements (highlighted in red)],


<p>This is a sentence with an <abbr>XHTML</abbr> <code><br /></code> in a <br />paragraph.</p>
<p>This is a sentence with an image <img src="http://www.elementary-group-standards.com/images/elementary-theory-rosette.jpg"
 alt="Elementary Rosette" /> with an <abbr>XHTML</abbr> <code><img /></code> in a paragraph.</p>

See Elementary Test Page. See W3C Markup Validation Service Results.

XHTML 1.0/Strict Passed W3C Markup Validation

The specific identical HTML markup as noted above,


<p>This is a sentence with an <abbr>XHTML</abbr> <code><br /></code> in a <br />paragraph.</p>
<p>This is a sentence with an image <img src="http://www.elementary-group-standards.com/images/elementary-theory-rosette.jpg"
 alt="Elementary Rosette" /> with an <abbr>XHTML</abbr> <code><img /></code> in a paragraph.</p>

See Elementary Test Page. See W3C Markup Validation Service Results

HTML 5 Passed HTML 5 Conformance Checker

The specific identical HTML markup as noted above,


<p>This is a sentence with an <abbr>XHTML</abbr> <code><br /></code> in a <br />paragraph.</p>
<p>This is a sentence with an image <img src="http://www.elementary-group-standards.com/images/elementary-theory-rosette.jpg"
 alt="Elementary Rosette" /> with an <abbr>XHTML</abbr> <code><img /></code> in a paragraph.</p>

See Elementary Test Page. See (X)HTML5 Validator Results

XHTML well-formed self-closing elements are acceptable in HTML 4.01 (as illustrated in the first test case). However, some things are not what they seem.

HTML 4.01/Strict (Second Test) Failed W3C Markup Validation

The specific HTML markup [with the failing element (highlighted in red)],


<meta http-equiv="content-type" content="text/html; charset=utf-8" />

See Elementary Test Page. See W3C Markup Validation Service Results

So. XHTML well-formed self-closing elements in the meta:elements are not acceptable in HTML 4.01. However, self-closing elements in the meta:elements are valid in HTML 5 documents. [Note: See Elementary Test Page.]

XHTML 1.0/Strict (Second Test) Failed W3C Markup Validation

The specific HTML markup [with the failing element (highlighted in red)],


<p>This is a sentence with an HTML <code><br></code> in a <br> paragraph.</p>
<p>This is a sentence with an image <img src="http://www.elementary-group-standards.com/images/elementary-theory-rosette.jpg" >
     alt="Elementary Rosette" /> with an XHTML <code><img /></code> in a paragraph.</p>

See Elementary Test Page. See W3C Markup Validation Service Results

And, HTML empty elements are not acceptable in XHTML 1.0; missing one / (solidus) causes grave errors. That makes sense. XHTML well-formedness requirements require empty elements to be self-closed whereas the HTML 4.01 specification makes no mention of well-formedness which thereby allows W3C Quality Assurance to have corrected their Validation Service so that at this moment it accepts self-closed XHTML elements in HTML 4.01.

The above test cases are very simple. Still. They illustrate that it is possible to have content which meets HTML 4.01, XHTML 1.0 and HTML 5 validation reqirements; and, when the appropriate DocTypes are used, one has three conforming documents. However, for all this theory, it appears HTML 5 has an interesting present-day Validator née Conformance Checker loophole: some nonconforming XHTML 1.0 will pass as HTML 5 whereas nonconforming HTML 4.01 shall not.

And, HTML 4.01 Markup Validation accepts XHTML well-formed empty elements.

An Addendum: Thomas Broyer wrote this [26 July 2007],

There’s a misunderstanding of the difference in results from html5-html-pass and html5-html-fail:

The former passes but does not mean what you think. Read carefully the error description from the latter. Actually, in SGML, it seems (I don’t know SGML subtilities) that the / (solidus) closes the tag, and the > (less-than sign) following it is then part of the textual content. While it’s not a problem with a paragraph, it becomes one in the head because textual content is not allowed there.

So: while the former passes validation, extracting textual content would prove you're wrong assuming XHTML well-formed self-closing elements are acceptable in HTML 4.01. It's not a problem with the meta element but really with using "/>" in start-tags.

[Published date: June 17 2007 (Revised May 14 2009)]