Well-formedness is a common discipline when constructing the markup of websites. That’s what most developers do, isn't it? So, pages with well-formed elements should be displayed in browsers as web developers intended, Right? And, all developers need to do is view their pages in a browser. Perhaps, not. What about ill-formedness? How are ill-formed documents displayed? The W3C offers a clue about it.
XHTML™ 1.0 The Extensible HyperText Markup Language (Second Edition), 4. Differences with HTML 4, 4.1. Documents must be well-formed has this intriguingly ominous caveat,
“Although overlapping is illegal in SGML, it is widely tolerated in existing browsers.”
Every web standards advocate has seen ill-formed (X)HTML when peering at source code but how many have identified it when viewing a web page. None. And, I don’t mean CSS (or, Cascading Style Sheet) errors; those are simple to see. The W3C notes that overlapping is illegal but what about other sorts of well-formedness errors, e.g., XHTML unclosed elements and nesting irregularities. How would one know that a web page has ill-formed markup by viewing ill-formed pages in a browser if “it is widely tolerated in existing browsers.” One wouldn’t; it’s widely tolerated.
How widely tolerated is ill-formedness in existing browsers?
I did this simple test.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>What's Wrong Here?</title> <p>First Paragraph <h1>Header <dd> <p>Second Paragraph <ol> <li>First Line Item </h1> <li>Second Line Item
That’s all. Except, for one inline style.
It is parsed and rendered thusly. It is rendered—Similarly—in all browsers. The browsers have taken their default attribute values (or, settings) for each element and applied them. Well, they tried.
- The first
<p>defaults to the browsers’ values.
<dd>defaults and has indented the elements which follow as expected (regardless of the missing
- The second
<p>has inherited the
<h1>font size as well as being indented by the preceding
<dd>. [Note: It has inherited the header’s font size because of the closing
</h1>. The header violates HTML 4.01 syntax and XHTML 1.0 “nesting” requirements.]
- The first
<dd>indentation. It has inherited
</h1>behaves accordingly. It closes the header element.
- The second <li> defaults to an unordered list regardless of the missing
<ul>and, since the header was closed, it does not inherit
[Note: If you have heard about “error-handling“, this ill-formed page is an example of what error-handling does.]
What a web page appears to be in a browser is not an indication of well-formed markup; some designers believe otherwise. [Elementary aside: What about potential clients?] If you had—Merely—viewed the above example (and, not peered at the source code) it appears to be an oddly constructed CSS page. Validation (or, Conformance) performance is required.
The W3C (X)HTML Validation Service can indirectly identify markup well-formedness deficiencies. The error descriptions are not precise but they do offer guidance when reviewing source code.
Ill-formed (X)HTML can (and, often does) affect any page. It doesn’t matter if pages are generated by hand-coding, (X)HTML editors or Content Management Systems (CMS): ill-formed content may strike at any time. Any where. And, until an (X)HTML Well-Formedness (or, Semantics) Validator or Conformance Checker is invented, the W3C HTML Validation Service remains the best tool but only if it’s used. Do not rely on browsers!