[Skip to the Main Content]

The Most Common HTML Markup Errors

Or, What does CSS Reboot illustrate about all other Sites?

The previous article CSS Reboot as Web Standards Validation Indicator quantified valid sites which participated in CSS Reboot Spring 2006. This article qualifies the most popular errors found in the Markup of sites found during validation process using The W3C Quality Assurance Markup Validation Service tool. All of the errors are simple, fundamental errors. They are easily corrected. Some sites had one (1) Markup error; some two hundred (200) errors.

The error types below were most common; or, popular.

HTML Specific Errors

1. Line 6 column 95:  character data is not allowed here.
....nt/themes/Jason/images/favicon.ico" />
You have used character data somewhere it is not permitted to appear. Mistakes that can cause this error include putting text directly in the body of the document without wrapping it in a container element (such as a <p>aragraph</p>) or forgetting to quote an attribute value (where characters such as "%" and "/" are common, but cannot appear without surrounding quotes).

The most common error found in HTML 4.01 code was the use of the self-closing XHTML tags, i.e., “/”.

HTML/XHTML Errors

1. Line 54 column 62:  required attribute "alt" not specified.
...src="http://www.neglected-things.com/images/burlesque.jpg" />
The attribute given above is required for an element that you’ve used, but you have omitted it. For instance, in most HTML and XHTML document types the "type" attribute is required on the "script" element and the "alt" attribute is required for the "img" element.

This has been required since HTML 4.01 Specification, 13 Objects, Images, and Applets, 13.2 Including an image: the IMG element which states,

“The alt attribute specifies alternate text that is rendered when the image cannot be displayed. User agents must render alternate text when they cannot support images, they cannot support a certain image type or when they are configured not to display images.”

And, Accessibility Guidelines, HTML Techniques for Web Content Accessibility Guidelines 1.0, 7.1 Short text equivalents for images ("alt-text") which states,

“Provide a text equivalent for every non-text element (e.g., via "alt", "longdesc", or in element content). This includes: images, graphical representations of text (including symbols), image map regions, animations (e.g., animated GIFs), applets and programmatic objects, ASCII art, frames, scripts, images used as list bullets, spacers, graphical buttons, sounds (played with or without user interaction), stand-alone audio files, audio tracks of video, and video. [Priority¬†1].”

2. Line 100 column 69:  cannot generate system identifier for general entity "contents".
...st.com/php/banner.php?size=rectangle&contents=286,288,287,293" width="304" he
An entity reference was found in the document, but there is no reference by that name defined. Often this is caused by misspelling the reference name, unencoded ampersands, or by leaving off the trailing semicolon (;). The most common cause of this error is unencoded ampersands in URLs as described by the WDG in "Ampersands in URLs".
Entity references start with an ampersand (&) and end with a semicolon (;). If you want to use a literal ampersand in your document you must encode it as "&amp" (even inside URLs!). Be careful to end entity references with a semicolon or your entity reference may get interpreted in connection with the following text. Also keep in mind that named entity references are case-sensitive; &Aelig; and &aelig; are different characters.
If this error appears in some markup generated by PHP's session handling code, this article has explanations and solutions to your problem.
Note that in most documents, errors related to entity references will trigger up to 5 separate messages from the Validator. Usually these will all disappear when the original problem is fixed.

They lie. Errors related to entity references will trigger up to ten (10) separate messages. This error was—Commonly—caused by a single “&” in URLs. [Elementary Note: Amazon Affiliate links do this. Imagine my surprise, when I set-up the "Musical Accompaniment" Amazon links on this site and found that my once-compliant Markup had seventy (70) errors! after validation process! I had forgotten about those pesky “&s”! After correction, all initial (and, subsequent) errors disappeared.]

The above two error types were most infectious.

3. Line 90 column 18:  element "ab" undefined.
<p><ab>The Hunting of the Snark</a></p>
You have used the element named above in your document, but the document type you are using does not define an element of that name.

The above typographical error caused the next error.

4. Line 90 column 50:  end tag for element "a" which is not open.
<p><ab>The Hunting of the Snark</a></p>
The Validator found an end tag for the above element, but that element is not currently open. This is often caused by a leftover end tag from an element that was removed during editing, or by an implicitly closed element (if you have an error related to an element being used where it is not allowed, this is almost certainly the case). In the latter case this error will disappear as soon as you fix the original problem.

There were numerous other typographical errors, e.g., <di> instead of <div>, <o> instead of <ol> &c. [Elementary Note: And, Popular Error N° 3 has a few other errors, doesn’t it.]

5.Line 39 column 9:  ID "home" already defined.
<li id="home">
An "id" is a unique identifier. Each time this attribute is used in a document it must have a different value. If you are using this attribute as a hook for style sheets it may be more appropriate to use classes (which group elements) than id (which are used to identify exactly one element).

HTML 4.01 Specification, 7 The global structure of an HTML document, 7.5.2 Element identifiers: the id and class attributes which states,

id = name [CS]
This attribute assigns a name to an element. This name must be unique in a document. [Elementary Note: It translates as use only once.]
class = cdata-list [CS]
This attribute assigns a class name or set of class names to an element. Any number of elements may be assigned the same class name or names. Multiple class names must be separated by white space characters.
6. Line 788 column 10:  document type does not allow element "small" here; assuming missing "li" start-tag.
<ul><small>   </small></p>

Which caused the following.

7. Line 788 column 24:  end tag for element "p" which is not open.
<ul><small>   </small></p>

Errors N° 8 and N° 9 were popular.

8. Line 542 column 8:  end tag for "div" omitted, but OMITTAG NO was specified.
</body>
You may have neglected to close an element, or perhaps you meant to "self-close" an element, that is, ending it with "/>" instead of ">".

The above error caused interesting arrays of compound, i.e., subsequent, errors similar to those found below.

9. Line 109 column 3:  document type does not allow element "h2" here; missing one of "object", "ins", "del", "map", "button" start-tag.
<h2>
The mentioned element is not allowed to appear in the context in which you’ve placed it; the other mentioned elements are the only ones that are both allowed there and can contain the element mentioned.
9a. Line 112 column 5:  document type does not allow element "hr" here; missing one of "object", "ins", "del", "map", "button" start-tag.
<hr />
9b. Line 113 column 23:  document type does not allow element "ul" here; missing one of "object", "ins", "del", "map", "button" start-tag.
<ul class='of-interest'>
9c. Line 131 column 3:  document type does not allow element "h2" here; missing one of "object", "ins", "del", "map", "button" start-tag.
<h2>

[Elementary Note: There were twenty subsequent errors.]

Popular Error No. 9 was caused by a paragraph, i.e., <p>, opened in Line 108 of the site’s source code but was not closed. All twenty errors would have disappear if the <p> were closed.

10. Line 171 column 98:  there is no attribute "align".
..."><img src="img/movie1pre.jpg" align="Left" /></a></div>
11. Line 124 column 12:  there is no attribute "border".
<img border="0" src="http://www.the-flying-monkeys-paw/images/logo.gif" width="167"
12. Line 242 column 81:  end tag for element "a" which is not open.
..om/blog/archives/snark/">snark</a></a></h5>
You may have neglected to close an element, or perhaps you meant to "self-close" an element, that is, ending it with "/>" instead of ">".

XHTML Specific Errors

1. Line 10 column 65:  end tag for "link" omitted, but OMITTAG NO was specified.
...ortcut icon" href="favicon.ico" type="image/x-icon">
You may have neglected to close an element, or perhaps you meant to "self-close" an element, that is, ending it with "/>" instead of ">".

XHTML™ 1.0 The Extensible HyperText Markup Language (Second Edition), 4. Differences with HTML¬†4, 4.6. Empty Elements states,

“Empty elements must either have an end tag or the start tag must end with />. For instance, <br/> or <hr></hr>. See HTML Compatibility Guidelines for information on ways to ensure this is backward compatible with HTML 4 user age.” [Elementary Note: Empty elements must be closed in the <head> and <body> sections.]

2. Line 9 column 67:  document type does not allow element "META" here. ..."Content-type" content="text/html; charset=utf-8" />
The element named above was found in a context where it is not allowed. This could mean that you have incorrectly nested elements -- such as a "style" element in the "body" section instead of inside "head" -- or two elements that overlap (which is not allowed).
One common cause for this error is the use of XHTML syntax in HTML documents. Due to HTML’s rules of implicitly closed elements, this error can create cascading effects. For instance, using XHTML’s "self-closing" tags for "meta" and "link" in the "head" section of a HTML document may cause the parser to infer the end of the "head" section and the beginning of the "body" section (where "link" and "meta" are not allowed; hence the reported error).

The element was placed in the <body>.

3. Line 191 column 52:  element "DIV" undefined.
<p><DIV class="left"><img src="http://static.flickr.com/28/4
..ou have used the element named above in your document, but the document type you are using does not define an element of that name. This error is often caused by:
  • by using upper-case tags in XHTML (in XHTML attributes and elements must be all lower-case).
4. Line 531 column 74:  an attribute value specification must be an attribute value literal unless SHORTTAG YES is specified.
...t/themes/k2/images/footer.gif" WIDTH=150 HEIGHT=29 BORDER=0 ALT="" USEMAP="#f

XHTML™ 1.0 The Extensible HyperText Markup Language (Second Edition), 4. Differences with HTML¬†4, 4.4. Attribute values must always be quoted

“All attribute values must be quoted, even those which appear to be numeric.”

5. Line 513 column 18:  end tag for "br" omitted, but OMITTAG NO was specified.
<br class="clear">

HTML/XHTML Warning

This was a very common warning that appeared for - Roughly - one-third of all sites.

 Line 102 column 32:  character "&" is the first character of a delimiter but occurred as data.
We’re a down to earth, friendly & totally reliable company with a focus on getti
This message may appear in several cases:
  • You tried to include the "<" character in your page: you should escape it as "<".
  • You used an unescaped ampersand "&": this may be valid in some contexts, but it is recommended to use "&amp;", which is always safe.
  • Another possibility is that you forgot to close quotes in a previous tag.

The Markup Validator will give you a “Warning” for using “&” in the text, i.e., data. Curiously, the CSS Validation will not process your style sheets until this ampersand warning has been corrected. Further, the Markup Validator will give you an “Error” when using an ampersand in URLs. It would simplify things if the W3C made “&amp;” mandatory for all uses of “&”, wouldn’t it.

Other errors occurred in the Markup from CSS Reboot sites but they were arcane and few.

Why validate?

One needn’t. Browser makers cannot ignore past Markup gobbledygook created before Web Standards and, consequently, are extremely forgiving. I validate for professional pride, for loathing errors, for simplicity, for future possibilities, e.g., HTML 5 or XHTML 1.1, where code with Content=“application/xhtml+xml; charset=utf-8” must be valid.

Here’s a simple test. If the sentence below makes sense to you or if you believe it would be acceptable for The New York Times to print it as a banner headline on their papers, I suggest HTML 3.2.

“Their Hunting Snarks.”

[The Most Common CSS Markup Errors is the next article in this series.]


Sean Fraser posted this on June 22, 2006 12:59 PM.

  • Add to Technorati Favorites
  • de.licio.us: http://www.elementary-group-standards.com/html/the-most-common-html-markup-errors.html
  • furl: http://www.elementary-group-standards.com/html/the-most-common-html-markup-errors.html
  • reddit: http://www.elementary-group-standards.com/html/the-most-common-html-markup-errors.html

Comments

Stephen wrote this at September 13, 2006 11:14 AM

Whose hunting snarks?


Comment Author Gravatar
Sean Fraser wrote this at September 13, 2006 11:21 AM

Stephen: They're hunting snarks.


stephen wrote this at September 15, 2006 06:52 AM

You've hit on two of the most interesting errors.

When validating all the HTML for a site I created over two years ago [www.sttc.org.au], the validator produced some interesting anomolies which were not exactly correct. . The first thing it told me was the meta-tags were not correctly closed - and because of the unclosed element, the validator we have all grown to love (and hate) went down the page telling me there are 199 other open or superflous elements. As you say, the validator was correct on the first one, but the rest were 'because' of the first. At first I was concerned - so I fixed that one problem. Yep, you guessed right. It validated. I see no problem with the validator making that presumption, but I can see how it might scare some designers that they have completely screwed up their code!

On the issue of ampersands, the validator was good for telling me where javascripts and RSS/XML/ATOM feeds had failed to show (ampersand)-amp; (or &amp; if your comments DO show that correctly.) I have discovered that RSS-readers will allow the feed to have a validating ampersand in the script.

I personally find the validating tool more than just about code - its a validation of ones abiity to write good HTML and CSS. Or not.


Comment Author Gravatar
Sean Fraser wrote this at September 15, 2006 10:33 AM

stephen: Ampersands are niggly, aren't they.

I agree with your last sentence. The W3C Validation Tools are more than just about code. They offer validation of one's knowledge.

Novices can be pleased with knowing that they have valid code; after that, then they can begin learning about semantics and shorthand and et al. I believe that the validation process is a necessary first step in the fundamental progression of web standards knowledge. Neo-Standardistas can smirk at the validators knowing that they have valid code but the tools are flawed (or some external factor has rendered their code "Failed".)

And, that's where validation has its limitations. Real world projects may take one's valid code and hack it. CMS idiosyncrasies, client participation and scope of work are common culprits. However, I know that whatever I contribute to a project will be valid.

Thank you for the thought-provoking comment.


Comments are closed.

The Elementary Standards: A Compendium of Web Standards, CSS, Linguistics and Search Engine Optimization methodology Copyright ©2005-2007 Sean Fraser. All work is published under a Creative Commons License. All Rights Reserved.

Palm trees on a grassy field in Hawai’i

Main Content Returns thus