CSS Reboot as Web Standards Validation Indicator

It was to be simple data collection. I wrote “I guess we’ll have to wait and see what after effects May Day brings.” at the end of XHTML’s Gift regarding those sites which chose HTML 4.01. That’s simple. HTML 4.01 sites from beyond those prominent, well-thumbed bookmarked sites I visit. CSS Reboot Spring 2006 would be perfect. And, since I was collecting data for HTML 4.01, I included Document Type Definition (DTD) or, DOCTYPE and Character Set Encoding since I've written about them before. [See Why XHTML™? and Which Character Set Encoding should be Used?, respectively.] I began data collection after CSS Reboot Spring 2006 ended, May 2nd. The first site which did not declare a character set underwent review utilizing The W3C Quality Assurance Markup Validation Service tool. An odd occurrence occurred. The site failed markup validation. Curiosity set. I went back through all of the previous sites: most failed markup. The scope of this project was—Then—increased: HTML 4.01, DTD, Charset and W3C Validation results.

CSS Reboot had seven hundred eight-six (786) sites which participated according to their number of pages, i.e., 15, and sites per page, i.e., 54. CSS Reboot states that 744 sites—Officially—rebooted; I could not verify that. Therefore, data collection was made on all of the sites presented in CSS Reboot. However, the final number was adjusted for two reasons: tables-based sites (27) and sites which were not sites (21), e.g., “Site removed pending relaunch.” or “Server not found.” The final count—by my count—was seven hundred thirty-eight (738) sites. Data collection and analysis were performed between May 2 — June 6, 2006.

A Proposed W3C Validation Icon was composed after preliminary data collection of the websites which participated in Spring CSS Reboot 2006 because of the failure rate exhibited by those sites.

71.8% of the websites failed validation for HTML Markup! for CSS or, for both!
28.2% were valid!

That’s atrocious.

The predominate Document Types declared and failure rates are illustrated, as follows.

The Final Breakdown goes like this. [Note: All DOCTYPEs have been given as they were written in the source code of the sites.]

DOCTYPE [None declared]

8 Sites (for which no server-side DTD was identified by Validation.)

All sites failed validation.

3 failed Markup but passed CSS
5 failed Markup/CSS

HTML 4.01/None

2 Sites (which were server-side Strict)

All Sites failed validation.

All failed Markup/CSS

HTML 4.01/Transitional

19 Sites

84.2% of the websites failed validation. 15.8% were valid.

3 were Valid Markup/CSS
9 failed Markup but passed CSS
7 failed Markup/CSS

HTML 4.01/Strict

27 Sites

74.1% of the websites failed validation. 25.9% were valid.

7 were Valid Markup/CSS
4 passed Markup but failed CSS
8 failed Markup but passed CSS
8 failed Markup/CSS

XHTML 1.0/Frameset

1 Site [Note: All pages in frames were Standards-based.]

It passed validation.

It is Valid Markup/CSS

XHTML 1.0/Transitional

367 Sites

74.9% of the websites failed validation. 25.1% were valid.

92 were Valid Markup/CSS
68 passed Markup but failed CSS
89 failed Markup but passed CSS
118 failed Markup/CSS

XHTML 1.0/Strict

220 Sites

70.5% of the websites failed validation. 29.5% were valid.

65 were Valid Markup/CSS
49 passed Markup but failed CSS
53 failed Markup but passed CSS
53 failed Markup/CSS

XHTML 1.0/Strict content=“text/hmtl+xml; charset=iso-8859-1”

1 Site

It failed validation.

It failed Markup but passed CSS

XHTML 1.1 [No Content type declared]

1 Site

It passed validation.

It was Valid Markup/CSS

XHTML 1.1 content-type=“text/hmtl”

59 Sites

64.4% of the websites failed validation. 35.6% were valid.

21 were Valid Markup/CSS
14 passed Markup but failed CSS
5 failed Markup but passed CSS
19 failed Markup/CSS

XHTML 1.1 content=“application/xhtml+xml;charset=utf-8”

8 Sites

50% of the websites failed validation; 50% were valid.

4 were Valid Markup/CSS
3 passed Markup but failed CSS
1 failed Markup/CSS

XML 1.0/XHTML1.0/Transitional [No Content type declared]

2 Sites

50% of the websites failed validation; 50% were valid.

1 was Valid Markup/CSS
1 passed Markup but failed CSS

XML 1.0/XHTML 1.0/Transitional content-type=“text/hmtl”

5 Sites

40% of the websites failed validation; 60% were valid

3 were Valid Markup/CSS
1 failed Markup but passed CSS
1 failed Markup/CSS

XML 1.0/XHTML 1.0/Strict content-type=“text/hmtl”

10 Sites

40% of the websites failed validation; 60% were valid.

6 were Valid Markup/CSS
3 passed Markup but failed CSS
1 failed Markup but passed CSS

XML 1.0/XHTML 1.0/Strict content=“application/xhtml+xml; charset=utf-8”

1 Site

It passed validation.

It was Valid Markup/CSS

XML 1.0/XHTML 1.1 [No Content type declared]

2 Sites

100% of the websites failed validation.

1 failed Markup but passed CSS
1 failed Markup/CSS

XML 1.0/XHTML XHTML1.1 content-type=“text/hmtl”

3 Sites

67% of the websites failed validation; 33% were valid.

1 was Valid Markup/CSS
1 passed Markup but failed CSS
1 failed Markup/CSS

It appears that those sonorous messages from Big-Top Barkers, Standardistas, Web Standards Evangelists, Semantics Peddlers and Old Professionals are wee and small beyond the vernal arcades of the Standards Side Show. Still, it’s not all that dismal. Some sites failed for typographical errors. Some for The W3C CSS Validation Service not comprehending CSS3 Selectors. Some were failures caused by CSS Hacks. Hokey-Smokes! A few 9 Rules Sites failed, even!

Still.

[The Most Common HTML Markup Errors is the next article in this series.]

Sean Fraser posted this on June 17, 2006 08:58 AM.

Comments

Sarven Capadisli wrote this at June 19, 2006 07:43 PM

Just randomly going through some of the designs submitted to CSS Reboot, reveals that most are not standards-compliant - I am not shocked by the results, needless to say, it is great to have concrete analysis at hand like the one you have conducted. So, nice job!

Of course, sites not being fully valid is not the end of the world (as the well-formedness of a document doesn't necessarily mean semantic markup), and I think in general most sites did fairly well at least in a collective way of pushing a new cleaner wave of sites (even though overlooking some of the fundamentals).

But let us be honest here: it is evident that (as I'm sure many already have highlighted this point) CSS Reboot's focus is to publicize the submitted sites, rather then to push compliancy.

The qualifications for submission are vague - almost none - within the limits of some attempt towards standards-compliance.

I think what a potential movement soon gave into free marketing rather then the developers truly showing their understanding of the technologies involved.

Just my two cents.

Sean Fraser wrote this at June 20, 2006 01:16 PM

Sarven: I agree.

This article does not impugn CSS Reboot nor belittle those who participated (excepting those sites which were all nested-tables). Rather, it addresses a single aspect of web standards and web development: continuous education.

The only showcase gallery site that doesn't allow failed Markup and/or CSS is css Zen Garden; all of the other showcase galleries do. And, as far as free marketing, css Zen Garden is a perfect marketing vehicle, if one's accepted.

Matthew Pennell wrote this at June 28, 2006 01:14 AM

The unwritten assumption in your testing is that all validation errors are created equal which, as we know, is not the case.

I would hazard a guess that a large number of the sites failing validation are blogs, and of those a large number of fails are caused by user input (i.e. reader comments) - I know that is what is making my site fail to validate at the moment.

When coding a site it is very difficult to take into account all the possible scenarios that may arise when the site is opened to the public; the best we can do is to note these abberations when they do occur, and put in place controls to prevent them happening again. I'm sure I could invalidate this page if I tried hard enough - should that be a reflection on your skills as a developer?

I think not.

Mark Wubben wrote this at June 28, 2006 01:30 AM

There are CSS uses which are invalid according to the validator, but valid according to the spec. Mixing CSS 2 and CSS 3 is an example, as are vendor specific properties (properly prefixed or IE specials). Is this really a problem?

Same with HTML, I'm usually too lazy to add a datetime property to my insertions and deletions, and I don't think it really matters anyway. Invalid HTML, but is it important?

Webstandars is a mindset, not absolute truth.

codesign wrote this at June 28, 2006 08:56 AM

Being a participant (and I'm afraid not to see many more here), I'm very surprised of this result, especialy for those who validate but are fired by votes.

The first thing to look at for this free "concours" is the fairplay of participants, and their competences in the design of a page : by design, I mean graphical design, but also architecture design, inside the page.

Regarding the results, we can imagine that almost 50 percent of "web standards" web sites are not standards :(

(I'm happy, I'm in the 20% precent ;))

Joe Clark wrote this at June 28, 2006 10:27 AM

OK, Matthew and Mark, what proportion of the invalid sites do you think are invalid for the trivial reasons you describe? What if half of them are invalid for those reasons? That still leaves 37% of CSS Reboot sites with incorrect HTML or CSS.

Is the problem still overstated then?

Mark Wubben wrote this at June 28, 2006 12:02 PM: Joe, fair enough. I'm just not a believer in 100% validation. Big errors should still be avoided though, and if not, that is a problem.

Sean Fraser wrote this at June 28, 2006 12:52 PM

Matthew: All errors were considered equal for this exercise: it was - simply - Pass/Fail. I only used a sites home/index page for validation purposes. The great majority of errors were author-induced and not from commenters poor HTML commenting. Very few errors seemed to be from the CMS platform (of which, Word Press was most popular.) And, even those errors could be fixed if the site authors had a better understanding of the content entry fields, "includes" and modules.

Most of the HTML errors are easily corrected; some are not. The most common CSS errors are easily corrected; some are more complex and, therefore, CSS errors were categorized: Author-generated, Vendor-specific, and CSS Validator not following it's own specifications.

Mark: The fundamentals of Web Standards should, however, be an absolute truth.

codesign: Nearly all "standards-compliant" sites which fail validation are compromised by User Comments when the site allows HTML in comments, external code, e.g., Amazon affiliate links and Google Adwords, CSS Hacks or Vendor Specific CSS. A site's index page is the best standards-compliance indicator I have found.

David wrote this at June 28, 2006 06:42 PM

so instead of rejoicing that a venture like this helps to spread the attempt to be standards compliant, we complain about them not being,

extraordinary reasearch, but i agree with Mark 'Webstandars is a mindset, not absolute truth.'

the web will not die, we will continue to self educate and push on. maybe our energies would be better spent trying to reach out and explain how to fix the errors as apossed to stating how 'atrocious' it is, I know when i came into the 'fold' of standards based design it was becaues i was embrassed and somewhat coudled by my peers, not have a finger wagged in my face ;), if i have misunderstood this article then my apologies.

Montoya wrote this at June 28, 2006 08:58 PM

The scary thing is the amount of XHTML sites that fail validation. See, validation doesn't matter much in old, fun HTML. It's the abuse of XHTML that's problematic. I can't help but blame the standards community for making developers think that XHTML is the right markup language for every site. I have yet to see a site that actually needs it!

Oh, and then there's stuff like this: march to your own standard.

Mike Haugland wrote this at June 28, 2006 09:09 PM

One thing that interested me was this; "Some for The W3C CSS Validation Service not comprehending CSS3 Selectors."

When these errors came up, did you change the validator to validate against CSS 3.0? Just curious as it may slightly alter the results.

Sean Fraser wrote this at June 28, 2006 10:57 PM

David: If CSS Reboot participants had used the W3C Markup and CSS Services then they would have - at least - been able to fix all of their simple typographical errors before rebooting.

Mike: Thank you for noticing that sentence. It's shrill alarmism. I'm finishing my article regarding CSS Reboot's common CSS errors of which I'll summarize: W3C CSS Validation's egregious!

And, to address your actual question, I didn't revalidate with CSS v.3 because the code was valid, e.g., "p::first-letter". I didn't adjust the failures for using the default CSS v.2 setting when CSS v.3 was found in the style sheet. Those sites which actually failed for a single validation error caused by the pseudo-elements "::" convention were few. Maybe, three. CSS validation was performed as I believe most users with an average knowledge of web development would have done so (since the CSS Tool doesn't have any user-friendly explanations as how to effectively use their tool!)

Bryan Culver wrote this at June 29, 2006 04:59 AM: A bit late and this may have been mentioned before, but I think a lot of the non-validation occurred in the actual content part of the sites. For my site, I have a problem with keeping my content valid so the site will remain valid. If people removed their content, I bet some would pass or come close to validation. Others, just ignore the standards as I once did, I now always use XHTML 1.0 Strict.

James wrote this at June 29, 2006 07:03 AM: I didn't bother going by standards this time - the DOCTYPES are pissing me off.

Mike Haugland wrote this at July 1, 2006 09:20 PM

Ahh, fair enough. I totally understand the approach now. Most people will validate it the way you did and lead to the same result/assumption.

It's really unfortunate that CSS doesn't have a way to declare what version you're using. Or at least it would be nice if the validator could pick up CSS 2.1 or CSS 3 stuff and adjust the validation accordingly.

Also, how did you handle hacks and proprietary properties (eg: scrollbars)?

Sean Fraser wrote this at July 2, 2006 09:07 AM

Mike:

-moz- extensions, MS filter:progid types and The Underscore Hack were very popular errors and were left remaining in final results. They failed using CSS v.3 (which includes v.2.1 as you know).

CSS v.2.1, 4.1.2.1 Vendor-specific extensions offers,

“In CSS 2.1, identifiers may begin with '-' (dash) or '_' (underscore). Keywords and property names, beginning with ‘-' or '_' are reserved for vendor-specific extensions.”

Further,

“An initial dash or underscore is guaranteed never to be used in a property or keyword by any current or future level of CSS.”

Still further.

CSS v.2.1, 4.1.3 Characters and case. states,

“The following rules always hold: ...In CSS 2.1, identifiers (including element names, classes, and IDs in selectors) can contain only the characters [A-Za-z0-9] and ISO 10646 characters U+00A1 and higher, plus the hyphen (-) and the underscore (_); they cannot start with a digit, or a hyphen followed by a digit. Only properties, values, units, pseudo-classes, pseudo-elements, and at-rules may start with a hyphen (-); other identifiers (e.g. element names, classes, or IDs) may not…”

I would suggest Everyone uses the CSS Validator's Advanced Interface with Version 3 only.

The CSS Validator doesn't seem to meet Web Standards Usability requirements, does it.

Mike Haugland wrote this at July 2, 2006 02:29 PM

Hehe. Thanks for clearing that up! I'm out of questions and definitely agree. The W3 should step things up and bit and make the validator more usuable. But I suppose they likely won't make the push until 2.1 and 3.0 are out of working draft status.

Although I was bitten by the validator reporting a 2.1 property as invalid without a clue that I could choose between 2.0, 2.1 and 3.0. Even if they made that more visible, I'd be happy. With how many validate through something like the web developer toolbar, as more of these properties and values get used, it'll become a problem.

Thanks for clearing it all up for me. :-)

david wrote this at July 5, 2006 02:42 PM: Hey Sean, you are right about that, people shoule use the validations services., Hey i have a question about doctype, and i know you have an article on here about that , but i didnt understand something, ... i'll find that article and ask the question there,. i think im using the wrong doctype in my pages, never really paid it any mind until i read that other article

Sean Fraser wrote this at July 6, 2006 09:59 AM: David: Why XHTML? was it. It doesn't - Really - matter if you use HTML 4.01 or XHTML 1 (with "text/html") as long as it is "Strict". And, it validates. I recommend HTML 4.01.

zcorpan wrote this at September 12, 2006 02:50 PM: What do you mean with "Content-Type"? Something you find in a META element in the markup or the actual HTTP header?

Sean Fraser wrote this at September 12, 2006 05:26 PM: zcorpan: "Content-Type" for this analysis was the META element only and not from the HTTP Response Header.

Comments are closed.

Accessibility Statement
Archives
Privacy Statement | Ian Dove

The Elementary Standards: A Compendium of Web Standards, CSS, Linguistics and Search Engine Optimization methodology Copyright ©2005-2007 Sean Fraser. All work is published under a Creative Commons License. All Rights Reserved.