More about tidying up HTML

If you’re curious about the last post

I accept that making some HTML code validate in validator.w3.org shouldn’t be the be all, end all of the problem. A piece of code can validate and yet be… horrible. On the other hand, a piece of code may not validate because of a minor problem, and yet be better than 99% of what you see out there.

Still, that applies mainly to your code. What if you’re aggregating other people’s code? What if they’re using bad HTML, which their blogging systems (mostly Blogger or WordPress) automatically converts to a feed, which is then converted back to (simplified) HTML by a Planet? And what if you want all of that to validate?

Well, tidy works very well; it fixes the worst problems, mainly, badly nested code, and unclosed tags. But… well, if you’re being pedantic (like the W3 validator is), then there are still problems.

They’re mostly one of the following: 1) img tags without an “alt” attribute, and 2) proprietary attributes.

tidy, by default, doesn’t deal with those (since its point is for you to correct your code, and those problems should really be fixed in the code itself). But you can make it do so.

How? Well, here’s the command line I’m using for Planet Atheism:

/usr/local/bin/tidy -wrap 79 -m -i -utf8 --alt-text "" --drop-proprietary-attributes 1 -asxhtml filename

It should be obvious what each parameter does. The “ alt-text "" ” part adds some empty alt text to any img tag that hasn’t got one. The “--drop-proprietary-attributes 1” part removes those weird attributes inside other tags, which make the W3 validator choke. I don’t want them anyway, since a Planet site is supposed to display a basic version of a post — not a Flash-y, YouTube-d, animated one.

The result is: complete W3 validation, and readable code. From many other blogs, by many different authors. Automatically. What more could anyone want? :)

Related posts:

  1. Adventures with moonmoon and tidy
  2. Blogging tips #3: Valid HTML
  3. Quick HTML page creation
  4. Search Engine Optimization and Accessibility
  5. The BSDs

0 Responses to “More about tidying up HTML”


  1. No Comments

Leave a Reply




Creative Commons Attribution-NonCommercial-NoDerivs 2.5 Portugal
Creative Commons Attribution-NonCommercial-NoDerivs 2.5 Portugal