XHTML 2.0, HTML5 and SEO

Structured Data - the future of SEO

How will HTML 5 and XHTML 2.0 affect the way we do search engine optimisation?

According to xhtml.com “The competition to become the next markup language for the Web is heating up.” So I’ve learned, “heating up” can be loosely translated to 7+ years development, an expected delivery of 2012 (if at all), and a lot of arguments along the way. What am I talking about? XHTML 2.0

I’ve been looking at the draft XHTML 2.0 and HTML5 markup language standards from the point of view of an SEO consultant. What are these languages and how will they affect the way we do SEO, if at all? What do you need to know now, and when will the way we work be impacted by the eventual replacement of XHTML 1.0 and HTML 4.01, the current common markup languages of the web.

What’s HTML 5?

HTML 5 is “the 5th major revision of the core language of the World Wide Web: the Hypertext Markup Language (HTML).” W3C Working Draft 10 June 2008. The new language is generally considered a step forward from the previous version, HTML 4.01. Basically, HTML 5 is being created to fix some problems and improve “interoperability” between different “user agents”.

How will HTML 5 affect SEO?

HTML 5 will introduce new features that help us (and search engines) better dissect a webpage. In the past, <div> elements have been used everywhere where, in HTML 5 an array of elements will be available to describe navigation, text sections, articles and headers. The improved sectioning could quite easily assist a search engine in understanding the layout of a page – check out this post on block analysis to understand why that’s cool.

Here are two excellent diagrams explaining the differences found on A List Apart: HTML 5:

Current layout with  HTML:

structure divs

Layout with HTML 5:

structure html5

There are a few other interesting additions too. For example, the dialog element will allow better representation of conversations in HTML. For example, a WP Twitter plugin could output code like this:

<dialog id="Twittering">
<dt> <time>14:22</time> richardbaxter
<dd> Has anyone seen the latest Battlestar?
<dt> <time>14:23</time> ZakaZaka
<dd> @richardbaxter Get on with your job!
</dialog>

The best articles I found on HTML 5 were this Webmonkey article and the article I mentioned earlier on alistapart.com – excellent background reading.

What’s XHTML 2.0?

“XHTML 2 is a general-purpose markup language designed for representing documents for a wide range of purposes across the World Wide Web. To this end it does not attempt to be all things to all people, supplying every possible markup idiom, but to supply a generally useful set of elements.” – XHTML2.0 draft specification, July 2006

XHTML 2.0 is an upgrade or replacement for the existing markup standards. It’s not “backwards compatible” with HTML 4.01 and not yet supported by browsers such as Firefox and Internet Explorer. The standard is designed to generate better search results, and to create a more accessible Web for people of all abilities and using all types of devices.

Fundamentally, XHTML 2.0 is considered a significant leap forward in markup language devvelopment and as a result, is not backwards compatible with the current HTML 4.01 standards. Current expectation is that the language (if it ever arrives), will be expected for sometime around 2012.

How will XHTML 2.0 affect the way we do SEO?

Here’s a page created in XHTML 2.0. If you’re an SEO, here are the main things to look out for:

Headings

Where H1, H2, etc would describe the relationships between headings (and therefore the semantic structure of the document) in HTML, XHTML 2 lets you explicitly markup the document structure with the section element, and its related header element “h”.

So, in XHTML 2.0, your optimised document looks like this:

<h>A study of Monkeys and Dishwashers</h>

<p>An introductory text explaining the purpose of my study of dishwashers and monkeys.</p>
<section>
<h>Dishwashers</h>
<p>Text about Dishwashers</p>
<section>
<h>Monkeys</h>
<p>Text about monkeys</p>
<section>
<h>Conclusion</h>
<p>Dishwashers and monkeys have little, if anything to do with each other</p>
</section>

You can add links or images to any “element”

<p src="images/picture.gif">some text here</p>

No more alt=”" attribute

The alt attribute of the img element has been removed, so in XHTML 2 you give the descriptive text in the content of the actual element  e.g., <img src=”image-profile.jpg”>Profile picture – Richard</img>.

There are a great deal of changes to be aware of, and frankly, my work on this subject has not finished. That said, I think it’s extremely important for SEO’s today to understand the markup languages of the future. It’s definitely worth paying attention to the subject of HTML and XHTML development. They will change the way we work with front end code for SEO.

Photo credit: pasukaru76

Comments

  1. Tanner Christensen

    Interesting changes, without a doubt.

    But for avid SEOs and programmers, it doesn’t look like much of a dramatic change from what has been done in the past. What’s really going to be interesting is to see which (if any) search engine is first to fully support the changes.

  2. richardbaxterseo

    You’re right – from a developer / SEO perspective these new changes aren’t exactly daunting or difficult to digest. The less technical SEO should definitely refresh his / her knowledge though.

    I read somewhere that a Googler is on the HTML5 panel – can’t remember who though. Thoughts?

  3. Heath Huffman

    Why even bother worrying about it? Let the market decide when and which standards will be the new standards. That will be a few years by itself. Then you have the issue of all of the browsers catching up with the new standards… another few years. Once all that’s worked out, then I suggest you worry about what changes you need to learn. Until then, isn’t it kind of futile?

  4. richardbaxterseo

    I don’t think it’s futile, I think it’s a very interesting subject which is why I wrote about it. It will be a long time (a very long time) before any of this stuff is standardised but why not expand your understanding of what developments are ahead of us? Isn’t that why SEO and technology in general is so great to be involved in?

  5. Onl

    Interesting post and I’d love for the next revisions to be released, however looking at the current web browser market share indicators as well as overall web analytics there is still a shocking amount of IE6 users out there. Given the fact that the current main browsers don’t yet support it, and a fair amount of people still visit my website using Firefox 1.0.x the last thing I want to do is limit my user base compatibility then make my website slightly more optimized. Cheers.

  6. Patrice Albertus

    Big job and lots of laboratory test to do with the news headliones attributes ! SEO semantic becomes framed like other metadata (microformats snippets, RFDa) and each sense paragraph will be structured with to provides search engine pr-formatted XML ;-)

  7. Dennis Franklin

    Does anyone know if any of these changes that are being discussed here have been implemented in Google, Yahoo.ect? I heard another site talking about the same thing and this seems like a major change to me.

  8. richardbaxterseo

    Hi Dennis

    SO far, very little visible changes in the way SERPs are presented. That said there’s definitely a move from Google to show support for this format (and structured data pages) with the launch of the rich snippet preview tool. You might find this article useful reading on using structured markup as a “future of SEO” idea.

    Thanks for dropping by,

  9. Sam Langdon

    It seems search engine support for semantic mark-up is growing, with Google recently releasing events rich snippets in addition to people, reviews and videos.

    RDFa & microformats are both supported by HTML 5, so these SE changes will remain pertinent after it starts to permeate in the years to come.

    Google has also announced how it’s shifting its focus away from Gears toward HTML5.

    It looks like RDFa and HTML 5 will have a big impact on search as we seem to be moving ever closer to a Semantic Web.

  10. Ashley Sheridan

    I think that the proposed new HTML 5 spec is something that’s been long needed. I think the trouble though, is that a lot of people are hung up on how it will display in a browser, without realising the huge semantic benefits to be had.

    As soon as search engines start to look for HTML 5 tags in web pages, then anyone who doesn’t use them is likely to fall behind. The spec is designed to fail gracefully in non-compliant browsers (read Internet Explorer) and there are countless tutorials out there to help write code that can work in both old and new browsers alike.

    This new spec not only brings possible benefits for SEO, but accessibility also. I for one will be making a start on implementing this in my work.

  11. Michael Persson

    There are very interesting information. I am starting to develop sites in HTML5 as I have a 5 years passed in XHTML 1.0. If it is better to use HTML5 than XHTML I am eager to test and make good examples of new ways of making even better SEO than before…

  12. SEO??

    this is very interesting information, html5 is not that popular in china yet, but in order to take advantage of it on seo , we need to know and implement new coding style in new sites.

  13. Ryan

    I just hope all the browsers will be consistent with how they handle the padding and margins for these new page elements. Or will this just add to the tags at the top of the css file that have to be zeroed out?

  14. Jim Rudnick

    Interesting piece here @Richard….and I value the links you’ve added to the piece too…much more to investigate, eh!

    As a guy who’s used and CSS for more’n 15 years, I like this new approach to design — at least theoretically….BUT — implementation by all the browsers is yet to be seen…and that has me somewhat worried….
    :-)

    Jim

  15. Jackson Lo

    The most important thing is seeing consistency between all browsers. It doesn’t look like a dramatic change on the SEO side of things, but it will probably take time to get used to.