Ever since the web was born, it’s been constantly changing. It started off as a text based medium for academics. Now it’s a global media powerhouse, creating entire industries. Yet the foundation of the web hasn’t changed. HTML isn’t even a correct description now, as most of the data is media.
Over the past few years there’s been a shift in the information schemas of the web. The Data Object notation is becoming de facto; AJAX, config files and data storage are all adopting it. For example the newer nginx uses a JSON like config files, vs Apache’s older XML files. Some bloggers have even claimed XML is dead.
Applying this paradigm to webpages is the next logical step; HTML is still using an ugly dialect of SGML. Hyper Text Data Objects or (or HTDO) are better in many ways:
- it’s more readable
- can be built by objects. OOP guys will love this: build up the DOM server side using objects (though you can still spit HTDO out as a string). Imagine adding a class to your input tag object, and they all change!
- reflects the DOM better. Inline tags could be a simple string:string relationship, while block level elements could be arrays of other objects. There are a few exceptions, for example it’s handy having anchors as block level elements.
Here’s a mockup of what it could look like:
{
"title":"webpage",
"meta":"charset:UTF-8",
"stylesheet":"styles/style.css",
"content":
{
"h1":"welcome to my webpage",
"p":"this is my content",
"a": {
"href":"/page2.htdo",
"title":"click me",
"link":"the next page"
}
}
}
Because the DOM can be built both server & client side, it would make for an interesting use of the _target anchor tag attribute. Imagine if you sent #selectorID as a tag, it sent with a header, and the server responded with only the HTML you needed? It would keep your URL intact but feel just as snappy as an AJAX request.
I’d love to see a DOM parser built as an extension in Firefox or Chrome for this experimental format. It might even fit in with Google’s “build a faster web” campaign.
There are a few disadvantages, for example <em>emphasis or bold</em> tags are more cumbersome. Also putting ID & classes on inline elements is a hairy problem if they’re strings only. Either you could wrap it in an id, have a CSS selector “string#IDname”:”string”, or have inline elements as arrays. For purity, I’d rather just warp the inline element.