Atom Microformat
Posted Jun 29, 2006 in XML 2
XML syndication feeds—whether they be RSS or Atom—are a brilliant idea with an unwieldly implementation. Whether it’s done manually or through one’s own authoring tool, putting the syndicatable content into a separate file for parsing by a feed-reading tool is an overly complicated extra step in the publishing process. Why not have all the necessary data and meta-data embedded in the single definitive source for a document—the HTML page itself?
The problem is that HTML and XHTML as of version 1.1 lack the native elements to describe a syndication feed. XHTML 2.0 should have the elements and attributes to do it easily. But XHTML 2.0 is a long way off yet. Once again, a microformat has stepped into the breach.
The hAtom microformat embeds Atom metadata into existing, well-worn XHTML constructs. It’s modeled on the ad hoc design patterns used by many existing blogs.
One area where it might have a lot of potential is in the creation of static XHTML pages. Currently, attempting to create a syndication feed for static content is a nightmare of hand-editing and duplication of effort. With hAtom, an author can write and mark-up once, and publish in a format that is semantically rich and very readily transformable.
Update:
I reproduce below the comments that were lost when the database went ka-boom!
Colin Morris wrote:
I have to say I’m not that sure about the microformat for syndication as part of the usefulness of the external-file format is that you don’t have to download all the guff from the page and then parse it. Navigation, titles, etc. are wasted bandwidth for that application. You don’t really need microformats to syndicate a blog page as you can have a computer make best guesses about page structure (assuming you’ve used semantic markup of course, but if you’re going to microformat a page I’d assume you’re also able to semanticise the page). Atom/ RSS as external files have only the data you want/ need to download.
Nick Caldwell wrote:
Hi, Colin! I’ll try and take your points in turn.
part of the usefulness of the external-file format is that you don’t have to download all the guff from the page and then parse it [...]
I’d assume that an hAtom feed reader would simply grab the xhtml structure itself and parse that, ignoring css links and so forth.
Navigation, titles, etc. are wasted bandwidth for that application.
How much wasted bandwith are we talking here, though? A few bytes? Compared to the work that feed readers do to parse and format busted RSS feeds (the majority), it’s surely trivial.
assuming you’ve used semantic markup of course, but if you’re going to microformat a page I’d assume you’re also able to semanticise the page
I think hAtom has the nice effect of enriching the semantics of an already well-marked-up document. And adopting it as a markup format should enhance the semantics of an otherwise poorly marked-up page.
Atom/ RSS as external files have only the data you want/ need to download.
Hmmm… My feeling is that you already have separation of presentation, content/structure, and behaviour with CSS, XHTML, and ECMAScript. Why split off content/structure into two separate content/structure containers?