Visit the weblog archives, check out some fun links, or learn more about the site

Category “XML”

Atom Microformat

Posted Jun 29, 2006 in XML Comments 2

XML syndication feeds—whether they be RSS or Atom—are a brilliant idea with an unwieldly implementation. Whether it’s done manually or through one’s own authoring tool, putting the syndicatable content into a separate file for parsing by a feed-reading tool is an overly complicated extra step in the publishing process. Why not have all the necessary data and meta-data embedded in the single definitive source for a document—the HTML page itself?

The problem is that HTML and XHTML as of version 1.1 lack the native elements to describe a syndication feed. XHTML 2.0 should have the elements and attributes to do it easily. But XHTML 2.0 is a long way off yet. Once again, a microformat has stepped into the breach.

The hAtom microformat embeds Atom metadata into existing, well-worn XHTML constructs. It’s modeled on the ad hoc design patterns used by many existing blogs.

One area where it might have a lot of potential is in the creation of static XHTML pages. Currently, attempting to create a syndication feed for static content is a nightmare of hand-editing and duplication of effort. With hAtom, an author can write and mark-up once, and publish in a format that is semantically rich and very readily transformable.

Update:

I reproduce below the comments that were lost when the database went ka-boom!

Colin Morris wrote:

I have to say I’m not that sure about the microformat for syndication as part of the usefulness of the external-file format is that you don’t have to download all the guff from the page and then parse it. Navigation, titles, etc. are wasted bandwidth for that application. You don’t really need microformats to syndicate a blog page as you can have a computer make best guesses about page structure (assuming you’ve used semantic markup of course, but if you’re going to microformat a page I’d assume you’re also able to semanticise the page). Atom/ RSS as external files have only the data you want/ need to download.

Nick Caldwell wrote:

Hi, Colin! I’ll try and take your points in turn.

part of the usefulness of the external-file format is that you don’t have to download all the guff from the page and then parse it [...]

I’d assume that an hAtom feed reader would simply grab the xhtml structure itself and parse that, ignoring css links and so forth.

Navigation, titles, etc. are wasted bandwidth for that application.

How much wasted bandwith are we talking here, though? A few bytes? Compared to the work that feed readers do to parse and format busted RSS feeds (the majority), it’s surely trivial.

assuming you’ve used semantic markup of course, but if you’re going to microformat a page I’d assume you’re also able to semanticise the page

I think hAtom has the nice effect of enriching the semantics of an already well-marked-up document. And adopting it as a markup format should enhance the semantics of an otherwise poorly marked-up page.

Atom/ RSS as external files have only the data you want/ need to download.

Hmmm… My feeling is that you already have separation of presentation, content/structure, and behaviour with CSS, XHTML, and ECMAScript. Why split off content/structure into two separate content/structure containers?

On Open Office

Posted Mar 30, 2004 in XML Comments 0

Tim Bray visits the OpenOffice.org guys.

It turns out that OpenOffice already comes with a doohickey that will produce an XHTML approximation of most documents (Lauren tells me it’s shaky on tables); plus it’s got a nice HTTP library and APIs out the wazoo. Can you see what I’m thinking? There’s no reason this sucker shouldn’t have a “Blog this” button that XHTML-i-fies whatever you’re typing, lets you preview, and then lets you ship it out via one of the existing blogging APIs or the Atom API.

Geeks like me are fine with writing in Emacs, but lots of people seem to like writing in word processors, and as of this week, I think that any word processor without a “Blog This” button is just broken.

I’ve been thinking for a while that OO.o can make a pretty good XHTML editor. More soon.