Robert Hahn

inspired by integration

I'm always interested in infrastructure that brings people together and facilitates communication. I'm currently exploring social software, markup & scripting languages, and abstract games.

Home | In This Site …
noted on Fri, 12 Dec 2003

What is a Web Site?

Furthering the discussion involving Tim Bray (where I picked it up), Dave Winer, Sam Ruby, Jeremy Zawodny, and Joe Gregorio, I started picking at the issue Tim brought up:

The problem is that all the Web knows about is URIs, and the Web can’t tell whether a URI points to a home page, a picture of a cute cat, or to one of a dozen daily entries on some blog… And I bet, down the road, once we really have the notion of a site, we’ll be able to think of all sorts of other useful things to do with it.

This series of posts builds on the thinking Tim puts out, so I recommend you look at his article before continuing.

What I have done in the following series of posts was to try and map out a model of what a web site is, break it down into it’s atomic pieces, and determine if there’s a way to represent the data in a machine and human readable format. Some of these articles, like the “WSDF as RSS/WSDF as Atom” article, could safely be skipped if you’re pressed for time.

If you read through most or all of this, let me use this post to thank you (I wasn’t sure where else to put it). If you decide there’s merit enough in this proposal to implement a WSDF file for your site, please drop me a line. If you have constructive criticism related to anything here, please let me know. I’m convinced that the ideas outlined here will work, and I’d like to see where it goes.

Update: added a warning to the WSDF as XHTML page not to download the wsdf file directly.


Once I started looking at the possibilities of using XHTML, I realized that I could change some of the properties. Let’s review the list:

  1. Each file describes the current ‘section’.
  2. If other sections exist, a link is provided to their WSDF.
  3. If its not a section, an entry is added to the current WSDF. This entry has the following properties:
    1. a URI to the asset
    2. an indication of what it is (possibly by mime-type, or a consistent use of terms)
    3. a description of what it is

I’ll show you the syntax in a bit, but what I realized was that the first three requirements could be rewritten as follows:

  1. One file can be used to describe an entire site
  2. Each section can be represented with a tag
  3. If it’s not a section, use another tag to describe what kind of resource it is.

As an XHTML file, there’s obviously going to be the usual required overhead: <html /> tags, <head /> tags, <title /> tags, and <body /> tags. Inside the <body /> tags, we can describe what a website looks like using only two tags: the <div /> tag and the <object /> tag. An example is provided here Note: Since this is an HTML file that links in almost everything on the site, you actually might want to “Save to disk” instead of clicking it directly.

A section can be indicated with <div /> tags, and the title of that section can be reflected in the title attribute. Sections may be nested within each other to create sub-sections or sub-categories. This nesting can be as deep as required to describe your site.

The <object /> tags are wonderfully rich, and so are perfect for describing each asset that makes up a web site. The data attribute contains the URI for the asset in question, and the type attribute provides an intelligent way of discerning what kind of asset it is. Between the open and close tags can lie any content you care to use to provide a human-readable description of what that object describes. Even more interesting is that there’s nothing keeping you from putting richer markup in this area. For example, a link can be provided to a site as a citation for the source of the information represented in that asset.

After looking at implementations of the concepts in RSS, Atom, and XHTML, I decided that the best fit was XHTML. Let me outline the thinking in this section.

tall ship