Robert Hahn

inspired by integration

I'm always interested in infrastructure that brings people together and facilitates communication. I'm currently exploring social software, markup & scripting languages, and abstract games.

Home | In This Site … | Google Thread
noted on Wed, 25 Feb 2004

RSS Feeds on This Site

I was just going through my access logs when I saw a request for an article in RSS format. Unsurprisingly (to me, anyway) that request 404’ed.

I only have two RSS feeds set up under /rhahn that you can subscribe to. This one is for my Inspired by Integration site, and this one is for the Handbook for Primitive Living web site that I’m helping to build with a friend of mine.

If you have a special request for a particular article feed, or you would like the ability to view all articles in RSS, please contact me and we’ll discuss.

At any rate, while I have your attention, I’d like to take a moment to thank you for reading these posts. I’m always willing to hear what you have to say, so please feel free to drop me a line.

Update: An alert reader with a newsreader was kind enough to point out that my rss feed consisted of links to non-existent rss feeds instead of html. The fix was really simple, fortunately. To those subscribed, I do apologize for the inconvenience.

noted on Sat, 21 Feb 2004

Conferences: Why bother, really?

So I was reading through Justin Hall’s reflection on the 2004 O’Reilley Emerging Technology Conference 2004, and maybe it’s just me, but I can’t help but wonder: Is there any point to holding conferences in person anymore?

It seems to me that the amount of information being generated per presentation is far bigger than the presentation itself. If this is the case, why not create a special web site for the purpose?

Here’s what I envision: A conference website is launched with a call for papers. There is a rigorous enough screening that the organizers are confident that the person doing the presentation is who they say they are. On the day of the presentation, that author’s work is added to the web site along with an IRC channel for the backtalk and questions, and a wiki that can only be edited by the author.

Visitors to the site, during a pre-set time slot can read through the document, and pop questions on IRC with other participants or the author. The author, if he wants to, has the chance to modify his document on the spot to take the questions into account. Even better, the author can annotate his presentation on the wiki to help answer the questions that come up. If the author can’t easily work in this format, then an assistant can transcribe what the author wants to say in the document.

Let’s now talk a bit about the business model for running an online conference like this. Obviously, to set up such a site can save on a lot of expenses for the company wanting to host such a conference. So, once this infrastructure is nailed, it becomes a lot less expensive to run such a conference.

But the conference holders still need to make money. That’s why they put them on. It’s for profit. Where can a potential host make money? I think that website ads would actually work very well here. Conference swag can also be sold through the site. Finally, the conference hosts could sell a limited number of anonymous registrations. I’ll need to explain that in a moment.

Where a host should not make money is by charging the participants an admission fee. If the point of these conferences is to share ideas, then nothing should be done to impair that sharing. But that doesn’t mean the hosts should be stupid about this. Anybody can participate, but they must not be allowed to be anonymous or to use a pseudonym. All potential participants must register before they are permitted to contribute; the registration must be structured in such a way as to ensure that the host can reasonably prove that the person who registers is who they say they are. This restriction should not be a problem for most people wanting to participate. They don’t go around with pseudonyms at real conferences, so why should the online version be any different?

However, there are, I’m sure, some people who feel that they must be anonymous for some reason. And since this is a feature that they hold to be valuable, the conference can sell them anonymous accounts. These accounts will be held to the same standards as the others. If anything, I would imagine that any trollish behaviour from these accounts would make them more likely to be cut off.

So what do you think? Is this workable? Are there details that need filling in? Would it work? Please let me know.

Net Services for Grocery Stores

My wife and I do not get the paper, ergo we do not get flyers for our local grocery stores. As we’re wanting to shop more price-consciously, the flyers would be nice pieces of information to have.

“Why don’t you check online?” I asked. “Too much bother,” she said, so we now have a routine where, once a week, we pick up flyers from the stores we want to shop at, and plan our trips accordingly.

So I get it into my head to go check these sites myself. I was thinking that maybe I could write a script that would grab the info, package it together in a nice, but privately accessible page, and then my wife would have exactly what she wanted in a nice convenient package. More, once I had that data, I could mix and dice it with our shopping list to help her pick products that were on sale that week.

The two stores we tend to shop at are Zehrs and Food Basics, mostly because they’re the two closest to where we live. I am not linking them here because they do not deserve it. Google “Zehrs markets” and “Food Basics” if you’re interested.

Let’s talk about Food Basics first. What they did was... annoying, but I can understand where they’re coming from. Their online version of the weekly flyer is basically 7 jpgs on 7 pages. Not exactly scrapeable information, but it would be possible to at least bookmark the first page, and the images themselves seem to have predictable URI’s.

Zehrs, now, is another thing altogether. First of all, the site’s in a frameset, which, by the way, isn’t a cardinal sin in my book, if it’s used properly (and it almost always isn’t), and so the URI is masked from view. Selecting their ‘online flyer’ link took me to a city/store selector, which in turn brings up the flyer. Great. Let’s view this frame. Uh-oh. The URI is completely opaque. After scraping the domain name, here’s what it looks like:

/eflyer/pages/PAGEweek8_A1_A2_A3E001V2004I008P001.asp

Cute, isn’t it? Basically, I can’t bookmark a single URI that would always take me to the first page of their flyer. I can infer that I’m looking at page 1 (the P001 part of the file name) and I can figure out that I’m on week 8 of the year, and I doubt that 2004 would represent anything BUT the year. I could look at it for a few weeks to infer the rest of the pattern, but I’m not done talking about why the Zehrs experience bugs me.

Their flyer, like the Food Basics one, is also a set of images... coincidentally, the image is stored in the same directory structure, with the same name excepting it starts with IMAGE instead of PAGE, and ends in .jpg instead of .asp. I would have been as annoyed at Zehrs as Food Basics, but, combined with the opaque URI, Zehrs looks relatively worse.

But get this: there’s a feature where, if you mouse over certain products on each page, you get a layer containing the flyer text for that item. That’s good, right? That’s scrapeable, right? Well, probably, but not easily. See, I view-sourced the file to see what they got, and instead of finding nice <div />’s with the copy, I instead find something that looks like this:

%4B%52%41%46%54%20%50%45%41%4E%55%54%20%42%55%54%54%45%52%20

That’s right, dear readers, they hex-encoded all the characters that would make up their specials. More, they wrote this fairly impressive decoder right in the file. Heaven’s pity, but why? Why bother?

Both these stores had this (in my mind) fantastic means to create brand loyalty by potentially offering data transparently enough that anyone could conceivably shuffle it in with their own personal data (like, in this case, a shopping list). Both these stores could have created an API (like Amazon & Google) for their specials. If the idea took off, they could then reduce the amount they’d need to print for their offline audience.

What can I say? Guess I’ll continue to pick up flyers from the stores. I don’t have that much free time...

noted on Mon, 09 Feb 2004

Using Google to Create Comment Threads.

I’m really happy that Bob DuCharme wrote his article on creating backlinks, because I have been struggling to write an article using the same technique, but to achieve a different end. In his article, what Bob wanted to do was find a way to work around the web’s fundamental restriction on linking: that it be one way. What he proposed was an easy hack where if he writes an article, he could publish as part of the article a link to Google using his article as a search term. If someone else wanted to write an article that contributed to the discussion that he started, then they too could add the exact same Google link. Now, with both sites so connected, if a viewer were to visit one site, and saw the Google link to the other, they would hit the link, read the other article, hit that article’s Google link, and would then have a way to ‘go back’ to the first article — albeit in a two-step process.

I was surprised that Bob didn’t pick up on the other possibility inherent in this kind of linking. That is what I want to discuss here.

I would like to propose that if any collection of weblog posts related in content were to publish these kinds of Google links, then the search results begin to take on a new kind of significance. The results would illustrate a discussion thread. Let’s take an example, using Alice, Bob, and Carly.

Alice decides to write a post about a recipe, and publishes it to alice.ca/myrecipe.html. Alice understands the idea of using Google to store discussion threads, so she publishes a link to Google using link:alice.ca/myrecipe.html as the search term.

Bob, also a culinary artist, saw Alice’s article, and clicks on the Google link. No surprise, the search results contain a link to Alice’s article. Bob then decides to write a followup, possibly to suggest an alternate method of preparation for Alice’s recipe. He links to Alice’s post, and he also creates his link to Google, using link:bob.net/alicesrecipe.html.

Alice decides to check on her article on her web site, and clicks on that Google link she made. She’s delighted that there are now two results: one for her own site, and a new link to Bob’s. She visits Bob’s page, reads his article, and clicks on his Google link. Alas, the discussion seems to have ended there for the time being.

Carly, who’s a fan of anything Bob writes, then goes to Bob’s site and discovers the post talking about Alice’s new recipe. Carly spotted some technical errors in Bob’s post, and decides to write a post of her own suggesting corrections. She includes a link to Bob’s page, and may likewise add her own Google link.

If Alice were then to check on her Google thread link, nothing would change, but if she were to follow Bob’s Google link, she would discover that the conversation had in fact continued on.

This example illustrates several aspects of using Google to show how a discussion evolves.

The first aspect to note is that anyone using the links will not be able to get a bird’s eye view of the entire thread. This is only a real drawback if you’re trying to determine which post was the one that started it all. However, I think there’s a way to make this a bit easier. After checking on a set of Google search results, I was surprised to see that some of the search results had a date tucked in beside the page size. Curious to see how this was done, I view-sourced the link, and found out they were using a meta tag. This is what it looked like:

<meta name="Date" content="2004-01-15" />

My theory is that if web log software made a habit of a) making permalinks to pages with one article on them, and b) adding this date tag, then it would be very possible to trace a series of comments to the oldest one.

If you’re going to explore a conversational thread, then you must follow a page-to-Google-to-page-to-Google pattern of browsing. Not exactly an ideal situation from a usability perspective, but it may well be worth the tradeoff if you’re weighing this approach to creating discussion threads over allowing comments (and therefore comment spam)

There ends up being very little overhead. You don’t have to manage or maintain the thread — Google takes care of that for you. As long as people are creating their links to Google using their article URI as the search term, and also linking to the source article that they’re commenting on, then it’s possible to follow a thread to each and every article.

Things get even more interesting when you play around with some what-if scenarios. What if, for example, Bob not only created the Google link for his own URI, but also added another Google link containing Alice’s article URI? Now you’ve made it convenient for anyone visiting Bob’s page to see both comments to his page and comments to Alice’s page without directly going to Alice’s page first.

Another what-if: What if Alice and Bob were savvy web developers who got their Google API keys, and decided to create an application that displayed the search results of their URI? Now it’s possible to see on their own page the top 10 comments to their own article!

So: I’m going to put my money where my mouth is, and I’ve already adjusted my templates to include a Google Thread link at the top of every page. I’ve also written and deployed a Blosxom plug-in to add the <meta /> date tag for individual entries. In time, I’ll also write a new plug-in that will augment all links I’m citing with an extra link going to Google, so that you can see other posts talking about the article I’m commenting on. The trick, of course, is in the implementation, not the functionality.

Unfortunately, I’m not (yet?) on the who’s who list of online personalities to watch, so almost all my Google Threads will come up zilch, but I hope you’ll realize that it’s not a failure of the idea. If you want to see it in action, then please comment on something I wrote! :)

Plug-in: meta_date

The inspiration for this plug-in came from a Google search result that turned up an HTML archive of a w3c mailling list. The particular search item was interesting because the date of the file was displayed right beside the size of the file. I thought that if anyone wanted to check how ‘new’ a particular page was, they would merely need to see that information.

This particular plug-in will work fine for all static and dynamic blosxom blogs, but only with the characteristic that permalinks will refer to one specific post, instead of a page containing a series of posts, like the date archive.

While the plug-in works, the benefits to adding a date is still a bit nebulous in my mind. Questions that come up for me include:

Mind you, when I say that the benefits of this plug-in is nebulous, it’s not to say that I don’t know what it would be good for, but rather, I’m not sure what good it will do. You may wish to read my post on Using Google to Create Comment Threads to see what use I’d put the date tag to.

Download the plug-in here. It requires Blosxom 2.0.

tall ship