ConcretePageNames

ThoughtStorms Wiki

Some notes I made a few months ago when writing about wiki's FlatNameSpace. It's also a sort of rework of ideas from SemanticsOfProgrammingEnvironments, DoesAbstractionScale etc. and close to WikiIsLikeLanguage.

I believe that one of the reasons for the difficulty of building software is that most programmers are trained to think in terms of decoupling the concrete data from the logical representation. {1}

Often the TradeOff is good. Consider how much more efficient it is to use style-sheets in your word-processor. But sometimes it's worth having another look at this bargain. And this is what the original wiki made us do.

In wiki, the pages could have been called by id numbers, with the text-title and link-text arbitrarily mapped onto numbers by the software. Instead it chose to be ruthelessly concrete, the plaintext name is the unique id of the page.

What did we lose? The ability to arbitrarily rename a page. To change HelloWorld into HelloToTheWorld and have nothing break. But we also gained. We gained quickness of link making. Quickness is always the primary Wiki virtue. And here it seems to have paid off. No-one needs to think about an id number, or to think about making that explicit mapping between id and plaintext. The constraint makes this an automatic, "no brainer".

Secondly, we gained serendipity. You don't have to make a link to an existing page. Or even to know that a page exists. Just make the link and maybe there's a page at the end.

But how do you know that this is a good form of words for a page name? Because the form of words is specified by something outside the scope of the program : the semantics of our normal language use.

This also explains why the trade-off is a good deal. Titles are not arbitrarily mapped to page contents or link text. There's already a semantic connection. It wouldn't make much sense to want to rename HelloWorld to RicePudding. If HelloWorld starts as being a reasonable summary of that page, then RicePudding and millions of other names and phrases are unlikely to be. As to the plausible transformations such as HelloWorld to HelloToTheWorld, these are often so trivial that the change is not particularly urgent.

So arbitrarily renaming pages is not something you need so often.

Yes, there are times when you do want to rename a page. And if it becomes urgent, you can hack an acceptablish substitute using #REDIRECT and, later, manually tweak the pages on a piecemeal basis. But that's the lesser demand. Not so important that it's worth paying the other cost (the higher cognitive-load of link-makking) that would be needed to make it easier. {2}

{1} See also JaronLanier for another theory of why big software is hard.

{2} Finding acceptable trade-offs like this is a difficult art. No one is really responsible for thinking about the long term adaptation between software and users. Or the effects of cumulative small-costs.

Software engineers aren't trained to do this, they expect the customer to know what she wants. Customers don't understand and dislike the notion they can't have everything. Even when they do admit this, they are guided by not very accurate intuitions or what the software guys tell them. The HCI and usability designers who maybe ought to worry about this, don't have the the leisure to do long term studies and are normally forced to work fixing the obvious flaws encountered by new users, on their first couple of tries of the software.

The only progress is through the slow build-up of conventions throughout SoftwareHistory. The invention of the button-bar seems to work, so everyone starts to adopt it. New users, using new software, understand and expect it, so it remains. It survives because it's good. But even when it isn't, the conservatism of users, the lock-in due to wide understanding, keeps it in place. (Although maybe we can use PatternLanguageForTheSocialNetwork to document good vernacular.)

Comment I made responding : http://alevin.com/weblog/archives/001620.html#001620,

here :

http://alevin.com/cgi-bin/mt-comments.cgi?entry_id=1620

on PurpleNumbers

Quite agree.

The real difference is this : wiki pages have names whereas purple numbers are, well, numbers.

PageNames are, to a certain extent, bound to their semantics by the conventions of the wider language use. The chances are I can't arbitrarily rename my HelloWorld page to be RicePudding because the contents are unlikely to be equal.

Numbers, OTOH, are completely abstract. They are only bound to a content, by the software. It's perfectly reasonable to change which content sits at which number.

This is also why wiki-like things where pages are named "node27" etc. don't work. And why weblogs, which haven't had semantically attached labels have actually been quite clumsy to search. (Maybe tagging is fixing this.)

ChrisDent responds here : http://www.burningchrome.com/~cdent/mt/archives/000387.html

And me to him : http://blahsploitation.blogspot.com/2005/05/chris-dent-has-posted-purple-response.html

ManuelSimoni :

solved http://manuel.typepad.com/manuel/2007/02/wiki_naming_sol.html
a problem http://manuel.typepad.com/manuel/2007/02/titles_slugs_id.html
(maybe)

WardCunningham : "ILoveAGoodNamespaceCollision"