Draft of a talk I'm giving

Wikis and Weblogs

Introduction

I want to talk about two things that you may have already come across in your use of the web : weblogs and wikis. These can be thought of initially as two kinds of web-site. Or two genres of web-writing. And I will start by describing each of them in this way. After which I will go on to discuss their ramifications.

At the end of the talk there'll be a brief discussion about the similarities and differences between them.

What is a weblog?

A weblog is a site which consists of a number of short pieces of writing, organized by time. It appears as a news site or online diary, with the most recent entry at the top of the page, and earlier entries below.

Typically entries are called "posts", and only a small number of the most recent appear on the main or front page of the site. Earlier posts disappear from the front page, but can still be accessed through permanent links (known as "permalinks").

The name "weblog" is oftern abbreviated to "blog" and writers of weblogs are known as "bloggers".

What are they for?

Originally, there were two main uses of weblogs:

  • online diaries, where people described their lives for their friends;
  • lists of links to other sites that the writer had discovered and wanted to point out (an analogy is made with the "cabinet of wonders": an early precursor to museums, curated by rich collectors).

More recently, weblogs have also been used in other ways.

  • as regular opinion columns (similar to newspaper opinion columns);
  • for niche or micro-journalism (for example, a blogger decides to write a news column about a particular area of technology or her home town);
  • to give news of a project (for example, a project manager may keep a team informed by posting relevant news items to it);
  • as an extended CV or advertizing. Freelance professionals write a regular column about an area of expertise. The idea is to gain a reputation in this area, in the hope that potential clients will become aware of them and their expertise. Clients can be referred to the weblog to get an idea of the professional's experience, specializations and approach.
  • blogging from live events. Audiences at a public lecture or conference use laptops wirelessly connected to the internet to give live coverage of what the speakers are saying. Those not at the conference can follow the debates remotely.

Some first notes on weblog "character"

  • weblogs are typically written by one person, or a small related group. They often have "voice": a sense of a real person with ideosyncratic interests and opinions.
  • The weblog "post" is permission to write in smaller chunks than would otherwise be used. This means it's possible to write as something as simple as a link and an "I agree" comment. Or an aphorism. Or a list of notes. The genre is tolerant of this style. It's an acceptable way to convey information. Compare this with book or journal publishing where even a small idea must be expanded into an essay to be publishable.

Weblogs allow ideas to be published and distributed more quickly than other media genres. This is true even compared with other web-sites that use more traditional formats such as essays. Weblogs are often called an example of "micro-content".

Having said this, different weblogs are written at different time-scales. Some specialize in many short items a day. Others in a fewer, well-considered, longer items less frequently. Different audiences may have different requirements.

Some first notes on weblog "software"

[Show some weblog software]

Weblog software must enable publishing to be quick and easy. A weblog can be written by hand in HTML using a text editor with one file per page. But that's very time-consuming (especially the linking) and really isn't likely to encourage frequent posting. So typically weblog software manages a database of posts, and gives the user an easy way to add a new post.

There are many different types of weblog software. But most users will encounter it in one of three forms :

  • a hosted service (eg. blogger, typepad, diaryland, livejournal, aol). With a service like this, another company (or department) maintains a server which can mange a number of weblogs. You get an account and access it through your web-browser.

There are free services available which many people use. Or you can pay for enhanced accounts with extra features.

  • host your own. (movable type, greymatter, bloxsom) If you have access to your own web-server you can install a personal weblog server. These are often very simple (a couple of perl, php or python CGI scripts) and free. You then update the weblog through the browser.
  • a desktop client and server also called "Microcontent client" eg.(Radio Userland)

This is a growing area. A specialist microcontent client is a piece of software which sits on your desktop like a word-processor. It allows you to write and organize your posts. It can either act as a webserver on its own. (viable if you are on a PC with a permanent IP or LAN address) or can then synchronize the information with a dedicated server.

Each method has advantages and disadvantages. Desktop clients can offer more convenient interfaces for writing and organizing posts, but central servers can be updated from anywhere, for example a cybercafe in another city when you are travelling.

The Blogosphere : the world of weblogs

An individual weblog has some benefit as a simple way to publish information. However, the real significance and excitement comes from going beyond publishing.

The internet is a two way communication medium that encourages dialogue, and weblogs are evolving an interesting ecology of discourse that consists of both technical components and social conventions. Both are important.

We'll now look at the technical and social conventions of the sphere of discussion between weblogs which is sometimes called the "blogosphere".

  • Comments

Most weblogs allow readers to add comments at the bottom of a post. This has great value for the author. If he's publishing a news story which contains an error, maybe a reader will point this out. Or if the situation has changed since he posted the story, a reader might post an update.

For example, I recently saw a weblog page suggesting that there was likely to be a student revolution in Iran. A comment to this news post updated the story by saying it looked as though the prospective revolution had fizzled out. If this comment section hadn't been there, I would probably have been expecting the revolution for a couple of days until I noticed on a conventional news site that nothing had happened.

The fact that readers can keep a news story up-to-date keeps the story relevant as events overtake it.

If the weblog is focused on opinion, expect strong arguments in the comments section. The writer can gain an insight into those who disagree, and can learn many things.

In a project situation, comments can be used to debate a decision, or point out that a particular mile-stone hasn't been met, or needs to be changed. A project manager who uses a weblog this way can both inform project staff and get useful feedback.

  • Trackback and the conventions of multi-weblog discourse

A weblog is a personal opinion space which the owner gains a lot of power from controlling. (Often the capacity to edit, and edit the comments.) So perhaps everyone wants their own personal weblog to express their views under their own control. In this case, a conversation started by a post can spread across multiple posts on multiple weblogs.

In my weblog I might say that Lula's pension reform is a necessary evil. You may want to disagree strongly. So you post a counter-argument. How are readers to know that these posts are connected?

When you write your post, you can link to my post, explaining that you are answering me. But how does a reader of my weblog know that your counter-argument is waiting on yours?

One solution that's been implemented is "Trackback". Trackback is a communication protocol which allows your weblog to inform mine that you have commented on one of my posts. Typically, your reply, which you commented on your weblog, then appears automatically as a Trackback link under my post. Readers of my post, who are interested in what others have to say in response, can then see a link to yours.

[Show an example of Trackback]

Topic Exchange

The Trackback mechanism has another application. Topic Exchange is set up as a number of channels (essentially empty posts) whose content is provided entirely by trackbacks from other weblogs. For example, if I am interested in chocolate cake, whenever I discover a new recipe or review a new bakery in my weblog, I "ping" Topic Exchange's chocolate cake channel as though I was commenting on it via trackback. The channel then shows a link to my new story.

If a channel is well supported it becomes a useful hub to find weblog posts on a particular subject.

Topic Exchange creates specialist publications about different subjects.

  • The blogroll

The "blogroll" is an important social convention which has appeared in the evolution of weblogs.

A blogroll is a column at the side of a weblog which contains a list of links to other weblogs that this blogger regularly reads and respects.

What's it for?

Firstly, it's a convenient bookmark list for yourself. You can treat your own weblog page as an index page to other weblogs you read.

For readers of your weblog, it acts rather like a citation list at the end of an academic paper. It's a way for readers who like your weblog to find more similarly interesting stuff. (If X finds Y's weblog interesting, and Y find's Z's interesting. Maybe X will find Z's interesting too.)

It also acts as a way for a reader to make a judgement about the person behind a weblog. For example, in the last year, there's been an explosion of political weblogging around the American attack on Iraq with both pro-war (warbloggers) and anti-war analysing the case made by the US and UK governments, and following the continuing events.

If you find a blogger with many pro-war or anti-war sites in her blogroll, you can probably infer her position. (If she has a lot of links to both, she's probably a political junkie.)

Equally you can start to tell what technical or religious or musical interests and alliegances someone has by looking at his blogroll. You might even make a quick judgement as to whether this is your kind of person, someone to revisit on a regular basis.

All of this depends on getting a feel for the people in the blogosphere. And your judgment may be wrong, Because webloggers are real people, and always more complicated than the boxes we try to categorize them in.

The social network

Notice this emphasis. A weblog is a type of software. It allows you to publish on the internet. It may also be a genre of writing which is useful. But what weblogs (in the plural) are about, are people. To use weblogs effectively you don't need to think just about which buttons to click, or how to configure a server. You need to think about the social world. About politics. About trust and reliability. Whose weblog do you invest time reading each day? Whose news is likely to be right? Whose opinions and speculations are valuable? How do you find the people who know the things you want to know?

This is why this software is sometimes part of a grouping we call "social software". Weblogs are not the first social software. Email and newsgroups precede it. But weblogs have sparked a new interest in thinking about the social network. And what we can do with it.

"Social network analysis" tries to extract useful information about the social network. For example, the search engine "Technorati" uses a technique similar to Google's PageRank to analyze only weblogs and the links between them. It can tell you which weblogs have the most links to them from other weblogs. (Who is well read and respected.) It can tell you what subjects are most popular and what links to other news stories are most popular. It can even identify what everyone's talking about using "word bursts", those words which have increased in frequency rapidly today.

Similar software can do this in companies. If there are weblogs in companies, they can be mined to discover who are the experts on a particular product. Email can be analyzed to find which employees often share information. There is, of course, a danger that this software can be misused for surveillance and control. But used well, it can help the "knowledge management" in an organization.

Second notes on weblog character

Weblogging changes our perspective of the internet or an intranet. It makes us think more about the people behind information than documents. It changes our perspective in another way too.

We usually think of a collection of static documents with unique addresses. However, a weblog is a news page. As we start to read weblogs our relationship becomes more like our relationship with a newspaper or television channel. We don't address or link to static pieces of information, but channels or points at which we expect a regular supply of new, stimulating, important stuff.

This is what I call the "flow internet". It's why trust is important. I only make a regular commitment to read X's weblog because I trust that the things I want will appear there.

In other words, I stop thinking of the internet as somewhere to go for information. And start thinking that information comes to me (or to a weblog I read). This isn't magic, it's due to the busy effort of many webloggers reading stories from each other and passing them along.

For example, I trust a weblogger called Ross Mayfield to be interested in, and dilligent in blogging stories about Social Software. I regularly read his weblog and expect if something significant happens in this area, pretty soon it will appear on Ross's weblog.

I also trust that, as many of these people are smart, as the story percolates through the flow internet, more errors are corrected than introduced. Initially, this might seem ridiculous. Surely people misunderstand and corrupt the information as they pass it from one to another? Weblogs have no formal editorial or fact-checking process to pick up on mistakes. No one stops malicious bloggers publishing lies.

But actually, most people aren't stupid or malicious. And if I write a stupid weblog, I'll lose trust and other webloggers won't read me or link to me (except to say that I'm stupid, so everyone knows). In the fields where I have some technical knowledge I believe that the knowledge coming through this informal filtering system is far better (and more timely) than knowledge that comes through the technical media or issued by PR departments of companies.

If you take no other idea from this talk, take away the habit of looking at weblogs first, for information about products and companies, before wasting your time listening to the company's own PR or trade journals.

RSS Feeds

This is a world of dynamic information, always in motion. Why shouldn't it come directly to you? On your desktop?

RSS is a particular XML file format which represents a list of story titles and summaries. Several popular weblog systems make the posts available in this form, because it allows them to be pulled into another piece of software for display.

A common application today is the news aggregator (amphetadesk, netnewswire, radio userland). This allows me to subscribe to a number of RSS "feeds" from different weblogs (or news sites) and combines them into one convenient page, so I can scan the news coming in from many places. It's convenient (although sometimes I'm overwhelmed).

[Show aggregator]

Web Services

A recent phrase that's become very fashionable is "web services". What this usually means is that computers talk to each other, using the web-transport protocol HTTP, usually by exchanging fragments of XML.

Much of the hype comes from large companies who are trying to sell a replacement for existing EDI systems. However, RSS fits this description perfectly. RSS is one of several XML based protocols which run over HTTP and let computers talk together, pioneered by a close community of early weblogging tool developers including Pyra, Userland, Six Apart and other independents and free software developers. Other examples are the Trackback protocol, and XML-RPC, which is used to pass messages between microcontent clients like Radio Userland and a weblog server.

Interestingly, these commercial companies started with a strong philosophy of co-operation and interoperability, so that, for example, Radio Userland can update a weblog on Blogger's server.

Several large companies like Google, eBay and Amazon have now started to believe that engagement with this community of developers is valuable, and have opened up some of their systems to talk via some of these protocols. The RSS feed is proving fairly versatile, with search engines making search results available, Amazon making book reviews available using it.

RSS is so simple that almost anything can be put into it.

(For those who know Unix, I think this may be turn out to be the equivalent of the Unix Pipe for tying stuff together.)

Pie / Atom / Echo / Whatever

However, there is a challenge to RSS. Some people want to improve upon it and build in more comprehensive capabilities for exchanging information between weblogs. This project is in its early stages and has no implementation. But for anyone making strategic decisions should watch it, in a year's time it may have superseded RSS as a web-services standard. Or it may have collapsed due to infighting between several partisan groups.

SOAP and Microsoft .NET

Userland worked on a cousin of XML-RPC known as SOAP with Microsoft. And Microsoft have now commited their .NET project to it. This means that many .NET developers are building web-services that exchange data using SOAP over HTTP. It doesn't seem that the weblogging community are doing much with SOAP and .NET at the moment, but that may change.

Weblogs and Knowledge Management

There's a view that knowledge management is more about organizing the conversation between knowledgable people in your organization than about a storage system for documents. And weblogs and social network analysis software fit well with this idea. In an ideal organization, everyone would spontaneously report on the state of their work using their weblog. In reality, it may be as hard to persuade employees to do this as it is to force any other kind of knowledge capture.

But the option can be made available cheaply. And where employees are keen to try, it can be encouraged.

Wikis

Now, to turn to the second type of social software : wiki.

What is a wiki? The short answer..

A wiki is a web-site where all users have the freedom to edit the pages.

What are they for?

Initially wikis were designed as simple and quick pieces of software to allow a group to collaborate on writing a collection of documents. They started in technical communities who wanted to discuss the design of a project they were working on. And later, to discuss design in general.

First notes on wiki "character"

The name wiki derives from Hawaian for "quick" And they emphasize quickness over other considerations such as presentation or security.

  • The basic wiki has no presentation values. It's intended for people to directly exchange information rather that to sell to or impress each other.
  • The basic wiki has no security. ANYONE can read it. Anyone can add to or delete text from pages. Initial use tended to be on private intranets where users knew and trusted each other. So this was not seen as a problem. Later, wikis appeared on the public internet and surpisingly, have been found to work well there too, despite the potential for damage.
  • Wiki is quick and easy, but emphasizes the convenience of experienced users over new users. To enter information into a wiki you need to learn a simple markup convention. It isn't as complex as HTML. But it does require more effort than filling in a form on a typical discussion site.

No security

Most people's first question about wiki is to worry about the lack of security. For many, this is still the defining feature, and one which prevents them exploring wiki further.

In fact, although early wikis didn't support security, most pieces of current wiki software do support all the obvious security combinations :

  • restricted read/write
  • public read, restricted write
  • public read and comment, restricted over-write
  • public read and write

Howerver, when wikis began to appear on the public internet, they proved surprisingly resistant to casual vandalism. Some ideas have been proposed as to why.

In essence, wiki is like any public space. It can be vandalized, but if people care enough to repair damage, the vandals won't win.

  • Most wikis back-up all changes that are made to pages, and there's an easy way to roll back to previous versions. That means that if you discover a damaged page, you can restore the last correct version. And as any good wiki will have more sympathetic users than hostile ones, when any of these sympathetic users encounters vandalism, they will undo it. In many cases it takes less effort to fix a piece of damage than to cause it in the first place.
  • Unlike a normal discussion forum, where readers comments are immutable, wiki is also fairly resistant to commercial vandalism (spamming) for the same reason. You can post an unwanted advert on my wiki, but the next reader will remove it.
  • Damaging a wiki is no challenge. There's no achievement in damaging wiki. Nor is there a demonstration of cleverness that you've overcome the security provisions.

These explanations are, however, all related to casual damage. Things are different if the wiki is being vandalized to stop it performing its function of allowing people to exchange information (a denial of service attack). In this case, wiki does need some protection. Most wikis allow IP blocking to stop edits coming from, say, a hostile robot.

Demonstration of wiki technical conventions

It's worth showing a couple of examples in detail to help understand.

[Demonstration]

  • The WikiWord convention for linking and creating pages

The most fundamental wiki standard is for creating a link to a page using WikiWords (two or more words that begin with capitals and are smashed together) These automatically become links to a page of the same name.

If the page doesn't exist, the first use of the WikiWord will create the page. So sketching a number of interlinked documents is straightforward.

  • Titles, bolds, italics, tables etc.

There are simple markup conventions, eg. signs, quotes, vertical bars etc. to produce these effects.

  • Title searches for inbound links.

Click the title of a page to search for any inbound links to it.

  • REDIRECT

If you want one link to be forwarded to another page. Useful for thesaurus functions, moved (refactored) content.

Second notes on wiki "character"

The conventions I've described above are primitive, but they form a compact vocabulary in which it is possible to describe a sophisticated and powerful information system. But unlike other systems, which are built by programmers, these wiki based systems rely on very little further code. Once again, they are built of social conventions.

For example :

  • CategoryCategory. Users who create a page can mark them with the term CategoryCATEGORYNAME.

The word Category has no special meaning. There's no code written to understand it in a special way. Nor is there a special information field. Or any enforcement that you use it.

But because the community of wiki users has adopted it as a convention, you can click on it, to find all other pages that contain it, and this produces a classification scheme.

This is software written in social practice. If the community didn't keep using the

convention, this facility wouldn't exist. But enough of them do.

Also note this is a very flexible classification scheme. Pages can be in many categories, nothing is forced into a hierarchy. But hierarchy can also be implemented.

Alternative terms can be allowed using #REDIRECT

  • Alternative indexes

Any user who has a preferred contents page can set one up.

Multi-wikis

Of course, although wiki is useful as a self-contained information space, there's also consideration of how wikis work together.

  • The intermap convention. Many pieces of wiki software support a simple convention to allow the addressing of other pages on other wikis Wiki:WikiWord. This is set up in a simple configuration file.
  • The bus tour. An early convention that allowed wiki pioneers to thread their wikis together.

The bus maps and links are managed by volunteers.

  • Wikis have a recent changes list. Some pieces of wiki software now make this available as an RSS feed for aggregators.

When should you use wiki?

There are many good reasons and situations for using wiki.

Here are three.

  • authoring shared documents.

This is what wiki was invented for. And much better than, say, exchanging Word documents through email. Everyone can see the current state. Comments can be added as notes which are simple to link.

  • prototyping other intranet services.

Before spending money on building an intranet system, consider prototyping in wiki. First wiki is cheap. The scripts are usually free. And the designers / information architects can sketch out the pages and functionality fairly quickly.

If your intranet needs special integration with other software, wiki can be extended by a scripting programmer, to work with these.

  • wiki as user configurable portal or corporate dashboard.

Wikis are increasingly extensible. For example there are wikis which can read RSS feeds

from other sites and have them appear on pages, wikis which can do spreadsheet-like calculations and present the results of queries drawn from databases.

When you MIGHT use wiki

  • extranets

Wiki can be useful for informal discussion with suppliers and customers. But note wiki is probably not secure enough for taking orders or other EDI; though it could probably be integrated with some of these systems to provide an informal annotation system.

  • public internet communication

Unless yours is a very radical company, you probably won't be using wiki for your main public site. But for smaller, more informed audiences, it may be suitable as an alternative to discussion forums and complement to technical support FAQs.

Most wikis support style-sheets and a basic page template, but not necessarily more sophisticated presentation which may be an issue with some graphic-heavy sites.

When NOT to use wiki

  • space

Most good wiki systems are simple because they store information in a fairly naive way. And the fact that they keep backups of every change to pages means that they use a lot of space.

For this reason wiki doesn't scale well to a large number of documents. If you have more than a couple of thousand short pages you may find wiki is greedy for disk space and slow to access. Wiki's can be adapted to use databases or other more efficient back ends. But if you want to keep backups of every page, this is still a large overhead. You may find yourself replacing the backup of every page-change regime with a nightly backup, and some more constraints on the users. But there comes a point when a different Content Management System may be better than adapting a wiki system

  • names and temporary documents

A more subtle problem is that every wiki page typically has a name which is composed of meaningful words. For smaller collections of documents, this is a great benefit. But for very large numbers of documents it can be problematic. When you have a large number of documents in a large organization, different parts may want to use the same name for different things. Imaging many groups fighting over rights to use the page called MonthlyProjectReport.

Also, if your system has to handle a lot of temporary documents, indexed by date, the wiki naming convention maybe less suitable. Imagine creating pages called Letter27thJune.

A possible solution to these first two problems is to think at a smaller scale. Wikis can be run by individual departments and connected via intermap. This will scale to any number of departments. But categorization and searching won't carry across departmental boundaries. And each department must be responsible for managing it's own server.

  • structured data

Wiki is one big free-form field. If you need to store more structured documents, and want a system to do a lot of validation, another system is probably better.

  • permissions and workflow

Some wiki software is adding more sophisticated permissioning and workflow features. But it's rare.

Comparing wiki and weblogs

To finish, let's compare wikis and weblogs

Similarities

Both have the effect of changing the genre of writing. Allowing the publication of smaller chunks of information directly, They make communicating and sharing information simpler,

quicker and easier.

Both are surrounded by active cultures who are experimenting and developing both new technologies and new cultural practices which take advantage of and advance them.

There are many attempts to synthesize them. To produce "wikiweblogs" or "blikis" which include elements of both.

Differences

Weblogs are focused on individuals and their voice. Understanding the social network and finding informed people you "like" is an important way to use the world of weblogs.

Wikis are often about the submersion of the individual in an anonyous group. You don't

necessarily know who said what.

Weblogs are organized around time. Or around fixed people with information in flux. Wikis are organized around page names or ideas with associated definitions and discussions in flux.

Sometimes webloggers are not good wiki-citizens. They'd rather write complaints about things being disorganized than re-organize them themselves. Or they try to move discussions away from the confusing wiki pages and into their own weblogs.

No Backlinks