Read with CheapMetaData
There are problems with people marking-up documents
- they don't classify things well, because
- classification is difficult
- classification is within a global context, if someone changes the overall taxonomy, everything has to be changed
- sometimes they have in incentive to lie, to try to draw more attention to their advert by claiming it contains more information than it has. *But classification isn't an end in itself.** It's just one solution to a more general problem : how to make a collection of documents more navigable and useful.
For some time I've been interested in the idea of using what I call typing of links rather than classification of documents. The idea is to attach types to links from one document to another. So that readers navigating through the documents get a better idea of what they are going to see. This can help them prioritize their search.
The first example was an experiment called LinkShelf, which allowed users to build a directory of links, which were typed as "introduction", "deeper resource", "news resource" etc.
A more recent example is TypedThreadedDiscussion which types the links between posts in a discussion forum by their relationship with the parent. Are they "counter-arguments", "counter-evidence", "explanations", "questions" etc.
The advantages these should have over other forms of human classification :
- easier .... you don't need an idea of the global taxonomy and the document's place in it, in order to give a type to your link. You just need to think about the document itself (or it's relation to this document.) In other words, you just need local information.
- less incentive to lie. No type is "better" than any other. Each is useful to the reader's specific situation and requirement. If you pretend your link is something that it's not, you'll not only alienate the false positive audience but lose the false negative one too.
Obviously local typed links wouldn't solve all the problems that classification is meant to solve. They don't allow global searching for the "right", or all the relevant information. But given that they may make human browsing more efficient and improve the spread of knowledge of good links through the population; and given that they're much cheaper to produce than good classification; they may have a role to play.
Of course, this is all awaiting empirical testing of course ;-) But I'm starting to work on that. TypedThreadedDiscussion code will be available soon, and I'm hoping to try it in a local university.
The GettingThingsDone methodology : sorts tasks into "types" by how and when to do them, not "categories" of what they're about.
Also, of course, this "locality" makes this a bottom-up kind of approach. Compare with ConcreteToAbstract.