AlgebraOfFeeds

ThoughtStorms Wiki

Origin of the PageName

SebPaquet has things he'd like to do with RSS / syndication feeds. (Read with RssAsAPlatform) His writing is gone, but here's a copy :

Recent talk about RSS feed splicing and the ineluctable need for filtering open feeds got me thinking about the variety of operations one might want to perform on feeds.

Taking a cue from the operations of set theory we could for instance define the following:

Splicing (union): I want feed C to be the result of merging feeds A and B. Intersecting: Given primary feeds A and B, I want feed C to consist of all items that appear in both primary feeds.

Subtracting (difference): I want to remove from feed A all of the items that also appear in feed B. Put the result in feed C.

Splitting (subset selection): I want to split feed D into feeds D1 and D2, according to some binary selection criterion on items.

The ultimate RSS bricolage tool would give users an interface to derive feeds from other feeds using the above operations, and spit out a working URL for the resulting feed.

I'm not sure how all of it would work, or even if all of it can work in practice. I'm completely abstracting out technical considerations here. While I'm not sure how large the space of useful applications of this could be, here are a couple example uses:

Splicing: All of the posts on the Many-to-many blog have to do with social software, so it would make sense to send its posts over to the social software channel. Now, since the blogging tool we use for that blog doesn't support TrackBack?, it can't automatically ping the Topic Exchange. A workaround would be to merge both channels into a new one. In general, this would enable any combination of category feeds from various sources to be constructed very simply. A feed splicer can also serve as a poor man's aggregator. Intersecting: Say I want to subscribe to all of Mark's posts that make the Blogdex Top 40; I'd just have to intersect the feeds. Or I could filter a Waypath keyword search feed in the same manner.

Subtracting: I'm interested in some topic that has an open channel, but find the items by one particular author uninteresting. (This is equivalent to the killfile idea from good ol' USENET.) Subtraction could also be used if you don't want to see your own contributions to a feed.

Splitting: One might want to manually split a feed into "good" and "bad" subfeeds according to a subjective assessment of quality or relevance, or automatically split according to language, author, etc. Note that this one doesn't qualify as an example of pure feed algebra, as it involves inputs beyond feeds.

SebPaquet

When writing [Social Routing] I was thinking about doing similar operations on feeds but not only for ourselves but also for our readers, to create some kind of 'closure'. By the way I think algebra is not enough - in the end we will need whole Turing Machine power for processing the feeds (one interesting technique would be to use Bayes rules or Neural Nets for filtering).

ZbigniewLukasiak?

Additional operations

Additional operations:

  • collapsing: many items a day to one item a day
  • extract images: a feed consisting only of images linking into the relevant posts - see http://feeds.scripting.com/picTuner (I'd run this on http://www.xmldatabases.org/WK/blog/)

Did anyone ever build this?

PipeDream is a new version of the now abandoned YahooPipes

There's NodeRed

Other DataFlow systems?

Thoughts

Here's a comment I made in an email to someone Feb, 2007

Or maybe you can create software "in real-time" by, for example, routing and mixing RSS feeds. Eg. you'd have feed-traffic-controllers, pulling together, mashing up and mixing the outputs of different web-services, maybe by dragging and dropping, or sketching pipelines with a pen or Wii-like controller. 10 years ago, PhilipGreenspun needed his own server and to be a serious programmer to make the Bill Gates Wealth Clock. 4-5 years ago, smart people were mashing up Google maps with other services with little bits of glue script. Now we have reblogging services that help automate the process of getting a feed from one place and pushing it elsewhere. And Ning which is a sort of platform for creating mash-ups and reusing other people's code.

Jump forward 5 more years. You can imagine feeds of "objects" (data + encapsulated behaviour) being published. And enterprising people noticing that objects from feed 1 have input signatures of this format, and objects of feed 2 have output signatures of that format and it only needs them to be wired together in the right way, for us to have this useful combination. So how to define the wiring? Why not just draw something like a patch diagram in a virtual modular synth?

A diagram of Unix-like pipes. How frequently does this need to be done? As data gets more dynamic, you might want to have people updating the wiring diagrams daily. Why make it hard work? Why not just let them draw it with a pen, or Wii controller.

There are now various GraphicalDataFlowProgramming systems. I'm not entirely convinced by them (particularly for low-level stuff. But obviously TheUnixPipe and for processing orchestrating feeds

Compare : CodeCasting

TimBray thinks it needs a messaging infrastructure : http://www.tbray.org/ongoing/When/200x/2004/11/23/Events

Technologies

Interesting to imagine a feed-processing language rather like SQL.

Maybe lines should be added to ProgrammingWithCirclesTrianglesAndRectangles ?

(Suplimentary question. I never know when things should be an algebra and when they should be a calculus.)