Update : Now ThoughtStorms has moved to a new engine, WikiSpam is no longer an issue. (At the moment.)

Maybe WhiteLists are the answer after all. :-(

See BotsOnWikis for more talk of bots reversing vandalism on WikiPedia

Old Stuff

Thanks to everyone who's been helping fight the spam. My faith in wiki-readership's capacity to maintain wiki is revitalized. :-)

Particular thanks to RichardP who's bot "WikiMinion" is now cleaning this wiki.

Other responses to Spam :

A new kind of spam?

Someone is going round creating blank pages for the UnfinishedLinks. Is this someone being helpful or learning about wiki or a new kind of spammer. Perhaps a bot that thinks it can earn some credit with a site, or lull spam filters into thinking it's useful? Any ideas? – PhilJones

Phil, I think the most likely cause is buggy spammer software. The software might actually be performing edits on a bunch of pages, but for whatever reason (presumably a bug) it isn't actually changing the content. Because UseMod silently discards edits which make no changes, RecentChanges only shows the creation of new pages. However, UseMod doesn't discard an edit of a new page if you save back the default "new page" text. Another possibilty is that this spam software is actually intended to actively look for UnfinishedLinks and place spam on those pages, perhaps because the spammer figures this results in spam that will be guaranteed to be findable by search engines and less likely to be noticed by a wiki's users. As a possible example, consider the recently added page ArsDigitaCommunitySystem Like the other recently created pages, ArsDigitaCommunitySystem was one of the UnfinishedLinks on the TypedThreadedDiscussion page and clearly was created by a spammer. So maybe the spammer's software had a bug yesterday and he fixed it today?

RichardP

Wow, it couldn't have been much more than hour after you unlocked your wiki until the first spammer arrived. They're mighty quick off the mark! – RichardP

: Yeah! I was hoping they might have forgoten about us for a while. I suspect their spam-bot didn't even notice. They aren't going to bother deleting the URL from their list. Anyway, now I'm back I'm going to implement the patch you suggested and then I have some thoughts about putting some kind of filter in front of ThoughtStorms which I'll play with. How are things with you anyway? – PhilJones

Perhaps migrating ThoughtStorms to another wiki engine would be a superior choice to patching UseMod to implement both META tags and content filtering. For instance, both OddMuse and MoinMoin allready support both of those features. For that matter, I think MoinMoin has superior anti-spam features (besides it is written in Python :) ), however the migration path from UseMod to OddMuse is probably easier. Things are fine with me, thanks for asking. Oh, and I got my first death threat on my answering machine from a wiki spammer just last week! WikiMinion must be having an effect (it is now protecting approximately 50 wikis). – RichardP

Wow, Richard. I guess this story about the death threat was not a joke, huh? Pheww. Care to tell us a little more about that?

Phil! I guess you noticed the stupid git that has been hammering ThoughtStorms (and other wikis, of course) with html spam that doesn't even work correctly. Over on the Chonqed wiki we had a report about this spammer. It'd be interesting to know whether your logs also show this Back to the Future activity. Can you confirm that? – Manni

My God! Richard, is that true? I think I'll hurry up with some kind of strategy to reduce WikiMinion visibility. Maybe I will have to migrate to OddMuse. I like it a lot, it's just that I'd miss the sub-page feature. Will also have another look at MoinMoin too. Manni, hadn't realized that. It's incredible. Will check logs to see. – PhilJones

:Phil, don't hurry your efforts on reducing WikiMinion visibility on my account. Not only am I convinced that the call was just harmless posturing on the part of a pissed-off spammer, but making the reverts less visible here won't significantly reduce WikiMinion's visibility overall now that it is protecting a bunch of wikis. This is particularly true since he was complained about WikiMinion during a time that ThoughtStorms was locked, so he certainly couldn't have been annoyed at the cleaning going on here. MoinMoin supports subpages, but it doesn't directly support UseMod page data files, so migrating to MoinMoin involves an extra export-then-import step. The wiki-syntax parsers in OddMuse and MoinMoin both differ slightly from UseMod's, however OddMuse is probably closer. Manni, I'd be happy to provide some more details about the call. Basically, a spammer left a rant on my answering machine. He started off by calling me names and asserting that I have no right to interfere with his legitimate business. He escalated to speculating about my sex life and ended the call with the suggestion that he knows someone who would, at his request, "break my head." He spoke in a Slavic or Russian accent. I am not concerned, many years ago I used to get occassional calls like that when I was involved with anti-spam efforts on usenet and latter with my e-mail anti-spam efforts. I responded the same way I've done in the past, I report the call to my local police department and they ask the phone company to provide the caller's phone number. They then record the number (but they don't tell me the number, although in this case the detective implied the call originated from one of the former Soviet republics). They then close the case, and tell me to contact them if anything ever comes of the threat. Nothing ever has in the past. – RichardP

Phil! No need to waste your time looking through those log files. I was able to [solve http://wiki.chongqed.org//BackToTheFutureII solve the mystery]. Of course, if you insist, I wouldn't mind having a look at his referrers and his user agent strings ;-)

Thinking about migrating away from UseMod is certainly a good idea in my view. It's always the same old problem, though: What do you do with those subpages? Is MoinMoin supporting those? Seems like a decent engine to me. But I would always choose the one that's easier to install and written in a language that I'm familiar with. There's a wealth of extensions for OddMuse that are very easy to install. And I have rarely seen spammed OddMuse wikis, probably because its spider meta tags are so effective. In fact the only two spammed OddMuse wikis I remember are my own (which has lots of content that spammers are looking for) and an abandoned one were spam is never cleaned.

Manni

:Manni, MoinMoin does indeed support [subpages. http://moinmoin.wikiwikiweb.de/HelpOnEditing/SubPages subpages].] In my experience the biggest difficulties involved in migrating from UseMod to MoinMoin are getting the raw page data from UseMod into MoinMoin and updating the pages to account for the differences in wiki syntax. Similarly, the biggest difficulties involved in migrating from UseMod to OddMuse are flattening the wiki structure to eliminate subpages (wikis with many subpages require a great deal of manual editing before migrating) and updating the pages to account for the mild differences in wiki syntax (even with the usemod.pl extension installed there are still a few differences). – RichardP

  • hi Manni, can you find some time to give a brief overview of what's happenning at chongqed; are there any emerging trends, paradoxes, learnings, in general or perhaps from the spamming here, what kind of content are spammers looking for - are they more than shotgunning every input box they can find? – kk

Sure, if you think this is the right place, I'll give you a little overview.

  • I think that there are basically two kinds of spammers. The small ones and the big players. The former are desperate people trying to make their post-dot-com e-business outfit stay alive. The latter can be further divided:

** There are people pushing affiliate programs and pay-per-click services. Seems most of them are from eastern Europe and they like to use bots.

** Then there are the Chinese people. They spam manually and it's hard to tell exactly what they are spamming for and why they are doing it. But there are a couple of companies in China that like to sell (what they call) SEO-expertise.

Spammers aren't looking for any kind of specific content. At least in the sense of wiki content as a context for their spam. They are using Google to find places to spam. Most of them enter a previously spamvertized URL as a phrase into Google and this will help them find spammable wikis. Since UseMod has this sincere deficiency of not providing sensible robots meta-tags, it's mostly UseMod wikis that get hammered.

It's hard to come up with any trends. I think that more and more spammers are using bots. I know that many people have always thought that any kind of spam must be from a bot, but our logs tell a different story. With the growing number of bots (and also with the growing number of Chinese people being paid for spamming), the amount of spam is getting worse. Not only the total amount, but also the amount of spam per spam incident. They post more and more links, spam more and more pages, and they don't care whether the last revision of a spammed page was their own spam.

It seems that there are similar trends over in the blogospehere. But spammers give the impression of concentrating on one 'medium'. Wiki spammers are focusing on wikis and blog spammers spam blogs. There isn't much overlap. A notable example is the guestbook spammer who also spams wikis. I don't know why he is doing that and why he isn't adjusting his means of spamming. I guess he's just an idiot with good tools.

Manni