RichardP

ThoughtStorms Wiki

I'm a software engineer, primarily writing code for the Apple MacOS platform. To be polite I probably should put more here...

: Richard is also the author of WikiMinion script which is currently keeping this wiki safe from spam. See WikiMinion for more information. – PhilJones

Richard got chatting to EugeneEricKim : https://web.archive.org/web/20050204143039/http://www.civicactions.org/cgi-bin/wiki.pl?Eekim

Welcome Richard. Good to have you here. And thanks again Please feel free to wander around, comment, ArgueAgainstMe, etc. ;-)

Discussion on MacOs moved to that page.

Message to Richard.

Hi Richard. Are you in the UK? And if so anywhere in the South? I'm going to be around the London / Brighton area for the next couple of weeks and I think I definitely owe you a drink :-)

PhilJones

I appreciate the kind offer, unfortunately I am not in UK - I guess I won't get that drink :-) I live in the US, near Los Angeles, California.

RichardP

Shame. Ah well, next time I'm in that area then :-)

PhilJones

Hi Richard,

Last chg of this page is phils note copying request by DavidAndel to WikiMinion.

My question is whether you are aware of Spam Management methods taken by other Major wikis ?

I have further questions later.

MicrosoftSlave

MS, I am pretty familiar with the anti-spam measures in the major wikis. Feel free to ask questions, I'll do my best to answer. – RichardP

Richard thx for help. Does your method of autoreverts work on C2 where there is a large number of users comings in and out? If a new page with several links gets deleted it can not be manually reversed with ease, as history would be wiped out over there.

  • BTW do you use Recent Changes to get notified on messages for you in homepage, or is the email notification a better alternative? – MS

The most active wiki protected by WikiMinion gets approximately 100 legitimate edits per day and has several hundred active users. Since WikiMinion protects many wikis simultaneously it is currently examining more than 500 legitimate edits a day (WikiMinion is protecting about 50 wikis, but the single most active wiki accounts for about 20% activity seen by WikiMinion). Interestingly, WikiMinion currently sees spam edits outnumbering legitimate edits by more than 10 to 1 (however, this is not a representative sample since it seems likely that only wiki's attracting a lot of spam would bother to ask for my help).

With my current bandwidth and processing constraints I think WikiMinion is operating at about 50% capacity, so, in theory, I think if WikiMinion was protecting only a single wiki it could handle a wiki that received 1000 legitimate edits a day with a community of perhaps 5000 active users. Is that enough to handle C2? I know the C2 community is large, but since I'm not a member of the C2 community I don't know how it compares to these numbers.

With regards to your question about deletion and history: WikiMinion currently relies on the wiki engine to preserve older versions to which WikiMinion can revert, WikiMinion does not mirror older pages. For example, MoinMoin gives privileged users the option of deleting pages, but MoinMoin preserves the page's history so any logged-in user can use the page history to revert a page deletion, thus WikiMinion can revert deletions in MoinMoin. Similarly, In UseMod page deletion by normal users is done my marking the page with a special text badge, so anyone, including WikiMinion, can preempt a deletion by removing the badge (or reverting to an earlier version without the badge). However, in UseMod someone with Admin rights can delete a page in such a way that the page history is destroyed - in this case WikiMinion can not revert the page deletion. If the C2 software allows normal users to delete pages, and if doing so destroys the page's history, then WikiMinion would not be able to revert the deletion.

With regards to messages for me in my homepage, I generally rely on either Recent Changes or an examination of the page history. I don't use the email notification feature offered by some wikis, although I'm sure that is a handy feature if one has many home pages scattered around.

RichardP

Richard, not in the spirit of complaint, but you said you wanted bug reports. Looks like WikiMinion got confused on the SdiDesk wiki. If you look at it's revision 58 to the home-page, it seems to have gone back to the spam 55. Did it not recognise 55 as spam? Or did something else go wrong?

hope there've been no more threats or problems arising from your anti-spam activities.

cheers,

PhilJones

Phil, yeah, I noticed that problem. OddMuse treats lines that begin with spaces by wrapping them with PRE tags. Since the spammer's links were essentially inside a PRE tag they were non-functional so WikiMinion didn't consider them spam. In addition, the spam edit originated from an IP address that WikiMinion didn't have in its database - so with no functional spam domains and what appeared to be a clean IP address WikiMinion accepted that edit as legitimate. I don't necessarily want WikiMinion to begin considering spam links inside PRE tags as spam, as often that is exactly how wiki pages by legitimate users disussing a particular spammer and his links appears. Hmm..,.I'm going to have to think about this problem a bit, I am not quite sure what the best way to fix it is.

RichardP

Last night I put in a quick hack for this particular spammer. When cleaning OddMuse wikis WikiMinion will now consider apparent links to this particular spammer's domains as spam, even if they're non-functional due to being embedded inside PRE tags. I checked today, and this fix appeared to work for the latest spam inserted by this particular spammer on SdiDesk wiki, even though one edit had just "dead" spam links due to them being inside a pair of PRE tags.

RichardP

Thanks Richard.

I can see there's a real problem with refering to spammer links without using them. Maybe the anti-spam community can come up with some kind of convention / escape sequence to use when they're talking about spammer's URLs, which will signal to WM that the link should be left alone. Hard to do though, I realize.

I wonder if non-link URLs get counted by PageRank?

PhilJones

I asked a Google engineer about non-link URLs. He said that it is his understanding that non-link URLs are treated as any other text on a page, ie. they count as text for the purpose of content search and do not affect PageRank either positively or negatively. In addition, non-link URLs will not be used by Google to discover new pages to add to its index.

RichardP

CategoryPerson, CategorySuperHero