I wonder if you can build a kind of email ImmuneSystem. Basically you set up honey-pot fake email addresses on your site which nobody is meant to mail to. Any email which arrives at both this honey-pot and your real email address, is subsequently deleted from all accounts that this server handles.
The nice things about this system :
- it can probably be hacked directly into your organization's server without changing either the clients, or the routers.
- it's a real disincentive to the spammers' indiscriminate emailing : the more mail they send, the higher the chance they lose their audience.
- it's a disincentive to crawling pages for addresses, because pages are now going to be loaded with honey-pot email addresses with innocent sounding names like email@example.com .
- people can register their honey-pots with dodgy sites that they suspect sell their mailing lists. This might even kill the mailing-list reselling business.
- equally, you can send honey-pot addresses anywhere where you suspect leakages to spammers.
- many people will make firstname.lastname@example.org and other guessable email addresses honey-pots, which will make the automatic use of these names very dangerous for spammers.
Hmmm, but I suppose what's wrong with this idea, is that the spammers send slightly different messages in every mail (the way the fast evolving flu virus defeats the immune system.) So you still have a problem recognizing when the mail to your address is the same as the mail to the honey-pot. If you can't do that, it's back to white-lists ...
Other possible objections :
- Q : seems like you'd need to store a lot of spam in the honey-pots. Is it feasable for everyone to store all this.
: A : Maybe there's a window of 2-3 weeks before a particular epidemic is over. After which you can forget the old stuff.
- Q : The spam will hit some legitimate email accounts before the honey-pot. So some spam will still get through.
: A : Yep, it's not perfect, but it can cut a significant quantity. Also I suppose if you raise the honey-pot to real-address ratio you reduce this.
I think this is a great idea! The email does not need to be exactly the same after all we allready have filtering software that can check similiarities between emails to build black lists (I suspect that the same envelope From: might be a good indicator for a blacklisting, but eventually a Bayes filter can combine that with other factors). – ZbigniewLukasiak
I'm starting to like this idea very much. More thoughts ...
Most spam solutions encourage spammers to fight back with more spam. For example, filters encourage spammers to send more nonsensical spam simply to confuse or choke the filters. Spammers keep hammering away with more spam than ever in order to break through the barriers you surround yourself with. And this is an arms race they're familiar with.
But a honey-pot system really seems different. The more spam they send, the less you receive. Their only solution is to forgo brute-force solutions, and to try to be cleverer to weed out the honey-pots. But as only your server knows which addresses are honey-pots, and which are valid, and the spammers get no feedback, it's not clear how they'll find out.
Also it makes all mailing lists suspect. You can't buy millions of email addresses on the grounds that only 20% of them are OK. If 5% are honey-pots they can kill your existing valid addresses.
In a sense, you are using spammers themselves to tell you which emails are spam. Taking advantage of their main weakness : that they don't care who they send to.
- so it's been thought before, but no account of anyone actually trying it in practice.
My main problem is mailing lists - I'm sure that for most spam I get, my address has been harvested through various mailing list archives that haven't obfuscated the e-mails. I guess you could include the "fake" address in your sig. Or post under a different e-mail, and set up some filtering thing there.
Interesting chat. You could look at this approach from the other direction and see it as 'infecting' or 'marking' the spammers email lists in a way that enables others to monitor the activity of the spammers. The spammers would constantly be trying to 'clean' their lists of these marker email addresses that give them away. The anti-spammers would be trying to trick the spammers email gather techniques into picking up as many spiked email addresses as possible. We'd be spamming the spammers with spiked email addresses ;)
Figures that someone should have thought of it before. In a sense, it's not very different from matching against central repositories of known spam.
But at first glance at the stories linked, there are differences from my suggestion.
- Mine is done at the individual server level, and administrated by the people who administrate the servers, rather than by the individual or globally. That seems to be the right balance. The global systems need someone to tell the system that this is spam, first. Individuals require both their personal email and their personal honey-pot to be hit together. But imagine this being done instead at the level of Runtime or even Hotmail.
- Mine is more simplistic. Literally it says, anything sent to a honey-pot is deleted for everyone on this server, no question.
: Seems this should lead to bad false-positives; but as long as you never use the honey-pot as a real email address it isn't clear why real messages will ever get sent there. Friends stupid enough to cc their message to the honey-pot really don't deserve to be read.
I see the concern about mailing lists. Yeah, you can certainly put your honey-pot somewhere so that it's available to archive-harvesters. I also imagine that there's a lot of cross-sharing between spammers, so that as long as your honey-pot gets into the spam-system somewhere, it will perculate through, and soon lots of spammer's lists will have it.
It is strange no-one's tried it though ... wonder why?
I like to think of it as poisoning the spammers lists. (Or it is a bit like (T-cells?) in the immune system learning the shape of bacteria.) The important point is that the spammers would really need to do something radically different from normal spammer-behaviour to clean the lists.
- They wouldn't be able to promiscuously share lists
- They wouldn't be able to rely on numbers ie. more is better even if a proportion are dud.
- And they wouldn't get any feedback as to which addresses were in fact bogus.
Ran across your discussion of an EmailImmuneSystem / Honey Pot system for trapping spammers. We've been working on something similar to what you describe and just opened it up to the public. Works by allowing websites to install honey pot pages on their sites. When these pages are accessed they contact our servers and generate a unique, though seemingly non-extraordinary, email address. The addresses are designed so that they all point back to our mail servers. We watch for mail as it arrives and then publish the information about the original harvester for the benefit of the Internet community.
As I said, we're just getting started. As someone who has thought about this a lot, we'd appreciate any feedback you have. Information online at:
We already have honey pots installed on every continent (other than Antarticia, unfortunately). Think that after we get about 1,000 installed hpots we'll start to get meaningful data on a relatively realtime basis. To that end, if you know anyone else who'd be interested, please pass the information along.
Keep up the great work!
CEO, Unspam, LLC
Adjunct Professor of Law
John Marshall Law School
Hi Matthew. Thanks. This sounds interesting. I'll take a look. One initial thought, what information are you getting about the original harvester? The IP address? If you make this available publicly, will the harvesters be able to know that they've been caught and keep changing their address?
Other thought. Can this be adapted against WikiSpam and CommentSpam to create blacklists against wiki and blog spammers?
I'll certainly pass this on anyway. Thanks again for getting in contact.
Hi Phil –
One of the things we're planning for v.2 is to include a form on the honey pots to catch comment/wiki spammers. My hunch is that the same robots doing the harvesting are also posting the garbage on wikis, blogs, etc, but we'll see.
As you suspect, I think that harvesters will certainly adapt as it becomes apparent that their IP addresses have been listed. However, we think the nature of harvesting makes it slightly trickier to pass off to a zombie than it has been with email. And, if you're not using zombies, it's fairly difficult to get ahold of a large pool of IP addresses. Even if harvesters are rotating their IPs, if we have enough installed HPOTs we should be able to catch them fast enough that hopefully they will have a tough time adjusting. Only time will tell for sure.
Some sort of HTTP RBL based on harvesters' IP addresses is definitely in our plan. Need to get more HPOTs up and running in order for the data to be coming in fast enough for it to be worthwhile. I'll let you know as things progress.
- Good discussion on evolutionary principles at work in email-viruses : http://blogs.apress.com/archives/000041.html via JoelSpolsky : http://www.joelonsoftware.com/items/2004/04/16.html
: Of course, the really funny thing would be if these viruses, as they try to mimic real, useful emails, actually discovered the survival strategy of doing real, useful, work. For example, including valuable information (perhaps blagged from online news-feeds), to disguise themselves as digests; or mining address-books to make introductions SocialSoftware-style. Maybe they could even evolve to replace the dumb users who allow them to spread :-)