Saturday, March 26, 2005

Spamming Experiment

Kasia at unix-girl.com decided to run a spamming experiment on her blog. She posted a couple spams to her own blog and waited to see what would happen. In less than 24 hours she received 356 more spams.

We didn't test it as scientifically, but Manni and I discovered the same thing on our wiki. We keep a lot of spammer URLs there for informational purposes and discussion (as text only so they Google doesn't see them as links). I also had a page where I was holding loads of spam taken from many spammed sites before I got around to studying and chongqing them. We then started to see spammers would spam those pages with referrers from Google searching for a specific spammer domain, often not their own. Those temporary pages have turned into really good spam honey pots for us.

When a spammer can find a page with existing spam through Google they know that site is probably not well monitored or cleaned so they know it’s a good place to spam. If you have a blog, wiki, or guestbook it is important to clean any spam you get. If you don't you are just inviting more and more spam.

Comments:
If you leave comment spams around, there's the possibility that google might penalize you for linking to bad neighborhoods. Although, I have to assume that they should be able to differentiate between a link created by the blogger and one inserted into comments.
Edit  
I doubt it. If Google was able to tell what sites were in a "bad neighborhood" was they would ban the sites. That is the whole problem with Google. They are based heavily (if not totally) on interlinking between sites. Interlinks between related sites is better for PageRank, but Google can't really punish a site for linking outside its neighborhood even if it is to a bad one. But supposedly sites that have a large number of outgoing links are penalized. I have not seen evidence of that in dealing with spammers though.

Differentiate between links by the blogger from links in the comments would really help. That is why they came up with the rel=nofollow tag. For Google to be able to detect comments in all types of blog software is not easily workable, there are too many versions and customizations of each different blog software. By using the rel=nofollow they allow the blog software designers to differentiate them.

But many bloggers don't like the nofollow idea because then no link commenters post recieves any PageRank gain. Currently bloggers have a lot of power in manipulating PageRank, which is good and bad.

The other problem is most spammers don't know to look for it or other kinds of PageRank denying methods before spamming. Unless it becomes very wide spread it won't ever be highly effective.

For wikis its not a good solution at all. On a wiki there is no differnce between the initial poster and commenters as there is with blogs; everyone is just an editor. Wikipedia is using it anyway for all links because their spam problem was just so great. I don't know if it has helped them much or not.
Edit  
Infact, wikipedia dosn't use the rel=nofollow tag on links, partly because they feel they dont have too big a proble with spam, and they feel big enough to deal with it as and when it comes. (the wiki software has it enabled by defult, but wikipedia just disables it)

Tom
Edit  
Post a Comment

<< Home