Saturday, October 22, 2005

Advanced Wiki Spam

Today the chongqed wiki had the first big spam attack in a long time. Actually I had even been missing the small time spammers attempts until yesterday.

RichardP and I both noticed the spam attack about the same time. As I went to lock the wiki, he set WikiMinion cleaning. Turns out all I blocked was WikiMinion, the spammer had already moved on.

When I got around to looking at the attack more carefully I realized this spammer's program is more advanced than usual. Most of the pages spammed were newly created by him. But he was not just making up random page names or using his own keywords like most wiki spammers do. Every new page name he used was on topic for our wiki.

The secret is that each of those page names had been linked to from one of our pages. Some were spelling mistakes, many are just automatic CamelCase links, others were links expecting to later become pages but never did. He had to have crawled a large portion of the site to come up with all these page names.

When I first saw only a couple edits at the top of the page and the new page names seemed on topic I briefly wondered if we had a new contributor. Scrolling down a bit solved that quickly. That is kind of sneaky, but not really effective especially since he over did it (22 spams). It would only work on an abandoned wiki where you can makeup any page names you want. So good job on making spamming advances in an area that doesn't matter.

But the spamming software and its designer were also pretty stupid. He thought he was being smart by hiding his edits by making another edit (from a different IP address) blanking the page while logged in as if he was a legit user cleaning up the spam. Pretty sneaky if it wasn't so retarded, apparently he missed the spammer classes on meta nofollow, noindex. Older revisions don't show up in search engines and thus do spammers no good on modern wikis.

And he also is trying to hide his spam with the CSSHiddenSpam trick which doesn't work on our kind of wiki.

Here is the spam (including tabs):

As you can see, he also has missed the concept of keywords. But I needed something for the database and WTHP doesn't seem that important of a keyword. So looking at his site I choose to chongq him for DVD.

I am glad he hit us because now he is in our blacklist where he belongs. According to Google, there are already 151,000 pages that have "" on them. He has been busy.

If this is advanced wiki spam, I say keep up the good work idiots!

your suggestion to solve this problem is not acceptable, if you close the div's for example, we will discover a new way to hide an keywords related texts and links :)
I am not suggesting to get rid of divs. Any allowed tag could be used like this, my solution is to block the use of the style attribute on any tag.
we will discover a new way to hide an keywords related texts and links

Two hikers were being chased by a bear. As the bear closed in, of them stopped to open his knapsack and pull out his jogging shoes. His friend said, "You'll never outrun that bear!" and the jogger replied, "I don't need to. I just have to outrun you."

The point you're missing, would-be spammer, is that all we need to do is put in just a bit of effort to make you work a whole lot harder, and you'll quickly move on to an easier target. There are plenty to be had.
