For some reason, my weblog became the target of hundreds of referrer spam hits from pornographic websites over the last week or so. I keep an eye on my referrer logs (a record of URLs that generated traffic to my site), and lately a bunch of URLs showed up which had no business being there. Some URLs are obviously pornographic, but there were one or two that looked innocent enough that when I clicked through to see who had linked to me, I got an eyeful. I really, really, don’t need that.
So, I did some research. I didn’t want to get into a trap of having to hand-modify my .htaccess file or a whitelist or a blacklist file for obvious reasons: the universe of porn and poker sites is potentially infinite. I waste enough time on this blog anyhow!
Angsuman’s Referrer Bouncer looked good, but it doesn’t play well with wp-cache. Other well-documented tricks involved endlessly modifying my .htaccess file. Bad Behavior looked good, too, but I’ve already used that plugin, and disabled it because I saw occasions where it needlessly blocked legitimate access, requiring manual intervention.
So, I settled on and installed Referrer Karma. After the painless installation (it’s not anywhere near one-click, you do have be careful and edit a file), I tested it by using one of the baddie referrers and tricking my Firefox browser to spoof the referrer, and … success! It blocked my access. Then I went to couple of my buddy-bloggers who link to me and tried to click-through and enjoyed more success. Checking the RK logfiles showed what happened: the bad referrers were added to a blacklist, and the good ones added to a whitelist.
Referrer Karma is cleverly engineered. It requires no manual intervention on my part, it does everything automatically. When a page is requested, the referring (linked) page is requested by my server, and it’s checked to see if my URL actually does appear there. Apparently, RK even requests the javascript files to be sure that my link isn’t in some javascript widget on the site, or embedded in an iframe, or anything like that. Once my URL is found embedded in the remote page, the referrer is added to the whitelist and that page need not be checked again. If my URL is not found, the referrer is blocked. Under certain conditions, the IP is blocked, too.
There is some risk that the referring IP is a webmail client or a password-protected forum. For that reason, there is an already-extensive whitelist that comes with RK, and when one of those protected sites hits a page on my blog, they just need to click on the link in the error page to pass through to my site. In one word: Nifty.
There is also some risk in slowing down my page delivery, defeating the purposes of wp-cache. I’ll have to monitor that and see if it becomes a problem. And there’s some exposure in the bandwidth department: I could be subject to a virtual denial-of-service attack just by being hit with so many new referrers that RK has to request an endless stream of pages to check. That could happen, so, I’ll have to monitor my bandwidth utilization as well.
But, all-in-all, not bad for a little research and a few minutes effort.
[tags]BlogRodent, referrer-spam, referrer-karma, bad-behavior, whitelist, blacklist, wordpress, wordpress-plugins, plugins[/tags]
I receive about two or three reports of someone being blocked daily. Every single one of those has turned out to be something specific to that user’s environment — just like it says on the error page — like a poorly configured, malicious or abused proxy server, or the fact that their system is chock full of viruses sending you exactly the spam you’re trying to get rid of.
Not a single one has turned out to be a truly “legitimate” browser access. The very few which indeed were actual human beings were, as you are probably well aware, highly suspicious, and for good reason.
You claim legitimate accesses were blocked, and yet didn’t follow the very simple directions on that page you linked to? Shame on you.
> but it doesn’t play well with wp-cache
It works with wp-cache. However because of the architecture of wp-cache the plugin (any plugin for that matter) is not invoked for the cached page. So it can only bounce when the plugin is invoked. In effect that means it appears to be partially working. As it stands no plugin based solution will be able to serve your needs with wp-cache, with its present architecture. The only way is to “continuosly modify” .htaccess, either directly or through some nice UI.
I should also note that something’s terribly wrong with your Referrer Karma installation. It’s going into a nice endless loop and never actually displaying the blog.
This isn’t going to please people who come to your site from links to it, should they happen to run across one.
Thanks, Michael and Angsuman, for your quick responses to my blog post.
Yes, Michael, I read your blog post on dealing with blocked accesses. As I recall, I also decided to stop using Bad Behavior because Akismet provided enough protection that I felt I didn’t need Bad Behavior as much. At the time my chief problem was comment spam and that was my reason for initially installing BB. When Akismet worked, I felt I didn’t need two plugins performing essentially the same service. Akismet does come with its own set of problems–notably false positives–but I haven’t experienced that problem yet. I apologize if you felt I gave your plugin a bad report–it worked fine, and I used it for some time, and enjoyed success with it. However, I had some friends who could not get through, and I had no control over their proxying environment, thus, I had to disable it several times in order for those folks to get through. In the end, Akismet provided what I needed without the extra protection BB provided.
Now that I was faced with the new problem of referrer spam, I wanted a solution that at least let me know when users were getting blocked–and Referrer Karma does that. When I saw IOERROR in the blocked logs, I immediately whitelisted it. Apparently, when RK attempted to access your aggregator feed, it didn’t find my content, and all subsequent clickthroughs from your domain got 302’d. Sorry about that.
Fortunately, I have been completely spam referrer free for the last 24 hours, and GrabPerf shows no obvious impact to site responsiveness.
Angsuman, I seem to have gotten RK to work with wp-cache with the help of the mclude wrapper for my RK php functions. It’s still too early to tell, I may experience more problems requiring the removal of RK, but I suspect (now) that the mclude wrapper would work for your plugin as well.
Again, for both of you, I’m sorry if it seemed I was dissing your plugins. I appreciate the WordPress community a great deal–this is a great publishing and authoring platform, and I’m indebted to the many plugin authors like you two who tirelessly give away the fruits of your intellect and labor to people who all-too-rarely say thanks and all-too-easily criticize your labor.
Thanks for your work on our behalf, and thanks for stopping by.
Regards,
Rich.
BlogRodent