� RIP M$ | Main | 16 Blocks �

March 11, 2006

Splogs and bots

There seems to be no end to the constant barrage of blogspot spam blogs or 'splogs' as described by Mark Cuban. It seems that I must modify the rules within my .htaccess file almost daily. These splogs play havoc on your bandwidth. Ain't know free lunch here. Yet another reason why I despise blogspot and consider it to be the scum of the earth. If I could simply get the datacenter to drop blogspot traffic from their router I would be very happy. The only problem is that there is that 1% of legitimate traffic (I happen to correspond with a couple of those blogspot types). Google really needs to tighten the reigns on the proliferation of these ghost accounts.

Another problem that I have with the blogspot setup is that they still do not have an integrated comment/trackback mechanism. For whatever reason, they still use what appears to be third-party solutions (ie Haloscan).. It could be that Google owns Haloscan, but the blogger API has not made provisions for a seemless well integrated trackback solution. The IP addresses of blogspot and haloscan don't match at all. So if you're running RSBL against known open relays, and you receive a trackback ping originating from a blogspot domain, it will actually provide two different IP addresses. So if it's _really_ legitimate traffic it will never see my blog. Pretty funny stuff. Actually, RSBL works quite well, as I have had very few false-positives.

Another interesting development is the fairly new Googlebot 2.1, it seems to disobey everything in robot.txt Extremely aggressive bot, and it will absolutely index everything in its path. It seems to love indexing content that is of no value to search engines. Could this be another winning Google strategy? Who knows. It appears that others have asked the same question.

Bottom line is that I'm going to deploy a script to automatically modify .htaccess in docroot. Wordpress users already have a means dynamically build the .htaccess file. MT headz will have to hack as appropriate. Not a big deal tho.

Posted by AG at March 11, 2006 12:06 AM