Splogs and bots

| No TrackBacks

There seems to be no end to the constant barrage of blogspot spam blogs or 'splogs' as described by Mark Cuban. It seems that I must modify the rules within my .htaccess file almost daily. These splogs play havoc on your bandwidth. Ain't know free lunch here. Yet another reason why I despise blogspot and consider it to be the scum of the earth. If I could simply get the datacenter to drop blogspot traffic from their router I would be very happy. The only problem is that there is that 1% of legitimate traffic (I happen to correspond with a couple of those blogspot types). Google really needs to tighten the reigns on the proliferation of these ghost accounts.

Another problem that I have with the blogspot setup is that they still do not have an integrated comment/trackback mechanism. For whatever reason, they still use what appears to be third-party solutions (ie Haloscan).. It could be that Google owns Haloscan, but the blogger API has not made provisions for a seemless well integrated trackback solution. The IP addresses of blogspot and haloscan don't match at all. So if you're running RSBL against known open relays, and you receive a trackback ping originating from a blogspot domain, it will actually provide two different IP addresses. So if it's _really_ legitimate traffic it will never see my blog. Pretty funny stuff. Actually, RSBL works quite well, as I have had very few false-positives.

Another interesting development is the fairly new Googlebot 2.1, it seems to disobey everything in robot.txt Extremely aggressive bot, and it will absolutely index everything in its path. It seems to love indexing content that is of no value to search engines. Could this be another winning Google strategy? Who knows. It appears that others have asked the same question.

Bottom line is that I'm going to deploy a script to automatically modify .htaccess in docroot. Wordpress users already have a means dynamically build the .htaccess file. MT headz will have to hack as appropriate. Not a big deal tho.

  • Trackback alive and well
  • Unwanted Image caching
  • Apache 2.2 breakage
  • Housekeeping items
  • No TrackBacks

    TrackBack URL: http://bkaeg.org/cgi-bin/mt/mt-tb.cgi/348

    Monthly Archives

    Pages

    OpenID accepted here Learn more about OpenID
    Powered by Movable Type 4.25

    About this Entry

    This page contains a single entry by AG published on March 11, 2006 12:06 AM.

    RIP M$ was the previous entry in this blog.

    16 Blocks is the next entry in this blog.

    Find recent content on the main index or look in the archives to find all content.