June 14, 2004

Server stuff

In part because of the recent organizational turmoil here, I started poking around in my server logs again (in ways I haven't done in quite some time). What have I learned? Nothing particularly thrilling, but I did get to the bottom of a couple questions that had been bugging me.

Starting on Memorial Day I began seeing a lot of daily referrals linking straight back to Yahoo! which seemed odd. I figured my hack of a parsing job was screwed up and moved on. Since they're still happening and I was digging around I decided to figure it out. How are all these referrals happening? Simple, they're not. I seem be getting hit by a bunch of sites that act as redirect agents for some porn site in the Netherlands which for some odd reason is using Yahoo! as the referrer. Weird stuff... I don't see the point of it all (unless it is to cause confusion and eventually make us go look, done that, now go away please).

There is also someone (until now known as unknown.level3.net) who is pulling my RSS 1.0 feed every 15 to 35 minutes (it's somewhat erratic but seems to be on a slightly less than 20 minute schedule). It's coming from a user agent of "Mozilla/5.0 [en] (Windows NT 5.0, U)", so I figured that it's someone running a custom plug in of some sort and just wanted to know where. Since I usually have host name lookups on (I'm not getting enough traffic to care about distributing name lookup load elsewhere — as a friend would put it, 'get over with it!') I decided to create a new log temporarily (if possible) and check the IP addresses directly. Apache mod_log_config had what I needed, a formatting argument to access the IP address without turning off HostnameLookups. I created a new format (see below) and added a temporary log file via CustomLog with the ipandhost format in httpd.conf and restarted (sudo apachectl graceful).

LogFormat "%a %h" ipandhost

A few minutes later I had an answer, removed the temporary log entry and restarted again. Seems that someone up in the bay area (close to San Francisco from the traceroute) is my unknown visitor. It might be fun at some point to see if I can do IP based redirection and send a custom feed that helps to spark conversation.

Posted by Dave at June 14, 2004 10:47 PM
Comments

The "point" of the referral spam redirecting through yahoo's servers is to take advantage of google. A lot of bloggers "allow" google to spider their referal logs, and/or post the "last ten referers" on the front page of their site. By referal spamming, the porn site in the netherlands attempts to increase their google ranking

Posted by: Jeff on June 19, 2004 08:03 AM commLink