Wednesday, February 17, 2010

DNS Settings for Linux

File for client DNS resolution as well as the default domain search is: /etc/resolve.conf

Modify this file:
vi /etc/resolve.conf

to appear like:
search relevantads.com
nameserver 4.2.2.1
nameserver 4.2.2.4


No restart should be required.

You can replace the term "relevantads.com" with your local domain name. That will allow for pinging a machine without fully qualified domain name.

The nameserver records should ideally be provided by your ISP. The above 4.2... records are gold TLD's which may block abusive traffic.

Tuesday, February 02, 2010

Finding Googlebot IP Addresses In IIS Server Logs

I’ve made a parser command to find all references of GoogleBot (case insensitive) in our server logs, extract their source IP addresses, and summarize the hit count. To do this in Windows with your IIS logs files, you will need to have Gnu CoreUtili tools.

First try this from command prompt, type in:

grep -i "GoogleBot" c:\temp\logs\*.log | grep " / " | more

You should get an output of text log records, each containing a GoogleBot request.

Notice:

· My log files are located in c:\temp\logs

· I’m looking for requests to the root (“ / “); optional

Next, extract the column containing the requestors IP address; in my log file, it is column number 9:

grep -i "GoogleBot" c:\temp\logs\*.log | grep " / " | cut -f 9 -d " “

Lastly, sort, summarize and store the results to a local file:

grep -i "GoogleBot" c:\temp\logs\*.log | grep " / " | cut -f 9 -d " “ | sort | uniq –c > c:\temp\googlebot-class-c.txt

Note: It’s possible/probably that some of the request headers are fabricated and not actually coming from Google.

Share Links