First try this from command prompt, type in:
grep -i "GoogleBot" c:\temp\logs\*.log | grep " / " | more
You should get an output of text log records, each containing a GoogleBot request.
Notice:
· My log files are located in c:\temp\logs
· I’m looking for requests to the root (“ / “); optional
Next, extract the column containing the requestors IP address; in my log file, it is column number 9:
grep -i "GoogleBot" c:\temp\logs\*.log | grep " / " | cut -f 9 -d " “
Lastly, sort, summarize and store the results to a local file:
grep -i "GoogleBot" c:\temp\logs\*.log | grep " / " | cut -f 9 -d " “ | sort | uniq –c > c:\temp\googlebot-class-c.txt
Note: It’s possible/probably that some of the request headers are fabricated and not actually coming from Google.
Comments