>>
Site Map
>>
Forums
>>
General
Forum module - topics in forum:
General - Got a general or non nuke problem which doesn't fit anywhere else, ask here.
robots using bandwidth
i have two accounts with a host. 100mb space and 1gb bandwidth. they are adequate to run three nuke sites as i use photobucket.com to host my images.
one account exceeded the bandwidth last month after only ten days and the second account is presently sitting close to 90% after about half a month.
i have altered a line in my meta.php file to read
| Code: : |
| echo "<META NAME=\"ROBOTS\" CONTENT=\"INDEX, NOFOLLOW\">\n"; |
which i hope will help prevent this in the future, even though i would have preferred not to as "google is your friend" is a popular internet quote.
in awstats the robot is unnamed (only referred to as "crawl"), so i cannot ban it using robots.txt as, unless i ban all robots mentioned with "crawl" in their names.
any ideas?
i understand that the REVISIT-AFTER tag is useless, as aparently only one robot uses it.
i checked "web ftp stats/awstats" and saw that i had a lot of "Traffic not viewed" (seven times more "not viewed" than "viewed").
"Unknown robot (identified by 'crawl')" accounted for 95% of this. i trawled through my "raw access log" to the time of the "last visit" of the bot to find "Feedster Crawler" had {i]checked in[/i] with robots.txt and loaded up my backend.php and a few articles.
i've subsequently added it to my list of banned-bots
| Code: : |
User-agent: Feedster Crawler
Disallow: / |
with my other domain i did the same and added | Code: : |
#Disallow IBM Crawler
User-agent: http://www.almaden.ibm.com/cs/crawler
Disallow: / |