source: web/trunk/www/robots.txt

Last change on this file was 170, checked in by Sam Hocevar, 14 years ago

Add a robots.txt.

  • Property svn:keywords set to Id
File size: 1.1 KB
Line 
1# $Id: robots.txt 170 2009-12-03 22:24:18Z sam $
2
3# Do not crawl CVS and .svn directories (they are 403 Forbidden anyway)
4User-agent: *
5Disallow: CVS
6Disallow: .svn
7Disallow: .git
8
9# Prevent excessive search engine hits
10Disallow: /cgi-bin/trac.cgi
11Disallow: /log
12
13# "This robot collects content from the Internet for the sole purpose of
14# helping educational institutions prevent plagiarism. [...] we compare
15# student papers against the content we find on the Internet to see if we
16# can find similarities." (http://www.turnitin.com/robot/crawlerinfo.html)
17#  --> fuck off.
18User-Agent: TurnitinBot
19Disallow: /
20
21# "NameProtect engages in crawling activity in search of a wide range of
22# brand and other intellectual property violations that may be of interest
23# to our clients." (http://www.nameprotect.com/botinfo.html)
24#  --> fuck off.
25User-Agent: NPBot
26Disallow: /
27
28# "iThenticate® is a new service we have developed to combat the piracy
29# of intellectual property and ensure the originality of written work for
30# publishers, non-profit agencies, corporations, and newspapers."
31# (http://www.slysearch.com/)
32#  --> fuck off.
33User-Agent: SlySearch
34Disallow: /
35
Note: See TracBrowser for help on using the repository browser.