Knowledge base
1000 FAQs, 500 tutorials and explanatory videos. Here, there are only solutions!
Manage file robots.txt created by default
This guide provides information on the file robots.txt
created by default for Web hosting on which this file is missing.
Preamble
- The file
robots.txt
acts as a guide for search engine exploration robots - It is placed at the root of a website and contains specific instructions for these robots, indicating which directories or pages they are allowed to explore and which they must ignore
- However, robots can choose to ignore these guidelines, by making the
robots.txt
one voluntary guide rather than a strict rule
File content
If file robots.txt
is missing from a Infomaniak site, a file of the same name is automatically generated with the following directives:
User-agent: *
Crawl-delay: 10
These instructions tell robots to space their 10 seconds requests, which avoids unnecessary overloading of servers.
Turn around on robots.txt created by default
It is possible to bypass the robots.txt following these steps:
- Create an empty file
robots.txt
(it will only serve as a location for the rules not to apply). - Manage the URI redirection (Uniform Resource Identifier)
robots.txt
to the file of your choice using a file.htaccess
.
Example
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} /robots.txt$
RewriteRule ^robots\.txt$ index.php [QSA,L]
</IfModule>
Explanation
- The module
mod_rewrite
Apache is enabled to allow redirections. - The condition
RewriteCond %{REQUEST_URI} /robots.txt$
check whether the request concerns the filerobots.txt
. - The rule
RewriteRule ^robots\.txt$ index.php [QSA,L]
redirects all queries torobots.txt
toindex.php
, with the option[QSA]
which keeps the query parameters.
It is recommended to place these instructions at the beginning of the file .htaccess
.
Link to this FAQ: