Knowledge base
1000 FAQs, 500 tutorials and explanatory videos. Here, there are only solutions!
Manage the default created robots.txt file
This guide provides information about the robots.txt
file automatically created for Web hosting where this file is missing.
Preamble
- The
robots.txt
file acts as a guide for search engine crawlers - It is placed at the root of a website and contains specific instructions for these robots, indicating which directories or pages they are allowed to explore and which they should ignore
- However, robots may choose to ignore these directives, making the
robots.txt
a voluntary guide rather than a strict rule
File Content
If the robots.txt
file is missing from an Infomaniak site, a file with the same name is automatically generated with the following directives:
User-agent: *
Crawl-delay: 10
These directives tell robots to space out their requests by 10 seconds, which helps to avoid unnecessarily overloading the servers.
Bypassing the default robots.txt
It is possible to bypass the robots.txt by following these steps:
- Create an empty
robots.txt
file (it will serve only as a placeholder so that the rules do not apply). - Manage the redirection of the URI (Uniform Resource Identifier)
robots.txt
to your chosen file using a.htaccess
file.
Example
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} /robots.txt$
RewriteRule ^robots\.txt$ index.php [QSA,L]
</IfModule>
Explanations
- The
mod_rewrite
module of Apache is enabled to allow redirections. - The condition
RewriteCond %{REQUEST_URI} /robots.txt$
checks if the request concerns therobots.txt
file. - The rule
RewriteRule ^robots\.txt$ index.php [QSA,L]
redirects all requests torobots.txt
toindex.php
, with the option[QSA]
that preserves the query parameters.
It is recommended to place these instructions at the beginning of the .htaccess
file.
Link to this FAQ: