stanford_metatag_nobots for blocking robots
This module prevents search engine robots from crawling and indexing a website while it is still in development. This module should only be enabled if you do not want your website to be indexed. Please disable this module when a site is ‘live’.
This is a simple Drupal Features module blocking search engine robots from indexing a site via the X-Robots-Tag HTTP header.
See https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag for more information on that HTTP header.
To use: enable the Feature. This will cause all requests to your site to return the following HTTP header:
You probably will want to disable this module before launching a site.
To test if it's working, you can use curl:
curl -I https://foo.stanford.edu/
The HTTP headers will be written to stdout
Or, if you want to be more fancy:
curl -sS -I https://foo.stanford.edu/ | grep 'X-Robots'
That should output "X-Robots-Tag: noindex,nofollow,noarchive" if the headers are being sent correctly.
This will block (well-behaved) search engine robots from crawling your website.
Install this module like any other module. See Drupal Documentation
Nothing special needed.
If you are experiencing issues with this module try reverting the feature first. If you are still experiencing issues try posting an issue on the GitHub issues page.
You are welcome to contribute functionality, bug fixes, or documentation to this module. If you would like to suggest a fix or new functionality you may add a new issue to the GitHub issue queue or you may fork this repository and submit a pull request. For more help please see GitHub's article on fork, branch, and pull requests