vipnytt / robotstagparser
X-Robots-Tag HTTP header parser class
v0.2.1
2016-07-15 21:18 UTC
Requires
- php: >=5.6
- ext-mbstring: *
- guzzlehttp/guzzle: 6.*
- vipnytt/useragentparser: ~1.0
Requires (Dev)
- codeclimate/php-test-reporter: >=0.2.0
- phpunit/phpunit: >=4.0
Suggests
- vipnytt/robotstxtparser: Robots.txt parser.
README
X-Robots-Tag HTTP header parser
PHP class to parse X-Robots-Tag HTTP headers according to Google X-Robots-Tag HTTP header specifications.
Requirements:
Note: HHVM support is planned once facebook/hhvm#4277 is fixed.
Installation
The library is available via Composer. Add this to your composer.json
file:
{ "require": { "vipnytt/robotstagparser": "~0.2" } }
Then run composer update
.
Getting Started
Basic example
Get all rules affecting you, this includes the following:
- All generic rules
- Rules specific to your User-Agent (if there is any)
use vipnytt\XRobotsTagParser; $headers = [ 'X-Robots-Tag: noindex, noodp', 'X-Robots-Tag: googlebot: noindex, noarchive', 'X-Robots-Tag: bingbot: noindex, noarchive, noimageindex' ]; $parser = new XRobotsTagParser('myUserAgent', $headers); $rules = $parser->getRules(); // <-- returns an array of rules
Different approaches:
Get the HTTP headers by requesting an URL
use vipnytt\XRobotsTagParser; $parser = new XRobotsTagParser\Adapters\Url('http://example.com/', 'myUserAgent'); $rules = $parser->getRules();
Use your existing GuzzleHttp request
use vipnytt\XRobotsTagParser; use GuzzleHttp\Client; $client = new GuzzleHttp\Client(); $response = $client->request('GET', 'http://example.com/'); $parser = new XRobotsTagParser\Adapters\GuzzleHttp($response, 'myUserAgent'); $array = $parser->getRules();
Provide HTTP headers as an string
use vipnytt\XRobotsTagParser; $string = <<<STRING HTTP/1.1 200 OK Date: Tue, 25 May 2010 21:42:43 GMT X-Robots-Tag: noindex X-Robots-Tag: nofollow STRING; $parser = new XRobotsTagParser\Adapters\TextString($string, 'myUserAgent'); $array = $parser->getRules();
Export all rules
Returns an array containing all rules for any User-Agent.
use vipnytt\XRobotsTagParser; $parser = new XRobotsTagParser('myUserAgent', $headers); $array = $parser->export();
Directives:
-
all
- There are no restrictions for indexing or serving. -
none
- Equivalent tonoindex
andnofollow
. -
noindex
- Do not show this page in search results and do not show a "Cached" link in search results. -
nofollow
- Do not follow the links on this page. -
noarchive
- Do not show a "Cached" link in search results. -
nosnippet
- Do not show a snippet in the search results for this page. -
noodp
- Do not use metadata from the Open Directory project for titles or snippets shown for this page. -
notranslate
- Do not offer translation of this page in search results. -
noimageindex
- Do not index images on this page. -
unavailable_after
- Do not show this page in search results after the specified date/time.
Source: https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag