glow/robots

This package is abandoned and no longer maintained. No replacement package was suggested.

Robots.txt parser and generator toolset

1.5 2015-12-08 18:38 UTC

This package is not auto-updated.

Last update: 2018-06-27 18:59:32 UTC


README

A PHP 5.5 (or greater) toolset for parsing, validating, and generating a robots.txt file.

Installing

The recommended way to install Glow\Robots is to use Composer.

composer require glow/robots

Badges

Service Badge
SensioLabs SensioLabsInsight
Codecov codecov.io
Travis CI Build Status
Gitter Chat Join the chat at https://gitter.im/KingdomCompany/glow-robots
Packagist Packagist Total Downloads https://packagist.org/packages/glow/robots
Codacy Codacy Badge

Usage

Parser

The parser class is used to parse the robots.txt contents into human readable arrays. It can gracefully skip errors occurred, and in some cases fix the errors during the parse procedure.

Methods

Method Visibility Description
__construct public Class Construct
parse protected Parse the contents of the robots.txt source
parseLine protected Workhorse method for parsing the robots.txt lines
getParsed public Returns the parsed contents
getErrors public Returns all of the errors that occurred
setError protected Sets an error
incrementCounter protected Increments counters used throughout the parsing process
isAllowed public Used to determine if a url path is allowed to be crawled
isDisallowed public Used to determine if a url path is not allowed to be crawled
getTR public Returns the tomverran/robots-txt-checker class
getElements public Returns the elements that are searched for during parsing
setElements public Sets the elements that are searched for during parsing
getMeta public Returns the meta data extrapolated during parsing
getSitemaps public Returns an array of sitemaps discovered during the parsing
getUserAgentData public Returns all of the parsed directives for a specified useragent
getUserAgentAllow public Returns all of the allowed elements for a specified useragent
getUserAgentDisallow public Returns all of the disallowed elements for a specified useragent

Basic Usage

A basic example of parser usage:

$p = new Glow\Robots\Parser(file_get_contents('http://cnn.com/robots.txt'));

Validate

The validate class is used to scan for errors and validate the robots.txt contents.

Methods

Method Visibility Description
__construct public Class construct
check public Check the source for errors

Basic Usage

A basic example of validate usage:

$p = new Glow\Robots\Validate(file_get_contents('http://cnn.com/robots.txt'));
if ($p->check()===false) {
	//something is wrong with our robots.txt file
}
else {
	//hooray everything is good with our file!
}