hugsbrugs / php-robots-txt
PHP Robots.txt Utilities
Requires
- hugsbrugs/php-filesystem: dev-master
This package is auto-updated.
Last update: 2024-10-20 02:13:30 UTC
README
This librairy provides utilities function to ease robots.txt manipulation. If you want to check if URLs respect robots.txt policy with optional cache then it's your lucky day ;)
Install
Install package with composer
composer require hugsbrugs/php-robots-txt
In your PHP code, load library
require_once __DIR__ . '/../vendor/autoload.php'; use Hug\Robots\Robots as Robots;
Usage
Returns if a page is accessible by respecting robots.txt policy. Optionaly pass a user agent to also check against UA policy.
Robots::is_allowed($url, $user_agent = null);
With this simple method a call to remote robots.txt will be fired on each request. To enable a cache define following variables
define('HUG_ROBOTS_CACHE_PATH', '/path/to/robots-cache/'); define('HUG_ROBOTS_CACHE_DURATION', 7*86400);
Cache in seconds (86400: 1 day) Don't forget to make your path writable by webserver user robots.txt files are gzcompressed to save disk space
You Should not need following methods unless you want to play with code and tweak it !
Robots::download_robots($url, $user_agent); Robots::get_robots($url, $user_agent); Robots::is_cache_obsolete($file); Robots::empty_cache();
Unit Tests
phpunit --bootstrap vendor/autoload.php tests
Author
Hugo Maugey visit my website ;)