m6web/roboxt

This package is abandoned and no longer maintained. No replacement package was suggested.

lib used for parsing a robots.txt file

1.1.0 2014-12-08 22:06 UTC

This package is not auto-updated.

Last update: 2021-09-27 00:06:57 UTC


README

Roboxt is a PHP robots.txt file parser.

Usage

    # Create a Parser instance
    $parser = new \Roboxt\Parser();

    # Parse your robots.txt file
    $file = $parser->parse("http://www.google.com/robots.txt");

    # You can verify that an url is allowed by a specific user agent
    $tests = [
        ["/events", "*"],
        ["/search", "*"],
        ["/search", "badbot"],
    ];

    foreach ($tests as $test) {
        list($url, $agent) = $test;
        if ($file->isUrlAllowedByUserAgent($url, $agent)) {
            echo "\n ✔ $url is allowed by $agent";
        } else {
            echo "\n ✘ $url is not allowed by $agent";
        }
    }

    # You can also iterate over all user agents specified by the robots.txt file
    # And check the type of each directive
    foreach ($file->allUserAgents() as $userAgent) {
        echo "\n Agent {$userAgent->getName()}: \n";

        foreach ($userAgent->allDirectives() as $directive) {
            if ($directive->isDisallow()) {
                echo "  ✘ {$directive->getValue()} \n";
            } else if ($directive->isAllow()) {
                echo "  ✔ {$directive->getValue()} \n";
            }
        }
    }

Installation

The recommended way to install Roboxt is through Composer:

$> composer require m6web/roboxt

Running the Tests

Roboxt uses PHPSpec for the unit tests:

$> composer install --dev

$> ./vendor/bin/phpspec run

Credits

License

Roboxt is released under the MIT License.