lexxtor / easy-php-crawler
There is no license information available for the latest version (dev-master) of this package.
Simple yet flexible URL crawler.
dev-master / 0.0.x-dev
2017-02-20 10:16 UTC
Requires
- php: >=5.4
This package is not auto-updated.
Last update: 2025-04-26 23:56:46 UTC
README
It's a simple yet flexible crawler for parsing URLs and loading content.
Usage Example
<?php use Lexxtor\EasyPhpCrawler\EasyPhpCrawler; require 'EasyPhpCrawler.php'; EasyPhpCrawler::go('http://news.yandex.ru', [ 'beforeLoadUrl' => function($url, $crawler) { echo $crawler->currentUrlIndex . '/' . $crawler->getQueueSize() . " $url "; }, 'afterLoadUrlSuccess' => function($url, $content, $crawler) { echo 'loaded: ' . strlen($content) . "\n"; }, 'afterLoadUrlFail' => static function($url, $errorMessage, $crawler) { echo 'Error: ' . $errorMessage . "\n"; }, 'allowUrlRules' => [ '/\/\/news.yandex.ru\//', ], 'denyUrlRules' => [ '/search/', '/\/$/', '/maps/', '/themes/', '/\?redircnt=/', ], ]);