vasily-kartashov / downloader
Downloader Library
Installs: 5 810
Dependents: 0
Suggesters: 0
Security: 0
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Requires
- php: ^7 || ^8
- ext-curl: *
- cache/void-adapter: ^1.0
- psr/cache: ^1.0
- psr/log: ^1.0
Requires (Dev)
- php-parallel-lint/php-parallel-lint: @stable
- phpunit/phpunit: ^6 || ^7 || ^8 || ^9
- squizlabs/php_codesniffer: @stable
- vimeo/psalm: @stable
README
Example
<?php $redisClient = new Redis(); $redisClient->connect('localhost', 6379); $redisCachePool = new RedisCachePool($redisClient); $downloader = Downloader($redisCachePool); $task = Task::builder() ->batch(12) ->retry(3) ->validate(function ($response) { return strlen($response) > 1024; }) ->cache('pages.', 12 * 3600) ->options([ CURLOPT_SSL_VERIFYHOST => false ]) ->throttle(120) ->add(1, 'http://example/page/1') ->add(2, 'http://example/page/2') ... ->add(9, 'http://example/page/9') ->build();
This will
- send multi curl requests to the specified URLs, 12 in each batch.
- The successful responses will be kept in cache for 12 hours.
- The downloader will try to download each page 3 times before moving to the next batch.
- If last failure was less than 2 minutes, a new download will not be attempted.
- Only responses longer than 1024 are treated as successful
$results = $downloader->execute($task); foreach ($results as $result) { if ($result->successful()) { echo $result->content(); } elseif ($result->failed()) { echo 'Failed to fetch'; } elseif ($result->skipped()) { echo 'Skipping result, to avoid too many retries'; } }
ToDo
- Embed Guzzle and use standards, keep this as a lean interface only
- Add more tests