rem42 / scraper
API Scraper website
v3.2.0
2024-06-05 07:10 UTC
Requires
- php: ^8.1
- symfony/http-client-contracts: ^3.5
- symfony/serializer-pack: ^1.3
Requires (Dev)
- phpstan/phpstan: ^1.11
- phpunit/phpunit: ^9.6
- rem42/php-cs-fixer-config: ^3.6
- dev-main
- 3.x-dev
- v3.2.0
- v3.1.0
- v3.0.5
- v3.0.4
- v3.0.3
- v3.0.2
- v3.0.1
- v3.0
- v2.2
- v2.1
- v2.0
- v2.0-beta.7
- v2.0-beta.6
- v2.0-beta.5
- v2.0-beta.4
- v2.0-beta.3
- v2.0-beta.2
- v2.0-beta.1
- v1.0.11
- v1.0.10
- v1.0.9
- v1.0.8
- v1.0.7
- v1.0.6
- v1.0.5
- v1.0.4
- v1.0.3
- v1.0.2
- v1.0.1
- v1.0
- dev-update-readable-class-and-readme
- dev-whitesource/configure
- dev-renovate/all
- dev-dependabot/composer/phpunit/phpunit-tw-10.1
- dev-master
This package is auto-updated.
Last update: 2026-04-14 22:06:33 UTC
README
Lightweight toolbox to build reusable "scrapers":
- Declare a Request class annotated with the PHP attribute
#[Scraper(...)]. - Provide the corresponding Api class (replace "Request" with "Api" in the name) which extends
\Scraper\Scraper\Api\AbstractApiand implementsexecute(). - Use
\Scraper\Scraper\Clientwith anHttpClientInterfaceto execute the request and retrieve the deserialized object.
Badges
Installation
composer require rem42/scraper "^3.0"
Short introduction
The package centralizes the following logic:
- A Request (under
src/Request/) defines the necessary data and exposes getters used in path variables. - The attribute
#[\Scraper\Scraper\Attribute\Scraper(...)](on the Request) describesmethod,scheme,host,path. \Scraper\Scraper\Client::send()reads this attribute (viaExtractAttribute), builds the HTTP options (headers, query, body, json, auth) according to the interfaces implemented by the Request, then performs the HTTP call.- The matching Api class (eg:
FooApi) is instantiated and itsexecute()method returns the final object/array/string.
Quickstart (minimal example)
Schematic example (adapt according to your autoload/imports). Examples use use imports:
use Symfony\Component\HttpClient\HttpClient; use Scraper\Scraper\Client; use Scraper\Scraper\Request\ScraperRequest; use Scraper\Scraper\Attribute\Scraper; use Scraper\Scraper\Attribute\Method; use Scraper\Scraper\Attribute\Scheme; use Scraper\Scraper\Api\AbstractApi; #[Scraper( method: Method::GET, scheme: Scheme::HTTPS, host: 'example.com', path: '/items/{id}' )] class ItemRequest extends ScraperRequest { public function __construct(private string $id) {} public function getId(): string { return $this->id; } } // Provide a matching Api: ItemApi extends AbstractApi $http = HttpClient::create(); $client = new Client($http); $result = $client->send(new ItemRequest('42'));
Important conventions
- PSR-4 root namespace:
Scraper\\Scraper\\->src/(seecomposer.json). - Naming convention:
XRequest->XApi(Client performs this replacement automatically using reflection). - In the
pathattribute, variables{name}are replaced by callinggetName()on the Request instance (seesrc/Attribute/ExtractAttribute.php). - Implement the interfaces in
src/Request/to enable options:RequestHeaders,RequestQuery,RequestBody,RequestBodyJson,RequestAuthBearer,RequestAuthBasic.
Tests / quality / style
- Run unit tests:
composer run unit-test
# or
./vendor/bin/phpunit
- Static analysis (phpstan):
composer run static-analysis
- Check / apply coding style (php-cs-fixer):
composer run code-style-check composer run code-style-fix
PHP compatibility
composer.json requires php: ^8.4 — the code uses enums and recent types, so PHP 8.4+ is recommended.
Resources and documentation for agents
- Agent helper file:
AGENTS.md(tips, patterns, commands). Seepackages/scraper/AGENTS.md. - Key code points:
src/Client.php,src/Attribute/ExtractAttribute.php,src/Factory/SerializerFactory.php.
Non-exhaustive list of published scrapers
- rem42/scraper-allocine
- rem42/scraper-colissimo
- rem42/scraper-deezer
- rem42/scraper-giantbomb
- rem42/scraper-jeuxvideo
- rem42/scraper-prestashop
- rem42/scraper-shopify
- rem42/scraper-tmdb
- rem42/scraper-tnt
Contributing
See AGENTS.md for rules and patterns to follow. For PRs: green tests + highest phpstan level.