ppajer / webscraper
A straightforward web scraper written in PHP, with support for parallel processing and HTML5.
Requires
This package is auto-updated.
Last update: 2025-03-20 09:16:04 UTC
README
A straightforward web scraper written in PHP, with support for parallel processing and HTML5.
Installation
To start using this package, add it to your composer.json
file and call composer install
, then include the generated autoload.php
in your project. Alternatively, download and include the package along with its dependencies directly into your project.
Dependencies
Usage
The scraper takes 2 inputs: an array of Request Options that define the resources to gather, and an array of Extracton Rules to specify what data we're looking for in those resources. For more information on Request Options or Extraction Rules, read the respective docs.
require 'autoload.php';
$rules = 'path/to/rules.json';
$options = [
'foo' => ['URL' => 'https://...']
];
$scraper = new WebScraper($rules);
$result = $scraper->start($options);