smartango/verygrabber

There is no license information available for the latest version (dev-main) of this package.

generic grabber based on DOMDocument, grabbing infos from an html page

dev-main 2025-05-12 05:45 UTC

This package is auto-updated.

Last update: 2025-06-12 06:04:34 UTC


README

smartango/verygrabber scrape html following a json schema definition.

It is designed to grab an array of elements, such as table rows or list of DIVs.

Usage

use \smrtg\VeryGrabber\GrabFromSchema;

$doc = file_get_contents(dirname(__FILE__).'/data/file.html');
$grab = new GrabFromSchema($doc);

$schema = file_get_contents(dirname(__FILE__).'/data/schema.json');

$data = $grab->getStruct($schema);

See tests/data/schema.json for the json schema definition: it follows a recursive descending parser concept in the DOM