gavinggordon/sorcerer

An easy-to-use PHP class for scraping webpages' source code.

1.0.0 2017-01-14 10:32 UTC

This package is not auto-updated.

Last update: 2024-09-29 01:07:32 UTC


README

Packagist Version Github Release Usage License

Description

An easy-to-use PHP class for scraping webpages' source code.

Usage

Installation

	$ composer require gavinggordon/sorcerer

Examples

Insantiation

	include( 'vendor/autoload.php' );

	use GGG\Http\Data\Collection\Sorcerer as Sorcerer;
	
	$scraper = new Sorcerer();

Configuration

	$url = 'http://www.testurl.com/index.php';
	
	$regexes = [
		'/\<a\s?[^\>]+?\>(.+)\<\/a\>/i',
		'/\<img\s?([^\>]+?)[\s\/]*?\>/i'
	];
	
	$savefile = __DIR__ . './testurl-scrapedata.txt';
	
	$scraper->configure( $url, $regexes, $savefile );

Run

If no filepath was set for "$savefile",...

	$data = $scraper->scrape();
	
	print_r( $data );

...the scraped data will be returned.

If a filepath was set for "$savefile",...

	$scraper->scrape();

...the scraped data will be saved to the file which you specified.

Issues

If you have any issues at all, please post your findings in the issues page at https://github.com/gavinggordon/sorcerer/issues.

License

This package utilizes the MIT License.