jez500 / web-scraper-for-laravel
A web scraper for laravel
Installs: 699
Dependents: 0
Suggesters: 0
Security: 0
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
pkg:composer/jez500/web-scraper-for-laravel
Requires
- symfony/dom-crawler: ^7.2
Requires (Dev)
- larastan/larastan: ^2.0
- laravel/pint: ^1.20
- orchestra/testbench: ^9.9
README
A package to make it easier to scrape external web pages using laravel.
Key Features
- Support for standard HTTP requests (Using Laravel's HTTP client)
- Support for scraping javascript rendered pages (Using https://github.com/amerkurev/scrapper)
- Rotating user agents to avoid being blocked
- Support for extracting data using CSS selectors
- Support for extracting data using XPath expressions
- Support for extracting data using dot notation from JSON responses
- Support for extracting data using regular expressions
- Caching responses to avoid repeated requests
Usage examples
use Jez500\WebScraperForLaravel\Facades\WebScraper; // Get an instance of the scraper with the body of the page loaded. $scraper = WebScraper::http()->from('https://example.com')->get(); // Get the full page body $body = $scraper->getBody(); // Get the first title element $title = $scraper->getSelector('title')->first(); // Get the content attribute of the first meta tag with property og:image $image = $scraper->getSelector('meta[property=og:image]|content')->first(); // Get all paragraph innerHtml as an array $links = $scraper->getSelector('p')->all(); // Get the first h1 element using XPath $h1 = $scraper->getXpath('//h1')->first(); // Get the href attribute of the first link using XPath $linkHref = $scraper->getXpath('//a', 'attr', ['href'])->first(); // Get values from the page via regex $author = $scraper->getRegex('~"user"\:"(.*)"~')->first(); // Get JSON data $author = WebScraper::http() ->from('https://example.com/page.json') ->get() ->getJson('user.name') ->first(); // Get title from a javascript rendered page $title = WebScraper::api() ->from('https://example.com') ->get() ->getSelector('title') ->first();
Installation
composer require jez500/web-scraper-for-laravel
Contributing
PRs are welcome! It's a good idea to run coding standards and tests locally before submitting a PR. This can be done with:
composer analyse
composer test