spatie / crawler
Crawl all internal links found on a website
Requires
- php: ^8.4
- guzzlehttp/guzzle: ^7.3
- guzzlehttp/psr7: ^2.0
- spatie/robots-txt: ^2.0
- symfony/css-selector: ^7.0|^8.0
- symfony/dom-crawler: ^7.0|^8.0
Requires (Dev)
- pestphp/pest: ^4.0
- phpstan/extension-installer: ^1.4
- phpstan/phpstan: ^2.0
- phpstan/phpstan-deprecation-rules: ^2.0
- phpstan/phpstan-phpunit: ^2.0
- spatie/invade: ^2.1
- spatie/ray: ^1.37
Suggests
- spatie/browsershot: Required for JavaScript rendering with BrowsershotRenderer (^5.0.5)
- dev-main
- v9.x-dev
- 9.0.0
- 8.5.0
- 8.4.7
- 8.4.6
- 8.4.5
- 8.4.4
- 8.4.3
- 8.4.2
- 8.4.1
- 8.4.0
- 8.3.1
- 8.3.0
- 8.2.3
- 8.2.2
- 8.2.1
- 8.2.0
- 8.1.0
- 8.0.4
- 8.0.3
- 8.0.2
- 8.0.1
- 8.0.0
- v7.x-dev
- 7.1.3
- 7.1.2
- 7.1.1
- 7.1.0
- 7.0.5
- 7.0.4
- 7.0.3
- 7.0.2
- 7.0.1
- 7.0.0
- v6.x-dev
- 6.0.2
- 6.0.1
- 6.0.0
- v5.x-dev
- 5.0.2
- 5.0.1
- 5.0.0
- v4.x-dev
- 4.7.6
- 4.7.5
- 4.7.4
- 4.7.3
- 4.7.2
- 4.7.1
- 4.7.0
- 4.6.9
- 4.6.8
- 4.6.7
- 4.6.6
- 4.6.5
- 4.6.4
- 4.6.3
- 4.6.2
- 4.6.1
- 4.6.0
- 4.5.0
- 4.4.3
- 4.4.2
- 4.4.1
- 4.4.0
- 4.3.2
- 4.3.1
- 4.3.0
- 4.2.0
- 4.1.7
- 4.1.6
- 4.1.5
- 4.1.4
- 4.1.3
- 4.1.2
- 4.1.1
- 4.1.0
- 4.0.5
- 4.0.4
- 4.0.3
- 4.0.2
- 4.0.1
- 4.0.0
- 3.2.1
- 3.2.0
- 3.1.3
- 3.1.2
- 3.1.1
- 3.1.0
- 3.0.1
- 3.0.0
- v2.x-dev
- 2.7.1
- 2.7.0
- 2.6.2
- 2.6.1
- 2.6.0
- 2.5.0
- 2.4.0
- 2.3.0
- 2.2.1
- 2.2.0
- 2.1.2
- 2.1.1
- 2.1.0
- 2.0.7
- 2.0.6
- 2.0.5
- 2.0.4
- 2.0.3
- 2.0.2
- 2.0.1
- 2.0.0
- 1.3.1
- 1.3.0
- 1.2.3
- 1.2.2
- 1.2.1
- 1.2.0
- 1.1.1
- 1.1.0
- 1.0.2
- 1.0.1
- 1.0.0
- 0.0.1
This package is auto-updated.
Last update: 2026-03-02 08:37:19 UTC
README
Crawl the web using PHP
This package provides a powerful, easy to use class to crawl links on a website. Under the hood, Guzzle promises are used to crawl multiple URLs concurrently.
Because the crawler can execute JavaScript, it can crawl JavaScript rendered sites. Under the hood, Chrome and Puppeteer are used to power this feature.
Here's a quick example:
use Spatie\Crawler\Crawler; use Spatie\Crawler\CrawlResponse; Crawler::create('https://example.com') ->onCrawled(function (string $url, CrawlResponse $response) { echo "{$url}: {$response->status()}\n"; }) ->start();
Or collect all URLs on a site:
$urls = Crawler::create('https://example.com') ->internalOnly() ->depth(3) ->foundUrls();
You can also test your crawl logic without making real HTTP requests:
Crawler::create('https://example.com') ->fake([ 'https://example.com' => '<html><a href="/about">About</a></html>', 'https://example.com/about' => '<html>About page</html>', ]) ->foundUrls();
Support us
We invest a lot of resources into creating best in class open source packages. You can support us by buying one of our paid products.
We highly appreciate you sending us a postcard from your hometown, mentioning which of our package(s) you are using. You'll find our address on our contact page. We publish all received postcards on our virtual postcard wall.
Documentation
All documentation is available on our documentation site.
Testing
composer test
Changelog
Please see CHANGELOG for more information on what has changed recently.
Contributing
Please see CONTRIBUTING for details.
Security Vulnerabilities
Please review our security policy on how to report security vulnerabilities.
Credits
License
The MIT License (MIT). Please see License File for more information.