teamtnt / crawler
Distributed Crawler Architecture
Installs: 1
Dependents: 0
Suggesters: 0
Security: 0
Stars: 9
Watchers: 5
Forks: 3
Open Issues: 3
Type:project
Requires
- php: ^7.1.3
- fideloper/proxy: ^4.0
- guzzlehttp/guzzle: ^6.3
- laravel/framework: 5.8.*
- laravel/tinker: ^1.0
- symfony/filesystem: ^4.2
- symfony/process: ^4.2
Requires (Dev)
- beyondcode/laravel-dump-server: ^1.0
- filp/whoops: ^2.0
- fzaninotto/faker: ^1.4
- mockery/mockery: ^1.0
- nunomaduro/collision: ^3.0
- phpunit/phpunit: ^7.5
This package is auto-updated.
Last update: 2024-11-30 01:59:22 UTC
README
A distributed crawler
Requirements
Installation
Via Composer:
composer require teamtnt/crawler
Configuration
Each instance needs to have an identifier. This can be added in .env
NODE_NAME="Instance 1"
The domain feeder needs to start with a seed domain. After that, running
php artisan crawler
For scraping a single url
php artisan url:frontier www.example.com/something