braseidon / mole
Scraper is a fast and efficient web scraper that targets various forms of data.
Requires
- php: >=5.4.0
- braseidon/rolling-curl: dev-master
- chuyskywalker/rolling-curl: ~3.1.0
- stefangabos/zebra_curl: dev-master
- symfony/css-selector: ~2.6
- symfony/dom-crawler: ~2.6
Requires (Dev)
- mockery/mockery: ~0.9
- phpunit/php-token-stream: >=1.3.0
This package is not auto-updated.
Last update: 2024-11-09 16:48:08 UTC
README
Mole (name is still being decided) is a powerful web scaping tool. Basically, you specify a URL and what you are scraping (outbound links or emails so far), and Mole will continuously crawl until a limit is reached.
Highlights
- Crawl websites and gather any sort of information you want using regex
- All requests can be sent through a random proxy from your list of proxies
- Powered by the powerful RollingCurl, a FAST curl wrapper that supports asynchronous threads
- Framework-agnostic, will work with any project
- Composer ready and PSR-2 compliant
Documentation
Documentation will be finished when v1.0.0 is up.
Installation
Mole is available via Composer:
$ composer require braseidon/scraper
Testing
Mole has a PHPUnit test suite. To run the tests, run the following command from the project folder:
$ phpunit
Contributing
Contributions are more than welcome and will be fully credited.
Security
If you discover any security related issues, please email brandon@poseidonwebstudios.com instead of using the issue tracker.
Credits
License
The MIT License (MIT). Please see LICENSE for more information.