PHPCrawl is a webcrawler/webspider-library written in PHP. It supports filters, limiters, cookie-handling, robots.txt-handling, multiprocessing and much more.
Due to the main project now seemingly being abandoned (having no updates for 4 years) I am going to proceed to make any changes/fixes in this repository.
- PHP 7 Only - Not backwards compatible with 0.8 versions.
- Introduced namespaces
- Lots of bug fixes
- Refactored various class sections
- Preperation for Windows OS multiprocess mode (pthreads or parallel extension)
Pull requests are welcome