dotpack / php-boiler-pipe
PhpBoilerPipe. Boilerplate Removal and Fulltext Extraction from HTML pages
Installs: 4 727
Dependents: 0
Suggesters: 0
Security: 0
Stars: 17
Watchers: 7
Forks: 13
Open Issues: 2
Requires (Dev)
- phpunit/phpunit: 4.0.*
This package is not auto-updated.
Last update: 2024-06-12 17:59:32 UTC
README
Project Archived
This project is no longer maintained. Please refer to pforret/pf-article-extractor for further updates and continued development.
Thank you for your support!
Boilerplate Removal and Fulltext Extraction from HTML pages.
Partial implementation of https://github.com/kohlschutter/boilerpipe in PHP. Requires PHP >= 5.4.
Example
# html $path = "http://example.com/some-article.html"; $data = file_get_contents($path); # code $ae = new DotPack\PhpBoilerPipe\ArticleExtractor(); echo $ae->getContent($data) . "\n";