content-extract / content-processor
Robust PHP library for batch document processing. Extracts content from PDFs/text and generates structured JSON according to user-defined schemas. Now with semantic structuring, OCR support for scanned PDFs, text normalization, and alias-driven field matching. Production-ready, secure, zero unnecess
Package info
github.com/saul9809/content_extract-library
pkg:composer/content-extract/content-processor
1.5.0
2026-04-20 06:29 UTC
Requires
- php: >=8.1
- smalot/pdfparser: ^2.0
Requires (Dev)
- phpunit/phpunit: ^11.0
- squizlabs/php_codesniffer: ^3.7