content-extract/content-processor - Packagist.org

content-extract / content-processor

Robust PHP library for batch document processing. Extracts content from PDFs/text and generates structured JSON according to user-defined schemas. Now with semantic structuring, OCR support for scanned PDFs, text normalization, and alias-driven field matching. Production-ready, secure, zero unnecess

Maintainers

Package info

github.com/saul9809/content_extract-library

pkg:composer/content-extract/content-processor

Statistics

Security

Aikido package health analysis

1.5.0 2026-04-20 06:29 UTC

Requires

php: >=8.1
smalot/pdfparser: ^2.0

Requires (Dev)

phpunit/phpunit: ^11.0
squizlabs/php_codesniffer: ^3.7

Suggests

None

Provides

None

Conflicts

None

Replaces

None

MIT 62f39541e88b7d73b6b1b138d97341cd6f2ac76d

Content Extract Contributors <info.woop@content-extract.org>

php security pdf PSR-4 batch-processing json-schema psr-12 production-ready content-extraction document-processing

dev-main
1.5.0
1.4.0
1.3.1
v1.3.0
dev-copilot/vscode-mo5vidd6-4itm

This package is auto-updated.

Last update: 2026-04-20 12:15:06 UTC