tecnickcom / tc-lib-pdf-parser
PHP library to parse PDF documents
Fund package maintenance!
Requires
- php: >=8.1
- ext-pcre: *
- tecnickcom/tc-lib-pdf-filter: ^2.0
Requires (Dev)
- pdepend/pdepend: ^2.16
- phpcompatibility/php-compatibility: ^10.0.0@dev
- phpmd/phpmd: ^2.15
- phpunit/phpunit: ^13.1 || ^12.5 || ^11.5 || ^10.5
- squizlabs/php_codesniffer: ^4.0
- dev-main
- 3.0.46
- 3.0.45
- 3.0.43
- 3.0.42
- 3.0.41
- 3.0.39
- 3.0.38
- 3.0.37
- 3.0.36
- 3.0.35
- 3.0.34
- 3.0.33
- 3.0.32
- 3.0.31
- 3.0.30
- 3.0.28
- 3.0.27
- 3.0.26
- 3.0.25
- 3.0.24
- 3.0.23
- 3.0.21
- 3.0.20
- 3.0.19
- 3.0.18
- 3.0.17
- 3.0.16
- 3.0.15
- 3.0.14
- 3.0.13
- 3.0.12
- 3.0.11
- 3.0.10
- 3.0.9
- 3.0.8
- 3.0.7
- 3.0.5
- 3.0.4
- 3.0.3
- 2.4.33
- 2.4.32
- 2.4.31
- 2.4.29
- 2.4.28
- 2.4.27
- 2.4.26
- 2.4.25
- 2.4.23
- 2.4.22
- 2.4.21
- 2.4.20
- 2.4.19
- 2.4.18
- 2.4.17
- 2.4.16
- 2.4.15
- 2.4.14
- 2.4.13
- 2.4.12
- 2.4.11
- 2.4.10
- 2.4.9
- 2.4.8
- 2.4.7
- 2.4.6
- 2.4.5
- 2.4.4
- 2.4.1
- 2.4.0
- 2.3.9
- 2.3.8
- 2.3.7
- 2.3.6
- 2.3.5
- 2.3.4
- 2.3.3
- 2.3.2
- 2.3.0
- 2.2.3
- 2.2.2
- 2.2.1
- 2.2.0
- 2.1.22
- 2.1.21
- 2.1.20
- 2.1.19
- 2.1.18
- 2.1.17
- 2.1.16
- 2.1.15
- 2.1.14
- 2.1.13
- 2.1.12
- 2.1.11
- 2.1.10
- 2.1.9
- 2.1.8
- 2.1.7
- 2.1.6
- 2.1.5
- 2.1.4
- 2.1.3
- 2.1.2
- 2.1.1
- 2.1.0
- 2.0.1
- 2.0.0
This package is auto-updated.
Last update: 2026-04-20 07:16:41 UTC
README
Parser library for reading and extracting PDF document structures.
If this library helps your analysis pipeline, please consider supporting development via PayPal.
Overview
tc-lib-pdf-parser parses raw PDF data into structured PHP arrays suitable for extraction, analysis, and downstream processing.
The parser is designed for tooling scenarios such as content inspection, metadata extraction, validation, and migration pipelines. It favors clear structured output so applications can build higher-level analysis features without depending on fragile regular-expression parsing.
| Namespace | \Com\Tecnick\Pdf\Parser |
| Author | Nicola Asuni info@tecnick.com |
| License | GNU LGPL v3 - see LICENSE |
| API docs | https://tcpdf.org/docs/srcdoc/tc-lib-pdf-parser |
| Packagist | https://packagist.org/packages/tecnickcom/tc-lib-pdf-parser |
Features
Parsing Capabilities
- Cross-reference and object stream parsing
- Filter-aware stream decoding integration
- Structured output suitable for custom extractors
Runtime Design
- Configuration options for tolerant parsing modes
- Pure-PHP parser with no external service dependency
- Typed exceptions for error handling
Requirements
- PHP 8.1 or later
- Extension:
pcre - Composer
Installation
composer require tecnickcom/tc-lib-pdf-parser
Quick Start
<?php require_once __DIR__ . '/vendor/autoload.php'; $raw = file_get_contents('/path/to/document.pdf'); $parser = new \Com\Tecnick\Pdf\Parser\Parser(['ignore_filter_errors' => true]); $data = $parser->parse((string) $raw); var_dump($data);
Development
make deps
make help
make qa
Packaging
make rpm make deb
For system packages, bootstrap with:
require_once '/usr/share/php/Com/Tecnick/Pdf/Parser/autoload.php';
Contributing
Contributions are welcome. Please review CONTRIBUTING.md, CODE_OF_CONDUCT.md, and SECURITY.md.
Contact
Nicola Asuni - info@tecnick.com