xp-forge/pdf-parser

PDF Parser

Maintainers

Package info

github.com/xp-forge/pdf-parser

Homepage

pkg:composer/xp-forge/pdf-parser

Statistics

Installs: 5

Dependents: 0

Suggesters: 0

Stars: 0

Open Issues: 0

v0.1.0 2026-05-30 16:36 UTC

This package is auto-updated.

Last update: 2026-05-30 16:37:51 UTC


README

Build status on GitHub XP Framework Module BSD Licence Requires PHP 7.4+ Supports PHP 8.0+ Latest Stable Version

Parses PDF files to extract text and images.

Example

Low-level usage:

use com\adobe\pdf\PdfReader;
use util\cmd\Console;
use io\streams\FileInputStream;

$reader= new PdfReader(new FileInputStream($argv[1]));

// Create objects lookup table while streaming
$objects= $trailer= [];
foreach ($reader->objects() as $kind => $value) {
  if ('object' === $kind) {
    $objects[$value['id']->hashCode()]= $value['dict'];
  } else if ('trailer' === $kind) {
    $trailer+= $value;
  }
}

Console::writeLine('Trailer: ', $trailer);

// Optional meta information like author and creation date
if ($info= ($trailer['Info'] ?? null)) {
  Console::writeLine('Info: ', $objects[$info->hashCode()]);
}

// Root catalogue and pages enumeration
Console::writeLine('Root: ', $objects[$trailer['Root']->hashCode()]);
Console::writeLine('Pages: ', $objects[$trailer['Pages']->hashCode()]);

See also