1tomany/pdf-pack

A simple PHP library that makes rasterizing pages and extracting text from PDFs for large language models easy

Maintainers

Package info

github.com/1tomany/pdf-pack

pkg:composer/1tomany/pdf-pack

Statistics

Installs: 3

Dependents: 1

Suggesters: 0

Stars: 4

Open Issues: 0

v0.6.0 2026-03-05 00:02 UTC

This package is auto-updated.

Last update: 2026-03-05 00:07:02 UTC


README

pdf-pack is a simple PHP library that makes rasterizing pages and extracting text from PDFs for large language models easy. It uses a single dependency, the Symfony Process Component, to interface with the Poppler command line tools from the xpdf library.

Installation

Install the library using Composer:

composer require 1tomany/pdf-pack

Installing Poppler

Before beginning, ensure the pdfinfo, pdftoppm, and pdftotext binaries are installed and located in the $PATH environment variables.

macOS

brew install poppler

Debian and Ubuntu

apt-get install poppler-utils

Usage

This library has three main features:

  • Read PDF metadata such as the number of pages
  • Rasterize one or more pages to JPEG or PNG images
  • Extract text from one or more pages

Extracted data is stored in memory and can be written to the filesystem or converted to a data: URI. Because extracted data is stored in memory, this library returns a \Generator object for each page that is extracted or rasterized.

Using the library is easy, and you have two ways to interact with it:

  1. Direct Instantiate the OneToMany\PdfPack\Client\Poppler\PopplerClient class and call the methods directly. This method is easier to use, but comes with the cost that your application will be less flexible and testable.
  2. Actions Create a container of OneToMany\PdfPack\Contract\Client\ClientInterface objects, and use the OneToMany\PdfPack\Factory\ClientFactory class to instantiate them.

Note: A Symfony bundle is available if you wish to integrate this library into your Symfony applications with autowiring and configuration support.

Direct usage

See examples/direct.php.

Credits

License

The MIT License