divido / pdf-to-img
Library to convert PDF's into images.
This package's canonical repository appears to be gone and the package has been frozen as a result.
Installs: 5 238
Dependents: 0
Suggesters: 0
Security: 0
Stars: 1
Watchers: 11
Forks: 0
Open Issues: 0
Requires
- php: >=7
- alchemy/binary-driver: ^1.6
- howtomakeaturn/pdfinfo: ^1.1
- psr/http-message: ~1.0
- setasign/fpdi-fpdf: ^1.6
- symfony/event-dispatcher: ^3.2
Requires (Dev)
- kint-php/kint: ^1.1
- phpunit/phpunit: ^5.5
This package is auto-updated.
Last update: 2020-01-10 16:09:29 UTC
README
This library helps to convert PDF documents to images.
Table of Contents
Dependencies
$ brew install imagemagick gs
$ brew install poppler
You'll also need to install the PHP dependencies with Composer:
$ composer install
Testing
$ vendor/bin/phpunit
PDF Sources
The bundled source options for the converter are:
1. Buffer
This source is available for when the entire PDF has been read into a string variable. E.g.
$pdf_source = file_get_contents('example.pdf'); $source = new Buffer($pdf_source, 'example.pdf');
2. BufferBase64
$pdf_source = base64_encode(file_get_contents('example.pdf')); $source = new BufferBase64($pdf_source, 'example.pdf');
3. Stream
// Assume file has been downloaded from S3. $response = $s3->getObject([ "Bucket" => 'your-bucket', "Key" => '/folder/example.pdf' ]); // $response->Body is now a Guzzle Stream which implements PSR-7 StreamInterface $source = new Stream($response->Body, 'example.pdf');
4. FileResource
This source is available for when the PDF is available from a file pointer
$fp = fopen('example.pdf', 'r'); $source = new FileResource($fp, 'example.pdf');
Conversion Engines.
1. ConvertBinaryEngine
$engine = EngineFactory::GetEngine('convert-binary'); // Optionally set arguments. @see https://www.imagemagick.org/script/convert.php for CLI options. $engine->withOptions([ '-quality' => '100', ]);
2. PpmToPdfBinaryEngine
$engine = EngineFactory::GetEngine('pdftoppm-binary'); // Optionally set arguments. @see http://manpages.ubuntu.com/manpages/yakkety/man1/pdftoppm.1.html for CLI options. $engine->withOptions([ 'r' => '150', ]); // Do not set the image type... $engine->withOptions([ '-png': '', // Don't do this...! ]);
Output
The converter will return an Output
object, which has the following methods:
echo $output->getPath(); // string(14) /tmp/j2io0caMA
echo $output->getOriginalPdf(); // string(11) example.pdf // Or with full path echo $output->getOriginalPdf(true); // string(26) /tmp/j2io0caMA/example.pdf
echo $output->getSubsetPdf(); // string(11) example-subset.pdf // Or with full path echo $output->getSubsetPdf(true); // string(26) /tmp/j2io0caMA/example-subset.pdf // Or with full path echo $output->getGeneratedImages(true); // array(2) [ // string(28) /tmp/j2io0caMA/example-1.jpg // string(28) /tmp/j2io0caMA/example-2.jpg // ]
echo $output->getGeneratedImages(); // array(2) [ // string(13) example-1.jpg // string(13) example-2.jpg // ] // Or with full path echo $output->getGeneratedImages(true); // array(2) [ // string(28) /tmp/j2io0caMA/example-1.jpg // string(28) /tmp/j2io0caMA/example-2.jpg // ]
Converter
The Converter
class is where everything is put together and starts the conversion process.
$converter = new Converter($source, $engine); // $source and $engine are desribed above. // This will convert the entire PDF into JPEG's $output = $converter->process("jpg") // This will convert only pages 2 & 4 of the PDF into PNG's // Note that this will also create a subset PDF with just pages 2 & 4 $output = $converter->process("png", [2,4]);
Putting it all together.
An example here used the following:
PDF has been read into a variable as a raw string, from a locally saved PDF (example.pdf) We are using thepdftoppm
binary to do the conversion
We want to save the PDF as credit-agreement.pdf
We want our images to be JPEGs
We want our images to be saved as credit-agreement-<page_num>.jpg
use DividoFinancialServices\PdfToImg\EngineFactory; use DividoFinancialServices\PdfToImg\Converter; use DividoFinancialServices\PdfToImg\Sources\Buffer; // Load a PDF into a string $buffer = new Buffer(file_get_contents('example.pdf'), 'credit-agreement.pdf'); // Create the conversion engine type. In this example we are using the pdftoppm binary. $engine = EngineFactory::GetEngine('pdftoppm-binary'); // Create a Converter with the source PDF and conversion engine. $converter = new Converter($buffer, $engine); // Do the conversion (saving images to JPEG) $output = $converter->process("jpg", [2,4,]); // 2 images (pages 2 & 4) are now saved in a temp folder on the disk. // Get the list of image filenames on disk $images = $output->getGeneratedImages(); // A subset PDF has been created because the pages were specified. THe 2 pages PDF: $subsetPdf = $output->getSubsetPdf(); // Do something with your images (upload to S3, etc..) // When finished, perform a clean up to free up the disk space $converter->cleanUp();