liz / php-xpdf
XPDF PHP, an Object Oriented library to manipulate XPDF
Requires
- php: >=5.3.3
- alchemy/binary-driver: ~2.0
Requires (Dev)
- phpunit/phpunit: ~3.7
- sami/sami: ~1.0
- silex/silex: ~1.0
README
PHP-XPDF is an object oriented wrapper for XPDF. For the moment, only PdfTotext wrapper is available.
Installation
It is recommended to install PHP-XPDF through Composer :
{ "require": { "php-xpdf/php-xpdf": "~0.2.0" } }
Dependencies :
In order to use PHP-XPDF, you need to install XPDF. Depending of your configuration, please follow the instructions at on the XPDF website.
Documentation
Driver Initialization
The easiest way to instantiate the driver is to call the `create method.
$pdfToText = XPDF\PdfToText::create();
You can optionaly pass a configuration and a logger (any
Psr\Logger\LoggerInterface
).
$pdfToText = XPDF\PdfToText::create(array( 'pdftotext.binaries' => '/opt/local/xpdf/bin/pdftotext', 'pdftotext.timeout' => 30, // timeout for the underlying process ), $logger);
Extract text
To extract text from PDF, use the getText
method.
$text = $pdtTotext->getText('document.pdf');
You can optionally extract from a page to another page.
$text = $pdtTotext->getText('document.pdf', $from = 1, $to = 4);
You can also predefined how much pages would be extracted on any call.
$pdtTotext->setpageQuantity(2); $pdtTotext->getText('document.pdf'); // extracts page 1 and 2
Use with Silex
A Silex service provider is available
$app = new Silex\Application(); $app->register(new XPDF\XPDFServiceProvider()); $app['xpdf.pdftotext']->getText('document.pdf');
Options can be passed to customize the provider.
$app->register(new XPDF\XPDFServiceProvider(), array( 'xpdf.configuration' => array( 'pdftotext.timeout' => 30, 'pdftotext.binaries' => '/opt/local/xpdf/bin/pdftotext', ), 'xpdf.logger' => $logger, ));
License
This project is licensed under the MIT license.