agussaputrapro/ocrmypdf

A simple PHP wrapper for OCRmyPDF

0.1.3 2022-02-07 07:57 UTC

This package is auto-updated.

Last update: 2024-05-07 12:36:11 UTC


README

A simple PHP wrapper for OCRmyPDF

Installation

Via Composer:

$ composer require mishagp/ocrmypdf

This library depends on OCRmyPDF. Please see the GitHub repository for instructions on how to install OCRmyPDF on your platform.

Usage

Basic example

use mishagp\OCRmyPDF\OCRmyPDF;

//Return file path of outputted, OCRed PDF
echo (new OCRmyPDF('document.pdf'))->run();

//Return file contents of outputted, OCRed PDF
echo (new OCRmyPDF('scannedImage.png'))->setOutputPDFPath(null)->run();

API

This section is a work-in-progress.

setInputData

Pass image/PDF data loaded in memory into ocrmypdf directly via stdin.

use mishagp\OCRmyPDF\OCRmyPDF;

//Using Imagick
$data = $img->getImageBlob();
$size = $img->getImageLength();

//Using GD
ob_start();
imagepng($img, null, 0);
$size = ob_get_length();
$data = ob_get_clean();

echo (new OCRmyPDF())
    ->setInputData($data, $size)
    ->run();

setOutputPDFPath

Specify a writable path where ocrmypdf should generate output PDF.

use mishagp\OCRmyPDF\OCRmyPDF;
echo (new OCRmyPDF('document.pdf'))
    ->setOutputPDFPath('/outputDir/ocr_document.pdf')
    ->run();

setExecutable

Define a custom location of the ocrmypdf executable, if by any reason it is not present in the $PATH.

use mishagp\OCRmyPDF\OCRmyPDF;
echo (new OCRmyPDF('document.pdf'))
    ->setExecutable('/path/to/ocrmypdf')
    ->run();

License

ocrmypdf-php is released under the AGPL-3.0 License.

Credits

Development of ocrmypdf-php is based on the tesseract-ocr-for-php PHP wrapper library for tesseract developed by thiagoalessio and associated contributors.