A wrapper for the Tesseract OCR engine

Installs: 651

Dependents: 0

Stars: 8

Watchers: 2

Forks: 1

Open Issues: 1

Language: PHP

1.1 2013-06-26 21:04 UTC


Build Status

A small PHP >=5.3 library that makes working with the open source Tesseract OCR engine easier.


You need a working Tesseract installation. For more information about installation and adding language support, see Tesseract's README.

Then install this library, which is available on Packagist, through Composer:

$ composer require ddeboer/tesseract:1.0


If the tesseract binary is in your path, just do:

use Ddeboer\Tesseract\Tesseract;

$tesseract = new Tesseract();

Otherwise, construct Tesseract with the path to the binary:

$tesseract = new Tesseract('/usr/local/bin/tesseract');

Get version and supported languages information:

$version = $tesseract->getVersion();

$languages = $tesseract->getSupportedLanguages();

Perform OCR on an image file:

$text = $tesseract->recognize('myfile.tif');

Optionally, specify the language(s) as second argument:

$text = $tesseract->recognize('myfile.tif', array('nld', 'eng'));

And specify Tesseract’s page seg mode as third argument:

$text = $tesseract->recognize('myfile.tif', null, Tesseract::PAGE_SEG_MODE_AUTOMATIC_OSD);