shababsoftwares/ocr-text-extractor

This is a simple php project using Tesseract to extract text from any image.

v1.0 2024-01-19 10:37 UTC

This package is auto-updated.

Last update: 2024-05-19 16:25:21 UTC


README

Downloads License

This is simple PHP Code using Tesseract-OCR to read and extract Text in any Image.

Installation

First, install the package through Composer.

composer create-project shababsoftwares/ocr-text-extractor

How to Install Tesseract-OCR

You need to have Tesseract-OCR installed and set Environment.

On Window 10,11

Installation guide on this link How to install tesseract ocr on window

Install Tesseract-OCR Edit Path under Environment System variables add new path %InstalationPath%/Tesseract-OCR like C:\Program Files (x86)\Tesseract-OCR

Add New Variable under System variables Variable Name: TESSDATA_PREFIX Variable Value: %InstalledPath%/Tesseract-OCR // C:\Program Files (x86)\Tesseract-OCR\

On Linux / ubuntu

Step 1 : Update your system Begin the installation process by updating the APT Index.

    sudo apt update

Step 2 : Add Tesseract OCR 5 PPA to your system. To add the Tesseract OCR 5 PPA to your system, run the command below

    sudo add-apt-repository ppa:alex-p/tesseract-ocr-devel

Step 3 : Install Tesseract on Ubuntu Run the command :

    sudo apt install -y tesseract-ocr

Once installation is complete update your system

    sudo apt update 

Confirm the Tesseract version installed.

    $ tesseract --version

Set Path of Image and Output text file.

shell_exec('"C:\\Program Files (x86)\\Tesseract-OCR\\tesseract" "C:\\xampp\\htdocs\\OCR-Text-Recognition\\images\\'.$file_name.'" out');

License

The MIT License (MIT). Please see LICENSE for more information.

Shabab Softwares

www.shababsoftwares.com

Shabab Softwares (c) 2024