ges / ocr
Core document processing services for OCR, classification, extraction, and normalization.
Requires
- php: ^8.4
- illuminate/http: ^12.0
- illuminate/support: ^12.0
- nesbot/carbon: ^3.0
Requires (Dev)
- mockery/mockery: ^1.6
- orchestra/testbench: ^10.0
- pestphp/pest: ^3.0
- pestphp/pest-plugin-laravel: ^3.0
README
Laravel package for document OCR, classification, extraction, and normalization.
This package is built for French business and identity documents, with current support for:
identity_cardresidence_permitpassportvisacrew_cardtravel_documentother_identity_documentkbisacte_propriete(land-title deed only)msa(parcel table)
What This Package Does
Input pipeline:
- detect technical input type:
image,pdf_text,pdf_scan - transcribe images and scanned PDFs
- classify the business document type
- extract structured data
- normalize values into a stable shape
- return a
ProcessedDocumentResult
Current model strategy:
qwen2.5vl:7bfor visual transcription onlyqwen2.5:7bfor classification and structured extraction
Available AI providers:
ollamaopenai
Provider strategy:
ollamauses a multi-step pipeline: vision transcription, classification, extraction, optional MRZ mergeopenaiuses a single structured request per document and returns classification plus extracted data in one response
Package Boundaries
This package contains:
- OCR/transcription services
- classifier
- extractor
- normalizer
- schema factory
- AI clients for Ollama and OpenAI
- package
DocumentProcessingmodel - package migration and factory
- install command
This package does not own your application workflow.
Typical app-specific code stays outside:
- accepted
Documentmodel - upload flow
- matching an identity document against a user
- deciding whether to persist a final document
- queue jobs tied to your app domain
Install
composer require ges/ocr
Then install package assets:
php artisan ocr:install
Or install and migrate immediately:
php artisan ocr:install --migrate
Optional install flags:
php artisan ocr:install --check php artisan ocr:install --no-config php artisan ocr:install --no-migrations php artisan ocr:install --force
What this command does:
- publishes
config/ges-ocr.php - publishes package migrations
- optionally runs
php artisan migrate - optionally runs
php artisan ocr:health
Health check command:
php artisan ocr:health
It checks:
pdftotextpdftoppm- selected AI provider connectivity
- configured text and vision models
Configuration
Published config file:
config/ges-ocr.php
Main environment variables:
GES_OCR_AI_PROVIDER=ollama GES_OCR_CLASSIFICATION_CONFIDENCE_THRESHOLD=0.75 GES_OCR_MAX_PAGES=0 OLLAMA_BASE_URL=http://host.docker.internal:11434 OLLAMA_TEXT_MODEL=qwen2.5:7b OLLAMA_VISION_MODEL=qwen2.5vl:7b OLLAMA_CONNECT_TIMEOUT=10 OLLAMA_TIMEOUT=120 OLLAMA_RETRY_TIMES=2 OLLAMA_RETRY_SLEEP_MS=500 OLLAMA_BASIC_AUTH_ENABLED=false OLLAMA_BASIC_AUTH_USERNAME= OLLAMA_BASIC_AUTH_PASSWORD= OPENAI_BASE_URL=https://api.openai.com/v1 OPENAI_API_KEY= OPENAI_TEXT_MODEL=gpt-4.1-mini OPENAI_VISION_MODEL=gpt-4.1-mini OPENAI_CONNECT_TIMEOUT=10 OPENAI_TIMEOUT=120 OPENAI_RETRY_TIMES=2 OPENAI_RETRY_SLEEP_MS=500 GES_OCR_MRZ_OCR_ENABLED=true GES_OCR_CLEANUP_TEMPORARY_FILES=true
GES_OCR_AI_PROVIDER accepts ollama or openai.
GES_OCR_MAX_PAGES=0 means unlimited pages.
Main config areas:
aiollamaopenaimrzprocessing
Optional Ollama upstream basic auth:
OLLAMA_BASIC_AUTH_ENABLED=trueenables HTTP basic auth on requests sent toOLLAMA_BASE_URLOLLAMA_BASIC_AUTH_USERNAMEsets the upstream usernameOLLAMA_BASIC_AUTH_PASSWORDsets the upstream password
Example OpenAI setup:
GES_OCR_AI_PROVIDER=openai OPENAI_API_KEY=sk-... OPENAI_TEXT_MODEL=gpt-4.1-mini OPENAI_VISION_MODEL=gpt-4.1-mini
Public API
Main service:
use Ges\Ocr\DocumentProcessor; $result = app(DocumentProcessor::class)->processFile( path: $absolutePath, mimeType: $mimeType, originalName: $originalName, );
Returned DTO:
originalNamemimeTypepathinputTypedocumentTypestatuspagesCountrawClassificationJsonrawExtractionJsonnormalizedJsonerrorMessage
Main statuses:
pendingprocessingdonefailedneeds_review
Supported Output Shapes
Identity Card
Normalized keys:
document_typecivilityfirst_namelast_namedate_of_birthplace_of_birthdocument_numberexpiry_datenationalitysexstreet_addresspostal_codecity
Residence Permit
Normalized keys:
document_typecivilityfirst_namelast_namedate_of_birthplace_of_birthdocument_numberexpiry_datenationalitysexstreet_addresspostal_codecity
KBIS
Normalized keys:
document_typecompany_nametrade_namelegal_formcapitalregistration_numbersiretsirenestreet_addresspostal_codecitynaf_coderegistration_dateissue_dateregistry_citylegal_representatives
Representative shape:
entity_typecompany_namelegal_formcivilityfirst_namelast_namestreet_addresspostal_codecityregistration_numberregistry_cityrole
Acte Propriete
Important: this currently means French land-title deed only.
Normalized keys:
document_typecadastral_parcelsowners
Parcel shape:
prefixesectionnumerostreet_addresspostal_codecity
Owner shape:
entity_typecompany_namecivilityfirst_namelast_name
Rules:
- owners are acquirers only
- sellers must not be returned as owners
- municipalities and administrations are treated as
company lieudit/leuditmay be used as parcelstreet_address
Package Model
The package provides:
Ges\Ocr\Models\DocumentProcessing
This model stores:
- source file metadata
- detected input type
- business document type
- status
- raw classification JSON
- raw extraction JSON
- normalized JSON
- error message
If your app wants its own subclass, it can extend the package model.
AI Notes
If you are an AI agent working in a project using this package:
- Use
DocumentProcessor::processFile(...)as the main entry point. - Treat
rawClassificationJsonas model output, not final truth. - Treat
normalizedJsonas the stable application-facing payload. - For images and scanned PDFs, the package uses two LLM stages:
- vision transcription
- text classification/extraction
- Exception: when
GES_OCR_AI_PROVIDER=openai, the package uses a one-shot analysis request instead. - Do not assume
acte_proprietemeans generic property deed. In this package it currently means land-title deed only. - Distinguish
identity_cardfromresidence_permit. - Use
residence_permitfor French residence permits andidentity_cardfor French identity cards. - For KBIS:
registration_numberis the rawImmatriculation RCSsireneis 9 digitssiretis optional and only if explicitly present
Tests
Package tests live under:
tests/Unit
Manual OCR fixture tests exist for:
- CIN
- titre de séjour
- KBIS
- land-title deeds
They are gated by:
RUN_MANUAL_OCR_TESTS=1
Current Assumptions
- documents are French documents
- the selected AI provider is reachable from the Laravel app
pdftotextandpdftoppmare available for PDF handling
Non-Goals
This package does not currently provide:
- user/document matching workflow
- approval workflow
- final accepted document persistence
- domain-specific queue orchestration
- UI components
Those belong in the consuming application.