imsoft-dev / devanagari-reshaper
PHP library for correcting Devanagari i-matra (ि) visual rendering order for legacy renderers (PDF, canvas, etc.)
Requires
- php: >=8.0
- ext-mbstring: *
Requires (Dev)
- phpunit/phpunit: ^10.0
README
A lightweight PHP library for correcting Devanagari i-matra (ि) rendering order in systems that do not perform Unicode shaping — such as legacy PDF generators, canvas renderers, or custom font engines.
The Problem
In Unicode logical order, the short-i matra (U+093F) is stored after its consonant:
क + ि → stored as U+0915 U+093F
Modern browsers and OS text stacks reorder this visually. But legacy renderers (FPDF, TCPDF, canvas, some PDF libs) render characters left-to-right in byte order, producing:
कि → displays as कि ✓ (modern)
कि → displays as कि ✗ (legacy — matra appears on the wrong side)
This library physically swaps the bytes so legacy renderers display the correct result.
Features
- ✅ Simple consonants:
क + ि → ि + क - ✅ Combining nukta forms:
ड + ़ + ि → ि + ड + ़ - ✅ Pre-composed nukta forms:
ज़ (U+095B) + ि → ि + ज़ - ✅ Deep conjunct clusters:
स्त्र + ि → ि + स्त्र(unlimited depth) - ✅ HTML-safe: only text nodes are processed, attributes are untouched
- ✅ Invisible marker stripping (BOM, ZWNJ, ZWJ)
- ✅ Batch array reshaping
- ✅ Cheap pre-check via
needsReshaping()
Requirements
- PHP 8.0+
ext-mbstring
Installation
composer require imsoft-dev/devanagari-reshaper
Usage
Plain text
use Devanagari\Reshaper\DevanagariReshaper; $reshaped = DevanagariReshaper::reshapeText('किताब');
HTML (tag-safe)
$html = '<p>किसान</p><span class="nav-link">विकास</span>'; $out = DevanagariReshaper::reshapeHtml($html); // Tag attributes are never modified — only text node content is reshaped
Array of strings
$rows = DevanagariReshaper::reshapeArray(['किसान', 'विकास', 'Hello']);
Guard check (performance)
if (DevanagariReshaper::needsReshaping($text)) { $text = DevanagariReshaper::reshapeText($text); }
How it works
A consonant cluster is defined as:
cluster = consonant_unit (virama consonant_unit)*
Where a consonant_unit is:
consonant_unit = (pre-composed-nukta-form | base-consonant + optional-combining-nukta)
The regex finds every cluster + i-matra sequence and moves the i-matra to the front:
[cluster][U+093F] → [U+093F][cluster]
Running tests
composer install ./vendor/bin/phpunit
License
MIT