kit-jotform / php-ftfy
Fixes text for you — PHP port of the Python ftfy library
Requires
- php: >=8.1
- ext-intl: *
- ext-mbstring: *
Requires (Dev)
- phpunit/phpunit: ^11.0
README
A PHP 8.1+ port of the Python ftfy library (version 6.3.1) by Robyn Speer.
use Ftfy\Ftfy; echo Ftfy::fixText("(ง'⌣')ง"); // (ง'⌣')ง
What it does
ftfy fixes mojibake — text that was encoded in UTF-8 but decoded as something else (Windows-1252, Latin-1, etc.), producing garbled characters.
use Ftfy\Ftfy; // Fix common mojibake Ftfy::fixText('âœ" No problems'); // ✔ No problems // Fix multiple layers of mojibake Ftfy::fixText('The Mona Lisa doesn’t have eyebrows.'); // "The Mona Lisa doesn't have eyebrows." // Fix HTML entities outside of HTML Ftfy::fixText('PÉREZ'); // PÉREZ // Correctly-decoded text is left unchanged Ftfy::fixText('IL Y MARQUÉ…'); // IL Y MARQUÉ…
Installing
composer require kit-jotform/php-ftfy
Requirements: PHP >= 8.1, ext-mbstring, ext-intl
Usage
Ftfy::fixText(string $text, ?TextFixerConfig $config = null): string
Fix all encoding issues in a string.
use Ftfy\Ftfy; $fixed = Ftfy::fixText('Ã\xa0 perturber la réflexion'); // à perturber la réflexion
Ftfy::fixEncoding(string $text): string
Fix only encoding/mojibake issues, without applying other text fixes.
$fixed = Ftfy::fixEncoding("l'humanité"); // l'humanité
Ftfy::fixAndExplain(string $text, ?TextFixerConfig $config = null): array
Returns ['text' => string, 'explanation' => array] with the fixed text and a list of changes made.
[$fixed, $explanation] = array_values(Ftfy::fixAndExplain('âœ" No problems')); // $fixed => '✔ No problems' // $explanation => [['name' => 'fix_encoding', 'cost' => 1, ...]]
Configuration
use Ftfy\Ftfy; use Ftfy\TextFixerConfig; $config = new TextFixerConfig( fixEntities: true, // decode HTML entities fixEncoding: true, // fix mojibake fixSurrogates: true, // fix surrogate characters fixLineBreaks: false, // normalize line breaks fixLatin: false, // fix Latin-1 lookalikes fixCharWidths: false, // normalize character widths uncurlQuotes: true, // straighten curly quotes removeTerminalEscapes: true, maxDecodeLength: 1_000_000, ); $fixed = Ftfy::fixText($garbled, $config);
Use $config->with(fixEntities: false) to produce a modified copy.
Command-line usage
A CLI script is included at bin/ftfy.
Fix a string directly:
php bin/ftfy "schön" # schön
Pipe from stdin:
echo "Hello & world" | php bin/ftfy # Hello & world
Fix a file:
php bin/ftfy --file input.txt
Show what was fixed (explanation goes to stderr):
php bin/ftfy --explain "schön" # schön # # explanation: # - encode: sloppy-windows-1252 # - decode: utf-8
Install globally (optional):
ln -s "$(pwd)/bin/ftfy" /usr/local/bin/ftfy ftfy "schön"
Options:
| Option | Short | Description |
|---|---|---|
--explain |
-e |
Print what was fixed (to stderr) |
--file |
-f |
Read input from a file |
--help |
-h |
Show help |
Running tests
composer install vendor/bin/phpunit tests/
Credits
- Original Python library: ftfy by Robyn Speer, licensed under Apache 2.0
- PHP port licensed under MIT