kit-jotform/php-ftfy

Fixes text for you — PHP port of the Python ftfy library

Maintainers

Package info

github.com/kit-jotform/php-ftfy

pkg:composer/kit-jotform/php-ftfy

Statistics

Installs: 10

Dependents: 0

Suggesters: 0

Stars: 1

Open Issues: 0

v1.2.0 2026-04-10 11:33 UTC

This package is auto-updated.

Last update: 2026-04-16 10:13:57 UTC


README

A PHP 8.1+ text-fixing library based on the Python ftfy library (version 6.3.1) by Robyn Speer.

use Ftfy\Ftfy;

echo Ftfy::fixText("(ง'⌣')ง");
// (ง'⌣')ง

What it does

ftfy fixes mojibake — text that was encoded in UTF-8 but decoded as something else (Windows-1252, Latin-1, etc.), producing garbled characters.

use Ftfy\Ftfy;

// Fix common mojibake
Ftfy::fixText('âœ" No problems');
// ✔ No problems

// Fix multiple layers of mojibake
Ftfy::fixText('The Mona Lisa doesn’t have eyebrows.');
// "The Mona Lisa doesn't have eyebrows."

// Fix HTML entities outside of HTML
Ftfy::fixText('PÉREZ');
// PÉREZ

// Correctly-decoded text is left unchanged
Ftfy::fixText('IL Y MARQUÉ…');
// IL Y MARQUÉ…

Installing

composer require kit-jotform/php-ftfy

Requirements: PHP >= 8.1, ext-mbstring, ext-intl

Usage

Ftfy::fixText(string $text, ?TextFixerConfig $config = null): string

Fix all encoding issues in a string.

use Ftfy\Ftfy;

$fixed = Ftfy::fixText('Ã\xa0 perturber la réflexion');
// à perturber la réflexion

Ftfy::fixEncoding(string $text): string

Fix only encoding/mojibake issues, without applying other text fixes.

$fixed = Ftfy::fixEncoding("l'humanité");
// l'humanité

Ftfy::needsFix(string $text, ?TextFixerConfig $config = null): bool

Fast dry-run that checks whether text needs fixing without performing corrections. Use as a gate before fixText() on hot paths — 10-26x faster depending on input.

use Ftfy\Ftfy;

if (Ftfy::needsFix($text)) {
    $text = Ftfy::fixText($text);
}

// Clean text exits almost instantly
Ftfy::needsFix('Hello world');   // false
Ftfy::needsFix('Héllo wörld');   // false

// Detects all fixable issues
Ftfy::needsFix('schön');        // true (mojibake)
Ftfy::needsFix('& test');    // true (HTML entity)
Ftfy::needsFix("\u{201C}test");  // true (curly quotes)

Respects TextFixerConfig — disabled fixers are skipped:

$config = new TextFixerConfig(uncurlQuotes: false);
Ftfy::needsFix("\u{201C}test", $config); // false

Ftfy::fixAndExplain(string $text, ?TextFixerConfig $config = null): array

Returns ['text' => string, 'explanation' => array] with the fixed text and a list of changes made.

[$fixed, $explanation] = array_values(Ftfy::fixAndExplain('âœ" No problems'));
// $fixed      => '✔ No problems'
// $explanation => [['name' => 'fix_encoding', 'cost' => 1, ...]]

Configuration

use Ftfy\Ftfy;
use Ftfy\TextFixerConfig;

$config = new TextFixerConfig(
    unescapeHtml: 'auto',           // 'auto', true, or false — decode HTML entities
    removeTerminalEscapes: true,    // strip ANSI terminal escape sequences
    fixEncoding: true,              // fix mojibake
    restoreByteA0: true,            // restore byte 0xA0 as non-breaking space
    replaceLossySequences: true,    // replace lossy codec sequences
    decodeInconsistentUtf8: true,   // decode inconsistent UTF-8
    fixC1Controls: true,            // fix C1 control characters
    fixLatinLigatures: true,        // expand Latin ligatures (fi → fi)
    fixCharacterWidth: true,        // normalize fullwidth characters
    uncurlQuotes: true,             // straighten curly quotes (' " → ' ")
    fixLineBreaks: true,            // normalize line breaks to \n
    fixSurrogates: true,            // fix surrogate characters
    removeControlChars: true,       // remove control characters
    normalization: 'NFC',           // Unicode normalization form (NFC, NFD, NFKC, NFKD, or null)
);

$fixed = Ftfy::fixText($garbled, $config);

Use $config->with(uncurlQuotes: false) to produce a modified copy.

Note on large inputs: Internally, regex matching uses chunked processing for inputs larger than 8 KB to avoid hitting PCRE backtracking/recursion limits. No configuration is needed — this is handled automatically.

Command-line usage

A CLI script is included at bin/ftfy.

Fix a string directly:

php bin/ftfy "schön"
# schön

Pipe from stdin:

echo "Hello & world" | php bin/ftfy
# Hello & world

Fix a file:

php bin/ftfy --file input.txt

Show what was fixed (explanation goes to stderr):

php bin/ftfy --explain "schön"
# schön
#
# explanation:
#   - encode: sloppy-windows-1252
#   - decode: utf-8

Check if text needs fixing (exit code 1 = needs fix):

php bin/ftfy --needs-fix "schön"
# true

php bin/ftfy --needs-fix "schön"
# false

Override config options with -c key=value (repeatable):

php bin/ftfy -c uncurlQuotes=false "It\u2019s great"
php bin/ftfy -c normalization=NFKC -c fixLineBreaks=false --file input.txt

Install globally (optional):

ln -s "$(pwd)/bin/ftfy" /usr/local/bin/ftfy
ftfy "schön"

Options:

Option Short Description
--explain -e Print what was fixed (to stderr)
--needs-fix -n Print true/false; exit 0 if no fix needed, 1 if fix needed
--file -f Read input from a file
--config key=val -c Set a TextFixerConfig option (repeatable)
--help -h Show help

Boolean config keys accept true/false/1/0: uncurlQuotes, fixEncoding, fixLineBreaks, fixSurrogates, removeControlChars, removeTerminalEscapes, restoreByteA0, replaceLossySequences, decodeInconsistentUtf8, fixC1Controls, fixLatinLigatures, fixCharacterWidth. String keys: unescapeHtml (auto/true/false), normalization (NFC/NFKC/null), maxDecodeLength (integer).

Running tests

composer install
vendor/bin/phpunit tests/

Credits

  • Original Python library: ftfy by Robyn Speer, licensed under Apache 2.0
  • PHP port licensed under MIT