fossar / transcoder
Better encoding conversion for PHP
Installs: 7 428
Dependents: 1
Suggesters: 0
Security: 0
Stars: 2
Watchers: 1
Forks: 15
Open Issues: 2
Requires
- php: >=7.2.5
Requires (Dev)
- symfony/phpunit-bridge: ^6.2
Suggests
- ext-iconv: For using the IconvTranscoder
- ext-mbstring: For using the MbTranscoder
Replaces
- ddeboer/transcoder: v2.0.0
This package is auto-updated.
Last update: 2024-11-08 16:11:15 UTC
README
Introduction
This is a wrapper around PHP’s mb_convert_encoding
and iconv
functions. This library adds:
- fallback from
mb
toiconv
for encodings it does not support - conversion of warnings to proper exceptions.
Installation
The recommended way to install the Transcoder library is through Composer:
$ composer require fossar/transcoder
This command requires you to have Composer installed globally, as explained in the installation chapter of the Composer documentation.
Usage
Basics
Create the right transcoder for your platform and translate a string to ISO-8859-1 encoding:
use Ddeboer\Transcoder\Transcoder; $transcoder = Transcoder::create(); $result = $transcoder->transcode('España', 'iso-8859-1');
You can also manually instantiate a transcoder of your liking:
use Ddeboer\Transcoder\MbTranscoder; $transcoder = new MbTranscoder();
Or:
use Ddeboer\Transcoder\IconvTranscoder; $transcoder = new IconvTranscoder();
Source encoding
The second argument accepts source encoding and can actually be omitted or passed null
.
$transcoder->transcode('España');
In that case, however, the behaviour is backend-specific:
IconvTranscoder
will use the encoding of the current locale of the process.MbTranscoder
will try to detect encoding from a list based on the value ofmbstring.language
setting. By default, this tries ASCII, followed by UTF-8. The number of supported languages is limited though and the encoding tables often overlap so the detection might be unreliable.
As you can see, this is mostly useless for western languages. You will get much more reliable results when you specify the source encoding explicitly.
Target encoding
Specify a default target encoding as the first argument to create()
:
use Ddeboer\Transcoder\Transcoder; $isoTranscoder = Transcoder::create('iso-8859-1');
Alternatively, specify a target encoding as the third argument in a transcode()
call:
use Ddeboer\Transcoder\Transcoder; $transcoder->transcode('España', 'iso-8859-1', 'UTF-8');
Error handling
PHP’s mv_convert_encoding
and iconv
are inconvenient to use because they generate notices and warnings instead of proper exceptions. This library fixes that:
use Ddeboer\Transcoder\Exception\UndetectableEncodingException; use Ddeboer\Transcoder\Exception\UnsupportedEncodingException; use Ddeboer\Transcoder\Exception\IllegalCharacterException; $input = 'España'; try { $transcoder->transcode($input, 'utf-8', 'not-a-real-encoding'); } catch (UnsupportedEncodingException $e) { // ‘not-a-real-encoding’ is an unsupported encoding } try { $transcoder->transcode('Illegal quotes: ‘ ’', 'utf-8', 'iso-8859-1'); } catch (IllegalCharacterException $e) { // Curly quotes ‘ ’ are illegal in ISO-8859-1 } try { $transcoder->transcode($input); } catch (UndetectableEncodingException $e) { // Failed to automatically detect $input’s encoding (mb) or not a valid string in current locale locale (iconv) }
Transcoder fallback
In general, mb_convert_encoding
is faster than iconv
. However, as iconv
supports more encodings than mb_convert_encoding
, it makes sense to combine the two.
So, the Transcoder returned from create()
: