covaleski / data-encoding
Data encoding and decoding.
Requires (Dev)
- phpunit/phpunit: ^10
README
PHP complete implementation of RFC 4648. Provides methods for data encoding and decoding in multiple bases - including custom encoding.
Uses bitwise operations to provide faster processing than string functions would do.
Note that both base16 and base64 encoding styles are already natively supported by PHP through base64_encode
/base64_decode
and bin2hex
/hex2bin
functions.
1 Installation
composer require covaleski/data-encoding
2 Usage
The following encoding styles - defined in RFC 4648 - are natively provided by this library:
- base16:
- Encoding with case-insensitive 16 character alphabet;
- Available through
Base16
class;
- base32:
- Encoding with case-insensitive 32 character alphabet;
- Available through
Base32
class;
- base32hex:
- Base32 with base16 extended alphabet;
- Available through
Base32Hex
class;
- base64:
- Encoding with case-sensitive 64 character alphabet;
- Available through
Base64
class;
- base64url:
- Base64 with URL/filename safe character alphabet.
- Available through
Base64Url
class;
You can easily extend the Encoder
class to create custom encoding styles. See 2.3 Creating custom encodings.
2.1 Encoding
Encoding is available through the encode()
static method in all encoder classes listed above.
use Covaleski\DataEncoding\Base32; $encoded = Base32::encode('Dunder Mifflin Paper Company, Inc.'); // Will produce: IR2W4ZDFOIQE22LGMZWGS3RAKBQXAZLSEBBW63LQMFXHSLBAJFXGGLQ=
2.2 Decoding
Decoding is available through the decode()
static method in all encoder classes listed above.
use Covaleski\DataEncoding\Base32; $decoded = Base32::decode('4WGZPZF2VTULPLY='); // Will produce: εδΊ¬θ·―
2.3 Creating custom encodings
For custom encoding, you must extend the Encoder
class and set the following properties (all static):
int $base
:- Required;
- Is the number of characters the alphabet should have;
- Must be a power of 2;
array $alphabet
:- Required;
- Is a list with all the alphabet's characters;
- Must have unique characters only;
- Is accessed with indexes generated by the bits of input data;
bool $isCaseSensitive
:- Required;
- Defines whether the alphabet is case-sensitive;
- Example: if not case-sensitive, decoding
MZXW6===
andmzxw6===
would generate identical results;
string $alphabetEncoding
:- Optional - defaults to "ASCII";
- Is the encoding system used by the alphabet characters;
- Non-ASCII alphabets MUST have their encoding explicitly defined;
- Is used to build and split the encoded string;
string $paddingCharacter
:- Optional - defaults to "=";
- Is used to pad encoded strings;
- Must not exist in the alphabet.
Example with ASCII alphabet:
class Base4 extends Encoder { protected static array $alphabet = [ 'A', 'B', 'C', 'D', ]; protected static int $base = 4; protected static bool $isCaseSensitive = true; } $encoded = Base8Emoji::encode('foo'); // Will produce: BCBCBCDDBCDD
Example with Unicode alphabet:
use Covaleski\DataEncoding\Encoder; class Base8Emoji extends Encoder { protected static array $alphabet = [ 'π', 'π', 'π³', 'π', 'π€‘', 'π', 'β', 'π ', ]; protected static int $base = 8; protected static string $alphabetEncoding = 'UTF-8'; protected static bool $isCaseSensitive = false; } $encoded = Base8Emoji::encode('foo'); // Will produce: πππ€‘βπ πππ
3 Testing
Tests were made with PHPUnit. Use the following command to run all of them.
./vendor/bin/phpunit