arnapou/encoder

Library - Basic encoders with a common interface.

v2.1 2024-02-03 18:09 UTC

This package is auto-updated.

Last update: 2024-03-03 17:26:18 UTC


README

pipeline coverage

This libray expose basic encoders with a common interface.

Installation

composer require arnapou/encoder

packagist 👉️ arnapou/encoder

Book Encoders

These encoders are predictive encoders which offers great compression capabilities on a specific known book of words.

These encoders are bad for strings which are not in the book : they take 2 bytes instead of 1.

One byte encoder (8 bits)

Maximum number of words : 255

use Arnapou\Encoder\Book\ArrayBook;
use Arnapou\Encoder\Book\OneByteBookEncoder;

$book = new ArrayBook(['Hello', 'World']);
$encoder = new OneByteBookEncoder($book);

// 11 to 4 bytes (-63%)

var_dump(bin2hex($encoder->encode('Hello World')));
// string(8) "00ff2001"

var_dump($encoder->decode(hex2bin('00ff2001')));
// string(11) "Hello World"

Two bytes encoder (16 bits)

Maximum number of words : 65535

This is mostly usefull when you have a lot of words to store (> 256) and each of them is at least 2 bytes long.

use Arnapou\Encoder\Book\ArrayBook;
use Arnapou\Encoder\Book\TwoBytesBookEncoder;

$book = new ArrayBook(['https://', 'arnapou.net']);
$encoder = new TwoBytesBookEncoder($book);

// 19 to 4 bytes (-79%)

var_dump(bin2hex($encoder->encode('https://arnapou.net')));
// string(6) "00010000"

var_dump($encoder->decode(hex2bin('00010000')));
// string(19) "https://arnapou.net"

Example

You can use this encoder for shortening predefined text patterns.

I use it to shorten urls which are on the same patterns.

Look at available books in src/Book :

  • ArrayBook : simple implementation of generic iterable book
  • EnglishBook : built-in book for english text
  • WebBook : built-in book for URIs

Feel free to submit dedicated books.

use Arnapou\Encoder\Book\EnglishBook;
use Arnapou\Encoder\Book\OneByteBookEncoder;

$book = new EnglishBook();
$encoder = new OneByteBookEncoder($book);

// 22 to 11 bytes (-50%)

var_dump(bin2hex($encoder->encode('This is a small string')));
// string(11) "ff54446e5675aeac75d446"

var_dump($encoder->decode(hex2bin('ff54446e5675aeac75d446')));
// string(22) "This is a small string"

Miscellaneous Encoders

Your will find other useful encoders like

EncoderDescription
Identityno encoding
Hexadecimalhexadecimal encoding
Base64encoding in base 64 with optional trimming
Base64UrlSafeencoding in base 64 for urls (+/ => -_)
Zlibgzdeflate/gzencode/gzcompress family

And a PipelineEncoder which is very usefull to chain encoders :

use Arnapou\Encoder\Base64\Base64Encoder;
use Arnapou\Encoder\PipelineEncoder;
use Arnapou\Encoder\Zlib\ZlibEncoder;
use Arnapou\Encoder\Zlib\ZlibEncoding;

$encoder = new PipelineEncoder(
    new ZlibEncoder(encoding: ZlibEncoding::raw),
    new Base64Encoder(),
);

var_dump($encoder->encode('Lorem ipsum dolor sit amet'));
// string(38) "88kvSs1VyCwoLs1VSMnPyS9SKM4sUUjMTS0BAA"

var_dump($encoder->decode('88kvSs1VyCwoLs1VSMnPyS9SKM4sUUjMTS0BAA'));
// string(26) "Lorem ipsum dolor sit amet"

Changelog versions techniques

StartTag, BranchPhp
25/11/20232.x, main8.3
13/12/20221.x8.2