iliaal / phpser
Fast binary serializer for PHP cache workloads. Decoder-optimized, beats igbinary on packed numerics, deep-nested structures, and same-class DTO batches.
Requires
- php: >=8.3
This package is auto-updated.
Last update: 2026-05-21 14:10:36 UTC
README
A PHP serialization extension in C, targeting read-heavy cache workloads where decode time matters more than encode time or payload size.
Why phpser?
PHP cache workloads pay decode cost on every read. Encode happens once per write. The default igbinary was the right answer for over a decade, but lags on three shapes that show up everywhere: packed numeric arrays, deep-nested structures, and same-class DTO batches (Laravel queue payloads, cached models).
phpser is decoder-optimized. Pointer-equality dict intern, refcount-reuse of zend_strings, pre-sized hash tables with direct arPacked writes, tagged scalar runs. On the shapes above, it cuts size by 60-65% and decode time by 70-77% vs igbinary. On general-purpose rowsets it sits within 1% of igbinary's size with 25% faster encode and ~5% slower decode.
Not a universal win. Encode on small rowsets (100 rows) costs +30% over igbinary, and object-heavy mixed shapes pay +42% on encode because obj->handlers->get_properties is per-object. The bench table below has the full shape-by-shape breakdown.
Install
# PIE (PHP Foundation's extension installer; uses the composer.json # at the repo root with type: "php-ext") pie install iliaal/phpser
On a minimal PHP image (e.g. php:8.x-cli from Docker Hub), PIE needs a
few build tools installed first:
# Debian/Ubuntu sudo apt install -y git bison libtool-bin unzip # macOS brew install bison libtool
unzip is load-bearing on Debian: composer shells out to /usr/bin/unzip
when extracting PIE's prebuilt-binary zip. If unzip is missing, composer
silently falls back to PHP's ZipArchive which lays the .so out at a
path PIE doesn't check, and install fails with ExtensionBinaryNotFound
even though the zip downloaded fine.
From source
git clone https://github.com/iliaal/phpser.git cd phpser phpize && ./configure --enable-phpser make -j$(nproc) sudo make install echo 'extension=phpser.so' | sudo tee /etc/php/conf.d/phpser.ini
Pre-built binaries
Pre-built .dlls for Windows (PHP 8.3-8.5, TS/NTS, x64) and .sos for
Linux glibc (x86_64, arm64) and macOS arm64 (PHP 8.4-8.5) are attached
to each GitHub release. PIE
fetches the matching binary automatically; falls back to source-build
when no asset matches.
Usage
Basic round-trip. The encoded payload is opaque bytes; treat it as a binary blob in storage (no JSON-safety, no UTF-8 guarantees):
$payload = phpser_serialize(['id' => 42, 'name' => 'row', 'tags' => ['a','b']]); $value = phpser_unserialize($payload); // $value === ['id' => 42, 'name' => 'row', 'tags' => ['a','b']]
HMAC-signed mode for untrusted storage (memcached, redis, files, cookies). The signed entry points wrap the payload in a constant-time HMAC-SHA256 frame; tampered or foreign-keyed input is rejected before any decoding work runs:
$key = random_bytes(32); // store this key in your app config $payload = phpser_serialize_signed($cacheValue, $key); // ... later, possibly across a process boundary ... $value = phpser_unserialize_signed($payload, $key); // returns NULL if the payload was tampered or signed with a different key
allowed_classes option on both unserialize entry points. Same shape as
PHP's native unserialize($payload, ['allowed_classes' => ...]):
// Reject all classes (decode them as __PHP_Incomplete_Class) $value = phpser_unserialize($payload, ['allowed_classes' => false]); // Allowlist specific classes; everything else becomes __PHP_Incomplete_Class $value = phpser_unserialize($payload, ['allowed_classes' => [Foo::class, Bar::class]]); // Allow all (default) $value = phpser_unserialize($payload, ['allowed_classes' => true]); $value = phpser_unserialize($payload); // same as above
When decoding attacker-controlled bytes, use one of the two restricted
modes or the signed entry point. See SECURITY.md for the full threat
model.
✨ Features
- Signed payloads for integrity.
phpser_serialize_signed($value, $key)wraps the payload in an HMAC-SHA256 frame;phpser_unserialize_signed($payload, $key)verifies in constant time and rejects tampered or foreign-keyed input before any decoding work runs. Use this whenever the storage layer crosses a trust boundary: memcached, redis, files, cookies, anywhere an attacker who can write to the store could otherwise feed a crafted payload to your decoder. - Safe handling of untrusted input.
allowed_classesoption on both unserialize entry points, matching PHP's nativeunserialize($payload, ['allowed_classes' => ...])shape: passfalseto reject all classes, an array to allowlist specific ones, ortruefor the default. Disallowed classes decode as__PHP_Incomplete_Classwith the original name preserved, never instantiated. Recursion depth is capped at 512 on both encode and decode, and assoc decode useszend_hash_updateso duplicate-key payloads collapse to last-write-wins rather than phantom buckets. - PHP 8.3+ (8.4, 8.5, master). BSD 3-Clause.
Bench (opt PHP 8.4.22-dev NTS release, 1000 iters, median of 9 runs)
| Shape | Size: ig → ps | Encode: ig → ps | Decode: ig → ps |
|---|---|---|---|
| rowset_100 | 4570 → 4771 (+4.4%) | 9k → 11k ns (+30%) | 9k → 10k ns (~parity) |
| rowset_1000 | 47K → 48K (+1.1%) | 143k → 113k ns (-25%) | 93k → 98k ns (+5%) |
| packed_1k | 5495 → 1941 (-65%) | 4.2k → 1.4k ns (-67%) | 7.0k → 1.7k ns (-77%) |
| packed_10k | 60K → 22K (-63%) | 41k → 16k ns (-61%) | 67k → 17k ns (-73%) |
| deep_50 | 419 → 424 (parity) | 1.3k → 0.65k ns (-49%) | 1.7k → 1.5k ns (-9%) |
| dto_100 | 7083 → 6362 (-10%) | 14k → 18k ns (+22%) | 25k → 22k ns (-11%) |
| dto_1000 | 73K → 65K (-12%) | 175k → 173k ns (parity) | 250k → 214k ns (-14%) |
| dto_mixed | 22K → 29K (+33%) | 54k → 76k ns (+42%) | 103k → 88k ns (-14%) |
Wins: packed numerics ~65% smaller + ~75% faster decode + ~61% faster
encode. Deep-nested ~49% faster encode at parity size. Rowset_1000
encode beats igbinary by ~25%, size within 1.1%; decode pays a ~5%
tax for the front-loaded dict header walk + refcount-reuse machinery.
DTO workloads (Laravel-queue-style payloads, single-class arrays):
10-12% smaller, 11-14% faster decode vs igbinary thanks to dict
dedup on prop names + the class-entry lookup cache that amortizes
zend_lookup_class_ex across same-typed batches.
rowset_100 encode (+30%) is the durable gap: a fixed-cost floor for
the dict header emission and first-row inline emissions, amortized
over too few rows to recover. The absolute time is small (11 µs for
the entire 100-row payload). Decode is essentially at parity (per-run
delta median +0.4%, absolute ratio +6%): the skip-DICT cache-eviction
policy keeps ['a','b','c']-style repeated values in DICT slots so
detect_packed_run picks the TAG_PACKED_STRINGS typed-run path
instead of falling back to PACKED_MIXED mid-rowset.
dto_mixed encode (+42%) is the durable encode gap on object-heavy
shapes: obj->handlers->get_properties is called per object and
isn't trivially avoidable without a custom fast path for default
property layouts.
Design highlights
The core ideas that drive the perf wins above:
- Pointer-equality dict intern. Encoding hits a
*zend_string == *zend_stringcheck first; only on miss do we hash the bytes. Cuts intern cost to near-zero for rowset-shaped data where PHP literals share interned zend_strings. - Front-loaded string dictionary. Same shape as igbinary's
compact_strings, except we emit the table once at the head and reference by varint index from values. Trade-off: not streamable. - Refcount-reuse of zend_strings on decode. Per-decode cache parallel
to the dict. First reference allocates, subsequent ones
addref. - HT_IS_PACKED detection via flag, not iteration. Avoid scanning the buckets just to determine layout.
arPackedstride awareness. PHP 8+'s packed-array layout stores zvals directly, not Buckets. Stride is 16, not 32.- Sparse-packed fallback. Arrays with holes (post-
unset) preserve original int keys via Assoc rather than silently re-indexing.
Where phpser diverges from igbinary
igbinary is the closest reference point. The areas where there's still measurable perf to take, and that this project targets, are:
-
Pre-sized HT + direct
arPackedwrites on decode. When the wire format declaresPACKED_LEN N, allocate the HT once viazend_new_array(N)and write directly intoarPackedwithZVAL_*macros. Skips Nzend_hash_next_index_insertcalls, including their hash computation, growth checks, and capacity tuning. Shipped. -
Tagged scalar runs.
[1, 2, 3, ...](1000 longs) emits as a singlePACKED_LONGSheader + N zigzag varints, not 1000(tag, varint)pairs. Decode is one tight loop with no per-element tag dispatch. Shipped. -
Inline-cache pointer intern. 16-slot ring of recently-seen
zend_string*. Hit rate near 100% on rowset shapes (PHP interns literals; the same"id"zend_string pointer flows through every row). Skips the byte-hash entirely on cache hits. Shipped. -
Eager dict materialization with warm hashes. All dict zend_strings allocated up front during header parse and their hashes pre-computed.
zend_hash_add_newreuses the cached hash. Shipped. -
updateinsert on assoc decode. Originallyadd_newto skip the existence-check, but adversarial wire payloads with duplicate keys would produce phantom buckets that violate PHP's last-write-wins semantic (count($arr) != count(array_unique(array_keys($arr)))). Reverted tozend_hash_updatefor security-boundary correctness;add_newis a real but small perf win the cost of breaking adversarial payloads cleanly. Shipped. -
Inline-short-string tag with upgrade-on-second-encounter.
TAG_STR_INLINE(0x0c) andKEY_STR_INLINE(0x02) are emitted on a string's first occurrence; the next occurrence triggers an in-place upgrade to a dict entry, and all subsequent ones emitTAG_STR_DICT. Singletons (e.g.row_Xvalues in a rowset) never hit the upgrade branch. They cost nothing in the dict header. The intern cache doubles as the "seen once?" signal: high bit ofidxdistinguishesINLINE_EMITTEDfromDICT_IDX. No pre-pass; single walk of the zval tree as before.A count-then-emit variant was tried first: pre-walk the zval tree to tag occurrences, then emit inline for singletons and dict for repeats. The pre-pass cost ~200 ns per string and ate the per-singleton savings, so the single-walk upgrade-on-second-encounter version above is what ships.
rowset_1000encode landed at 25% faster than igbinary (up from 8% in the pre-upgrade implementation), with payload size dropping from +5% to +2.7%. -
Skip refcount machinery during build. All zvals built during decode are fresh and unshared until handed back to PHP. Internal writes can skip
Z_TRY_ADDREFguards.
Local dev build
The hand-rolled Makefile builds against an in-tree ~/php-src-8.4-opt
checkout without phpize/autoconf. Useful for hacking on the extension
while also hacking on PHP itself:
make -j$(nproc) # builds modules/phpser.so make test # runs tests/*.phpt via run-tests.php
Override PHP_SRC= to target a different in-tree PHP checkout. Load
alongside igbinary for the A/B bench:
~/php-src-8.4-opt/sapi/cli/php \ -d extension=$HOME/igbinary/modules/igbinary.so \ -d extension=$(pwd)/modules/phpser.so \ bench.php
The config.m4 auto-detects the session extension and registers phpser
as a session.serialize_handler when available.
Limitations / known gaps
- Recursion depth is capped at 512 on both encode and decode. Anything deeper than 512 nested containers / refs is rejected to bound stack consumption against adversarial wire payloads. Object cycles are preserved correctly via the id-table machinery and don't count against this cap for shared-graph cases; the cap only fires on genuinely deep trees. Cache workloads typically nest 5-10 deep, so the cap is many orders of magnitude past any legitimate payload.
- Closures and resources encode as
NULL. Same shape as PHP's ownserialize(); these types are inherently non-serializable. - Unknown classes at decode fall back to
stdClassrather than PHP's__PHP_Incomplete_Class. This is deliberate for the typical cache workload;allowed_classes => [...]produces__PHP_Incomplete_Classwith the original name preserved for disallowed classes, matching PHP. session.serialize_handler=phpseris shipped (compiled in whenphpizedetects the session extension; gated onHAVE_PHP_SESSIONso the extension still loads on session-less PHP builds).phpredisintegration is not yet wired; callphpser_serialize/unserializedirectly when using the extension as a phpredis serializer.
Wire format (V1)
[u8 version=0x01]
[varint ndict]
per entry: [varint len] [bytes]
[value]
value tags:
0x00 NULL
0x01 FALSE
0x02 TRUE
0x03 LONG varint (zigzag-encoded)
0x04 DOUBLE 8 bytes (LE)
0x05 STR_DICT varint dict_idx
0x06 ASSOC varint(len), N×(key, val)
0x07 PACKED_MIXED varint(len), N×val
0x08 PACKED_LONGS varint(len), N×zigzag-varint
0x09 PACKED_DOUBLES varint(len), N×8-byte LE
0x0a OBJECT varint(class_idx), varint(nprops), N×(key_idx, val)
0x0b PACKED_STRINGS varint(len), N×varint(dict_idx) // typed string run
0x0c STR_INLINE varint(len), bytes // single-use string, skips dict
0x0d ENUM varint(class_idx), varint(case_name_idx)
0x0e OBJECT_MAGIC varint(class_idx), value // class with __serialize;
// value is the array __serialize returned
0x0f OBJECT_LEGACY varint(class_idx), varint(len), bytes // class with
// ce->serialize / ce->unserialize (Serializable etc.)
0x10 REF varint(id) // back-ref to a previously-emitted container
0x11 NEW_REF value // claims the next id for an IS_REFERENCE wrap
key tags:
0x00 LONG varint(zigzag)
0x01 STR varint(dict_idx)
0x02 STR_INLINE varint(len), bytes
Varints are LEB128 (unsigned); signed values use zigzag encoding. Tags 0x10/0x11 plus 0x0a/0x0d/0x0e/0x0f each implicitly claim the next id in encounter order, so the decoder reconstructs back-refs by counting container tags as it parses.
🔗 PHP Performance Toolkit
Companion native PHP extensions:
- php_excel: native XLS/XLSX read/write via LibXL
- mdparser: native CommonMark + GitHub Flavored Markdown parser
- php_clickhouse: native ClickHouse client over the binary protocol
- fastchart: 19 chart types in one PHP extension
- fastjson: drop-in faster
ext/json, backed by yyjson - statgrab: system statistics wrapper around libstatgrab
Follow on X • Blog • If this cut your cache decode CPU, ⭐ star it!
