displace/ext-whisper

PHP 8.3+ native, in-process speech-to-text via whisper.cpp: 16kHz WAV in, text + timestamped segments out.

Maintainers

Package info

github.com/DisplaceTech/ext-whisper

Language:Rust

Type:php-ext

Ext name:ext-whisper

pkg:composer/displace/ext-whisper

Statistics

Installs: 2

Dependents: 0

Suggesters: 0

Stars: 0

Open Issues: 0

v0.1.0 2026-06-11 20:57 UTC

This package is auto-updated.

Last update: 2026-06-11 21:16:53 UTC


README

Local speech-to-text for PHP, in-process.
16kHz WAV in, text + timestamped segments out โ€” no Python sidecar, no remote API, no audio leaving the box.

CI PHP 8.3 / 8.4 / 8.5 Pre-release MIT License Documentation

What is ext-whisper?

ext-whisper is a PHP 8.3+ extension that loads a whisper.cpp model and runs speech-to-text in the PHP process, on CPU. Written in Rust on top of ext-php-rs and whisper-rs.

  • ๐ŸŽ™๏ธ Transcription with timestamps โ€” full text plus time-aligned segments, offsets in seconds.
  • ๐Ÿงพ Contracts-shaped output โ€” segment rows match Displace\AI\Contracts\Transcriber exactly; the adapter is two lines.
  • ๐Ÿงฐ Actionable errors โ€” a non-conforming WAV throws with the precise ffmpeg one-liner that fixes it.
  • ๐ŸŒ Multilingual + translate โ€” ['language' => 'de'] hints, ['translate' => true] to English (multilingual models).
  • ๐Ÿงต Thread-safe by construction โ€” one model handle, a fresh whisper state per call, no shared mutable state.
  • ๐Ÿคซ Quiet by default โ€” whisper.cpp's stderr firehose is silenced; EXT_WHISPER_LOG=1 restores it.

Quick start

mkdir -p models
curl -L -o models/ggml-tiny.en.bin \
    https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-tiny.en.bin

make build
php -d extension=$PWD/target/debug/libwhisper.so examples/transcribe.php \
    models/ggml-tiny.en.bin tests/fixtures/jfk.wav
<?php
use Displace\Whisper\Model;

$model  = Model::load('models/ggml-tiny.en.bin');
$result = $model->transcribe('audio/meeting.wav');

echo $result->text(), PHP_EOL;

foreach ($result->segments() as $s) {
    printf("[%6.2fs โ†’ %6.2fs] %s\n", $s['start'], $s['end'], $s['text']);
}

$model->close();

Input must be 16kHz mono 16-bit PCM WAV; everything else converts in one line (ffmpeg -i in.mp3 -ar 16000 -ac 1 -c:a pcm_s16le out.wav) and the error messages carry that exact command.

Documentation

whisper.displace.tech โ€” install, audio preparation, the full API surface. Built from docs/ with mdbook, deployed on every push to main.

Part of a stack

Transcribe (ext-whisper) โ†’ chunk (ai-toolkit) โ†’ embed (ext-infer) โ†’ search (ext-turbovec): searchable audio archives, entirely on your hardware. The ai-contracts Transcriber interface is the integration surface.

Compatibility

macOS arm64 Linux x86_64 Linux arm64 Windows
PHP 8.3 โœ… โœ… โœ… โ€”
PHP 8.4 โœ… โœ… โœ… โ€”
PHP 8.5 โœ… โœ… โœ… โ€”

Deliberately out of scope (v0.1)

Audio decoding (mp3/m4a/ogg โ€” the ffmpeg one-liner is the API; symphonia-based decoding is a v0.2 candidate) ยท streaming / realtime transcription ยท speaker diarization ยท word-level timestamps ยท GPU-default builds (CPU-first platform-wide; use_gpu exists for custom builds) ยท Windows.

License

MIT ยฉ 2026 Eric Mann / Displace Technologies. Statically links whisper.cpp (MIT, ยฉ The ggml authors) โ€” see THIRD-PARTY-NOTICES.md.