displace / ext-whisper
PHP 8.3+ native, in-process speech-to-text via whisper.cpp: 16kHz WAV in, text + timestamped segments out.
Package info
github.com/DisplaceTech/ext-whisper
Language:Rust
Type:php-ext
Ext name:ext-whisper
pkg:composer/displace/ext-whisper
Requires
- php: ^8.3
This package is auto-updated.
Last update: 2026-06-11 21:16:53 UTC
README
Local speech-to-text for PHP, in-process.
16kHz WAV in, text + timestamped segments out โ no Python sidecar, no remote API, no audio leaving the box.
What is ext-whisper?
ext-whisper is a PHP 8.3+ extension that loads a
whisper.cpp model and runs
speech-to-text in the PHP process, on CPU. Written in Rust on top of
ext-php-rs and
whisper-rs.
- ๐๏ธ Transcription with timestamps โ full text plus time-aligned segments, offsets in seconds.
- ๐งพ Contracts-shaped output โ segment rows match
Displace\AI\Contracts\Transcriberexactly; the adapter is two lines. - ๐งฐ Actionable errors โ a non-conforming WAV throws with the precise ffmpeg one-liner that fixes it.
- ๐ Multilingual + translate โ
['language' => 'de']hints,['translate' => true]to English (multilingual models). - ๐งต Thread-safe by construction โ one model handle, a fresh whisper state per call, no shared mutable state.
- ๐คซ Quiet by default โ whisper.cpp's stderr firehose is silenced;
EXT_WHISPER_LOG=1restores it.
Quick start
mkdir -p models
curl -L -o models/ggml-tiny.en.bin \
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-tiny.en.bin
make build
php -d extension=$PWD/target/debug/libwhisper.so examples/transcribe.php \
models/ggml-tiny.en.bin tests/fixtures/jfk.wav
<?php use Displace\Whisper\Model; $model = Model::load('models/ggml-tiny.en.bin'); $result = $model->transcribe('audio/meeting.wav'); echo $result->text(), PHP_EOL; foreach ($result->segments() as $s) { printf("[%6.2fs โ %6.2fs] %s\n", $s['start'], $s['end'], $s['text']); } $model->close();
Input must be 16kHz mono 16-bit PCM WAV; everything else converts
in one line (ffmpeg -i in.mp3 -ar 16000 -ac 1 -c:a pcm_s16le out.wav)
and the error messages carry that exact command.
Documentation
whisper.displace.tech โ install,
audio preparation, the full API surface. Built from
docs/ with mdbook, deployed on every push to main.
Part of a stack
Transcribe (ext-whisper) โ chunk
(ai-toolkit) โ embed
(ext-infer) โ search
(ext-turbovec):
searchable audio archives, entirely on your hardware. The
ai-contracts
Transcriber interface is the integration surface.
Compatibility
| macOS arm64 | Linux x86_64 | Linux arm64 | Windows | |
|---|---|---|---|---|
| PHP 8.3 | โ | โ | โ | โ |
| PHP 8.4 | โ | โ | โ | โ |
| PHP 8.5 | โ | โ | โ | โ |
Deliberately out of scope (v0.1)
Audio decoding (mp3/m4a/ogg โ the ffmpeg one-liner is the API;
symphonia-based decoding is a v0.2 candidate) ยท streaming /
realtime transcription ยท speaker diarization ยท word-level
timestamps ยท GPU-default builds (CPU-first platform-wide;
use_gpu exists for custom builds) ยท Windows.
License
MIT ยฉ 2026 Eric Mann / Displace Technologies. Statically links whisper.cpp (MIT, ยฉ The ggml authors) โ see THIRD-PARTY-NOTICES.md.