danny50610 / bpe-tokeniser
PHP port for openai/tiktoken (most)
Installs: 141 085
Dependents: 1
Suggesters: 0
Security: 0
Stars: 7
Watchers: 2
Forks: 0
Open Issues: 0
Requires
- php: ^8.1
Requires (Dev)
- phpunit/phpunit: ^10
README
PHP port for openai/tiktoken (most)
Supported encodings
- gpt-3.5-turbo
- gpt-4
- gpt-4o
- more ...
For available encodings, see src/EncodingFactory.php
Installation
composer require danny50610/bpe-tokeniser
Example
GPT-4 / GPT-3.5-Turbo (cl100k_base)
use Danny50610\BpeTokeniser\EncodingFactory; $enc = EncodingFactory::createByEncodingName('cl100k_base'); var_dump($enc->encode("hello world")); /** * output: * array(2) { * [0]=> * int(15339) * [1]=> * int(1917) * } */ var_dump($enc->decode($enc->encode("hello world"))); // output: string(11) "hello world"
use Danny50610\BpeTokeniser\EncodingFactory; $enc = EncodingFactory::createByModelName('gpt-3.5-turbo'); var_dump($enc->decode($enc->encode("hello world"))); // output: string(11) "hello world"