mehrab-wj / tiktoken-php
a clone of python tiktoken but for PHP! fast BPE tokeniser for use with OpenAI's models.
Installs: 22 384
Dependents: 2
Suggesters: 0
Security: 0
Stars: 17
Watchers: 0
Forks: 4
Open Issues: 0
Requires
- php: ^8.1
- ext-mbstring: *
Requires (Dev)
- friendsofphp/php-cs-fixer: ^3.14
- phpstan/phpstan: ^1.9
- phpunit/phpunit: ^9.5
- rector/rector: ^0.15.12
- symfony/var-dumper: ^6.2
README
PHP Text Tokenizer for GPT models
About
A PHP toolkit to tokenize text like GPT family of models process it.
Forked from semji/gpt3-tokenizer-php to bug fixes and improvement.
Requirements
- PHP 8.1
- mbstring extension details here on how to install mbstring
Usage
First install the package using composer:
composer require mehrab-wj/tiktoken-php
use TikToken\Encoder; $prompt = "Ai is cool"; $encoder = new Encoder(); $tokens = $encoder->encode($prompt); // [32, 72, 318, 3608] // Get tokens count: echo count($tokens); // 4