rajentrivedi/tokenizer-x

TokenizerX calculates required tokens for given prompt

v1.0.4 2024-07-23 02:53 UTC

This package is auto-updated.

Last update: 2024-12-23 03:46:55 UTC


README

Latest Version on Packagist

GitHub Code Style Action Status Total Downloads

TokenizerX supports Laravel 11 and Laravel 10.

Installation

    composer require rajentrivedi/tokenizer-x

TokenizerX

TokenizerX is a Laravel package designed to streamline tokenization processes in your applications. With the latest update, TokenizerX now supports cutting-edge GPT-4 models, providing advanced natural language processing capabilities.

It calculates the tokens required for a given prompt before requesting the OpenAI REST API. This package helps to ensure that the user does not exceed the OpenAI API token limit and can generate accurate responses.

To access the OpenAI Rest API, you may consider the beautiful Laravel Package OpenAI PHP.

Supported OpenAI Models

  • gpt-4o
  • gpt-4
  • gpt-3.5-turbo
  • text-davinci-003
  • text-davinci-002
  • text-davinci-001
  • text-curie-001
  • text-babbage-001
  • text-ada-001
  • davinci
  • curie
  • babbage
  • ada
  • code-davinci-002
  • code-davinci-001
  • code-cushman-002
  • code-cushman-001
  • davinci-codex
  • cushman-codex
  • text-davinci-edit-001
  • code-davinci-edit-001
  • text-embedding-ada-002
  • text-similarity-davinci-001
  • text-similarity-curie-001
  • text-similarity-babbage-001
  • text-similarity-ada-001
  • text-search-davinci-doc-001
  • text-search-curie-doc-001
  • text-search-babbage-doc-001
  • text-search-ada-doc-001
  • code-search-babbage-code-001
  • code-search-ada-code-001

Supported Encoding

  • r50k_base
  • p50k_base
  • p50k_edit
  • cl100k_base

Installation

You can install the package via composer:

composer require rajentrivedi/tokenizer-x

Usage

By default package will consider GPT-3 model

use Rajentrivedi\TokenizerX\TokenizerX;
TokenizerX::count("how are you?");

If you want token counts for specific OpenAI model, you can pass model as a second argument from above given supported model list.

use Rajentrivedi\TokenizerX\TokenizerX;
TokenizerX::count("how are you?", "gpt-4");

You can also read the text from file

TokenizerX::count(file_get_contents('path_to_file'));

Please make sure that text of the file don't change while reading the file programmatically, this may happen due to encoding. You can check the generated token IDs by using following

TokenizerX::tokens(file_get_contents('path_to_file'));

This will return an array of tokens generated & compare those token IDs with OpenAI Tokenizer

You can also use the OpenAI Tokenizer to double-check package generated token counts.

Support

If you find TokenizerX helpful and would like to support its ongoing development, you can contribute by buying me a coffee! Your support helps in maintaining and improving the package for the Laravel community.

ko-fi

Testing

composer test

Changelog

Please see CHANGELOG for more information on what has changed recently.

Contributing

Please see CONTRIBUTING for details.

Security Vulnerabilities

Please review our security policy on how to report security vulnerabilities.

⭐ Star the Repository ⭐

If you find this project useful or interesting, I kindly request you to give it a ⭐ star on GitHub. Your support will encourage and motivate me to continue improving and maintaining this project.

By starring the repository, you can show appreciation for the work put into developing this open-source project. It also helps to increase its visibility, making it more accessible to other developers and potentially attracting contributors.

To give a ⭐ star, simply click on the Star button at the top-right corner of the repository page.

Credits

License

TokenizerX is developed using

The MIT License (MIT). Please see License File for more information.