textualization/ropherta

Compute RoBERTa embeddings using ONNX framework.

Fund package maintenance!
Ko-Fi

v0.0.11 2024-04-29 13:11 UTC

This package is auto-updated.

Last update: 2024-04-29 13:11:59 UTC


README

This brings the power of Transformers to the PHP world.

Installation

Add this project to your dependencies

composer require textualization/ropherta
composer update

Before using it, you will need to install the ONNX framework:

composer exec -- php -r "require 'vendor/autoload.php'; OnnxRuntime\Vendor::check();"

and download the RoBERTa ONNX model (this takes a while, the model is 477Mb in size):

composer exec -- php -r "require 'vendor/autoload.php'; Textualization\Ropherta\Vendor::check();"

Computing embeddings

$model = new RophertaModel();

$emb = $model->embeddings("Text");

Check \Textualization\Ropherta\Distances to check whether two embeddings are closer to each other.

Using custom embeddings

$model = new RophertaModel("/path/to/model.onnx");
$emb = $model->embeddings("Text");

To fine-tune a model you will need a large amount of in-domain text and use Python in a machine with a GPU. See tuning for details.

Sponsors

We thank our sponsor:

68747470733a2f2f65766f6c75646174612e636f6d2f646973706c6179323038