benbjurstrom/pgvector-scout

Pgvector driver for Laravel Scout

v0.1.0 2024-11-16 16:05 UTC

This package is not auto-updated.

Last update: 2024-11-17 14:15:57 UTC


README

Logo

Latest Version on Packagist GitHub Tests Action Status GitHub Code Style Action Status

Pgvector driver for Laravel Scout

Use the pgvector extension with Laravel Scout for vector similarity search.

To see a full example showing how to use this package check out benbjurstrom/pgvector-scout-demo.

πŸš€ Quick Start

1. Install the package using composer:

composer require benbjurstrom/pgvector-scout

2. Publish and run the migrations:

php artisan vendor:publish --tag="pgvector-scout-migrations"
php artisan migrate

3. Ensure the pgvector extension is available:

select * from pg_extension where extname='vector';

4. Update the model you wish to make searchable:

Add the HasEmbeddings and Searchable traits to your model and implement toSearchableArray() with the content you want converted into an embedding.

use BenBjurstrom\PgvectorScout\Models\Concerns\HasEmbeddings;
use Laravel\Scout\Searchable;

class YourModel extends Model
{
    use HasEmbeddings, Searchable;

    /**
     * Get the indexable data array for the model.
     */
    public function toSearchableArray(): array
    {
        return [
            'title' => $this->title,
            'content' => $this->content,
        ];
    }
}

5. Configure your environment:

If you're using OpenAI to generate your embeddings be sure to add your API key to your .env file:

OPENAI_API_KEY=your-api-key

6. Publish the config:

php artisan vendor:publish --tag="pgvector-scout-config"

This is the contents of the published config file:

return [
    /*
    |--------------------------------------------------------------------------
    | Default Embedding Handler
    |--------------------------------------------------------------------------
    |
    | This option controls which embedding handler to use by default. You can
    | change this to any of the handlers defined below or create your own.
    |
    */
    'default' => env('EMBEDDING_HANDLER', 'openai'),

    /*
    |--------------------------------------------------------------------------
    | Embedding Handler Configurations
    |--------------------------------------------------------------------------
    |
    | Here you can define the configuration for different embedding handlers.
    | Each handler can have its own specific configuration options.
    |
    */
    'handlers' => [
        'openai' => [
            'class' => \BenBjurstrom\PgvectorScout\Handlers\OpenAiHandler::class,
            'model' => 'text-embedding-3-small',
            'dimensions' => 256, // See Reducing embedding dimensions https://platform.openai.com/docs/guides/embeddings#use-cases
            'url' => env('OPENAI_URL', 'https://api.openai.com/v1'),
            'api_key' => env('OPENAI_API_KEY'),
            'table' => 'embeddings',
        ],
        'fake' => [
            'class' => \BenBjurstrom\PgvectorScout\Handlers\FakeHandler::class,
            'model' => 'fake',
            'dimensions' => 3,
            'url' => 'https://example.com',
            'api_key' => '123',
            'table' => 'embeddings',
        ],
    ],
];

πŸ” Usage

Create embeddings for your models:

Laravel Scout uses eloquent model observers to automatically keep your search index in sync anytime your Searchable models change.

This package uses this functionality automatically generate embeddings for your models when they are saved or updated; or remove them when your models are deleted.

If you want to manually generate embeddings for existing models you can use the artisan command below. See the Scout documentation for more information.

artisan scout:import "App\Models\YourModel"

Search using vector similarity:

You can use the typical Scout syntax to search your models. For example:

$results = YourModel::search('your search query')->get();

Note that the text of your query will be converted into a vector embedding using the configured embedding handler (such as OpenAI). It's important that the same model is used for both indexing and searching.

Search using existing vectors:

You can also pass an existing embedding vector as a search parameter. This can be useful to find related models. For example:

$vector = $someModel->embedding->vector;
$results = YourModel::search($vector)->get();

Evaluate search results:

All search queries will be ordered by similarity to the given input and include the embedding relationship. The value of the nearest neighbor search can be accessed as follows:

$results = YourModel::search('your search query')->get();
$results->first()->embedding->neighbor_distance; // 0.26834 (example value)

The larger the distance the less similar the result is to the input.

πŸ› Using custom handlers

By default this package uses OpenAI to generate embeddings. To do this it uses the OpenAiHandler class paired with the openai config found in the packages config file.

You can generate embeddings from other providers by adding a custom Handler. A handler is a simple class defined in the HandlerContract that takes a string and returns a Pgvector\Laravel\Vector object.

Whatever api calls or logic is needed to turn a string into a vector should be defined in the handle method of your custom handler.

If you need to pass api keys, embedding dimensions, or any other configuration to your handler you can define them in the config/pgvector-scout.php file.

Installing pgvector when using DBngin

If you're using DBngin for local development you can install the pgvector extention by doing the following:

  1. Add PostgreSQL to your path:
export PATH=/Users/Shared/DBngin/postgresql/14.3/bin:$PATH
  1. Then install pgvector:
git clone https://github.com/pgvector/pgvector.git
cd pgvector
make && make install

πŸ‘ Credits

πŸ“ License

The MIT License (MIT). Please see License File for more information.