helgesverre / chromadb
PHP Client for the Chromadb Rest API
Requires
- php: ^8.2
- saloonphp/laravel-plugin: ^v3.0.0
- saloonphp/saloon: ^3.0
- spatie/laravel-data: ^3.10
- spatie/laravel-package-tools: ^1.14.0
Requires (Dev)
- larastan/larastan: ^2.0.1
- laravel/pint: ^1.0
- nunomaduro/collision: ^7.8
- orchestra/testbench: ^8.8
- pestphp/pest: ^2.20
- pestphp/pest-plugin-arch: ^2.0
- pestphp/pest-plugin-laravel: ^2.0
- phpstan/extension-installer: ^1.1
- phpstan/phpstan-deprecation-rules: ^1.0
- phpstan/phpstan-phpunit: ^1.0
README
ChromaDB PHP API Client
ChromaDB is an open-source vector database that allows you to store and query vector embeddings. This package provides a PHP client for the ChromaDB API.
Installation
You can install the package via composer:
composer require helgesverre/chromadb
You can publish the config file with:
php artisan vendor:publish --tag="chromadb-config"
This is the contents of the published config/chromadb.php
file:
return [ 'token' => env('CHROMADB_TOKEN'), 'host' => env('CHROMADB_HOST', 'localhost'), 'port' => env('CHROMADB_PORT', '19530'), ];
Usage
$chromadb = new \HelgeSverre\Chromadb\Chromadb( token: 'test-token-chroma-local-dev', host: 'http://localhost', port: '8000' ); // Create a new collection with optional metadata $chromadb->collections()->create( name: 'my_collection', ); // Count the number of collections $chromadb->collections()->count(); // Retrieve a specific collection by name $chromadb->collections()->get( collectionName: 'my_collection' ); // Delete a collection by name $chromadb->collections()->delete( collectionName: 'my_collection' ); // Update a collection's name and/or metadata $chromadb->collections()->update( collectionId: '3ea5a914-e2ab-47cb-b285-8e585c9af4f3', newName: 'new_collection_name', ); // Add items to a collection with optional embeddings, metadata, and documents $chromadb->items()->add( collectionId: '3ea5a914-e2ab-47cb-b285-8e585c9af4f3', ids: ['item1', 'item2'], embeddings: ['embedding1', 'embedding2'], documents: ['doc1', 'doc2'] ); // Update items in a collection with new embeddings, metadata, and documents $chromadb->items()->update( collectionId: '3ea5a914-e2ab-47cb-b285-8e585c9af4f3', ids: ['item1', 'item2'], embeddings: ['new_embedding1', 'new_embedding2'], documents: ['new_doc1', 'new_doc2'] ); // Upsert items in a collection (insert if not exist, update if exist) $chromadb->items()->upsert( collectionId: '3ea5a914-e2ab-47cb-b285-8e585c9af4f3', ids: ['item'], metadatas: [['title' => 'metadata']], documents: ['document'] ); // Retrieve specific items from a collection by their IDs $chromadb->items()->get( collectionId: '3ea5a914-e2ab-47cb-b285-8e585c9af4f3', ids: ['item1', 'item2'] ); // Delete specific items from a collection by their IDs $chromadb->items()->delete( collectionId: '3ea5a914-e2ab-47cb-b285-8e585c9af4f3', ids: ['item1', 'item2'] ); // Count the number of items in a collection $chromadb->items()->count( collectionId: '3ea5a914-e2ab-47cb-b285-8e585c9af4f3' ); // Query items in a collection based on embeddings, texts, and other filters $chromadb->items()->query( collectionId: '3ea5a914-e2ab-47cb-b285-8e585c9af4f3', queryEmbeddings: [createTestVector(0.8)], include: ['documents', 'metadatas', 'distances'], nResults: 5 );
Example: Semantic Search with ChromaDB and OpenAI Embeddings
This example demonstrates how to perform a semantic search in ChromaDB using embeddings generated from OpenAI.
Full code available in SemanticSearchTest.php.
Prepare Your Data
First, create an array of data you wish to index. In this example, we'll use blog posts with titles, summaries, and tags.
$blogPosts = [ [ 'title' => 'Exploring Laravel', 'summary' => 'A deep dive into Laravel frameworks...', 'tags' => ['PHP', 'Laravel', 'Web Development'] ], [ 'title' => 'Introduction to React', 'summary' => 'Understanding the basics of React and how it revolutionizes frontend development.', 'tags' => ['JavaScript', 'React', 'Frontend'] ], ];
Generate Embeddings
Use OpenAI's embeddings API to convert the summaries of your blog posts into vector embeddings.
$summaries = array_column($blogPosts, 'summary'); $embeddingsResponse = OpenAI::client('sk-your-openai-api-key') ->embeddings() ->create([ 'model' => 'text-embedding-ada-002', 'input' => $summaries, ]); foreach ($embeddingsResponse->embeddings as $embedding) { $blogPosts[$embedding->index]['vector'] = $embedding->embedding; }
Create ChromaDB Collection
Create a collection in ChromaDB to store your blog post embeddings.
$createCollectionResponse = $chromadb->collections()->create( name: 'blog_posts', ); $collectionId = $createCollectionResponse->json('id');
Insert into ChromaDB
Insert these embeddings, along with other blog post data, into your ChromaDB collection.
foreach ($blogPosts as $post) { $chromadb->items()->add( collectionId: $collectionId, ids: [$post['title']], embeddings: [$post['embedding']], metadatas: [$post] ); }
Creating a Search Vector with OpenAI
Generate a search vector for your query, akin to how you processed the blog posts.
$searchEmbedding = getOpenAIEmbedding('laravel framework');
Searching using the Embedding in ChromaDB
Use the ChromaDB client to perform a search with the generated embedding.
$searchResponse = $chromadb->items()->query( collectionId: $collectionId, queryEmbeddings: [$searchEmbedding], nResults: 3, include: ['metadatas'] ); // Output the search results foreach ($searchResponse->json('results') as $result) { echo "Title: " . $result['metadatas']['title'] . "\n"; echo "Summary: " . $result['metadatas']['summary'] . "\n"; echo "Tags: " . implode(', ', $result['metadatas']['tags']) . "\n\n"; }
Running ChromaDB in Docker
To quickly get started with ChromaDB, you can run it in Docker
# Download the docker-compose.yml file wget https://github.com/HelgeSverre/chromadb/blob/main/docker-compose.yml # Start ChromaDB docker compose up -d
The auth token is set to test-token-chroma-local-dev
by default.
You can change this in the docker-compose.yml
file by changing the CHROMA_SERVER_AUTH_CREDENTIALS
environment
variable
To stop ChromaDB, run docker compose down
, to wipe all the data, run docker compose down -v
.
NOTE
The
docker-compose.yml
file in this repo is provided only as an example and should not be used in production.Go to the ChromaDB deployment documentation for more information on deploying Chroma in production.
Testing
cp .env.example .env
docker compose up -d
composer test
composer analyse src
License
The MIT License (MIT). Please see License File for more information.