php-llm / llm-chain
A slim PHP component with tooling around LLMs.
Installs: 6 783
Dependents: 1
Suggesters: 0
Security: 0
Stars: 10
Watchers: 3
Forks: 3
Open Issues: 16
Requires
- php: >=8.2
- phpdocumentor/reflection-docblock: ^5.4
- psr/cache: ^3.0
- psr/log: ^3.0
- symfony/clock: ^6.4 || ^7.1
- symfony/http-client: ^6.4 || ^7.1
- symfony/property-access: ^6.4 || ^7.1
- symfony/property-info: ^6.4 || ^7.1
- symfony/serializer: ^6.4 || ^7.1
- symfony/type-info: ^6.4 || ^7.1
- symfony/uid: ^6.4 || ^7.1
- webmozart/assert: ^1.11
Requires (Dev)
- codewithkyrian/chromadb-php: ^0.2.1
- mongodb/mongodb: ^1.20
- php-cs-fixer/shim: ^3.64
- phpstan/phpstan: ^1.12
- phpunit/phpunit: ^11.3
- probots-io/pinecone-php: ^1.0
- rector/rector: ^1.2
- symfony/console: ^6.4 || ^7.1
- symfony/css-selector: ^6.4 || ^7.1
- symfony/dom-crawler: ^6.4 || ^7.1
- symfony/dotenv: ^6.4 || ^7.1
- symfony/finder: ^6.4 || ^7.1
- symfony/process: ^6.4 || ^7.1
- symfony/var-dumper: ^6.4 || ^7.1
Suggests
- codewithkyrian/chromadb-php: For using the ChromaDB as retrieval vector store.
- mongodb/mongodb: For using MongoDB Atlas as retrieval vector store.
- probots-io/pinecone-php: For using the Pinecone as retrieval vector store.
- symfony/clock: For using the clock tool.
- symfony/css-selector: For using the YouTube transcription tool.
- symfony/dom-crawler: For using the YouTube transcription tool.
Conflicts
- mongodb/mongodb: <1.20
This package is auto-updated.
Last update: 2024-11-21 07:02:08 UTC
README
PHP library for building LLM-based features and applications.
This library is not a stable yet, but still rather experimental. Feel free to try it out, give feedback, ask questions, contribute or share your use cases. Abstractions, concepts and interfaces are not final and potentially subject of change.
Requirements
- PHP 8.2 or higher
Installation
The recommended way to install LLM Chain is through Composer:
composer require php-llm/llm-chain
When using Symfony Framework, check out the integration bundle php-llm/llm-chain-bundle.
Examples
See examples folder to run example implementations using this library.
Depending on the example you need to export different environment variables
for API keys or deployment configurations or create a .env.local
based on .env
file.
To run all examples, use make run-all-examples
or php example
.
Basic Concepts & Usage
Models & Platforms
LLM Chain categorizes two main types of models: Language Models and Embeddings Models.
Language Models, like GPT, Claude and Llama, as essential centerpiece of LLM applications and Embeddings Models as supporting models to provide vector representations of text.
Those models are provided by different platforms, like OpenAI, Azure, Replicate, and others.
Example Instantiation
use PhpLlm\LlmChain\OpenAI\Model\Embeddings; use PhpLlm\LlmChain\OpenAI\Model\Gpt; use PhpLlm\LlmChain\OpenAI\Model\Gpt\Version; use PhpLlm\LlmChain\OpenAI\Platform\OpenAI; use Symfony\Component\HttpClient\HttpClient; // Platform: OpenAI $platform = new OpenAI(HttpClient::create(), $_ENV['OPENAI_API_KEY']); // Language Model: GPT (OpenAI) $llm = new Gpt($platform, Version::gpt4oMini()); // Embeddings Model: Embeddings (OpenAI) $embeddings = new Embeddings($platform);
Supported Models & Platforms
- Language Models
- OpenAI's GPT with OpenAI and Azure as Platform
- Anthropic's Claude with Anthropic as Platform
- Embeddings Models
- OpenAI's Text Embeddings with OpenAI and Azure as Platform
- Voyage's Embeddings with Voyage as Platform
See issue #28 for planned support of other models and platforms.
Chain & Messages
The core feature of LLM Chain is to interact with language models via messages. This interaction is done by sending a MessageBag to a Chain, which takes care of LLM invokation and response handling.
Messages can be of different types, most importantly UserMessage
, SystemMessage
, or AssistantMessage
, and can also
have different content types, like Text
or Image
.
Example Chain call with messages
use PhpLlm\LlmChain\Chain; use PhpLlm\LlmChain\Message\MessageBag; use PhpLlm\LlmChain\Message\SystemMessage; use PhpLlm\LlmChain\Message\UserMessage; // LLM instantiation $chain = new Chain($llm); $messages = new MessageBag( new SystemMessage('You are a helpful chatbot answering questions about LLM Chain.'), new UserMessage('Hello, how are you?'), ); $response = $chain->call($messages); echo $response->getContent(); // "I'm fine, thank you. How can I help you today?"
The MessageInterface
and Content
interface help to customize this process if needed, e.g. additional state handling.
Code Examples
- Anthropic's Claude: chat-claude-anthropic.php
- OpenAI's GPT with Azure: chat-gpt-azure.php
- OpenAI's GPT: chat-gpt-openai.php
- OpenAI's o1: chat-o1-openai.php
Tools
To integrate LLMs with your application, LLM Chain supports tool calling out of the box. Tools are services that can be called by the LLM to provide additional features or process data.
Tool calling can be enabled by registering the processors in the chain:
use PhpLlm\LlmChain\ToolBox\ChainProcessor; use PhpLlm\LlmChain\ToolBox\ToolAnalyzer; use PhpLlm\LlmChain\ToolBox\ToolBox; use Symfony\Component\Serializer\Encoder\JsonEncoder; use Symfony\Component\Serializer\Normalizer\ObjectNormalizer; use Symfony\Component\Serializer\Serializer; $yourTool = new YourTool(); $toolBox = new ToolBox(new ToolAnalyzer(), [$yourTool]); $toolProcessor = new ChainProcessor($toolBox); $chain = new Chain($llm, inputProcessor: [$toolProcessor], outputProcessor: [$toolProcessor]);
Custom tools can basically be any class, but must configure by the #[AsTool]
attribute.
use PhpLlm\LlmChain\ToolBox\Attribute\AsTool; #[AsTool('company_name', 'Provides the name of your company')] final class CompanyName { public function __invoke(): string { return 'ACME Corp.' } }
Code Examples (with built-in tools)
- Clock Tool: toolbox-clock.php
- SerpAPI Tool: toolbox-serpapi.php
- Weather Tool: toolbox-weather.php
- Wikipedia Tool: toolbox-wikipedia.php
- YouTube Transcriber Tool: toolbox-youtube.php (with streaming)
Document Embedding, Vector Stores & Similarity Search (RAG)
LLM Chain supports document embedding and similarity search using vector stores like ChromaDB, Azure AI Search, MongoDB Atlas Search, or Pinecone.
For populating a vector store, LLM Chain provides the service DocumentEmbedder
, which requires an instance of an
EmbeddingsModel
and one of StoreInterface
, and works with a collection of Document
objects as input:
use PhpLlm\LlmChain\DocumentEmbedder; use PhpLlm\LlmChain\OpenAI\Model\Embeddings; use PhpLlm\LlmChain\OpenAI\Platform\OpenAI; use PhpLlm\LlmChain\Store\Pinecone\Store; use Probots\Pinecone\Pinecone; use Symfony\Component\HttpClient\HttpClient; $embedder = new DocumentEmbedder( new Embeddings(new OpenAI(HttpClient::create(), $_ENV['OPENAI_API_KEY']);), new Store(Pinecone::client($_ENV['PINECONE_API_KEY'], $_ENV['PINECONE_HOST']), ); $embedder->embed($documents);
The collection of Document
instances is usually created by text input of your domain entities:
use PhpLlm\LlmChain\Document\Metadata; use PhpLlm\LlmChain\Document\TextDocument; foreach ($entities as $entity) { $documents[] = new TextDocument( id: $entity->getId(), // UUID instance content: $entity->toString(), // Text representation of relevant data for embedding metadata: new Metadata($entity->toArray()), // Array representation of entity to be stored additionally ); }
Note
Not all data needs to be stored in the vector store, but you could also hydrate the original data entry based on the ID or metadata after retrieval from the store.*
In the end the chain is used in combination with a retrieval tool on top of the vector store, e.g. the built-in
SimilaritySearch
tool provided by the library:
use PhpLlm\LlmChain\Chain; use PhpLlm\LlmChain\DocumentEmbedder; use PhpLlm\LlmChain\Message\Message; use PhpLlm\LlmChain\Message\MessageBag; use PhpLlm\LlmChain\ToolBox\ChainProcessor; use PhpLlm\LlmChain\ToolBox\Tool\SimilaritySearch; use PhpLlm\LlmChain\ToolBox\ToolAnalyzer; use PhpLlm\LlmChain\ToolBox\ToolBox; // Initialize Platform and LLM $similaritySearch = new SimilaritySearch($embeddings, $store); $toolBox = new ToolBox(new ToolAnalyzer(), [$similaritySearch]); $processor = new ChainProcessor($toolBox); $chain = new Chain(new Gpt($platform), [$processor], [$processor]); $messages = new MessageBag( Message::forSystem(<<<PROMPT Please answer all user questions only using the similary_search tool. Do not add information and if you cannot find an answer, say so. PROMPT>>>), Message::ofUser('...') // The user's question. ); $response = $chain->call($messages);
Code Examples
- MongoDB Store: store-mongodb-similarity-search.php
- Pinecone Store: store-pinecone-similarity-search.php
Supported Stores
- ChromaDB (requires
codewithkyrian/chromadb-php
as additional dependency) - Azure AI Search
- MongoDB Atlas Search (requires
mongodb/mongodb
as additional dependency) - Pinecone (requires
probots-io/pinecone-php
as additional dependency)
See issue #28 for planned support of other models and platforms.
Advanced Usage & Features
Structured Output
A typical use-case of LLMs is to classify and extract data from unstructured sources, which is supported by some models by features like Structured Output or providing a Response Format.
PHP Classes as Output
LLM Chain support that use-case by abstracting the hustle of defining and providing schemas to the LLM and converting the response back to PHP objects.
To achieve this, a specific chain processor needs to be registered:
use PhpLlm\LlmChain\Chain; use PhpLlm\LlmChain\Message\Message; use PhpLlm\LlmChain\Message\MessageBag; use PhpLlm\LlmChain\StructuredOutput\ChainProcessor; use PhpLlm\LlmChain\StructuredOutput\ResponseFormatFactory; use PhpLlm\LlmChain\Tests\StructuredOutput\Data\MathReasoning; use Symfony\Component\Serializer\Encoder\JsonEncoder; use Symfony\Component\Serializer\Normalizer\ObjectNormalizer; use Symfony\Component\Serializer\Serializer; // Initialize Platform and LLM $serializer = new Serializer([new ObjectNormalizer()], [new JsonEncoder()]); $processor = new ChainProcessor(new ResponseFormatFactory(), $serializer); $chain = new Chain($llm, [$processor], [$processor]); $messages = new MessageBag( Message::forSystem('You are a helpful math tutor. Guide the user through the solution step by step.'), Message::ofUser('how can I solve 8x + 7 = -23'), ); $response = $chain->call($messages, ['output_structure' => MathReasoning::class]); dump($response->getContent()); // returns an instance of `MathReasoning` class
Array Structures as Output
Also PHP array structures as response_format
are supported, which also requires the chain processor mentioned above:
use PhpLlm\LlmChain\Message\Message; use PhpLlm\LlmChain\Message\MessageBag; // Initialize Platform, LLM and Chain with processors and Clock tool $messages = new MessageBag(Message::ofUser('What date and time is it?')); $response = $chain->call($messages, ['response_format' => [ 'type' => 'json_schema', 'json_schema' => [ 'name' => 'clock', 'strict' => true, 'schema' => [ 'type' => 'object', 'properties' => [ 'date' => ['type' => 'string', 'description' => 'The current date in the format YYYY-MM-DD.'], 'time' => ['type' => 'string', 'description' => 'The current time in the format HH:MM:SS.'], ], 'required' => ['date', 'time'], 'additionalProperties' => false, ], ], ]]); dump($response->getContent()); // returns an array
Code Examples
- Structured Output (PHP class): structured-output-math.php
- Structured Output (array): structured-output-clock.php
Tool Parameters
LLM Chain generates a JSON Schema representation for all tools in the ToolBox
based on the #[AsTool]
attribute and
method arguments and doc block. Additionally, JSON Schema support validation rules, which are partially support by
LLMs like GPT.
To leverage this, configure the #[ToolParameter]
attribute on the method arguments of your tool:
use PhpLlm\LlmChain\ToolBox\Attribute\AsTool; use PhpLlm\LlmChain\ToolBox\Attribute\ToolParameter; #[AsTool('my_tool', 'Example tool with parameters requirements.')] final class MyTool { /** * @param string $name The name of an object * @param int $number The number of an object */ public function __invoke( #[ToolParameter(pattern: '/([a-z0-1]){5}/')] string $name, #[ToolParameter(minimum: 0, maximum: 10)] int $number, ): string { // ... } }
Note
Please be aware, that this is only converted in a JSON Schema for the LLM to respect, but not validated by LLM Chain.
Response Streaming
Since LLMs usually generate a response word by word, most of them also support streaming the response using Server Side Events. LLM Chain supports that by abstracting the conversion and returning a Generator as content of the response.
use PhpLlm\LlmChain\Chain; use PhpLlm\LlmChain\Message\Message; use PhpLlm\LlmChain\Message\MessageBag; // Initialize Platform and LLM $chain = new Chain($llm); $messages = new MessageBag( Message::forSystem('You are a thoughtful philosopher.'), Message::ofUser('What is the purpose of an ant?'), ); $response = $chain->call($messages, [ 'stream' => true, // enable streaming of response text ]); foreach ($response->getContent() as $word) { echo $word; }
In a terminal application this generator can be used directly, but with a web app an additional layer like Mercure needs to be used.
Code Examples
- Streaming Claude: stream-claude-anthropic.php
- Streaming GPT: stream-gpt-openai.php
Image Processing
Some LLMs also support images as input, which LLM Chain supports as Content
type within the UserMessage
:
use PhpLlm\LlmChain\Message\Content\Image; use PhpLlm\LlmChain\Message\Message; use PhpLlm\LlmChain\Message\MessageBag; // Initialize Platoform, LLM & Chain $messages = new MessageBag( Message::forSystem('You are an image analyzer bot that helps identify the content of images.'), Message::ofUser( 'Describe the image as a comedian would do it.', new Image(dirname(__DIR__).'/tests/Fixture/image.png'), // Path to an image file new Image('https://foo.com/bar.png'), // URL to an image new Image('data:image/png;base64,...'), // Data URL of an image ), ); $response = $chain->call($messages);
Code Examples
- Image Description: image-describer-binary.php (with binary file)
- Image Description: image-describer-url.php (with URL)
Embeddings
Creating embeddings of word, sentences or paragraphs is a typical use case around the interaction with LLMs and
therefore LLM Chain implements a EmbeddingsModel
interface with various models, see above.
The standalone usage results in an Vector
instance:
use PhpLlm\LlmChain\OpenAI\Model\Embeddings; use PhpLlm\LlmChain\OpenAI\Model\Embeddings\Version; // Initialize Platform $embeddings = new Embeddings($platform, Version::textEmbedding3Small()); $vector = $embeddings->create($textInput); dump($vector->getData()); // Array of float values
Code Examples
- OpenAI's Emebddings: embeddings-openai.php
- Voyage's Embeddings: embeddings-voyage.php
Input & Output Processing
The behavior of the Chain is extendable with services that implement InputProcessor
and/or OutputProcessor
interface. They are provided while instantiating the Chain instance:
use PhpLlm\LlmChain\Chain; // Initialize LLM and processors $chain = new Chain($llm, $inputProcessors, $outputProcessors);
InputProcessor
InputProcessor
instances are called in the chain before handing over the MessageBag
and the $options
array to the LLM and are
able to mutate both on top of the Input
instance provided.
use PhpLlm\LlmChain\Chain\Input; use PhpLlm\LlmChain\Chain\InputProcessor; use PhpLlm\LlmChain\Message\AssistantMessage final class MyProcessor implements InputProcessor { public function processInput(Input $input): void { // mutate options $options = $input->getOptions(); $options['foo'] = 'bar'; $input->setOptions($options); // mutate MessageBag $input->messages->append(new AssistantMessage(sprintf('Please answer using the locale %s', $this->locale))); } }
OutputProcessor
OutputProcessor
instances are called after the LLM provided a response and can - on top of options and messages -
mutate or replace the given response:
use PhpLlm\LlmChain\Chain\Output; use PhpLlm\LlmChain\Chain\OutputProcessor; use PhpLlm\LlmChain\Message\AssistantMessage final class MyProcessor implements OutputProcessor { public function processOutput(Output $out): void { // mutate response if (str_contains($output->response->getContent, self::STOP_WORD)) { $output->reponse = new TextReponse('Sorry, we were unable to find relevant information.') } } }
Chain Awareness
Both, Input
and Output
instances, provide access to the LLM used by the Chain, but the chain itself is only
provided, in case the processor implemented the ChainAwareProcessor
interface, which can be combined with using the
ChainAwareTrait
:
use PhpLlm\LlmChain\Chain\ChainAwareProcessor; use PhpLlm\LlmChain\Chain\ChainAwareTrait; use PhpLlm\LlmChain\Chain\Output; use PhpLlm\LlmChain\Chain\OutputProcessor; use PhpLlm\LlmChain\Message\AssistantMessage final class MyProcessor implements OutputProcessor, ChainAwareProcessor { use ChainAwareTrait; public function processOutput(Output $out): void { // additional chain interaction $response = $this->chain->call(...); } }
Contributions
Contributions are always welcome, so feel free to join the development of this library.