larswiegers/laravel-ai-evaluation

Run AI/LLM evals for your AI features

Maintainers

Package info

github.com/LarsWiegers/laravel-ai-evaluation

Homepage

pkg:composer/larswiegers/laravel-ai-evaluation

Statistics

Installs: 2

Dependents: 0

Suggesters: 0

Stars: 0

Open Issues: 0

dev-master 2026-04-17 15:25 UTC

README

Latest Version on Packagist Total Downloads GitHub Actions

Laravel AI Evaluation helps you run LLM output evaluations directly in your test suite using real model calls.

Installation

You can install the package via composer:

composer require larswiegers/laravel-ai-evaluation

Usage

use App\Ai\Agents\SupportAgent;
use LaravelAIEvaluation\LaravelAIEvaluation\LaravelAIEvaluationFacade as AiEval;

AiEval::agent(SupportAgent::class)
    ->case('refund-policy')
    ->input('What is your refund policy?')
    ->expectContains(['refund', '30 days'])
    ->run()
    ->assertPasses();

You can also assert exact outputs:

AiEval::agent(SupportAgent::class)
    ->case('healthcheck')
    ->input('Reply with exactly: OK')
    ->expectExact('OK')
    ->run()
    ->assertPasses();

And evaluate with an LLM judge rubric + reference answer:

AiEval::agent(SupportAgent::class)
    ->input('What is your refund policy?')
    ->expectJudgeAgainst(
        reference: 'Refunds are available within 30 days of purchase.',
        criteria: 'The answer should be correct, concise, and mention the 30 day window.',
        threshold: 0.8,
        judge: App\Ai\Agents\JudgeAgent::class,
    )
    ->run()
    ->assertPasses();

You can configure one judge for the whole eval chain:

AiEval::agent(SupportAgent::class)
    ->input('What is your refund policy?')
    ->useJudge(App\Ai\Agents\JudgeAgent::class)
    ->expectJudge('The answer should be concise and mention the refund window.', threshold: 0.8)
    ->expectJudgeAgainst(
        reference: 'Refunds are available within 30 days of purchase.',
        criteria: 'The answer should be correct and complete.',
        threshold: 0.8,
    )
    ->run()
    ->assertPasses();

The package includes a default judge agent, so you can start immediately if Laravel AI is available. You can still override the default in config or pass one per expectation as shown above.

Debug output and formats

EvalResult supports dump() and dd() in text and json formats:

$result = AiEval::agent(SupportAgent::class)
    ->input('What is your refund policy?')
    ->expectContains(['refund'])
    ->run();

$result->dump(); // text
$result->dump(format: 'json'); // JSON line

Verbose mode and default output format are configurable:

AI_EVAL_VERBOSE=true
AI_EVAL_FORMAT=text

Run summaries (passed / failed / token usage / cost) are also configurable:

AI_EVAL_SUMMARY=true
AI_EVAL_SUMMARY_FORMAT=json
AI_EVAL_SUMMARY_CURRENCY=USD

Recommended location for these eval tests is tests/AgentEvals.

Testing

composer test

Changelog

Please see CHANGELOG for more information what has changed recently.

Contributing

Please see CONTRIBUTING for details.

Security

If you discover any security related issues, please email test@test.com instead of using the issue tracker.

Credits

License

The MIT License (MIT). Please see License File for more information.