larswiegers / laravel-ai-evaluation
Run AI/LLM evals for your AI features
Package info
github.com/LarsWiegers/laravel-ai-evaluation
pkg:composer/larswiegers/laravel-ai-evaluation
Requires
- php: ^8.3|^8.4|^8.5
- illuminate/support: ^12.0|^13.0
- laravel/ai: ^0.5.0
Requires (Dev)
- orchestra/testbench: ^11.0
- pestphp/pest: ^4.5
- phpunit/phpunit: ^12.0
This package is auto-updated.
Last update: 2026-04-19 11:01:21 UTC
README
Laravel AI Evaluation helps you run LLM output evaluations directly in your test suite using real model calls.
Installation
You can install the package via composer:
composer require larswiegers/laravel-ai-evaluation
Usage
use App\Ai\Agents\SupportAgent; use LaravelAIEvaluation\LaravelAIEvaluation\LaravelAIEvaluationFacade as AiEval; AiEval::agent(SupportAgent::class) ->case('refund-policy') ->input('What is your refund policy?') ->expectContains(['refund', '30 days']) ->run() ->assertPasses();
You can also assert exact outputs:
AiEval::agent(SupportAgent::class) ->case('healthcheck') ->input('Reply with exactly: OK') ->expectExact('OK') ->run() ->assertPasses();
And evaluate with an LLM judge rubric + reference answer:
AiEval::agent(SupportAgent::class) ->input('What is your refund policy?') ->expectJudgeAgainst( reference: 'Refunds are available within 30 days of purchase.', criteria: 'The answer should be correct, concise, and mention the 30 day window.', threshold: 0.8, judge: App\Ai\Agents\JudgeAgent::class, ) ->run() ->assertPasses();
You can configure one judge for the whole eval chain:
AiEval::agent(SupportAgent::class) ->input('What is your refund policy?') ->useJudge(App\Ai\Agents\JudgeAgent::class) ->expectJudge('The answer should be concise and mention the refund window.', threshold: 0.8) ->expectJudgeAgainst( reference: 'Refunds are available within 30 days of purchase.', criteria: 'The answer should be correct and complete.', threshold: 0.8, ) ->run() ->assertPasses();
The package includes a default judge agent, so you can start immediately if Laravel AI is available. You can still override the default in config or pass one per expectation as shown above.
Debug output and formats
EvalResult supports dump() and dd() in text and json formats:
$result = AiEval::agent(SupportAgent::class) ->input('What is your refund policy?') ->expectContains(['refund']) ->run(); $result->dump(); // text $result->dump(format: 'json'); // JSON line
Verbose mode and default output format are configurable:
AI_EVAL_VERBOSE=true AI_EVAL_FORMAT=text
Run summaries (passed / failed / token usage / cost) are also configurable:
AI_EVAL_SUMMARY=true AI_EVAL_SUMMARY_FORMAT=json AI_EVAL_SUMMARY_CURRENCY=USD
Recommended location for these eval tests is tests/AgentEvals.
Testing
composer test
Changelog
Please see CHANGELOG for more information what has changed recently.
Contributing
Please see CONTRIBUTING for details.
Security
If you discover any security related issues, please email test@test.com instead of using the issue tracker.
Credits
- [Lars Wiegers](https://github.com/Lars Wiegers)
- All Contributors
License
The MIT License (MIT). Please see License File for more information.