padosoft / laravel-ai-search-providers
Plug-and-play Laravel package exposing a single contract over Brave Search, Tavily, Exa.ai, Firecrawl, WebSearchAPI.ai and DuckDuckGo — the search/extraction backbone for AI agents and product/catalog tooling.
Package info
github.com/padosoft/laravel-ai-search-providers
pkg:composer/padosoft/laravel-ai-search-providers
Requires
- php: ^8.3
- ext-dom: *
- ext-libxml: *
- illuminate/contracts: ^11.0 || ^12.0 || ^13.0
- illuminate/database: ^11.0 || ^12.0 || ^13.0
- illuminate/http: ^11.0 || ^12.0 || ^13.0
- illuminate/support: ^11.0 || ^12.0 || ^13.0
Requires (Dev)
- orchestra/testbench: ^9.0 || ^10.0 || ^11.0
- phpunit/phpunit: ^11.0 || ^12.0
This package is auto-updated.
Last update: 2026-05-23 16:43:25 UTC
README
One Laravel-native contract over Brave, Tavily, Exa.ai, Firecrawl, WebSearchAPI.ai and DuckDuckGo. Plug any AI-friendly search API into your app in three commands. Swap providers with a config row. Test offline with the bundled fake provider, then go live with Http::fake-driven unit tests and an opt-in real-API E2E suite.
7 providers. 1 interface. 0 boilerplate.
Table of Contents
- Why this package
- Features
- Supported providers
- Quick Start (5 minutes, junior-friendly)
- Per-provider setup
- Architecture
- Configuration reference
- Extending: add a custom driver
- Testing
- Backward compatibility tips for host apps
- Roadmap
- Contributing
- Credits
- License
Why this package
Modern AI agents and product/catalog/price-comparison tooling all need the same primitive: search the web, parse the results, hand them to the next stage. Every API does it differently — Brave returns one shape, Tavily another, Exa flattens images inside extras, Firecrawl uses data.images[], DuckDuckGo has no API at all. Re-implementing that plumbing in every Laravel project is repetitive, brittle, and impossible to test offline.
This package gives you:
- One interface (
SearchProviderInterface) —searchImages()andsearchWeb(), period. - One manager (
SearchProviderManager) — picks the right provider by priority, falls back when one fails, logs every attempt, never leaks API keys. - One config row — drop a row in
search_providers, setis_active=true, you're done. Switch providers from staging to production with a SQL update.
It's the search/extraction backbone the padosoft/product-image-discovery catalog pipeline runs in production, extracted and hardened for everyone else.
Features
- 🔌 7 providers out of the box — Brave, Tavily, Exa.ai, Firecrawl, WebSearchAPI.ai, DuckDuckGo (no key), and a deterministic Fake for tests.
- 🧪 Test-first — every driver is unit-tested via
Http::fake. Live E2E suite is opt-in and self-skips when API keys are absent. - 🛡️ Secrets-safe by default —
api_key/api_secretare encrypted at rest, redacted from logs and execution metadata, never exposed intoSafeArray(). - 🔄 Priority + fallback orchestration — list multiple providers, set priority, the manager tries them in order and falls back on failure or empty results.
- 🧱 Pluggable model + table — the package ships
search_providerstable; host apps can override via one config key without subclassing. - 🪶 Zero-config Quick Start —
composer require+migrate+ activate the fake provider = working pipeline in 5 minutes, no API keys. - 📦 Laravel-native — auto-discovery, publishable config + migrations,
loadMigrationsFromso junior devs don't have to copy files. - 🤖 Built for agents —
SearchEventLoggerInterfacelets you stream every provider attempt into your own audit/observability layer. - 🦆 Free fallback included — DuckDuckGo HTML lite parser ships with the package; no key, anti-bot–aware live test self-skips in CI.
- 🇮🇹 EU-friendly — Apache-2.0 license, runs on Laravel 11/12/13 with PHP 8.3+, no proprietary lock-in.
Supported providers
Out of the box the package ships 7 search providers ready to plug in:
| Provider | Driver | Image search | Site filter | Free tier | Docs |
|---|---|---|---|---|---|
| Fake (deterministic) | fake |
✅ | ✅ | — | bundled |
| Brave Search | brave |
✅ | ✅ | 2 000 queries / month | https://api-dashboard.search.brave.com/app/documentation |
| Tavily | tavily |
✅ | ✅ (include_domains) |
1 000 credits / month | https://docs.tavily.com |
| Exa.ai | exa |
✅ (extras.imageLinks) |
✅ (includeDomains) |
trial credits | https://exa.ai/docs |
| Firecrawl | firecrawl |
✅ (sources:[{type:"images"}]) |
✅ (includeDomains) |
500 credits / month | https://docs.firecrawl.dev |
| WebSearchAPI.ai | websearchapi |
❌ (web-only) | ✅ (includeDomains) |
trial credits | https://websearchapi.ai/docs/search-api |
| DuckDuckGo (HTML lite) | duckduckgo |
❌ (web-only) | ✅ (site:) |
no key | https://duckduckgo.com/html/ |
SearchProviderManager automatically skips drivers whose supportsImageSearch() returns false when you call searchImages(), so mixing image-capable and web-only drivers in the same priority list is safe.
Quick Start (5 minutes, junior-friendly)
Prerequisites: PHP 8.3+, Composer, a Laravel 11/12/13 app, ~5 minutes. No API keys, no Redis, no queue worker.
The Quick Start uses the bundled fake driver. You write zero code beyond the snippets below.
1. Install the package
composer require padosoft/laravel-ai-search-providers
2. (Optional) Publish config + migrations
The package's service provider auto-loads the migration, so this is only needed if you want to customize the schema or config defaults.
php artisan vendor:publish --tag=ai-search-providers-config php artisan vendor:publish --tag=ai-search-providers-migrations
3. Run migrations
php artisan migrate
You now have a search_providers table.
4. Insert a fake provider row
In php artisan tinker:
\Padosoft\LaravelAiSearchProviders\Models\SearchProviderConfig::query()->create([ 'code' => 'quickstart-fake', 'name' => 'Quickstart Fake', 'driver' => 'fake', 'config' => [ 'image_results' => [[ 'title' => 'Quick Start Demo', 'page_url' => 'https://example.test/p/demo', 'image_url' => 'https://cdn.example.test/demo.jpg', 'source_domain' => 'example.test', 'width' => 1200, 'height' => 1200, ]], ], 'priority' => 1, 'timeout_seconds' => 5, 'is_active' => true, ]);
5. Run a search
Anywhere in your app (controller, console command, tinker):
use Padosoft\LaravelAiSearchProviders\Data\SearchQueryData; use Padosoft\LaravelAiSearchProviders\SearchProviderManager; $manager = app(SearchProviderManager::class); $execution = $manager->searchImages(SearchQueryData::fromArray([ 'brand' => 'Nike', 'model' => 'Air Force 1 07', 'color' => 'White', 'site' => 'nike.com', 'limit' => 5, ])); dump($execution->provider?->code); // "quickstart-fake" dump($execution->results->count()); // 1 dump($execution->results->first()->title); // "Quick Start Demo"
✅ Done. You just plugged in your first AI search provider. Now swap the row's driver to tavily/brave/exa/firecrawl/websearchapi/duckduckgo, paste the API key into api_key_encrypted, and the same code runs live. See Per-provider setup for each driver's env vars.
Per-provider setup
Every live driver shares the same activation flow:
- Set the API key in
.env. - Insert a
SearchProviderConfigrow (tinker snippet below). - The next
SearchProviderManager::searchImages()/::searchWeb()call uses it.
Below are the driver-specific details.
Brave Search
BRAVE_SEARCH_API_KEY=your-key
\Padosoft\LaravelAiSearchProviders\Models\SearchProviderConfig::query()->create([ 'code' => 'brave', 'name' => 'Brave Search', 'driver' => 'brave', 'base_url' => 'https://api.search.brave.com', 'api_key_encrypted' => env('BRAVE_SEARCH_API_KEY'), 'priority' => 10, 'timeout_seconds' => 15, 'is_active' => true, ]);
Tavily
TAVILY_API_KEY=your-key
\Padosoft\LaravelAiSearchProviders\Models\SearchProviderConfig::query()->create([ 'code' => 'tavily', 'name' => 'Tavily', 'driver' => 'tavily', 'base_url' => 'https://api.tavily.com', 'api_key_encrypted' => env('TAVILY_API_KEY'), 'config' => ['search_depth' => 'basic'], 'priority' => 20, 'timeout_seconds' => 20, 'is_active' => true, ]);
Tavily returns images[] either as a string[] (legacy) or as {url, description?}[] (current); the driver normalizes both.
Exa.ai
EXA_API_KEY=your-key
\Padosoft\LaravelAiSearchProviders\Models\SearchProviderConfig::query()->create([ 'code' => 'exa', 'name' => 'Exa.ai', 'driver' => 'exa', 'base_url' => 'https://api.exa.ai', 'api_key_encrypted' => env('EXA_API_KEY'), 'config' => ['search_type' => 'auto', 'image_links_per_result' => 5], 'priority' => 30, 'timeout_seconds' => 20, 'is_active' => true, ]);
Each Exa result can carry up to image_links_per_result images in extras.imageLinks. The driver flattens them 1:N and emits one candidate per image URL, deduped against the primary image.
Firecrawl
FIRECRAWL_API_KEY=your-key
\Padosoft\LaravelAiSearchProviders\Models\SearchProviderConfig::query()->create([ 'code' => 'firecrawl', 'name' => 'Firecrawl', 'driver' => 'firecrawl', 'base_url' => 'https://api.firecrawl.dev', 'api_key_encrypted' => env('FIRECRAWL_API_KEY'), 'priority' => 40, 'timeout_seconds' => 60, // /v2/search is synchronous; 20–40 s typical on free tier 'rate_limit_per_minute' => 30, 'is_active' => true, ]);
WebSearchAPI.ai
WEBSEARCHAPI_API_KEY=your-key
\Padosoft\LaravelAiSearchProviders\Models\SearchProviderConfig::query()->create([ 'code' => 'websearchapi', 'name' => 'WebSearchAPI.ai', 'driver' => 'websearchapi', 'base_url' => 'https://api.websearchapi.ai', 'api_key_encrypted' => env('WEBSEARCHAPI_API_KEY'), 'priority' => 50, 'timeout_seconds' => 30, 'is_active' => true, ]);
WebSearchAPI exposes only Google-backed organic web results, so the driver reports supportsImageSearch() === false and the manager skips it for image queries automatically. Use it as a primary web driver or as a fallback.
DuckDuckGo (HTML lite)
No key required. The driver POSTs to https://html.duckduckgo.com/html/, parses the response with DOMDocument + DOMXPath, and decodes the //duckduckgo.com/l/?uddg=... redirect links transparently.
# Optional override (defaults to https://html.duckduckgo.com): DUCKDUCKGO_URL=https://html.duckduckgo.com
\Padosoft\LaravelAiSearchProviders\Models\SearchProviderConfig::query()->create([ 'code' => 'duckduckgo', 'name' => 'DuckDuckGo (HTML lite)', 'driver' => 'duckduckgo', 'base_url' => 'https://html.duckduckgo.com', 'priority' => 60, 'timeout_seconds' => 20, 'rate_limit_per_minute' => 20, // be polite, shared anti-bot infra 'is_active' => true, ]);
Caveats:
- Web search only (
supportsImageSearch() === false). - DuckDuckGo applies anti-bot rate limits to shared/datacenter IPs. Use sparingly. The bundled live test self-skips in CI (
CI=true) and on 403/429/503 responses.
Fake provider
Deterministic, no network. Configure via the config JSON column:
\Padosoft\LaravelAiSearchProviders\Models\SearchProviderConfig::query()->create([ 'code' => 'tests', 'name' => 'Fake (tests)', 'driver' => 'fake', 'config' => [ 'image_results' => [/* array of result rows */], 'web_results' => [/* array of result rows */], 'throw' => false, // set true to simulate failure 'throw_for' => ['web'], // or fail just one method 'supports_image_search' => true, 'supports_site_filter' => true, ], 'priority' => 1, 'is_active' => true, ]);
Perfect for feature tests, smoke tests, and the Quick Start.
Architecture
flowchart LR
A[Your code] -->|SearchQueryData| B(SearchProviderManager)
B --> C{ConfigRepository}
C -->|active+ordered| D[SearchProviderDefinition list]
B -->|driver name| E{Factories registry}
E --> F[fake]
E --> G[brave]
E --> H[tavily]
E --> I[exa]
E --> J[firecrawl]
E --> K[websearchapi]
E --> L[duckduckgo]
F --> M[SearchResultCollection]
G --> M
H --> M
I --> M
J --> M
K --> M
L --> M
B -->|attempts + result| N[Your code]
B -.->|optional| O[SearchEventLoggerInterface]
Loading
Core moving pieces:
SearchProviderManager— orchestrates a search call. Reads active providers from the repository, sorts by priority, tries each driver via its factory, falls back on failure, skips when a driver doesn't support the requested method, returns aSearchProviderExecutionResultwith the full attempt log.SearchProviderConfigRepositoryInterface— pluggable backing store. DefaultEloquentSearchProviderConfigRepositoryreads rows from thesearch_providerstable; you can swap it for an in-memory implementation in tests or a Redis/etcd-backed one in production.SearchProviderFactoryInterface+CallableSearchProviderFactory— closure-based factory pattern. Register your own driver by appending toconfig('ai-search-providers.factories').AbstractHttpSearchProvider— base class providing the shared HTTP / parsing helpers (pickUrl,dotGet,extractDomain,normalizeDomain,normalizeInt,normalizeFloat,applySiteFilter). New drivers add ~80 LOC of provider-specific code.SearchEventLoggerInterface— single hook (record($eventType, $context, $level)). Bind your audit logger and you'll get one event per provider attempt: success, failure, empty result, skipped.
Configuration reference
config/ai-search-providers.php (after vendor:publish):
| Key | Type | Default | Purpose |
|---|---|---|---|
table |
string | search_providers (or env('AI_SEARCH_PROVIDERS_TABLE')) |
Eloquent table name read by SearchProviderConfig::getTable() and the create-table migration. |
model |
class-string|null | null |
When set, EloquentSearchProviderConfigRepository uses this class instead of the bundled SearchProviderConfig. Useful for host apps that want a thin subclass with extra columns or a custom table. |
load_migrations |
bool | true |
Toggle loadMigrationsFrom() in the service provider. Disable when you fully manage the schema yourself. |
factories |
array<string, callable|class-string|SearchProviderFactoryInterface> |
[] |
Per-driver factory overrides, merged on top of the 7 bundled defaults. |
Plus the per-provider env vars listed under Per-provider setup.
Extending: add a custom driver
namespace App\Search; use Padosoft\LaravelAiSearchProviders\Data\SearchQueryData; use Padosoft\LaravelAiSearchProviders\Data\SearchResultCollection; use Padosoft\LaravelAiSearchProviders\Providers\AbstractHttpSearchProvider; final class SerperSearchProvider extends AbstractHttpSearchProvider { public function searchImages(SearchQueryData $query): SearchResultCollection { $this->assertHttpClientAvailable(); $payload = \Illuminate\Support\Facades\Http::baseUrl($this->definition->baseUrl ?? 'https://google.serper.dev') ->withHeaders(['X-API-KEY' => (string) $this->definition->apiKey]) ->timeout($this->definition->timeoutSeconds) ->post('/images', ['q' => $this->applySiteFilter($query), 'num' => $query->limit]) ->throw() ->json(); return new SearchResultCollection(array_map(fn (array $hit): array => [ 'title' => $hit['title'] ?? 'Untitled', 'page_url' => $hit['link'] ?? null, 'image_url' => $hit['imageUrl'] ?? null, 'thumbnail_url' => $hit['thumbnailUrl'] ?? null, 'source_domain' => $this->extractDomain($hit['link'] ?? null), ], $payload['images'] ?? [])); } public function searchWeb(SearchQueryData $query): SearchResultCollection { return new SearchResultCollection(); } }
Register it in config/ai-search-providers.php:
'factories' => [ 'serper' => static fn ($definition) => new \App\Search\SerperSearchProvider($definition), ],
Insert a SearchProviderConfig row with driver = 'serper' and your code is now reachable from SearchProviderManager. Done.
Testing
The package ships a full unit suite that does not touch the network:
vendor/bin/phpunit --testsuite Unit,Feature
The optional E2E suite under tests/E2E/ exercises every live driver against real APIs:
# Put your keys in .env first (see .env.example)
vendor/bin/phpunit --testsuite E2E
Without keys the live tests skip cleanly, so CI stays green even in PRs from contributors without credentials. The DuckDuckGo live test additionally skips on CI=true (shared runner IPs are anti-bot-throttled).
In your own app's tests, use the bundled fake provider (see Fake provider) or the test-helper repository Padosoft\LaravelAiSearchProviders\Tests\Support\InMemorySearchProviderConfigRepository to inject definitions without touching the database.
Backward compatibility tips for host apps
If you're extracting your own search layer onto this package (the way padosoft/product-image-discovery does):
- Keep your existing table: set
config('ai-search-providers.table')to your legacy table name (e.g.product_image_search_providers). The bundled migration isSchema::hasTable()-guarded so it skips automatically. - Keep your existing model: extend
SearchProviderConfigwith a thin subclass, hard-codeprotected $table = '...', and setconfig('ai-search-providers.model')to the FQCN. All your downstreamFoo::with('searchProvider')calls keep working. - Wire your audit logger: bind your existing event logger to
SearchEventLoggerInterfacein your service provider, and the manager emits one event per attempt into your audit trail.
Roadmap
- 🔁 Runtime enforcement of
rate_limit_per_minute(currently advisory). - 🧠 Built-in caching adapter (decorator over any driver).
- 🧬 Perceptual-hash dedupe utility for image results.
- 🪙 More drivers as the community asks: Serper, SearchAPI.io, Google Custom Search, You.com.
See docs/PROGRESS.md for in-flight work and docs/LESSON.md for design notes accumulated during the original extraction.
Contributing
Pull requests welcome. Before opening one:
- Run
vendor/bin/phpunit --testsuite Unit,Featurelocally and make sure it stays green. - If you touched HTTP plumbing, drive it with
Http::fake— don't add network-bound unit tests. - New providers: ship the driver, ≥4 unit-test cases (parse OK, empty payload, HTTP 4xx, peculiarity), and an opt-in live E2E test under
tests/E2E/. - Update README's Supported providers matrix.
Credits
- Lorenzo Padovani — author and maintainer.
- The drivers were originally engineered inside
padosoft/product-image-discoveryv0.2.0 → v0.3.0 and extracted here as v1.0.0. - Big thanks to Brave, Tavily, Exa.ai, Firecrawl, WebSearchAPI.ai and DuckDuckGo for the APIs this package wraps.
License
Apache-2.0. See LICENSE.
