rkdhatterwal / decodo-scraper
A Laravel package for the Decodo Web Scraping API — real-time, async, webhooks, DB tracking, caching, and testing helpers
dev-master
2026-05-21 08:01 UTC
Requires
- php: ^8.2
- illuminate/cache: ^11.0|^12.0
- illuminate/console: ^11.0|^12.0
- illuminate/database: ^11.0|^12.0
- illuminate/events: ^11.0|^12.0
- illuminate/http: ^11.0|^12.0
- illuminate/routing: ^11.0|^12.0
- illuminate/support: ^11.0|^12.0
Requires (Dev)
- orchestra/testbench: ^9.0
- pestphp/pest: ^3.0
- pestphp/pest-plugin-laravel: ^3.0
- phpunit/phpunit: ^11.0
This package is auto-updated.
Last update: 2026-05-21 08:05:50 UTC
README
A clean, well-tested Laravel wrapper for the Decodo Web Scraping API — supports both real-time (v2) and async/batch (v3) scraping.
Requirements
- PHP 8.2+
- Laravel 11 or 12
Installation
composer require rkdhatterwal/decodo-scraper
Publish the config file:
php artisan vendor:publish --tag=decodo-config
Add credentials to .env:
DECODO_TOKEN=your_basic_auth_token_here DECODO_TIMEOUT=120
Your token is in the Decodo dashboard under Scraping APIs → Username / Token.
Real-time Scraping (v2)
Use the Decodo facade or inject DecodoClient.
use Rkdhatterwal\DecodoScraper\Facades\Decodo; // Simple scrape → ScrapeResult $result = Decodo::scrape('https://example.com'); echo $result->content; // raw HTML echo $result->statusCode; // upstream HTTP status // JavaScript rendering $result = Decodo::scrapeWithJs('https://example.com'); // Screenshot (PNG) $result = Decodo::screenshot('https://example.com'); // Geo-targeted $result = Decodo::scrapeFromGeo('https://example.com', 'United States'); // Return Markdown (great for LLM pipelines) $result = Decodo::scrapeAsMarkdown('https://example.com'); // Structured data via a target template's parser $result = Decodo::scrapeWithParser('amazon_pricing', 'https://amazon.com/dp/B0BS1QCF'); // Scrape multiple URLs at once $results = Decodo::scrapeMany(['https://example.com', 'https://another.com']);
Full control with PayloadBuilder
use Rkdhatterwal\DecodoScraper\PayloadBuilder; use Rkdhatterwal\DecodoScraper\Facades\Decodo; $results = Decodo::send( (new PayloadBuilder()) ->url('https://example.com') ->headless('html') ->geo('Germany') ->locale('de-DE') ->deviceType('mobile') ->proxyPool('premium') ->markdown() ->successfulStatusCodes([200, 301]) );
Async Scraping (v3)
Use the DecodoAsync facade or inject AsyncDecodoClient.
Queue a single task
use Rkdhatterwal\DecodoScraper\Facades\DecodoAsync; // Queue and get a task ID immediately $task = DecodoAsync::queueTask('https://example.com'); echo $task->id; // "7434928397127555073" echo $task->status; // "pending" // With a webhook callback $task = DecodoAsync::queueTask( url: 'https://example.com', options: ['headless' => 'html', 'geo' => 'United States'], callbackUrl: 'https://my.app/webhook/decodo', passthrough: 'my-verification-secret', // echoed back for auth );
Queue a batch
$batch = DecodoAsync::queueBatch( urls: ['https://site1.com', 'https://site2.com', 'https://site3.com'], options: ['geo' => 'United States'], callbackUrl: 'https://my.app/webhook/decodo-batch', ); $batch->ids(); // Collection of task IDs $batch->count(); // 3
Check status & retrieve results
// Poll status manually $status = DecodoAsync::getTaskStatus($task->id); $status->isPending(); // true / false $status->isDone(); // true / false $status->isFaulted(); // true / false // Retrieve results once done (valid for 24 hours) $results = DecodoAsync::getTaskResults($task->id); // Collection<ScrapeResult> $result = DecodoAsync::getFirstTaskResult($task->id); // ScrapeResult // Convenience: poll and block until done (for scripts/queues) $results = DecodoAsync::pollUntilDone($task->id, intervalMs: 2000, maxAttempts: 30);
ScrapeResult DTO
| Property | Type | Description |
|---|---|---|
content |
string |
Raw HTML, Markdown, or parsed JSON |
statusCode |
int |
HTTP status of the upstream page |
url |
string |
The URL that was scraped |
taskId |
string |
Decodo task ID |
createdAt |
string |
Task creation timestamp |
updatedAt |
string |
Task completion timestamp |
$result->isSuccessful(); // true when statusCode is 2xx $result->toArray();
TaskResponse DTO
Returned by queueTask(), queueBatch(), getTaskStatus().
$task->id; // Task ID for later retrieval $task->status; // "pending" | "done" | "faulted" $task->isPending(); // bool $task->isDone(); // bool $task->isFaulted(); // bool $task->toArray();
All Payload Parameters
See the Decodo parameters docs.
| Method on PayloadBuilder | API Parameter | Default |
|---|---|---|
->url($url) |
url |
required |
->query($q) |
query |
— |
->target($t) |
target |
null |
->proxyPool('standard') |
proxy_pool |
premium |
->headless('html'/'png') |
headless |
null |
->geo('United States') |
geo |
auto |
->domain('co.uk') |
domain |
com |
->locale('en-GB') |
locale |
matched |
->headers([...]) |
headers |
null |
->forceHeaders() |
force_headers |
false |
->cookies([...]) |
cookies |
null |
->forceCookies() |
force_cookies |
false |
->deviceType('mobile') |
device_type |
desktop |
->parse() |
parse |
false |
->sessionId('1234') |
session_id |
null |
->httpMethod('post') |
http_method |
get |
->payload($body) |
payload (base64) |
null |
->successfulStatusCodes([]) |
successful_status_codes |
null |
->markdown() |
markdown |
false |
->xhr() |
xhr |
false |
->callbackUrl($url) |
callback_url |
null |
->passthrough($val) |
passthrough |
null |
Testing
composer test
All tests use Http::fake() — no live requests are made.
Changelog
See CHANGELOG.md.
License
MIT