rushing/saloon-playwright-sender

A Saloon Sender that dispatches requests through a Playwright browser microservice

Maintainers

Package info

github.com/stephenr85/saloon-playwright-sender

pkg:composer/rushing/saloon-playwright-sender

Statistics

Installs: 0

Dependents: 0

Suggesters: 0

Stars: 0

Open Issues: 0

dev-main 2026-06-09 18:25 UTC

This package is auto-updated.

Last update: 2026-06-09 18:25:58 UTC


README

A Saloon Sender that dispatches requests through a Playwright browser microservice. Use it when a target site requires JavaScript rendering, bot-detection bypass, or any browser interaction before the page is ready to scrape.

How it works

  1. Your Saloon connector sends a request through PlaywrightSender instead of the default Guzzle sender.
  2. PlaywrightSender forwards the request to a local Node.js service.
  3. The Node.js service uses a Playwright browser to navigate to the URL, optionally runs an interaction script, then returns the rendered HTML (or raw response body) to PHP.
  4. Saloon receives a normal Response — DTOs, plugins, retries, and MockClient all work as expected.

Requirements

Installation

composer require rushing/saloon-playwright-sender

If you are using Laravel, publish the config:

php artisan vendor:publish --tag=playwright-sender-config

Starting the Playwright service

cd vendor/rushing/saloon-playwright-sender/playwright-service
yarn install          # or: npm install
npx playwright install chromium
node index.js

The service binds to 127.0.0.1 (loopback only) by default and is not reachable from outside the host.

Tip — Laravel projects: add the install steps to your composer.json post-autoload-dump scripts so they run automatically on composer install / composer update:

"post-autoload-dump": [
    "...",
    "yarn --cwd vendor/rushing/saloon-playwright-sender/playwright-service install --silent",
    "@php -r \"passthru('cd vendor/rushing/saloon-playwright-sender/playwright-service && npx playwright install chromium --quiet 2>&1');\""
]

Set PLAYWRIGHT_AUTO_START=true in .env.testing so the service starts automatically during test runs.

Bot-detection bypass

The service launches Chrome (preferred, via channel: 'chrome') or the downloaded Chromium headless shell as a fallback. It uses a persistent browser profile (~/.playwright-sender-profile by default) so that bot-detection clearance cookies survive between process restarts.

To pass Cloudflare and similar bot checks, the service:

  • Uses --disable-blink-features=AutomationControlled
  • Masks navigator.webdriver via an init script
  • Launches in headed mode by default (PLAYWRIGHT_HEADLESS=true to override)
  • Retries once after a short wait on 429 Too Many Requests

Headless vs. headed on a server. Headed mode passes fingerprint checks that headless Chromium fails, so the default is headed. On a server with no display (CI, a headless VM), run the headed browser under a virtual display: xvfb-run -a node index.js. xvfb-run is only needed for headed mode — if you set PLAYWRIGHT_HEADLESS=true, no display (and no Xvfb) is required, but expect more bot blocks.

The waitUntil strategy defaults to 'load' rather than 'networkidle' so pages with persistent analytics requests don't time out. Override with PLAYWRIGHT_WAIT_UNTIL.

Environment variables (Node service)

Variable Default Description
PORT 3000 Port the service listens on
HOST 127.0.0.1 Interface to bind — keep as loopback
PLAYWRIGHT_HEADLESS (headed) true runs headless (no display needed). Headed (default) needs a display — on a headless server wrap with xvfb-run
PLAYWRIGHT_USER_DATA_DIR ~/.playwright-sender-profile Persistent browser profile directory
PLAYWRIGHT_WAIT_UNTIL load Playwright waitUntil navigation strategy

Configuration

PHP / Laravel

Set these in .env (or pass a PlaywrightServiceConfig directly — see below):

Variable Default Description
PLAYWRIGHT_SERVICE_URL http://localhost:3000 URL of the running Node service
PLAYWRIGHT_TIMEOUT 30 Request timeout in seconds
PLAYWRIGHT_RESPONSE_MODE html html returns rendered DOM; body returns the raw HTTP response body
PLAYWRIGHT_AUTO_START false Start the Node service automatically if it is not already running

Basic usage

Set PlaywrightSender as the default sender on any Saloon connector:

use Rushing\SaloonPlaywright\PlaywrightSender;
use Saloon\Contracts\Sender;
use Saloon\Http\Connector;

class MyConnector extends Connector
{
    public function resolveBaseUrl(): string
    {
        return 'https://example.com';
    }

    protected function defaultSender(): Sender
    {
        return new PlaywrightSender();
    }
}

To override the service URL or timeout at runtime:

use Rushing\SaloonPlaywright\PlaywrightServiceConfig;

protected function defaultSender(): Sender
{
    return new PlaywrightSender(new PlaywrightServiceConfig(
        serviceUrl: 'http://playwright-service:3000',
        timeout: 60,
        responseMode: 'html',
    ));
}

Parse the rendered HTML in your request's createDtoFromResponse() using Saloon's built-in $response->dom(), which returns a Symfony\Component\DomCrawler\Crawler:

public function createDtoFromResponse(Response $response): array
{
    $items = [];

    $response->dom()->filter('.product-card')->each(function (Crawler $node) use (&$items) {
        $items[] = [
            'name'  => trim($node->filter('h2')->text()),
            'price' => trim($node->filter('.price')->text()),
        ];
    });

    return $items;
}

Browser interactions

playwright_script — post-load interactions

Some pages require an action before their full content is available — expanding a collapsed section, clicking a "load more" button, or waiting for a lazy-loaded element.

Define a playwright_script in your request's defaultConfig(). The script runs after the page loads successfully and has access to the full Playwright page API:

public function defaultConfig(): array
{
    return [
        'playwright_script' => "
            const btn = await page.$('.load-more');
            if (btn) {
                await btn.click();
                await page.waitForLoadState('networkidle');
            }
        ",
    ];
}

The script runs as an async function body — await works at the top level. The page variable is the Playwright Page object.

Common patterns

Wait for an element to appear before capturing:

await page.waitForSelector('.results-container');

Scroll to the bottom to trigger lazy loading:

await page.evaluate(() => window.scrollTo(0, document.body.scrollHeight));
await page.waitForLoadState('networkidle');

Dismiss a cookie banner:

const dismiss = await page.$('[data-testid="accept-cookies"]');
if (dismiss) await dismiss.click();

Multiple sequential interactions:

await page.click('#expand-section');
await page.waitForSelector('.expanded-content');
await page.click('.load-all-results');
await page.waitForLoadState('networkidle');

challenge_script — bot-detection challenge handling

Some sites return a 429 Too Many Requests with an interactive challenge page (e.g. a DataDome slider) that must be solved before the real content loads.

Define a challenge_script in your request's defaultConfig(). The script runs on the 429 challenge page instead of the passive 4-second retry:

public function defaultConfig(): array
{
    return [
        'challenge_script' => "
            // Wait for the challenge component to render
            const handle = await page.waitForSelector(
                '.challenge-slider-handle, [role=\"slider\"]',
                { timeout: 10000 }
            ).catch(() => null);

            if (!handle) return;

            const handleBox = await handle.boundingBox();
            const track = await page.$('.challenge-slider-track');
            const trackBox = track ? await track.boundingBox() : null;

            const startX = handleBox.x + handleBox.width / 2;
            const startY = handleBox.y + handleBox.height / 2;
            const endX = trackBox ? trackBox.x + trackBox.width - handleBox.width / 2 : startX + 280;

            await page.mouse.move(startX, startY, { steps: 3 });
            await page.mouse.down();
            const steps = 40;
            for (let i = 1; i <= steps; i++) {
                const t = i / steps;
                const eased = t < 0.5 ? 4*t*t*t : 1 - Math.pow(-2*t + 2, 3) / 2;
                await page.mouse.move(startX + (endX - startX) * eased, startY + (Math.random() - 0.5) * 2);
                await page.waitForTimeout(8 + Math.random() * 18);
            }
            await page.mouse.up();
        ",
    ];
}

After the script runs, the service waits for the page to finish loading and returns whatever content is now available. If no challenge_script is set, the service falls back to a passive 4-second wait and one retry.

Security note

Both playwright_script and challenge_script are executed server-side via new Function(). The Node service binds to 127.0.0.1 by default precisely because of this — scripts should always originate from your PHP application code, never from external user input.

Testing

PlaywrightSender works with Saloon's MockClient — no Node service needed for tests:

use Saloon\Http\Faking\MockClient;
use Saloon\Http\Faking\MockResponse;

$connector = new MyConnector();
$connector->withMockClient(new MockClient([
    MyRequest::class => MockResponse::make(
        body: file_get_contents(__DIR__ . '/fixtures/page.html'),
        headers: ['Content-Type' => 'text/html'],
    ),
]));

$result = $connector->send(new MyRequest())->dto();