hryvinskyi/magento2-external-media-prefetcher

N/A

Maintainers

Package info

github.com/hryvinskyi/magento2-external-media-prefetcher

Type:magento2-module

pkg:composer/hryvinskyi/magento2-external-media-prefetcher

Statistics

Installs: 443

Dependents: 0

Suggesters: 0

Stars: 4

Open Issues: 0

2.0.1 2026-04-17 11:01 UTC

This package is auto-updated.

Last update: 2026-04-17 11:01:52 UTC


README

A Magento 2 module that automatically fetches missing media files from an external source (typically your production site) with SSRF protection, configurable retries/timeouts, an async queue mode, and placeholder-aware filtering so you never persist a CDN's "image not found" body as a real product image.

Description

This module solves the common problem of missing media files in development or staging environments. When a media file is requested but not found locally, the module automatically attempts to download it from a configured external URL. It hooks in at two points:

  1. Magento\MediaStorage\Model\File\Storage\Synchronization::synchronize() — catches requests routed through pub/get.php.
  2. Magento\Catalog\Model\Product\Image::setBaseFile() — catches catalog image helper calls in templates ($imageHelper->init($product, $type)->resize(...)) before Magento falls back to the placeholder.

Requirements

  • Magento 2.4.x
  • PHP 8.1 or higher

Installation

composer require hryvinskyi/magento2-external-media-prefetcher
bin/magento module:enable Hryvinskyi_ExternalMediaPrefetcher
bin/magento setup:upgrade

Configuration

All configuration is stored in app/etc/env.php and can be managed through a dedicated CLI command — no admin UI is involved.

CLI command

bin/magento hryvinskyi:external-media-prefetcher:configure [options]
Option Description Default
--show Print current values and exit.
--enabled=1|0 Enable/disable the prefetcher. 0
--url=<base> Base URL of the external media source (e.g. https://cdn.example.com/media/).
--allowed-schemes=https,http Comma-separated URL schemes the validator will allow. https
--async=1|0 Dispatch downloads via the message queue instead of running inline. 0
--connect-timeout=<s> HTTP connect timeout in seconds. 3
--request-timeout=<s> HTTP request timeout in seconds. 8
--retry-count=<n> Retries on 5xx / 408 / 429 / transport errors. 4xx does not retry. 2
--retry-delay-ms=<ms> Base delay for exponential backoff between retries. 200
--placeholder-hash=<sha256> Manual placeholder SHA-256 (64-char lowercase hex).
--placeholder-length=<bytes> Manual placeholder byte length.
--auto Probe the external URL once and auto-fill placeholder-hash / placeholder-length. Cannot be combined with the manual flags.

Values are validated up-front: boolean flags accept 1/0/true/false/yes/no/on/off, integer flags must be non-negative, and --placeholder-hash must be a 64-char lowercase hex string.

Example: set everything in one call

bin/magento hryvinskyi:external-media-prefetcher:configure \
    --enabled=1 \
    --url=https://www.example.com/media/ \
    --async=0 \
    --connect-timeout=3 \
    --request-timeout=8 \
    --retry-count=2 \
    --auto

bin/magento cache:clean config

The --auto flag probes the URL you just passed (no need to run a separate detect step) and persists the resulting placeholder signature alongside the rest of the options.

Direct env.php editing (if you prefer)

'system' => [
    'default' => [
        'external_media_prefetcher' => [
            'general' => [
                'enabled'            => '1',
                'external_media_url' => 'https://www.example.com/media/',
                'allowed_schemes'    => 'https',
                'async_enabled'      => '0',
                'connect_timeout'    => '3',
                'request_timeout'    => '8',
                'retry_count'        => '2',
                'retry_delay_ms'     => '200',
            ],
            'placeholder' => [
                'hash'   => 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855',
                'length' => '0',
            ],
        ],
    ],
],

How It Works

Sync mode (default)

  1. A request hits a missing media file.
  2. Either plugin fires (Synchronization or Product\Image), cleans the path, and checks the media directory.
  3. Security\UrlValidator asserts the resulting URL: scheme allowlist, FILTER_VALIDATE_URL, and DNS resolution that rejects loopback / private / link-local / reserved IPs (SSRF protection).
  4. FileDownloader requests the file using the configured connect/request timeouts, with exponential-backoff retries on 5xx / 408 / 429 / transport errors. 4xx responses are terminal.
  5. Before writing, the response body is compared against the cached placeholder signature. If it matches, the file is not saved — preventing "placeholder-as-product-image" pollution when the CDN returns a generic fallback image for missing files.
  6. The destination directory is realpath-checked to guarantee it's inside the Magento media directory (path-traversal protection).
  7. The file is written and the request proceeds against the real image.

Async mode

When --async=1 is set, both plugins publish the clean relative path onto the hryvinskyi.external_media_prefetcher.download queue (db connection) instead of fetching inline. The current request still renders the placeholder, but the next request for the same file gets the real image. Start the consumer with:

bin/magento queue:consumers:start hryvinskyi.external_media_prefetcher.download

Placeholder detection

If you don't configure a manual signature, PlaceholderDetector probes the external URL once against a guaranteed-missing path (<base>/catalog/product/__hryvinskyi_emp_probe_<random>.jpg) and caches the {hash, length} result in Magento's app cache (tag HRYVINSKYI_EMP_PLACEHOLDER, TTL 1 day). Negative probes (4xx/5xx/empty) are also cached so we don't hammer the source.

Logging

All module output goes to var/log/external_media_prefetcher.log (a dedicated Monolog channel), including probe results, blocked SSRF attempts, rejected placeholder bodies, and retry attempts.

Security Notes

  • URL validation blocks 127.0.0.0/8, RFC1918 ranges, 169.254.0.0/16, and reserved blocks — so a compromised / misconfigured external_media_url cannot be used to reach internal metadata services (169.254.169.254, etc.) or internal hosts.
  • Default allowed scheme is https only. Plain http must be opted into explicitly.
  • Both destination-directory preparation and the pre-write destination check use realpath + prefix comparison against the media root to block .. traversal.
  • The module never trusts the source CDN's response body on its own: the placeholder signature check protects against silently caching fallback images.

Testing

ddev exec vendor/bin/phpunit \
    -c dev/tests/unit/phpunit.xml.dist \
    vendor/hryvinskyi/magento2-external-media-prefetcher/Test/Unit

Unit test coverage: Config, PathResolver, FileDownloader (happy path, SSRF rejection, placeholder skip, retry on 5xx, no-retry on 404, path-traversal), PlaceholderDetector (override, cache hit/miss, negative cache, probe), UrlValidator (scheme / loopback / private / link-local / public), SynchronizationPlugin (sync + async branches).

Why It's Useful

  • Development and staging: transparently pull media from production on demand.
  • Missing files: no more broken product images without manual intervention.
  • Safe by default: SSRF-safe, traversal-safe, placeholder-aware.
  • Tunable: switch between inline and queue-backed downloads per environment.

License

MIT

Author

Volodymyr Hryvinskyi Email: volodymyr@hryvinskyi.com GitHub: https://github.com/hryvinskyi