hryvinskyi / magento2-external-media-prefetcher
N/A
Package info
github.com/hryvinskyi/magento2-external-media-prefetcher
Type:magento2-module
pkg:composer/hryvinskyi/magento2-external-media-prefetcher
Requires
- magento/framework: *
- magento/module-media-storage: 100.4.*
README
A Magento 2 module that automatically fetches missing media files from an external source (typically your production site) with SSRF protection, configurable retries/timeouts, an async queue mode, and placeholder-aware filtering so you never persist a CDN's "image not found" body as a real product image.
Description
This module solves the common problem of missing media files in development or staging environments. When a media file is requested but not found locally, the module automatically attempts to download it from a configured external URL. It hooks in at two points:
Magento\MediaStorage\Model\File\Storage\Synchronization::synchronize()— catches requests routed throughpub/get.php.Magento\Catalog\Model\Product\Image::setBaseFile()— catches catalog image helper calls in templates ($imageHelper->init($product, $type)->resize(...)) before Magento falls back to the placeholder.
Requirements
- Magento 2.4.x
- PHP 8.1 or higher
Installation
composer require hryvinskyi/magento2-external-media-prefetcher bin/magento module:enable Hryvinskyi_ExternalMediaPrefetcher bin/magento setup:upgrade
Configuration
All configuration is stored in app/etc/env.php and can be managed through a dedicated CLI
command — no admin UI is involved.
CLI command
bin/magento hryvinskyi:external-media-prefetcher:configure [options]
| Option | Description | Default |
|---|---|---|
--show |
Print current values and exit. | — |
--enabled=1|0 |
Enable/disable the prefetcher. | 0 |
--url=<base> |
Base URL of the external media source (e.g. https://cdn.example.com/media/). |
— |
--allowed-schemes=https,http |
Comma-separated URL schemes the validator will allow. | https |
--async=1|0 |
Dispatch downloads via the message queue instead of running inline. | 0 |
--connect-timeout=<s> |
HTTP connect timeout in seconds. | 3 |
--request-timeout=<s> |
HTTP request timeout in seconds. | 8 |
--retry-count=<n> |
Retries on 5xx / 408 / 429 / transport errors. 4xx does not retry. | 2 |
--retry-delay-ms=<ms> |
Base delay for exponential backoff between retries. | 200 |
--placeholder-hash=<sha256> |
Manual placeholder SHA-256 (64-char lowercase hex). | — |
--placeholder-length=<bytes> |
Manual placeholder byte length. | — |
--auto |
Probe the external URL once and auto-fill placeholder-hash / placeholder-length. Cannot be combined with the manual flags. |
— |
Values are validated up-front: boolean flags accept 1/0/true/false/yes/no/on/off, integer
flags must be non-negative, and --placeholder-hash must be a 64-char lowercase hex string.
Example: set everything in one call
bin/magento hryvinskyi:external-media-prefetcher:configure \
--enabled=1 \
--url=https://www.example.com/media/ \
--async=0 \
--connect-timeout=3 \
--request-timeout=8 \
--retry-count=2 \
--auto
bin/magento cache:clean config
The --auto flag probes the URL you just passed (no need to run a separate detect step)
and persists the resulting placeholder signature alongside the rest of the options.
Direct env.php editing (if you prefer)
'system' => [ 'default' => [ 'external_media_prefetcher' => [ 'general' => [ 'enabled' => '1', 'external_media_url' => 'https://www.example.com/media/', 'allowed_schemes' => 'https', 'async_enabled' => '0', 'connect_timeout' => '3', 'request_timeout' => '8', 'retry_count' => '2', 'retry_delay_ms' => '200', ], 'placeholder' => [ 'hash' => 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', 'length' => '0', ], ], ], ],
How It Works
Sync mode (default)
- A request hits a missing media file.
- Either plugin fires (
SynchronizationorProduct\Image), cleans the path, and checks the media directory. Security\UrlValidatorasserts the resulting URL: scheme allowlist,FILTER_VALIDATE_URL, and DNS resolution that rejects loopback / private / link-local / reserved IPs (SSRF protection).FileDownloaderrequests the file using the configured connect/request timeouts, with exponential-backoff retries on 5xx / 408 / 429 / transport errors. 4xx responses are terminal.- Before writing, the response body is compared against the cached placeholder signature. If it matches, the file is not saved — preventing "placeholder-as-product-image" pollution when the CDN returns a generic fallback image for missing files.
- The destination directory is
realpath-checked to guarantee it's inside the Magento media directory (path-traversal protection). - The file is written and the request proceeds against the real image.
Async mode
When --async=1 is set, both plugins publish the clean relative path onto the
hryvinskyi.external_media_prefetcher.download queue (db connection) instead of fetching
inline. The current request still renders the placeholder, but the next request for the
same file gets the real image. Start the consumer with:
bin/magento queue:consumers:start hryvinskyi.external_media_prefetcher.download
Placeholder detection
If you don't configure a manual signature, PlaceholderDetector probes the external URL
once against a guaranteed-missing path
(<base>/catalog/product/__hryvinskyi_emp_probe_<random>.jpg) and caches the
{hash, length} result in Magento's app cache (tag HRYVINSKYI_EMP_PLACEHOLDER, TTL 1 day).
Negative probes (4xx/5xx/empty) are also cached so we don't hammer the source.
Logging
All module output goes to var/log/external_media_prefetcher.log (a dedicated Monolog
channel), including probe results, blocked SSRF attempts, rejected placeholder bodies, and
retry attempts.
Security Notes
- URL validation blocks
127.0.0.0/8, RFC1918 ranges,169.254.0.0/16, and reserved blocks — so a compromised / misconfiguredexternal_media_urlcannot be used to reach internal metadata services (169.254.169.254, etc.) or internal hosts. - Default allowed scheme is
httpsonly. Plainhttpmust be opted into explicitly. - Both destination-directory preparation and the pre-write destination check use
realpath+ prefix comparison against the media root to block..traversal. - The module never trusts the source CDN's response body on its own: the placeholder signature check protects against silently caching fallback images.
Testing
ddev exec vendor/bin/phpunit \
-c dev/tests/unit/phpunit.xml.dist \
vendor/hryvinskyi/magento2-external-media-prefetcher/Test/Unit
Unit test coverage: Config, PathResolver, FileDownloader (happy path, SSRF rejection,
placeholder skip, retry on 5xx, no-retry on 404, path-traversal), PlaceholderDetector
(override, cache hit/miss, negative cache, probe), UrlValidator (scheme / loopback /
private / link-local / public), SynchronizationPlugin (sync + async branches).
Why It's Useful
- Development and staging: transparently pull media from production on demand.
- Missing files: no more broken product images without manual intervention.
- Safe by default: SSRF-safe, traversal-safe, placeholder-aware.
- Tunable: switch between inline and queue-backed downloads per environment.
License
Author
Volodymyr Hryvinskyi Email: volodymyr@hryvinskyi.com GitHub: https://github.com/hryvinskyi