maikschneider / solr-shield
TYPO3 extension that protects the Apache Solr search endpoint from bots and crawlers via URI obfuscation and bot detection.
Package info
github.com/maikschneider/solr-shield
Type:typo3-cms-extension
pkg:composer/maikschneider/solr-shield
Requires
- php: ^8.2
- apache-solr-for-typo3/solr: ^13.0 || ^14.0 || dev-main
- typo3/cms-core: ^13.4 || ^14.0
- typo3/cms-frontend: ^13.4 || ^14.0
Requires (Dev)
- armin/editorconfig-cli: ^2.0
- ergebnis/composer-normalize: ^2.45
- friendsofphp/php-cs-fixer: ^3.64
- helmich/typo3-typoscript-lint: ^3.3
- move-elevator/composer-translation-validator: ^1.3
- phpstan/phpstan: ^2.1
- saschaegerer/phpstan-typo3: ^2.1 || ^3.0
- ssch/typo3-rector: ^2.13 || ^3.0
- typo3/cms-backend: ^13.4 || ^14.0
This package is auto-updated.
Last update: 2026-06-11 11:01:48 UTC
README
TYPO3 extension that protects the Apache Solr search endpoint from bots and crawlers that enumerate search parameters (tx_solr[filter][…], tx_solr[q], the suggest/autocomplete endpoint) to generate thousands of unique, uncacheable requests.
URI Obfuscation — tx_solr parameters never appear in URLs or HTML form fields; every Solr request must carry a single server-signed _ss token. This layer is server-side, requires no JavaScript, and needs no changes to your existing Solr templates.
Bot detection (JavaScript) is not part of
main. A second, behavioural bot-detection layer is under development on thefeature/js-bot-detectionbranch. The server-side validator (BotDetectionService) ships onmainbut is disabled by default because it depends on that client-side layer. Until the branch lands, leavesolrShield.botDetection.enabledoff.
Requirements
| Dependency | Version |
|---|---|
| PHP | ^8.2 |
| TYPO3 CMS | ^13.4 || ^14.0 |
| apache-solr-for-typo3/solr | ^12.0 || ^13.0 |
Installation
composer require maikschneider/solr-shield
Then add the Solr Shield site set to your site configuration (TYPO3 backend → Site Management → Sites → [your site] → Sets), or declare it as a dependency in your sitepackage's set:
# packages/your-sitepackage/Configuration/Sets/YourSet/config.yaml dependencies: - maikschneider/solr-shield
Flush all caches after installation.
Configuration
Settings are configured per site under Site Management → Sites → [your site] → Solr Shield. All settings ship with sensible defaults.
URI Obfuscation
| Setting | Type | Default | Description |
|---|---|---|---|
solrShield.uriObfuscation.enabled |
bool | true |
Enable/disable the obfuscation layer |
solrShield.uriObfuscation.rejectAction |
string | redirect |
What to do with an unsigned direct tx_solr request: redirect (back to the referer) or 403 |
Bot Detection
Bot-detection settings exist but default to disabled on main (the client-side layer they depend on lives on the feature/js-bot-detection branch). solrShield.botDetection.enabled defaults to false; the remaining keys (minFormTimeout, securityLevel, requireInteraction) only take effect once it is enabled together with the JavaScript layer.
You can also override the obfuscation settings in config/sites/<site>/settings.yaml:
solrShield: uriObfuscation: enabled: true rejectAction: redirect
How It Works
Protection 1 — URI Obfuscation
Goal: make tx_solr[…] parameters invisible in both URLs and HTML so bots cannot enumerate parameter combinations.
Token mechanism
TokenService derives a compact token: the first 12 bytes of HMAC-SHA256(encryptionKey, "solr-shield"), base64url-encoded — 16 characters. Tokens are stateless and do not expire; they remain valid as long as the TYPO3 encryptionKey is unchanged. This is intentional — filter and pagination links are permanent and must not rot over time.
The signed payload carried on every request is a single _ss parameter:
_ss = base64( {"p": "<tx_solr query string>", "s": "<token>"} )
Filter / pagination / typolink URLs (server-side)
Two PSR-14 event listeners encode any URL that carries tx_solr parameters into the _ss payload:
AfterUriIsProcessedEventListener— hooks Solr'sSearchUriBuilder(AfterUriIsProcessedEvent) for facet, pagination and sorting links.AfterLinkIsGeneratedEventListener— hookstypolink()(AfterLinkIsGeneratedEvent) for any other generated Solr link. It deliberately skips Solr routing template links that still contain###tx_solr:…###placeholders, leaving those to the listener above.
Before: /search?tx_solr[q]=foo&tx_solr[filter][0]=type:pages&tx_solr[page]=2
After: /search?_ss=eyJwIjoidHhfc29sciU1QnElNUQ9…
Search form (server-side, no JavaScript required)
HtmlOutputObfuscationMiddleware rewrites the rendered HTML response:
- Replaces every
name="tx_solr[attribute withname="_s[. - Injects a hidden
<input name="_ss">carrying a signed empty-payload token into each form that now contains_s[…]fields.
On submission SolrShieldMiddleware remaps _s[…] back to tx_solr[…] and validates the _ss token — so the search form works without any client-side JavaScript.
Middleware validation (inbound)
SolrShieldMiddleware runs on every frontend request, positioned after site resolution (typo3/cms-frontend/site) and before the request-token middleware:
| Request | Action |
|---|---|
_s[…] fields present |
Remap to tx_solr[…] |
Signed _ss present |
Validate token → base64-decode → parse_str → inject tx_solr params → strip _ss |
Direct tx_solr params without _ss |
Reject (redirect / 403) |
Suggest request (pageType 7384) without _ss |
Reject |
| No Solr params, or obfuscation disabled | Pass through unchanged |
Protection 2 — Bot Detection (in development)
A second, optional layer scores behavioural signals (elapsed time, mouse/scroll/key/touch interaction, navigator.webdriver, screen/viewport dimensions) to reject headless and scripted clients. The server-side validator (BotDetectionService) ships on main, but the client-side JavaScript that collects the signals — and the TypoScript that wires it up — lives on the feature/js-bot-detection branch and is not yet operational.
Bot detection therefore defaults to off on main. Enabling it without the client-side layer would reject every form submission. Track or contribute to the work on the feature branch.
File Structure
solr-shield/
├── Classes/
│ ├── EventListener/
│ │ ├── AfterLinkIsGeneratedEventListener.php # Encodes typolink() Solr URLs
│ │ └── AfterUriIsProcessedEventListener.php # Encodes Solr SearchUriBuilder URLs
│ ├── Middleware/
│ │ ├── HtmlOutputObfuscationMiddleware.php # Renames tx_solr[ → _s[ in HTML, injects _ss
│ │ └── SolrShieldMiddleware.php # Validates & decodes incoming requests
│ └── Service/
│ ├── TokenService.php # HMAC token generation & validation
│ └── BotDetectionService.php # Bot-detection payload validation
├── Configuration/
│ ├── RequestMiddlewares.php # PSR-15 middleware registration
│ ├── Services.yaml # DI + event-listener registration
│ ├── Sets/SolrShield/ # Site set: config, settings, setup
│ └── TypoScript/setup.typoscript # (JS wiring lives on the feature branch)
├── Resources/
│ └── Private/Language/locallang.xlf
├── composer.json
├── ext_emconf.php
└── ext_localconf.php
Development
This repository uses Composer-based quality tooling. After composer install:
composer ci:sca # run all static analysis (cs-fixer, phpstan, rector, lint, editorconfig, yaml/typoscript, translations) composer fix # auto-fix: cs-fixer, rector, editorconfig, composer normalize
See CONTRIBUTING.md for details.
License
GPL-2.0-or-later — Maik Schneider.