mkopcic / laravel-bot-protection
Laravel middleware za blokiranje AI crawlera i traΕΎilica. Auto-registracija, konfigurabilan, podrΕΎava Laravel 10/11/12/13.
Requires
- php: ^8.1
- illuminate/console: ^10.0|^11.0|^12.0|^13.0
- illuminate/http: ^10.0|^11.0|^12.0|^13.0
- illuminate/routing: ^10.0|^11.0|^12.0|^13.0
- illuminate/support: ^10.0|^11.0|^12.0|^13.0
Requires (Dev)
- orchestra/testbench: ^8.0|^9.0|^10.0|^11.0
- pestphp/pest: ^2.0|^3.0
README
π€π‘οΈ Laravel Bot Protection
Block AI crawlers, search engines, and known scrapers from your Laravel app β with one line of composer require.
π About
laravel-bot-protection is a drop-in middleware package that protects Laravel applications from unwanted automated traffic β AI training crawlers, LLM agents, SEO bots, and generic scrapers. It blocks known bot User-Agents with HTTP 403 and adds the X-Robots-Tag: noindex, nofollow header to every response so well-behaved crawlers (Google, Bing, etc.) also skip indexing.
Built for production apps where you need zero-config setup but fine-grained control when you want it.
β¨ Features
- π« Blocks 30+ known bots out of the box β GPTBot, ClaudeBot, PerplexityBot, Bytespider, Google-Extended, CCBot, AhrefsBot, SemrushBot, and more
- β‘ Auto-registers globally β install and you're protected, no manual middleware setup
- π·οΈ Adds
X-Robots-Tagheader to every response β covers crawlers that respect HTTP-level directives - π¨
@botProtectionMetaBlade directive β one-liner for<meta name="robots">andnoai/noimageaitags - π€
noai, noimageaiAI opt-out meta tag β emerging standard adopted by DeviantArt, ArtStation - π Dynamic
/robots.txtroute (opt-in) β generated from config, single source of truth - π‘
BotBlockedevent β listen and react: log, alert, feed analytics - π Optional logging β write blocked requests to any Laravel log channel
- π§ Fully configurable via
.envor published config β toggle, status code, custom message, allow-list IPs - π Publishable
robots.txtwith comprehensive AI/SEO crawler disallow list - π Server-level config stubs β Nginx (shared map + per-vhost), Apache vhost,
.htaccess - π§ͺ Artisan test command β verify protection works against a live URL
- β CI-tested across Laravel 10/11/12/13 Γ PHP 8.1β8.4 (34 Pest tests)
- π Wide compatibility β Laravel 10 / 11 / 12 / 13, PHP 8.1+
π Requirements
| Requirement | Version |
|---|---|
| PHP | ^8.1 |
| Laravel | 10.x, 11.x, 12.x, 13.x |
π¦ Installation
composer require mkopcic/laravel-bot-protection
That's it. Laravel package auto-discovery registers the service provider and pushes the middleware into the web group. Your app is now protected.
π¨ Publishing assets (optional)
| Tag | What it publishes | Destination |
|---|---|---|
bot-protection-config |
Configuration file | config/bot-protection.php |
bot-protection-robots |
Comprehensive robots.txt |
public/robots.txt β οΈ overwrites! |
bot-protection-server |
Nginx + Apache + .htaccess snippets |
bot-protection/ |
bot-protection |
Config + server stubs (everything except robots.txt) | mixed |
# Publish config to customize blocked agents, status codes, etc. php artisan vendor:publish --tag=bot-protection-config # Publish robots.txt β heads up, this overwrites your existing one! php artisan vendor:publish --tag=bot-protection-robots # Publish Nginx / Apache config examples php artisan vendor:publish --tag=bot-protection-server
π Quick Start
After installation, verify the protection works:
# Show current configuration php artisan bot-protection:test config # Test live URL against default bot User-Agents php artisan bot-protection:test url https://mojaapp.hr # Test all configured bot agents php artisan bot-protection:test url https://mojaapp.hr --all
You should see β BLOCKED [403] for each agent.
βοΈ Configuration
All settings can be controlled via environment variables (no need to publish config):
# Master toggle BOT_PROTECTION_ENABLED=true # Auto-register middleware into web group BOT_PROTECTION_AUTO_REGISTER=true # Which middleware group to attach to BOT_PROTECTION_MIDDLEWARE_GROUP=web # What status code to return for blocked bots BOT_PROTECTION_BLOCK_STATUS=403 # Message body for blocked responses BOT_PROTECTION_BLOCK_MESSAGE="Forbidden" # X-Robots-Tag header value (empty string to disable) BOT_PROTECTION_X_ROBOTS_TAG="noindex, nofollow, noarchive, nosnippet" # Block requests with empty User-Agent (suspicious) BOT_PROTECTION_BLOCK_EMPTY_UA=false # IPs that bypass blocking (comma-separated) BOT_PROTECTION_ALLOWED_IPS="1.2.3.4,5.6.7.8" # Log every blocked request as a warning BOT_PROTECTION_LOG_BLOCKED=false # Specific log channel (defaults to logging.default) BOT_PROTECTION_LOG_CHANNEL=daily # AI opt-out meta tags (rendered by @botProtectionMeta) BOT_PROTECTION_AI_META_TAGS="noai, noimageai" # Serve /robots.txt dynamically from blocked_agents config BOT_PROTECTION_GENERATE_ROBOTS_ROUTE=false
For custom blocked agent lists, publish the config and edit config/bot-protection.php.
π¨ Blade Directive β @botProtectionMeta
Drop one line into your <head> and the package renders the standard robots meta tags using your configured x_robots_tag value:
<!doctype html> <html> <head> <meta charset="utf-8"> <title>My App</title> @botProtectionMeta </head>
Renders:
<meta name="robots" content="noindex, nofollow, noarchive, nosnippet"> <meta name="googlebot" content="noindex, nofollow, noarchive, nosnippet"> <meta name="googlebot-news" content="noindex"> <meta name="bingbot" content="noindex, nofollow, noarchive, nosnippet"> <meta name="robots" content="noai, noimageai">
The last <meta> is the AI opt-out directive β an emerging standard adopted by DeviantArt, ArtStation, Squarespace. Some AI scrapers already respect it. Disable via BOT_PROTECTION_AI_META_TAGS="".
If both x_robots_tag and ai_meta_tags are empty, the directive renders nothing.
π Dynamic /robots.txt Route
Instead of publishing a static public/robots.txt and keeping it in sync with your config, opt in to a dynamic route:
BOT_PROTECTION_GENERATE_ROBOTS_ROUTE=true
The package registers GET /robots.txt that emits content generated from your blocked_agents config. Change the config β robots.txt updates instantly. Single source of truth.
β οΈ If
public/robots.txtexists, your web server (Nginx/Apache) serves the static file first and the dynamic route never fires. Deletepublic/robots.txtfor full dynamic behavior.
π‘ BotBlocked Event
Every block fires a Mkopcic\BotProtection\Events\BotBlocked event with full request context. Listen to it for logging, alerting, or analytics:
// app/Providers/AppServiceProvider.php use Illuminate\Support\Facades\Event; use Mkopcic\BotProtection\Events\BotBlocked; public function boot(): void { Event::listen(function (BotBlocked $event) { // $event->userAgent β full UA string // $event->ip β client IP // $event->url β full URL the bot tried // $event->matchedAgent β which needle from blocked_agents matched \Log::channel('bots')->info('Blocked', (array) $event); }); }
Or use a dedicated listener class:
php artisan make:listener LogBlockedBot --event="Mkopcic\BotProtection\Events\BotBlocked"
π Built-in Logging
If you don't need custom event handling, just turn on logging:
BOT_PROTECTION_LOG_BLOCKED=true BOT_PROTECTION_LOG_CHANNEL=daily
Every blocked request writes a warning to the chosen channel with user_agent, ip, url, and matched_agent in the context.
π οΈ Manual Middleware Registration
If you want full control (e.g. apply only to specific route groups), disable auto-register:
BOT_PROTECTION_AUTO_REGISTER=false
Then register manually.
Laravel 11 / 12 / 13 β in bootstrap/app.php:
use Mkopcic\BotProtection\Http\Middleware\BotProtectionMiddleware; ->withMiddleware(function (Middleware $middleware) { $middleware->web(append: [ BotProtectionMiddleware::class, ]); })
Laravel 10 β in app/Http/Kernel.php:
protected $middlewareGroups = [ 'web' => [ // ... \Mkopcic\BotProtection\Http\Middleware\BotProtectionMiddleware::class, ], ];
Or apply per-route:
Route::middleware(BotProtectionMiddleware::class)->group(function () { // protected routes });
π§ͺ Artisan Command β bot-protection:test
The package ships with a built-in tester with two subactions: url and config.
config β dump current configuration
php artisan bot-protection:test config
Outputs all settings, allowed IPs, and the full list of blocked agents.
url β fire HTTP requests with bot User-Agents
# Default 3 representative agents (GPTBot, ClaudeBot, PerplexityBot) php artisan bot-protection:test url https://example.com # Specific agent php artisan bot-protection:test url https://example.com --agent=GPTBot # Test every agent from config php artisan bot-protection:test url https://example.com --all # Custom timeout php artisan bot-protection:test url https://example.com --timeout=30
Sample output:
Testiranje: https://example.com
Broj agenata: 3
β BLOCKED [403] GPTBot
β BLOCKED [403] ClaudeBot
β BLOCKED [403] PerplexityBot
βββββββββββββββββββββββββββββββββββββββ
Blocked: 3 Allowed: 0 Errors: 0
Returns exit code 0 if all agents are blocked, 1 if any get through.
π Server-Level Protection (Recommended)
The middleware protects at the Laravel layer. For defense in depth, block bots at the web server too β they never reach PHP, saving CPU.
Publish the server config examples:
php artisan vendor:publish --tag=bot-protection-server
You'll get a bot-protection/ directory with:
| File | Use |
|---|---|
nginx-shared-map.conf |
Drop in /etc/nginx/conf.d/ once β defines $blocked_bot map for all vhosts |
nginx-vhost-snippet.conf |
Paste into each Nginx server {} block |
apache-vhost-snippet.conf |
Full Apache vhost example with SetEnvIf |
htaccess-snippet.txt |
.htaccess rules (when you can't edit vhosts) |
𧬠How It Works
βββββββββββββββββββββββ
β Incoming Request β
ββββββββββββ¬βββββββββββ
βΌ
ββββββββββββββββββββββββββ
β Web Server β β optional: blocks at nginx/apache layer
β (nginx/apache) β
ββββββββββββ¬ββββββββββββββ
βΌ
ββββββββββββββββββββββββββ
β BotProtection β
β Middleware β
β β
β 1. Check enabled? β
β 2. IP in allow-list? β
β 3. UA matches bot? βββββ YES βββΆ HTTP 403
β 4. Empty UA + flag? β
ββββββββββββ¬ββββββββββββββ
β NO
βΌ
ββββββββββββββββββββββββββ
β Laravel App β
ββββββββββββ¬ββββββββββββββ
βΌ
ββββββββββββββββββββββββββ
β Response β
β + X-Robots-Tag header β
ββββββββββββββββββββββββββ
π§ͺ Running Tests
composer install ./vendor/bin/pest
13 Pest tests cover:
- β Blocking known bot User-Agents
- β Allowing legitimate browser User-Agents
- β
Adding
X-Robots-Tagheader to passed responses - β Case-insensitive User-Agent matching
- β
enabled=falsebypass - β Custom block status codes
- β Custom block messages
- β
Empty
x_robots_tagdisables header - β Empty User-Agent handling (both modes)
- β Allowed-IP bypass
π€ What's Blocked Out of the Box
Click to expand the full list (33 agents)
| Category | Agents |
|---|---|
| OpenAI | GPTBot, ChatGPT-User, OAI-SearchBot |
| Anthropic | ClaudeBot, anthropic-ai, Claude-Web |
| Google-Extended, Googlebot, AdsBot-Google | |
| Meta | Meta-ExternalAgent, FacebookBot, facebookexternalhit |
| Apple | Applebot, Applebot-Extended |
| Amazon | Amazonbot |
| Perplexity | PerplexityBot |
| ByteDance | Bytespider |
| Common Crawl | CCBot |
| Cohere | cohere-ai |
| Mistral | MistralAI-User |
| Diffbot | Diffbot |
| SEO crawlers | SemrushBot, AhrefsBot, MJ12bot, DotBot, BLEXBot |
| Eastern engines | YandexBot, Baiduspider, Sogou |
| Generic scrapers | Scrapy, python-requests, curl/, wget/ |
You can add, remove, or fully override the list by publishing config and editing blocked_agents.
β οΈ What This Package Is NOT
- β Not authentication. If content must be private, use Laravel auth, Basic Auth, or Cloudflare Zero Trust.
- β Not foolproof against UA spoofing. A determined scraper can fake any User-Agent. This package targets mass crawlers that identify themselves correctly.
- β Not a WAF. For rate limiting, geo-blocking, DDoS protection, layer in Cloudflare or a dedicated WAF.
For maximum protection: this package + server-level rules + authentication for sensitive content.
π Related
- π Google: Robots meta tag and X-Robots-Tag specifications
- π OpenAI: GPTBot opt-out documentation
- π Cloudflare: Block AI bots and scrapers
π€ Contributing
Contributions are welcome! Please open an issue or PR.
For new bot User-Agents to add to the default list, please include a source link (the bot's official documentation page).
π License
The MIT License (MIT). See LICENSE for details.
Built with β€οΈ for the Laravel community.
If this package saved your bandwidth or your sanity, β the repo!