mkopcic/laravel-bot-protection

Laravel middleware za blokiranje AI crawlera i traΕΎilica. Auto-registracija, konfigurabilan, podrΕΎava Laravel 10/11/12/13.

Maintainers

Package info

github.com/mkopcic/laravel-bot-protection

pkg:composer/mkopcic/laravel-bot-protection

Statistics

Installs: 4

Dependents: 0

Suggesters: 0

Stars: 1

Open Issues: 0

v1.2.1 2026-05-21 10:03 UTC

This package is auto-updated.

Last update: 2026-05-21 10:07:29 UTC


README

πŸ€–πŸ›‘οΈ Laravel Bot Protection

Block AI crawlers, search engines, and known scrapers from your Laravel app β€” with one line of composer require.

Latest Version on Packagist Tests Total Downloads License PHP Version Laravel

πŸ“– About

laravel-bot-protection is a drop-in middleware package that protects Laravel applications from unwanted automated traffic β€” AI training crawlers, LLM agents, SEO bots, and generic scrapers. It blocks known bot User-Agents with HTTP 403 and adds the X-Robots-Tag: noindex, nofollow header to every response so well-behaved crawlers (Google, Bing, etc.) also skip indexing.

Built for production apps where you need zero-config setup but fine-grained control when you want it.

✨ Features

  • 🚫 Blocks 30+ known bots out of the box β€” GPTBot, ClaudeBot, PerplexityBot, Bytespider, Google-Extended, CCBot, AhrefsBot, SemrushBot, and more
  • ⚑ Auto-registers globally β€” install and you're protected, no manual middleware setup
  • 🏷️ Adds X-Robots-Tag header to every response β€” covers crawlers that respect HTTP-level directives
  • 🎨 @botProtectionMeta Blade directive β€” one-liner for <meta name="robots"> and noai/noimageai tags
  • πŸ€– noai, noimageai AI opt-out meta tag β€” emerging standard adopted by DeviantArt, ArtStation
  • πŸ”„ Dynamic /robots.txt route (opt-in) β€” generated from config, single source of truth
  • πŸ“‘ BotBlocked event β€” listen and react: log, alert, feed analytics
  • πŸ“ Optional logging β€” write blocked requests to any Laravel log channel
  • πŸ”§ Fully configurable via .env or published config β€” toggle, status code, custom message, allow-list IPs
  • πŸ“„ Publishable robots.txt with comprehensive AI/SEO crawler disallow list
  • 🌐 Server-level config stubs β€” Nginx (shared map + per-vhost), Apache vhost, .htaccess
  • πŸ§ͺ Artisan test command β€” verify protection works against a live URL
  • βœ… CI-tested across Laravel 10/11/12/13 Γ— PHP 8.1–8.4 (34 Pest tests)
  • 🐘 Wide compatibility β€” Laravel 10 / 11 / 12 / 13, PHP 8.1+

πŸ“‹ Requirements

Requirement Version
PHP ^8.1
Laravel 10.x, 11.x, 12.x, 13.x

πŸ“¦ Installation

composer require mkopcic/laravel-bot-protection

That's it. Laravel package auto-discovery registers the service provider and pushes the middleware into the web group. Your app is now protected.

🎨 Publishing assets (optional)

Tag What it publishes Destination
bot-protection-config Configuration file config/bot-protection.php
bot-protection-robots Comprehensive robots.txt public/robots.txt ⚠️ overwrites!
bot-protection-server Nginx + Apache + .htaccess snippets bot-protection/
bot-protection Config + server stubs (everything except robots.txt) mixed
# Publish config to customize blocked agents, status codes, etc.
php artisan vendor:publish --tag=bot-protection-config

# Publish robots.txt β€” heads up, this overwrites your existing one!
php artisan vendor:publish --tag=bot-protection-robots

# Publish Nginx / Apache config examples
php artisan vendor:publish --tag=bot-protection-server

πŸš€ Quick Start

After installation, verify the protection works:

# Show current configuration
php artisan bot-protection:test config

# Test live URL against default bot User-Agents
php artisan bot-protection:test url https://mojaapp.hr

# Test all configured bot agents
php artisan bot-protection:test url https://mojaapp.hr --all

You should see βœ“ BLOCKED [403] for each agent.

βš™οΈ Configuration

All settings can be controlled via environment variables (no need to publish config):

# Master toggle
BOT_PROTECTION_ENABLED=true

# Auto-register middleware into web group
BOT_PROTECTION_AUTO_REGISTER=true

# Which middleware group to attach to
BOT_PROTECTION_MIDDLEWARE_GROUP=web

# What status code to return for blocked bots
BOT_PROTECTION_BLOCK_STATUS=403

# Message body for blocked responses
BOT_PROTECTION_BLOCK_MESSAGE="Forbidden"

# X-Robots-Tag header value (empty string to disable)
BOT_PROTECTION_X_ROBOTS_TAG="noindex, nofollow, noarchive, nosnippet"

# Block requests with empty User-Agent (suspicious)
BOT_PROTECTION_BLOCK_EMPTY_UA=false

# IPs that bypass blocking (comma-separated)
BOT_PROTECTION_ALLOWED_IPS="1.2.3.4,5.6.7.8"

# Log every blocked request as a warning
BOT_PROTECTION_LOG_BLOCKED=false

# Specific log channel (defaults to logging.default)
BOT_PROTECTION_LOG_CHANNEL=daily

# AI opt-out meta tags (rendered by @botProtectionMeta)
BOT_PROTECTION_AI_META_TAGS="noai, noimageai"

# Serve /robots.txt dynamically from blocked_agents config
BOT_PROTECTION_GENERATE_ROBOTS_ROUTE=false

For custom blocked agent lists, publish the config and edit config/bot-protection.php.

🎨 Blade Directive β€” @botProtectionMeta

Drop one line into your <head> and the package renders the standard robots meta tags using your configured x_robots_tag value:

<!doctype html>
<html>
<head>
    <meta charset="utf-8">
    <title>My App</title>

    @botProtectionMeta
</head>

Renders:

<meta name="robots" content="noindex, nofollow, noarchive, nosnippet">
<meta name="googlebot" content="noindex, nofollow, noarchive, nosnippet">
<meta name="googlebot-news" content="noindex">
<meta name="bingbot" content="noindex, nofollow, noarchive, nosnippet">
<meta name="robots" content="noai, noimageai">

The last <meta> is the AI opt-out directive β€” an emerging standard adopted by DeviantArt, ArtStation, Squarespace. Some AI scrapers already respect it. Disable via BOT_PROTECTION_AI_META_TAGS="".

If both x_robots_tag and ai_meta_tags are empty, the directive renders nothing.

πŸ”„ Dynamic /robots.txt Route

Instead of publishing a static public/robots.txt and keeping it in sync with your config, opt in to a dynamic route:

BOT_PROTECTION_GENERATE_ROBOTS_ROUTE=true

The package registers GET /robots.txt that emits content generated from your blocked_agents config. Change the config β†’ robots.txt updates instantly. Single source of truth.

⚠️ If public/robots.txt exists, your web server (Nginx/Apache) serves the static file first and the dynamic route never fires. Delete public/robots.txt for full dynamic behavior.

πŸ“‘ BotBlocked Event

Every block fires a Mkopcic\BotProtection\Events\BotBlocked event with full request context. Listen to it for logging, alerting, or analytics:

// app/Providers/AppServiceProvider.php
use Illuminate\Support\Facades\Event;
use Mkopcic\BotProtection\Events\BotBlocked;

public function boot(): void
{
    Event::listen(function (BotBlocked $event) {
        // $event->userAgent     β€” full UA string
        // $event->ip            β€” client IP
        // $event->url           β€” full URL the bot tried
        // $event->matchedAgent  β€” which needle from blocked_agents matched

        \Log::channel('bots')->info('Blocked', (array) $event);
    });
}

Or use a dedicated listener class:

php artisan make:listener LogBlockedBot --event="Mkopcic\BotProtection\Events\BotBlocked"

πŸ“ Built-in Logging

If you don't need custom event handling, just turn on logging:

BOT_PROTECTION_LOG_BLOCKED=true
BOT_PROTECTION_LOG_CHANNEL=daily

Every blocked request writes a warning to the chosen channel with user_agent, ip, url, and matched_agent in the context.

πŸ› οΈ Manual Middleware Registration

If you want full control (e.g. apply only to specific route groups), disable auto-register:

BOT_PROTECTION_AUTO_REGISTER=false

Then register manually.

Laravel 11 / 12 / 13 β€” in bootstrap/app.php:

use Mkopcic\BotProtection\Http\Middleware\BotProtectionMiddleware;

->withMiddleware(function (Middleware $middleware) {
    $middleware->web(append: [
        BotProtectionMiddleware::class,
    ]);
})

Laravel 10 β€” in app/Http/Kernel.php:

protected $middlewareGroups = [
    'web' => [
        // ...
        \Mkopcic\BotProtection\Http\Middleware\BotProtectionMiddleware::class,
    ],
];

Or apply per-route:

Route::middleware(BotProtectionMiddleware::class)->group(function () {
    // protected routes
});

πŸ§ͺ Artisan Command β€” bot-protection:test

The package ships with a built-in tester with two subactions: url and config.

config β€” dump current configuration

php artisan bot-protection:test config

Outputs all settings, allowed IPs, and the full list of blocked agents.

url β€” fire HTTP requests with bot User-Agents

# Default 3 representative agents (GPTBot, ClaudeBot, PerplexityBot)
php artisan bot-protection:test url https://example.com

# Specific agent
php artisan bot-protection:test url https://example.com --agent=GPTBot

# Test every agent from config
php artisan bot-protection:test url https://example.com --all

# Custom timeout
php artisan bot-protection:test url https://example.com --timeout=30

Sample output:

Testiranje: https://example.com
Broj agenata: 3

  βœ“ BLOCKED [403] GPTBot
  βœ“ BLOCKED [403] ClaudeBot
  βœ“ BLOCKED [403] PerplexityBot

───────────────────────────────────────
Blocked: 3   Allowed: 0   Errors: 0

Returns exit code 0 if all agents are blocked, 1 if any get through.

🌐 Server-Level Protection (Recommended)

The middleware protects at the Laravel layer. For defense in depth, block bots at the web server too β€” they never reach PHP, saving CPU.

Publish the server config examples:

php artisan vendor:publish --tag=bot-protection-server

You'll get a bot-protection/ directory with:

File Use
nginx-shared-map.conf Drop in /etc/nginx/conf.d/ once β€” defines $blocked_bot map for all vhosts
nginx-vhost-snippet.conf Paste into each Nginx server {} block
apache-vhost-snippet.conf Full Apache vhost example with SetEnvIf
htaccess-snippet.txt .htaccess rules (when you can't edit vhosts)

🧬 How It Works

   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  Incoming Request   β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚   Web Server           β”‚  ← optional: blocks at nginx/apache layer
   β”‚   (nginx/apache)       β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  BotProtection         β”‚
   β”‚  Middleware            β”‚
   β”‚                        β”‚
   β”‚  1. Check enabled?     β”‚
   β”‚  2. IP in allow-list?  β”‚
   β”‚  3. UA matches bot?    │──── YES ──▢  HTTP 403
   β”‚  4. Empty UA + flag?   β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚ NO
              β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚   Laravel App          β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  Response              β”‚
   β”‚  + X-Robots-Tag header β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ§ͺ Running Tests

composer install
./vendor/bin/pest

13 Pest tests cover:

  • βœ… Blocking known bot User-Agents
  • βœ… Allowing legitimate browser User-Agents
  • βœ… Adding X-Robots-Tag header to passed responses
  • βœ… Case-insensitive User-Agent matching
  • βœ… enabled=false bypass
  • βœ… Custom block status codes
  • βœ… Custom block messages
  • βœ… Empty x_robots_tag disables header
  • βœ… Empty User-Agent handling (both modes)
  • βœ… Allowed-IP bypass

πŸ€– What's Blocked Out of the Box

Click to expand the full list (33 agents)
Category Agents
OpenAI GPTBot, ChatGPT-User, OAI-SearchBot
Anthropic ClaudeBot, anthropic-ai, Claude-Web
Google Google-Extended, Googlebot, AdsBot-Google
Meta Meta-ExternalAgent, FacebookBot, facebookexternalhit
Apple Applebot, Applebot-Extended
Amazon Amazonbot
Perplexity PerplexityBot
ByteDance Bytespider
Common Crawl CCBot
Cohere cohere-ai
Mistral MistralAI-User
Diffbot Diffbot
SEO crawlers SemrushBot, AhrefsBot, MJ12bot, DotBot, BLEXBot
Eastern engines YandexBot, Baiduspider, Sogou
Generic scrapers Scrapy, python-requests, curl/, wget/

You can add, remove, or fully override the list by publishing config and editing blocked_agents.

⚠️ What This Package Is NOT

  • ❌ Not authentication. If content must be private, use Laravel auth, Basic Auth, or Cloudflare Zero Trust.
  • ❌ Not foolproof against UA spoofing. A determined scraper can fake any User-Agent. This package targets mass crawlers that identify themselves correctly.
  • ❌ Not a WAF. For rate limiting, geo-blocking, DDoS protection, layer in Cloudflare or a dedicated WAF.

For maximum protection: this package + server-level rules + authentication for sensitive content.

πŸ”— Related

🀝 Contributing

Contributions are welcome! Please open an issue or PR.

For new bot User-Agents to add to the default list, please include a source link (the bot's official documentation page).

πŸ“œ License

The MIT License (MIT). See LICENSE for details.

Built with ❀️ for the Laravel community.

If this package saved your bandwidth or your sanity, ⭐ the repo!