rtfirst/llms-txt

LLMs.txt Generator - Generates llms.txt files for AI/LLM crawlers with website content in Markdown format, with optional API key protection.

Maintainers

Package info

github.com/rtfirst/llms-txt

Documentation

Type:typo3-cms-extension

pkg:composer/rtfirst/llms-txt

Statistics

Installs: 143

Dependents: 0

Suggesters: 0

Stars: 0

Open Issues: 0

1.0.7 2026-02-06 22:36 UTC

This package is auto-updated.

Last update: 2026-03-06 22:50:25 UTC


README

TYPO3 13 TYPO3 14 Latest Stable Version CI Total Downloads License

Generates llms.txt for AI/LLM crawlers - a compact index of your website with SEO metadata and instructions for accessing page content in any language. Optionally protect access with an API key.

Note: This extension implements the llmstxt.org specification.

Concept

The extension provides a two-tier approach for LLM content access:

  1. llms.txt - A single index file containing:

    • Website metadata (title, description, domain)
    • Page structure with SEO descriptions and keywords
    • Instructions for accessing full page content
  2. Content Format - Access page content via (spec-compliant with llmstxt.org):

    • .md suffix - Clean Markdown (e.g., /page.md)

Multi-Language Support

Instead of generating separate llms.txt files per language, this extension uses a simpler approach:

  • Single llms.txt - Contains the site structure in the default language
  • Language-specific content - Access any page in any language using the .md suffix with language URL prefix:
    • Default: https://example.com/about.md
    • English: https://example.com/en/about.md
    • German: https://example.com/de/about.md

This approach is cleaner and follows how multi-language sites actually work.

Features

  • Automatic generation of llms.txt when TYPO3 cache is cleared
  • Page properties tab: Configure LLM-specific metadata for each page
  • HTML header link: Adds <link rel="alternate"> to HTML pages
  • Clean output formats: Well-formatted HTML and Markdown without excessive whitespace
  • Flexible configuration: Via Site Settings and page properties

Requirements

  • TYPO3 13.0 - 14.x
  • PHP 8.2+

Installation

composer require rtfirst/llms-txt

Then activate the extension:

ddev typo3 extension:setup
ddev typo3 cache:flush

Configuration

Site Settings

Add the Site Set "LLMs.txt Generator" to your site configuration, then configure in Site Settings:

Setting Description
llmsTxt.baseUrl Full URL of the website (e.g., https://example.com)
llmsTxt.intro Website description shown in the intro section
llmsTxt.excludePages Comma-separated page UIDs to exclude
llmsTxt.includeHidden Include hidden pages (default: false)
llmsTxt.apiKey API key for protected access (empty = public access)

Page Properties (LLM Tab)

Each page has an "LLM" tab with these fields:

Field Description
Exclude from llms.txt Don't include this page in the index
LLM Priority Higher values (0-100) appear first in the list
LLM Description Custom description (fallback: meta description)
LLM Summary Additional summary text shown as quote
LLM Keywords Comma-separated topics for this page

Output File

After cache flush, llms.txt is created in public/.

Content Access Formats

Markdown (.md suffix)

Returns clean Markdown with YAML frontmatter. Spec-compliant with llmstxt.org.

https://example.com/about.md

Output:

---
title: "About Us"
description: "Learn about our company..."
language: en
date: 2026-01-31
canonical: "/about"
format: markdown
generator: "TYPO3 LLMs.txt Extension"
---

# About Us

> Learn about our company...

## Our History

Our company was founded in 1985...

## Our Values

- Quality and reliability
- Fair and transparent prices
- Personal consultation

Accessing Different Languages

Simply use the language prefix with the .md suffix:

# German (default)
https://example.com/ueber-uns.md

# English
https://example.com/en/about.md

# French
https://example.com/fr/a-propos.md

API Key Protection

You can protect both /llms.txt and the .md suffix endpoint with an API key. This is useful when you want to:

  • Restrict access to your own chatbots/RAG systems
  • Prevent external scraping of structured content
  • Control who can access your LLM-optimized content

Configuration

Set the llmsTxt.apiKey in your Site Settings. Leave empty for public access (default).

Usage

Pass the API key via HTTP header (recommended):

# Access llms.txt
curl -H "X-LLM-API-Key: your-secret-key" https://example.com/llms.txt

# Access page as Markdown
curl -H "X-LLM-API-Key: your-secret-key" https://example.com/about.md

Or via query parameter:

https://example.com/llms.txt?api_key=your-secret-key
https://example.com/about.md?api_key=your-secret-key

n8n Integration

In n8n HTTP Request node, add the header:

Name Value
X-LLM-API-Key your-secret-key

Error Response

Invalid or missing API key returns 401 Unauthorized:

{
  "error": "Unauthorized",
  "message": "Valid API key required. Provide via X-LLM-API-Key header or api_key query parameter."
}

Example llms.txt Output

# My Website

> Your expert for quality products and services.

**Specification:** <https://llmstxt.org/>
**Domain:** https://example.com
**Language:** de
**Generated:** 2026-01-31 12:00:00

## LLM-Optimized Content Access

This site provides LLM-friendly Markdown output for all pages:

### Markdown Format
Append `.md` to any page URL to get plain Markdown with YAML frontmatter.
- **Example:** `https://example.com/page-slug.md`

### Multi-Language Access
Use language-specific URL prefixes with the `.md` suffix:
- **Default language:** `https://example.com/page.md`
- **English:** `https://example.com/en/page.md`
- **Other languages:** Use configured prefix (e.g., `/de/page.md`, `/fr/page.md`)

## Page Structure

- **[Home](/)**
  Welcome to our website with all important information.
  [Markdown](/index.html.md)

  - **[About](/about/)**
    Learn about our company history and values.
    [Markdown](/about.md)

  - **[Services](/services/)**
    Professional services for your needs.
    *Keywords: services, consulting, support*
    [Markdown](/services.md)

- **[Contact](/contact/)**
  Get in touch with us via phone or email.
  [Markdown](/contact.md)

robots.txt Configuration

Add these lines to your public/robots.txt to allow AI crawlers:

# Allow AI crawlers to access llms.txt
User-agent: GPTBot
Allow: /llms.txt

User-agent: Claude-Web
Allow: /llms.txt

User-agent: Anthropic-AI
Allow: /llms.txt

HTML Header Link

The extension automatically adds a link tag to all HTML pages:

<link rel="alternate" type="text/plain" href="/llms.txt" title="LLM Content Guide">

This helps AI crawlers discover the llms.txt file from any page.

Development

Code Quality

# Static analysis (from DDEV project root)
ddev exec vendor/bin/phpstan analyse packages/llms_txt/Classes --level=8

# Code style check
ddev exec vendor/bin/php-cs-fixer fix packages/llms_txt --dry-run

# Fix code style
ddev exec vendor/bin/php-cs-fixer fix packages/llms_txt

Testing

# Run unit tests (from DDEV project root)
ddev exec "cd packages/llms_txt && ../../vendor/bin/phpunit --bootstrap ../../vendor/autoload.php"

CI Pipeline

The extension includes a GitHub Actions workflow (.github/workflows/ci.yaml) that runs:

  • PHP-CS-Fixer (code style)
  • PHPStan Level 8 (static analysis)
  • Rector (code modernization)
  • Unit Tests (PHP 8.2-8.4, TYPO3 13 & 14)

Author

Roland Tfirst Email: roland@tfirst.de

License

GPL-2.0-or-later