rtfirst / llms-txt
LLMs.txt Generator - Generates llms.txt files for AI/LLM crawlers with website content in Markdown format, with optional API key protection.
Package info
Type:typo3-cms-extension
pkg:composer/rtfirst/llms-txt
Requires
- php: ^8.2 || ^8.3 || ^8.4
- league/html-to-markdown: ^5.1
- typo3/cms-core: ^13.0 || ^14.0
- typo3/cms-frontend: ^13.0 || ^14.0
Requires (Dev)
- friendsofphp/php-cs-fixer: ^3.0
- phpstan/phpstan: ^2.1
- rector/rector: ^2.0
- saschaegerer/phpstan-typo3: ^2.0
- typo3/testing-framework: ^8.0 || ^9.0
README
Generates llms.txt for AI/LLM crawlers - a compact index of your website with SEO metadata and instructions for accessing page content in any language. Optionally protect access with an API key.
Note: This extension implements the llmstxt.org specification.
Concept
The extension provides a two-tier approach for LLM content access:
-
llms.txt - A single index file containing:
- Website metadata (title, description, domain)
- Page structure with SEO descriptions and keywords
- Instructions for accessing full page content
-
Content Format - Access page content via (spec-compliant with llmstxt.org):
.mdsuffix - Clean Markdown (e.g.,/page.md)
Multi-Language Support
Instead of generating separate llms.txt files per language, this extension uses a simpler approach:
- Single llms.txt - Contains the site structure in the default language
- Language-specific content - Access any page in any language using the
.mdsuffix with language URL prefix:- Default:
https://example.com/about.md - English:
https://example.com/en/about.md - German:
https://example.com/de/about.md
- Default:
This approach is cleaner and follows how multi-language sites actually work.
Features
- Automatic generation of llms.txt when TYPO3 cache is cleared
- Page properties tab: Configure LLM-specific metadata for each page
- HTML header link: Adds
<link rel="alternate">to HTML pages - Clean output formats: Well-formatted HTML and Markdown without excessive whitespace
- Flexible configuration: Via Site Settings and page properties
Requirements
- TYPO3 13.0 - 14.x
- PHP 8.2+
Installation
composer require rtfirst/llms-txt
Then activate the extension:
ddev typo3 extension:setup ddev typo3 cache:flush
Configuration
Site Settings
Add the Site Set "LLMs.txt Generator" to your site configuration, then configure in Site Settings:
| Setting | Description |
|---|---|
llmsTxt.baseUrl |
Full URL of the website (e.g., https://example.com) |
llmsTxt.intro |
Website description shown in the intro section |
llmsTxt.excludePages |
Comma-separated page UIDs to exclude |
llmsTxt.includeHidden |
Include hidden pages (default: false) |
llmsTxt.apiKey |
API key for protected access (empty = public access) |
Page Properties (LLM Tab)
Each page has an "LLM" tab with these fields:
| Field | Description |
|---|---|
| Exclude from llms.txt | Don't include this page in the index |
| LLM Priority | Higher values (0-100) appear first in the list |
| LLM Description | Custom description (fallback: meta description) |
| LLM Summary | Additional summary text shown as quote |
| LLM Keywords | Comma-separated topics for this page |
Output File
After cache flush, llms.txt is created in public/.
Content Access Formats
Markdown (.md suffix)
Returns clean Markdown with YAML frontmatter. Spec-compliant with llmstxt.org.
https://example.com/about.md
Output:
--- title: "About Us" description: "Learn about our company..." language: en date: 2026-01-31 canonical: "/about" format: markdown generator: "TYPO3 LLMs.txt Extension" --- # About Us > Learn about our company... ## Our History Our company was founded in 1985... ## Our Values - Quality and reliability - Fair and transparent prices - Personal consultation
Accessing Different Languages
Simply use the language prefix with the .md suffix:
# German (default)
https://example.com/ueber-uns.md
# English
https://example.com/en/about.md
# French
https://example.com/fr/a-propos.md
API Key Protection
You can protect both /llms.txt and the .md suffix endpoint with an API key. This is useful when you want to:
- Restrict access to your own chatbots/RAG systems
- Prevent external scraping of structured content
- Control who can access your LLM-optimized content
Configuration
Set the llmsTxt.apiKey in your Site Settings. Leave empty for public access (default).
Usage
Pass the API key via HTTP header (recommended):
# Access llms.txt curl -H "X-LLM-API-Key: your-secret-key" https://example.com/llms.txt # Access page as Markdown curl -H "X-LLM-API-Key: your-secret-key" https://example.com/about.md
Or via query parameter:
https://example.com/llms.txt?api_key=your-secret-key
https://example.com/about.md?api_key=your-secret-key
n8n Integration
In n8n HTTP Request node, add the header:
| Name | Value |
|---|---|
X-LLM-API-Key |
your-secret-key |
Error Response
Invalid or missing API key returns 401 Unauthorized:
{
"error": "Unauthorized",
"message": "Valid API key required. Provide via X-LLM-API-Key header or api_key query parameter."
}
Example llms.txt Output
# My Website > Your expert for quality products and services. **Specification:** <https://llmstxt.org/> **Domain:** https://example.com **Language:** de **Generated:** 2026-01-31 12:00:00 ## LLM-Optimized Content Access This site provides LLM-friendly Markdown output for all pages: ### Markdown Format Append `.md` to any page URL to get plain Markdown with YAML frontmatter. - **Example:** `https://example.com/page-slug.md` ### Multi-Language Access Use language-specific URL prefixes with the `.md` suffix: - **Default language:** `https://example.com/page.md` - **English:** `https://example.com/en/page.md` - **Other languages:** Use configured prefix (e.g., `/de/page.md`, `/fr/page.md`) ## Page Structure - **[Home](/)** Welcome to our website with all important information. [Markdown](/index.html.md) - **[About](/about/)** Learn about our company history and values. [Markdown](/about.md) - **[Services](/services/)** Professional services for your needs. *Keywords: services, consulting, support* [Markdown](/services.md) - **[Contact](/contact/)** Get in touch with us via phone or email. [Markdown](/contact.md)
robots.txt Configuration
Add these lines to your public/robots.txt to allow AI crawlers:
# Allow AI crawlers to access llms.txt
User-agent: GPTBot
Allow: /llms.txt
User-agent: Claude-Web
Allow: /llms.txt
User-agent: Anthropic-AI
Allow: /llms.txt
HTML Header Link
The extension automatically adds a link tag to all HTML pages:
<link rel="alternate" type="text/plain" href="/llms.txt" title="LLM Content Guide">
This helps AI crawlers discover the llms.txt file from any page.
Development
Code Quality
# Static analysis (from DDEV project root) ddev exec vendor/bin/phpstan analyse packages/llms_txt/Classes --level=8 # Code style check ddev exec vendor/bin/php-cs-fixer fix packages/llms_txt --dry-run # Fix code style ddev exec vendor/bin/php-cs-fixer fix packages/llms_txt
Testing
# Run unit tests (from DDEV project root) ddev exec "cd packages/llms_txt && ../../vendor/bin/phpunit --bootstrap ../../vendor/autoload.php"
CI Pipeline
The extension includes a GitHub Actions workflow (.github/workflows/ci.yaml) that runs:
- PHP-CS-Fixer (code style)
- PHPStan Level 8 (static analysis)
- Rector (code modernization)
- Unit Tests (PHP 8.2-8.4, TYPO3 13 & 14)
Author
Roland Tfirst Email: roland@tfirst.de
License
GPL-2.0-or-later