tomstgeorge / silverstripe-llm-markdown
Generates llm.txt and Markdown versions of pages for AI agents, building on staticpublishqueue.
Installs: 1
Dependents: 0
Suggesters: 0
Security: 0
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Type:silverstripe-vendormodule
pkg:composer/tomstgeorge/silverstripe-llm-markdown
Requires
- php: ^8.1
- league/html-to-markdown: ^5.1
- silverstripe/cms: ^5.0
- silverstripe/framework: ^5.0
- silverstripe/staticpublishqueue: ^6.3
Requires (Dev)
This package is auto-updated.
Last update: 2026-02-23 18:22:28 UTC
README
Generates llm.txt and Markdown (.md) versions of all statically published pages for AI agents. Builds on silverstripe/staticpublishqueue: when a page is published and the static cache is built, this module also writes a .md file alongside each .html file and can regenerate a single llm.txt from all markdown.
Features
- Per-page Markdown: For every URL that gets a static
.htmlfile, a.mdfile is written to the same path (e.g.index.md,about-us.md) using HTML-to-Markdown conversion. - Purge on unpublish: When a URL is purged from the static cache, the corresponding
.mdfile is removed. - llm.txt: A build task regenerates a single
llm.txtin the cache root from all.mdfiles (one## URLsection per page). - Serving Markdown to agents: An optional static request handler serves
.mdwhen the request hasAccept: text/markdownorAccept: text/plain, otherwise delegates to the normal static HTML handler.
Requirements
- PHP ^8.1
- Silverstripe Framework ^5.0, CMS ^5.0
- silverstripe/staticpublishqueue ^6.3
- league/html-to-markdown ^5.1
Dependencies are installed at the project level (in your root vendor/), not inside the module.
Installation
Install the module and its dependencies via Composer:
composer require tomstgeorge/silverstripe-llm-markdown
If you use a path repository for local development:
{
"repositories": [
{
"type": "path",
"url": "./silverstripe-llm-markdown-src",
"options": { "symlink": true }
}
],
"require": {
"tomstgeorge/silverstripe-llm-markdown": "@dev"
}
}
Then run composer update. Ensure the module folder is not also present under the same name at the project root (e.g. use a name like silverstripe-llm-markdown-src for the path repo) to avoid duplicate class errors.
Usage
Automatic behaviour
- Publishing: When you publish a page, the static publish queue runs as usual. This module’s publisher extension runs after each URL is generated and writes the same path with a
.mdextension (same directory as the.htmlfiles, typically under your static cache folder). - Unpublishing / purge: When a URL is purged from the static cache, the matching
.mdfile is deleted.
No extra configuration is required for per-page markdown generation; it works with your existing staticpublishqueue setup.
Regenerating llm.txt
After a full static cache build (or whenever you want to refresh the combined file), run the build task:
- URL:
https://yoursite.com/dev/tasks/RegenerateLLMTxtTask - CLI:
vendor/bin/sake dev/tasks/RegenerateLLMTxtTask
This scans all .md files in the static cache and writes llm.txt in the cache root, with one ## <url> section per page.
Serving Markdown to clients that request it
The module does not modify public/index.php or any core Silverstripe files. Like StaticPublishQueue (which provides a static handler file but does not wire it into your project), you choose whether to serve static cache from your front controller.
To serve cached HTML and Markdown (so that requests with Accept: text/markdown or Accept: text/plain get the .md file, and others get .html), use the same cache directory as StaticPublishQueue: get it from the Publisher after booting the kernel, then run the module’s static handler before handling the request. Example for public/index.php:
// After require autoload.php: $request = HTTPRequestBuilder::createFromEnvironment(); $kernel = new CoreKernel(BASE_PATH); $kernel->boot(); // Static cache: serve .md for Accept text/markdown|text/plain, else .html (optional – requires tomstgeorge/silverstripe-llm-markdown) $publisher = \SilverStripe\StaticPublishQueue\Publisher::singleton(); if ($publisher instanceof \SilverStripe\StaticPublishQueue\Publisher\FilesystemPublisher) { $cacheDir = $publisher->getDestPath(); $staticHandlerPath = __DIR__ . '/../vendor/tomstgeorge/silverstripe-llm-markdown/includes/staticrequesthandler.php'; if (is_file($staticHandlerPath)) { $staticHandler = require $staticHandlerPath; if ($staticHandler($cacheDir)) { exit; } } } $app = new HTTPApplication($kernel); $response = $app->handle($request); $response->output();
The handler will:
- Serve the
.mdfile withContent-Type: text/markdown; charset=utf-8when the request includesAccept: text/markdownorAccept: text/plain. - Otherwise delegate to the staticpublishqueue handler (serve
.htmlor fall through).
Using $publisher->getDestPath() ensures the cache directory matches the one used by the static publisher (same as FilesystemPublisher’s dest path, including any custom destFolder config).
Configuration
The module replaces the static publish queue’s publisher with TomStGeorge\LLMMarkdown\Publisher\LLMMarkdownPublisher and attaches the markdown extension. Your existing staticpublishqueue config (e.g. disallowed_status_codes, regenerate_children / regenerate_parents) still applies.
Regenerating llm.txt after each job
By default, llm.txt is regenerated at the end of every static publish queue job (generate, delete, full build). To disable this and regenerate only via the build task or your own schedule, set in your YAML:
TomStGeorge\LLMMarkdown\Publisher\LLMMarkdownPublisher: regenerate_llm_txt_after_job: false
Optional: if you need to run the RegenerateLLMTxt task from code (e.g. after a full build job), get the publisher and call regenerateLLMTxt():
use SilverStripe\StaticPublishQueue\Publisher; $publisher = Publisher::singleton(); if ($publisher instanceof \TomStGeorge\LLMMarkdown\Publisher\LLMMarkdownPublisher) { $publisher->regenerateLLMTxt(); }
Documentation
License
BSD-3-Clause (see LICENSE).