goldziher / html-to-markdown
Modern PHP API for the html_to_markdown native extension powered by the Rust html-to-markdown engine.
Package info
github.com/kreuzberg-dev/html-to-markdown
Language:HTML
Type:php-ext
Ext name:ext-html_to_markdown
pkg:composer/goldziher/html-to-markdown
Requires
- php: ^8.2
Requires (Dev)
- php/pie: ^1.0
- dev-main
- v4.0.0-rc.16
- 3.1.0
- 3.0.2
- 3.0.1
- 3.0.0
- 2.30.0
- 2.29.0
- 2.28.6
- 2.28.5
- 2.28.4
- 2.28.3
- 2.28.2
- 2.28.1
- 2.28.0
- 2.27.3
- 2.27.2
- 2.27.1
- 2.27.0
- 2.26.3
- 2.26.2
- 2.26.1
- 2.26.0
- 2.25.2
- 2.25.1
- 2.25.0
- 2.24.6
- 2.24.5
- 2.24.4
- 2.24.3
- 2.24.2
- 2.24.1
- 2.24.0
- 2.23.6
- 2.23.5
- 2.23.4
- 2.23.3
- 2.23.2
- 2.23.1
- 2.23.0
- 2.22.6
- 2.22.5
- 2.22.4
- 2.22.3
- 2.22.2
- 2.22.1
- 2.22.0
- 2.21.1
- 2.21.0
- 2.20.1
- 2.20.0
- 2.19.8
- 2.19.7
- 2.19.6
- 2.19.5
- 2.19.4
- 2.19.3
- v2.19.2
- v2.19.1
- v2.19.0
- v2.18.0
- v2.17.0
- v2.16.1
- v2.16.0
- v2.15.0
- v2.14.11
- v2.14.10
- v2.14.9
- v2.14.8
- v2.14.7
- v2.14.6
- v2.14.5
- v2.14.4
- v2.14.3
- v2.14.2
- v2.14.1
- v2.14.0
- v2.13.0
- v2.12.1
- v2.12.0
- v2.11.4
- v2.11.3
- v2.11.2
- v2.11.1
- v2.11.0
- v2.10.1
- v2.10.0
- v2.9.3
- v2.9.2
- v2.9.1
- v2.9.0
- v2.8.3
- v2.8.2
- v2.8.1
- v2.8.0
- v2.7.2
- v2.7.1
- v2.7.0
- v2.6.6
- v2.6.5
- v2.6.4
- v2.6.3
- v2.6.2
- v2.6.1
- v2.6.0
- v2.5.7
- dev-feat/alef-adoption
- dev-feat/skif-adoption
- dev-dependabot/npm_and_yarn/tests/test_apps/wasm/vite-7.3.2
- dev-dependabot/npm_and_yarn/types/node-25.5.2
- dev-dependabot/composer/packages/php/phpstan/phpstan-2.1.46
- dev-dependabot/composer/packages/php/phpstan/phpstan-2.1.45
- dev-dependabot/maven/packages/java/net.sourceforge.pmd-pmd-java-7.23.0
- dev-dependabot/maven/packages/java/com.puppycrawl.tools-checkstyle-13.4.0
- dev-dependabot/github_actions/actions/deploy-pages-5
This package is auto-updated.
Last update: 2026-04-11 16:26:45 UTC
README
High-performance HTML to Markdown conversion powered by Rust. Ships as native bindings for Rust, Python, TypeScript/Node.js, Ruby, PHP, Go, Java, C#, Elixir, R, C (FFI), and WebAssembly with identical rendering across all runtimes.
Documentation | Live Demo | API Reference
Highlights
- 150-280 MB/s throughput (10-80x faster than pure Python alternatives)
- 12 language bindings with consistent output across all runtimes
- Structured result —
convert()returnsConversionResultwithcontent,metadata,tables,images, andwarnings - Metadata extraction — title, headers, links, images, structured data (JSON-LD, Microdata, RDFa)
- Visitor pattern — custom callbacks for content filtering, URL rewriting, domain-specific dialects
- Table extraction — extract structured table data (cells, headers, rendered markdown) during conversion
- Secure by default — built-in HTML sanitization via ammonia
Quick Start
# Rust cargo add html-to-markdown-rs # Python pip install html-to-markdown # TypeScript / Node.js npm install @kreuzberg/html-to-markdown-node # Ruby gem install html-to-markdown # CLI cargo install html-to-markdown-cli # or brew install kreuzberg-dev/tap/html-to-markdown
See the Installation Guide for all languages including PHP, Go, Java, C#, Elixir, R, and WASM.
Usage
convert() is the single entry point. It returns a structured ConversionResult:
# Python from html_to_markdown import convert result = convert("<h1>Hello</h1><p>World</p>") print(result["content"]) # # Hello\n\nWorld print(result["metadata"]) # title, links, headings, …
// TypeScript / Node.js import { convert } from "@kreuzberg/html-to-markdown-node"; const result = convert("<h1>Hello</h1><p>World</p>"); console.log(result.content); // # Hello\n\nWorld console.log(result.metadata); // title, links, headings, …
// Rust use html_to_markdown_rs::convert; let result = convert("<h1>Hello</h1><p>World</p>", None)?; println!("{}", result.content.unwrap_or_default());
Language Bindings
| Language | Package | Install |
|---|---|---|
| Rust | html-to-markdown-rs | cargo add html-to-markdown-rs |
| Python | html-to-markdown | pip install html-to-markdown |
| TypeScript / Node.js | @kreuzberg/html-to-markdown-node | npm install @kreuzberg/html-to-markdown-node |
| WebAssembly | @kreuzberg/html-to-markdown-wasm | npm install @kreuzberg/html-to-markdown-wasm |
| Ruby | html-to-markdown | gem install html-to-markdown |
| PHP | kreuzberg-dev/html-to-markdown | composer require kreuzberg-dev/html-to-markdown |
| Go | htmltomarkdown | go get github.com/kreuzberg-dev/html-to-markdown/packages/go/v3 |
| Java | dev.kreuzberg:html-to-markdown | Maven / Gradle |
| C# | KreuzbergDev.HtmlToMarkdown | dotnet add package KreuzbergDev.HtmlToMarkdown |
| Elixir | html_to_markdown | mix deps.get html_to_markdown |
| R | htmltomarkdown | install.packages("htmltomarkdown") |
| C (FFI) | releases | Pre-built .so / .dll / .dylib |
Part of the Kreuzberg Ecosystem
html-to-markdown is developed by kreuzberg.dev and powers the HTML conversion pipeline in Kreuzberg, a document intelligence library for extracting text from PDFs, images, and office documents.
Contributing
Contributions welcome! See CONTRIBUTING.md for setup instructions and guidelines.
License
MIT License — see LICENSE for details.