marcgoertz/shorten

Safely truncate HTML markup while preserving tags, handling entities, and supporting Unicode/emoji with optional word-safe truncation

Fund package maintenance!
Kofi
Liberapay

5.0.1 2025-08-19 12:15 UTC

This package is auto-updated.

Last update: 2025-08-19 13:28:46 UTC


README

Safely truncate HTML markup while preserving tags, handling entities, and supporting Unicode/emoji with optional word-safe truncation.

Test Coverage Status Packagist PHP Version Support Packagist Downloads Packagist Stars MIT License

Installation

I recommend using Composer for installing and using Shorten:

composer require marcgoertz/shorten

Of course you can also just require it in your scripts directly.

Usage

<?php

use Marcgoertz\Shorten\Shorten;

$shorten = new Shorten();
print $shorten->truncateMarkup('<a href="https://example.com/">Go to example site</a>', 10);
?>

Output:

<a href="https://example.com/">Go to exam</a>

Functions

truncateMarkup()

truncateMarkup(
    string $markup,
    int $length = 400,
    string $appendix = '',
    bool $appendixInside = false,
    bool $wordsafe = false,
    string $delimiter = ' '
): string

Parameters

  • string $markup: Text containing markup
  • int $length: Maximum length of truncated text (default: 400)
  • string $appendix: Text added after truncated text (default: '…')
  • bool $appendixInside: Add appendix to last content in tags, increases $length by 1 (default: false)
  • bool $wordsafe: Wordsafe truncation, cuts at word boundaries (default: false)
  • string $delimiter: Delimiter for wordsafe truncation (default: ' ')

Examples

<?php
use Marcgoertz\Shorten\Shorten;

$shorten = new Shorten();

// Basic truncation
$result = $shorten->truncateMarkup('<b>Hello world test</b>', 10);
// Output: <b>Hello worl</b>…

// Appendix inside tags
$result = $shorten->truncateMarkup('<b>Hello world test</b>', 10, '...', true);
// Output: <b>Hello worl...</b>

// Wordsafe truncation (cuts at word boundaries)
$result = $shorten->truncateMarkup('<b>Hello world test</b>', 10, '...', false, true);
// Output: <b>Hello</b>...

// Custom delimiter for wordsafe truncation
$result = $shorten->truncateMarkup('<b>Hello-world-test</b>', 10, '...', false, true, '-');
// Output: <b>Hello</b>...

// Preserves HTML structure with nested tags
$result = $shorten->truncateMarkup('<div><b><i>Hello world</i></b></div>', 8);
// Output: <div><b><i>Hello wo</i></b></div>…

// Handles HTML entities correctly
$result = $shorten->truncateMarkup('<b>Caf&eacute; &amp; Restaurant</b>', 8);
// Output: <b>Café &amp; Re</b>…
?>

Features

  • ✅ Preserves HTML tag structure and proper nesting
  • ✅ Handles HTML entities correctly
  • ✅ Supports self-closing tags (both XML and HTML5 style)
  • ✅ UTF-8 and multibyte character support (including emojis)
  • ✅ Wordsafe truncation to avoid cutting words in the middle
  • ✅ Configurable appendix text and placement

Related

License

MIT © Marc Görtz